TEXT MINING – KNOWLEDGE FROM UNSTRUCTURED SOURCES
Text Mining is a process established to obtain information from unstructured texts.
With the help of linguistic, statistical and mathematical processes, patterns and structures are selectively sought and information extracted by Text Mining.
The word Mining in the term Text Mining originated from an analogy to coal mining. As with coal mining, a vast amount of unstructured raw material must first be dug up, exposed and processed in Text Mining, yielding valuable precious metals or, with the latter, profitable knowledge.
WHY TEXT MINING?
The age of Big Data is leading to a huge increase of digital information. Today memory is cheap and no longer a medium with limits. Until now, knowledge has been distributed to only a few sources, however the many technical possibilities have lead to a rapid increase in the number of text documents and memory locations.
This however leads to a serious problem:
Given the mass of information, understanding text contents and correlations – something which, until now has solely been performed by the human mind – can no longer be carried out reliably without technological help.
Highly professional Text Mining software solutions provide support here, helping everywhere where data and information are found in text documents and not in databases.
Select a Cluster:
Tuesday, March 26. 2013 -AusAID Australia and the World
Bank´s Global Environment Fund (GEF) reached an agreement
to give the government of Kiribat US$5 million to install solar
panels around the country capital, located on the Tarawa atoll.
According to Business Desk of the Brunei Times, AusAID promised
AU$3.2 million in funding, while GEF promised US$1 million.
The country was the first in the Pacific to make a deal
with the World Bank.
Example: Text Mining
In the analysis process of Text Mining implicit information is specifically made explicit. With the help of algorithms, correlations between pieces of information are structured and thus rendered evaluable.
To the left, a schematic and exemplary look at the process of a Text Mining is offered. Give it a try!
All relevant information is marked following the selection and is thus quickly retrievable for those searching for information.
DIFFERENCE TEXT MINING AND DATA MINING
In Text Mining the information and data are available in text documents, as opposed to Data Mining in which the data is already available in a database in structured and compressed form. As a result, Text Mining is related to Data Mining. The source of the information and the degree of structuring are the main decisive factors for differentiation. Text Mining mainly deals with unstructured data, whereas Data Mining often avails of structured sources.
TEXT MINING (simplified illustration)
Selection of suitable Text Documents
Processing of the Information & structured Extraction
Presentation of Results
DATA MINING (simplified illustration)
Selection of Data
Processing of Data
Validation & Evaluation
Presentation of Results
PROVIDERS OF TEXT MINING SOFTWARE & TEXT MINING TOOLS – WHERE ARE THE DIFFERENCES?
EXAMPLE: APPLICATION CASE FOR TEXT MINING
TEXT MINING IN HEALTHCARE
The pharmaceutical industry and medical research are faced with enormous challenges as a result of the very restrictive handling of health data. Innumerous sources of unstructured data, for example, must be consolidated for the prognosis of the progression of various illnesses or the efficacy of medications.
The analysis process Text Mining provides support here: exact data and thus valuable knowledge can be extracted from immense quantities of unstructured texts, surveys and documents – knowledge enabling better care and prognosis of patients and thus a healthier life.
IF YOUR COMPANY KNEW WHAT IT ACTUALLY ALREADY KNOWS!
Data is the basis of all strategies, planning, reports and ultimately all decisions made in a company. More than 80% of a company‘s business-relevant information are dormant in unstructured data. New information is accrued every day in the form of unstructured documents. Merging this information manually is extremely time- and thus cost-consuming and significantly increases the margin of error.
Potential generally lying utterly idle today.
Professional and efficient Text Mining enables companies to obtain information that had remained hidden from them until now.
Readily available knowledge about the customers, competition and the markets is becoming more and more important on today’s prevalent cut-throat markets.
This knowledge and the management thereof is the most important success factor and the most crucial resource in a successful company.