Information Discovery vs. Apache UIMA

INFORMATION DISCOVERY VS. APACHE UIMA

Information Discovery contains a 100% UIMA compatible text mining platform. It offers numerous annotators for the semantic analysis of text. Our annotators are multilingual and allow text analysis in various languages. Depending on the task we exploit rule based or trainable (machine-learning) based approaches.

All trainable annotators come with tools for the creation of new models in new languages or genres. In addition to standard models based on news paper we offer a wide variety of biomedical annotators for text analysis of research litature, patents and medical text.

Framework

Framework	Information Discovery	Apache UIMA
UIMA Java Framework	yes	yes
UIMA C++ Framework	yes	yes
UIMA Default Viewers & Tooling	yes	yes
PEAR Packaging Facilities	yes	yes
UIMA-AS Scaleout Framework	yes	yes
UIMA-AS in the Cloud	yes	no

Infrastructure

Framework	Information Discovery	Apache UIMA
Simple Server (UIMA REST service)	Add-On	Add-On
Generic Typesystem	yes	no
Web-based Annotation Client	yes	no
Scripting Language for Pipeline Configuration	yes	no

Core Components

Framework	Information Discovery	Apache UIMA
Collection Readers (CR)
Simple File Reader	yes	Add-On
XMI Reader	yes	Add-On
Generic XML Reader	yes	no
Generic Database Reader	yes	no
Annotators
Tika Annotator	yes	Add-On
Document Zoning	yes	no
Language Detection	yes	no
Document Classification	yes	no
Sentence Splitting, Rule Based	yes	Add-On
Sentence Splitting, Trainable	yes	no
Tokenization, Rule Based	yes	Add-On
Tokenization, Trainable	yes	no
Part-Of-Speech Recognition	yes	no
Shallow Parsing / Chunking	yes	no
Stemming	yes	Add-On
Morphological Analyis	yes	no
Decompounding	yes	no
Stopword Recognition	yes	Add-On
Invariant Recognition	yes	no
Acronym and Abbreviation Resolution	yes	no
Regular Expression Annotator	yes	Add-On
Lemmatizer, Lexicon Based	yes	no
Concept Recognition	yes	Add-On
Named Entity Recognition, Trainable	yes	no
Concept Disambiguation	yes	no
Keyword-Extraction, Controlled and Uncontrolled	yes	no
Evaluation Modules	yes	no
Table Format Recognition	yes	no
UIMA Default Annotators (HMM Tagger, BSF Annotator, Alchemi, OpenCalais)	Add-On	Add-On
Drools Annotator	yes	no
Relation Extraction, Trainable	yes	no
CAS Consumer (CC)
XML Writer	yes	Add-On
Lucene CAS Indexer (Lucas)	yes	Add-On
Solr CAS Consumer (Solrcas)	yes	no
DB Writer	yes	no
Flow Controller
Document Language Flow Controller	yes	no
Document Category Flow Controller	yes	no

Biomedical Components

Framework	Information Discovery	Apache UIMA
Medline Reader	yes	no
Biomedical Sentence Splitter	yes	no
Biomedical Tokenizer	yes	no
Negation Annotator	yes	no
Number Annotator	yes	no
Disease Annotator	yes	no
Anatomy Annotator	yes	no
Drug Annotator	yes	no
Gene Tagger (Uniprot, EntrezGene)	yes	no
ChemSpot Annotator	yes	no

Start finding Answers in your Data today

We would be glad to present our products to you and create a demonstration based on your selected data repositories.