AUTOMATIC DOCUMENT CLASSIFICATION
Document classification: Articles and text documents are automatically classified for indexing using a freely definable category system. For example, agency reports can be automatically assigned to the respective departments (e.g. “Economy”, “Politics”).
FLEXIBLE CONCEPT IDENTIFICATION THROUGH TERMINOLOGIES
Concept recognition through the use of terminology: The lexicon structure is flexible and enables the inclusion of synonyms and various attributes that play a role in the annotation. The lexicon comparison can be carried out on coherent or separate text blocks (“innovative ability” vs. “ability to innovate”).
Recognition of entities through the purely statistical calculation of a variety of different information and characteristics from context words. In this way, precise personal and product names, organizations or geographical information are identified.
ANALYSIS & MINING
Sentiment analysis and opinion mining: Qualitative value judgments are reliably recognized in texts and evaluated at sentence level. Averbis – text mining to perfection.
AUTOMATIC CLASSIFICATION OF DOCUMENTS
The document or text classification of Information Discovery enables the simple classification of documents using statistical methods from the field of artificial intelligence.
We offer classification and clustering techniques based on modern text mining and machine learning through Natural Language Processing .
This enables application scenarios such as sentiment analysis, content monitoring, technology categorization, predictive coding, clustering, alerting and document research to be implemented in just a few steps.
Users do not need a deep understanding of statistical learning processes. You can use our services both via a powerful graphical user interface and via web services. Machine learning methods such as Natural language processing and deep learning support information professionals with complex annotation and classification work.
In contrast to rule-based procedures, in which a rule must be defined for every possible decision, computers learn from the examples and experiences of the experts in machine learning procedures. The system is trained and learns. Then they make independent predictions on new, previously unknown documents.
The automatic categorization of large amounts of data with a high number of hierarchical categories with high forecast quality requires a sufficient number of learning data. The concept of active learning minimizes the effort of manually creating this data through intelligent data sampling and iterative supervised learning. Learn more about machine learning.
By integrating special components, the search engine offers a comprehensive treatment of linguistic phenomena. Even phrases, synonyms or individual components of compound words are recognized and layperson and expert language are mapped onto one another (“appendicitis”, “inflammation of the appendix”, “appendicitis”, “inflamed appendix” etc.)
In order to narrow down large numbers of hits, the search engine shows the user related search terms that are semantically associated with a search query.
Based on text similarities, the search engine automatically calculates recommendations for articles that may also be relevant to the user.
FLEXIBLE LEGAL MANAGEMENT
Existing rights management concepts (e.g. LDAP user groups) can be adopted. The solution supports storing authorizations in the search index as well as querying existing authorization services.
Using a web-based editor, existing terminologies and other term catalogs can be imported, edited and used for information extraction and indexing.
Multilingualism is supported as well as the enrichment of word synonyms and cross-references to other terminologies.
FLEXIBLE & INTELLIGENT
The editor supports the entry of new terms through automatic validation and consistency checks and helps with the enrichment with information from various external sources.
Apache Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted search, dynamic clustering, database integration, rich document (e.g., Word, PDF) handling, and geospatial search.
Solr is highly scalable, providing fault tolerant distributed search and indexing, and powers the search and navigation features of many of the world’s largest internet sites.
The Unstructured Information Management Architecture (UIMA)is an architecture and software framework for creating, discovering, composing and deploying a broad range of multi-modal analysis capabilities and integrating them with search technologies.
The architecture was undergoing a standardization effort, referred to as the UIMA specification by a technical committee within OASIS. The Apache UIMA framework is an Apache licensed, open source implementation of the UIMA Architecture, and provides a run-time environment in which developers can plug in and run their UIMA component implementations and with which they can build and deploy UIM applications. The framework itself is not specific to any IDE or platform.
Neo4j, the world’s leading graph database, powering numerous organizations worldwide, including more than 50 Global 2000 customers. Neo4j delivers lightning-fast read and write performance you need, while still protecting your data integrity.
It is the only enterprise-strength graph database that combines native graph storage, scalable architecture optimized for speed, and ACID compliance to ensure predictability of relationship-based queries.
AngularJS is an open-source web application framework maintained by Google and by a community of individual developers and corporations to address many of the challenges encountered in developing single-page applications.
It aims to simplify both the development and the testing of such applications by providing a framework for client-side model–view–controller (MVC) and model-view-viewmodel (MVVM) architectures, along with components commonly used in rich Internet applications.
Bootstrap is a front end framework, that is, an interface for the user, unlike the server-side code which resides on the “back end” or server.