SEMCARE: New Platform for Information Management in the Healthcare Industry

Information Discovery specially designed for the healthcare branch – that was the goal of the SEMCARE (Semantic Data Platform for Healthcare) research project. Alongside European partners from industry, secondary education, and clinical settings, we acted as coordinators in developing the new data analytics platform Information Discovery for Healthcare through this EU-funded project.

The software developed in SEMCARE supports clinics in diagnosing illnesses and selecting suitable treatments, and makes it easier for them to choose suitable patients for clinical studies. In the SEMCARE project, we created a platform based on our proven information discovery technology to combine the newest text-mining technologies with multilingual semantics. Specific medical language features, terminology, and vocabulary were integrated, making it possible to harmonize and analyze structured and unstructured patient data from a variety of sources. Medical documents can be searched based on diagnoses, symptoms, regulations, and other criteria. This makes it possible to assemble patient groups based on specific characteristics, for instance for clinical trials, with just a few mouse clicks.

Screenshot Averbis Information Discovery

Text mining meets patient data

Information Discovery for Healthcare was developed to pool specific patient data according to defined clinical criteria. The technology makes it possible to find and evaluate individual criteria such as age, sex, diagnosis, indication, symptoms, or laboratory results in documents from various sources. Powerful full-text search capabilities and semantic textual analysis are combined here into a hybrid semantic full-text search. This makes it possible to semantically integrate heterogeneous unstructured and structured data sources for the purpose of identifying information and documents, as well as to complete individual data assessment and representation.

In order to provide basic functionality, the software requires only access to the text documents, independent of their format, whether it be PDF, RTF, TXT, or others. Data export to platforms like i2b2 and tranSMART is supported. Open interfaces also make it possible to integrate the system into existing hospital information systems.

Our Proof of Concept

The platform has already been evaluated in three European pilot locations in London, Rotterdam, and Graz during the SEMCARE project term. Information Discovery for Healthcare was used in these locations in the field of cardiovascular disease. High-risk patients were successfully identified using specific biomarkers and combinations of symptoms. Due to their ischemic heart disease, these patients are at risk of dying from sudden cardiac arrhythmia. Early recognition of this danger can be used to provide timely care with suitable treatments and lower the mortality of this patient group.

Privacy protection in accordance with international standards

Information Discovery for Healthcare was developed in a manner that conforms with statutory regulations on privacy protection. A multi-layer concept of data usage regulates access rights, guaranteeing the highest level of security:

  • Installation exclusively in individual hospitals: no higher-level collection of data from multiple hospitals into centralized data banks
  • Integration into local access control systems: data from Information Discovery may only be viewed if the user has unrestricted access to the electronic patient data system
  • Automated, retrospective analysis of patient data: limited access to data from the user’s own department
  • No clinical data leaves the hospital

One platform – Many uses

One Application - Multiple Scenarios

Information Discovery for Healthcare is a modern analysis platform that gives users the opportunity to complete comprehensive semantic searches of medical documents. It provides insight into structured and unstructured patient data, and facilitates flexible and comprehensive data analyses and correlations. With its comprehensive functionality, the platform is suitable for use in a wide variety of applications in the healthcare industry.

Read More

Information Discovery: New version 4.5.0 available!

averbis information discovery text mining

information discovery averbisInformation Discovery is a next-generation text analytics platform that allows you to gain insights into your unstructured data and explore key information in the most flexible way possible. Information Discovery collects and analyzes all kind of documents such as patents, research literature, databases, websites and company- internal repositories.

With the new version we have our Information Discovery revised again and again taken the opportunity to improve the user interface in detail.

The key new features of Version 4.5.0 include:

  • New! Significantly improved performance of the file system crawler
  • New! Additional configuration options with regard to indexing
  • New! Configurable facets: Facets can now be configured as AND or OR links, rendering search queries easier and quicker to configure. You can now also immediately refresh the next entries dynamically within the facets.
  • Various bugfixes
Read More

Big Data in Healthcare

Proportion of studies in Europe, which conclude the recruitment process in time
Proportion of studies in the United States, which conclude the recruitment process in time
8 Mio. $
Cost per day in case of delayed launch of a medicament

Healthcare systems worldwide are currently going through major transformations brought on by increasing regulation, record public debt, and shrinking budgets. Traditionally separate and fragmented sectors of the industry such as healthcare providers, payers and drug companies are now looking at ways to work together and coordinate efforts to improve patient safety and healthcare quality while reducing costs.

In these times of reduced income from payers and a decline in R&D productivity, manufacturers seek to develop drugs that combine cost effectiveness and targeting with a high value. The concept of personalized medicine offers the chance of improved healthcare, better patient outcomes, and less harm. Clinical data perform an important function in this scenario, whether through being able to identify who is likely to respond well to a given treatment, or by speeding up and improving drug submissions to regulatory authorities.

Personalized medicine, however, has created various “big data” challenges for medical trials, including how to collect, manage and examine effectively the increasing amount and velocity of patient data involved. There has been an enormous upsurge in documented patient data noted by the life sciences industry in the last 10 years. This has been propelled by impressive changes that include advances in genome sequencing technologies; the adoption of Electronic Health Records (EHRs) by different healthcare systems; the sharing of clinical-trial data; and the explosion in data from patient registries, social media networks, and medical and non-medical devices (e.g. smartphones and fitness monitors). These changes have given rise to a profusion of data from diverse sources such as: genomic, clinical trials, EHRs, and research studies.

The adoption of advanced analytical tools is more necessary than ever to develop insights from these data. Manufacturers are thus placing themselves to create more targeted therapies and to revolutionize the way that biopharmaceutical drugs are discovered, developed, and marketed. Identifying and recruiting suitable patients and finding trial sites are the main causes of trial delays, where there is no access to clinical data. Delayed trials waste precious resources and curtail access to new drugs.

Half of clinical trials today fail to obtain the target sample size needed for the study; just 18% of Europe based studies and 7% of US based studies complete enrolment on time. A single day of delay for a drug reaching the market can cost pharmaceutical companies up to 8 million US dollars.

Averbis’ mission is to facilitate collaboration between pharmaceutical and medtech companies and healthcare providers by building tools and services giving real-time access to large patient populations. We reduce inefficiency and unnecessary expenses in clinical studies enabling pharmaceutical companies to get new therapeutics on the market faster. We improve clinical research leveraging the patient’s data, providing tools and services for semantic harmonization and better quality data. The network enables collaboration with other member providers, advancing translational research efforts. We allow hospitals to fund their research via pharmaceutical company sponsorships and increase their participation in industry-sponsored, clinical studies and enhance grant competitiveness. We help preventing unnecessary amendments to clinical studies and aid in identifying clinical trial sites that provide access to a sufficient number of patients meeting the inclusion and exclusion criteria. By this, pharmaceutical companies get new therapeutics on the market faster by reducing the inefficiencies of clinical studies and eliminating unsuccessful candidates early on in the clinical trial process.

Read More


In mid-2016, a new set of regulations concerning product compliance with European industry standards will be put in place for pharmaceutical and life science organizations. Identification of Medicinal Products (IDMP) is a framework of detailed descriptions of substances, composition and dosage forms, production procedures, and packaging. These IDMP norms require identification of all pharmaceutical products according to certain data standards, laid out by ISO (International Organization for Standardization). ISO has come up with five IDMP standards; these are aimed at accurately identifying medicinal products for human use, with a high degree of certainty.

IDMP Standards Averbis

With Europe being the first region to adopt Identification of Medicinal Products (IDMP) standards, by July 1st, 2016, time is running out for life science organizations to comply. It is anticipated that the U.S. Food and Drug Administration (FDA) and the Japanese Pharmaceutical and Medical Devices Agency (PMDA) will follow the European Medicines Agency’s (EMA) advance move. EMA is the first regulatory agency that requires life science organizations to comply with the ISO standards. In the case of noncompliance, organizations will face severe fines. Life science organizations therefore will need to quickly establish a data standardization process that is robust, reliable, and flexible enough to meet these varying regulatory demands.


The main challenges from the perspective of implementing compliance are:

  • An enormously ambitious time schedule. Life science organizations are obliged to adhere to the requirements prior to the July 2016 deadline. But the final EMA implementation guidelines were only made available in late 2015. This leaves organizations with a time slot of less than a year to comply with the new standards.
  • The need to collaborate between business units. The product master data required to satisfy the IDMP standards is present in a wide set of units and systems within life science organizations and their suppliers.
  • A massively unstructured data pool. So far, most data has been submitted to regulators in heterogeneous formats such as pdf, doc and txt files. Details about substances such as the Summary of Product Characteristics (SmPC), Manufacturing Licenses, Chemistry, Manufacturing and Control (CMC) documents and others are present in a wide set of source systems within the organizations.

Regarding the last point, text mining technologies are crucial for overcoming the complexity and heterogeneity of unstructured data. Text mining refers to the process of deriving high-quality information from text, a process that is capable of quickly extracting relevant information from text sources and structuring heterogeneous data contained in various IDMP relevant sources, such as SmPC documents. Relevant information includes product names; ingredients; excipients; pharmaceutical dosage forms, strengths, and units; undesirable side effects; and much more. This information is mapped to standardized vocabulary as defined in the above-mentioned ISO standards and as shown in the following picture.

Identification of Medicinal Products (IDMP)

Let’s take undesirable effects as an example, to show the complexity of the task and the capabilities of advanced text mining solutions. Side effects are usually listed in section 4.8 “Undesirable Effects” of SmPC documents. They are mostly present in different table formats but can also be found in a text passage. Side effects will be coded/mapped to the MedDRA vocabulary. The information intended for extraction consists of multiple items from the table. For example, adverse events are usually listed in connection with its System Organ Class (SOC) and frequency. Text mining solutions must be able to extract such complex information and relations from tables and automatically map it to MedDRA codes.

If a text mining solution is able to fulfill these needs with high precision, then it will save pharmaceutical companies a lot of time and money. It creates a consistent and homogenized set of product master data, permitting significant analysis. This delivers new insights in areas including post-marketing surveillance, competitor analysis, supply chain, and sales/marketing. Expenses associated with managing, integrating, maintaining, and reconciling data across functions and sites are reduced.

Averbis offers solutions that enable organizations to comply with IDMP, within the given timeframe, and to leverage data assets. Please contact us to find out more.

Read More

Information Discovery: New version is available!

averbis information discovery text mining

information discovery averbisInformation Discovery is a next-generation text analytics platform that allows you to get insights in your unstructured data and explore important information in the most flexible way. Information Discovery collects and analyzes all kind of documents, such as patents, research literature, databases, websites and enterprise internal repositories.

Information Discovery is a complete new version of the Averbis text mining and search technologies.

New features of Information Discovery Version 4.4
at a glance:

  • New! Significantly improved user interface for operation and configuration
  • New! Intuitive query builder for compiling complex queries
  • New! Available with a multilingual interface
  • New! Manage multiple projects within one instance
  • New! Comprehensive user and rights management
  • New! Support of single sign on (SSO)
  • New! Extended document import and export with various new formats
  • Improved help functions
  • Latest, cutting edge technologies such as Spring 4, JPA 2, AngularJS, Bootstrap, HTML5 and a consistent REST API
Read More
page  2  of  2

Jetzt weitere Informationen und Demo anfordern!

kostenlos & unverbindlich

Schreiben Sie uns von Ihrem Vorhaben (Pflichtfeld)


Vorname (Pflichtfeld)

Name (Pflichtfeld)

Jobtitle / Rolle (Pflichtfeld)

Firma (Pflichtfeld)

E-Mail-Adresse (Pflichtfeld)



Your message (required)

Use case

First Name (required)

Last Name (required)

Job Title / Role (required)

Company (required)

Email (required)