Complex and time consuming to conduct informatics research, and to implement and update a solution for the varying needs
Scientists in Biopharma companies need to search and analyze relevant concepts and entities in greater depth from a variety of public and internal information sources. They end up spending hours or days finding the related contextual information due to different data formats, varying vocabularies, and scattered information across sources.
A semantic enrichment solution provides a framework to address this challenge by identifying and associating data, definitions, and contexts from a large, unstructured, and heterogeneous content from different sources.
Key challenges in implementing a semantic enrichment solution:
- Complex to develop a number of solution components from data wrangling to entity extraction to knowledge presentation
- Continually updating the solution supporting different data sources and formats, upgrading vocabularies, and changing user interfaces
- Insufficient coverage in the existing biomedical vocabulary databases (For example, UMLS falls short in preclinical records)
- Sourcing and wrangling of data, Converting formats, and Extracting specific fields based on the conventions (eg: press release, blog posts etc.)
- Semantic enrichment activities including extracting and resolving entities, associating them based on ontologies, and leveraging vocabulary databases
- Advanced NLP and AI driven classification, ranking, and summarization
- Storing/writing results into different search-friendly databases – full-text, graph, relational DBs, document DBs, etc.
- Developing a user interface for search and exploration of entities and knowledge