Semantic Enrichment
A low code platform to quickly build semantic enrichment pipelines and meta data, helping faster and deeper informatics research
Scientists in Biopharma companies need to search and analyze relevant concepts and entities in greater depth from a variety of public and internal information sources. They end up spending hours or days finding the related contextual information due to different data formats, varying vocabularies, and scattered information across sources.
A semantic enrichment solution provides a framework to address this challenge by identifying and associating data, definitions, and contexts from a large, unstructured, and heterogeneous content from different sources.
Key challenges in implementing a semantic enrichment solution:
- Complex to develop a number of solution components from data wrangling to entity extraction to knowledge presentation
- Continually updating the solution supporting different data sources and formats, upgrading vocabularies, and changing user interfaces
- Insufficient coverage in the existing biomedical vocabulary databases (For example, UMLS falls short in preclinical records)
Our approach leverages a low-code configuration and declarations driven solution for all the stages from extracting data to presenting knowledge, instead of developing a one-off custom solution for specific use cases.
- Sourcing and wrangling of data, Converting formats, and Extracting specific fields based on the conventions (eg: press release, blog posts etc.)
- Semantic enrichment activities including extracting and resolving entities, associating them based on ontologies, and leveraging vocabulary databases
- Advanced NLP and AI driven classification, ranking, and summarization
- Storing/writing results into different search-friendly databases – full-text, graph, relational DBs, document DBs, etc.
- Developing a user interface for search and exploration of entities and knowledge
Highlights
Key components & strengths
Low-code Platform and Declarative Pipeline
Framework that enables creating adaptable and scalable low-code solutions and enables code reuse across data formats, types and use cases
Library of reusable data processing components
Basic data wrangling (e.g. HTML parser), File format conversion (e.g. PDF to text), and Advanced components (e.g. document classification)
Semantic Enrichment Solution Components
Entity extraction, entity resolution, ontology association, classification, and ranking besides leveraging Metamap and Aganitha’s MDM
Enriched Data Storage in different databases
Integration for storing and writing results to full-text search, CMS, Relational, Document, and Graph Databases enabling multi-faceted search
Intuitive, Semantic, and Configurable UI
Dynamic view of the processed records allowing the users to search and analyze information across sources by slicing and dicing facets produced by the pipeline
Variety of entities across different data sources
Extract entities like drugs, proteins, genes, diseases, enzymes, organizations, etc. from several public sources and data from internal repositories
Outcomes
Faster informatics research across use cases and easier solution implementation and updates
Enhanced discovery
Of entities and knowledge through proper association between data from different sources, contexts and definitions
Quicker to insights
Across multiple processes – competitive intelligence, drug discovery and development, drug safety etc.
Productive user experience
Through an eCommerce like search and explore of entities and information by slicing and dicing relevant facets
Faster on-boarding and updates
Leveraging plug and play available components across the pipeline and configuration driven low code platform
Wider Vocabulary
With Aganitha’s continually expanding MDM (Master Data Management) database complementing Metamap (UMLS)
Acts as an add-on accelerator
For AI ML driven knowledge graphs, computational biology, and computational chemistry solutions
Discover our offerings across the biopharma value chain
Our Solutions
Customizable platforms and solutions across the drug discovery value chain from target discovery to therapeutic development
Read More
Our Services
Offering services in computational sciences and technology to complement biopharma R&D
Read More
Case Studies
Helping clients accelerate drug discovery and development by building computational biology, computational chemistry, technology, and cloud solutions
Read More