Inspiration

Genetic disorders and diseases affect millions of people across the planet. Genome data analysis can give insights into genetic variation and its implications for individual health. Despite advances in the field, humans must still search and read hundreds of research papers when analyzing genomes.This is a demanding task that takes genomic data analysts hours, often creating huge backlogs that can delay patient results for months.

What it does

GenSearch is a tool that takes an analyst’s query and aggregates and automatically reads hundreds of research articles to extract relevant information and prioritize the results.

How we built it

When an analyst enters a gene name such as FOXP2, the tool automatically searches multiple academic databases such as Google Scholar, BioArchive, PubMed, and MedArchive while removing duplicates. Then, by leveraging existing databases of diseases and genes, we use deep learning and natural language processing to decipher which diseases the studies refer to even when they are not explicitly mentioned in the abstract. GenSearch also uses named entity extraction to supplement the metadata of the gene-related studies. This adds essential information for the analyst, such as the genes involved in the study, the diseases mentioned, mutation types, and species.

Challenges we ran into

API calls were limited so, despite the script working, we could not get results from all databases. We took a long time to figure our named-entity extraction from abstracts.

Accomplishments that we're proud of

Eventually, we used a combination of list matching and Watson derived keywords and ranked them by disease, gene, and animal species used in the study (if any). We managed to obtain results for queries with gene names.

What we learned

A LOT -we learned how to query multiple APIs at the same time, create a search tool UI, map genes to associated diseases using list, and calling Watson to read returned abstracts from our algorithms.

What's next for GenSearch

In the future, we will incorporate a feedback system that allows users to rate results to improve our algorithms. Gensearch will not only save the time of analysts but also save lives so let's get gensearching!

Built With

Share this project:

Updates