September 29, 2020 – An artificial intelligence-driven program will help facilitate big data analytics research among scientists without specialized expertise, according to a study published in Cancer Cell.
The technologies used in modern biomedical research generate large, complex datasets that offer information about patients, animal models, or cell lines. These research efforts may include studying the whole of genetic information (genomics), gene expression, or protein expression.
Because these datasets are so complex, it is often challenging for researchers to answer specific biological questions without specialized analytical approaches. Scientists usually perform these analyses using a computer script written in a range of programming languages, which requires some understanding of both programming and bioinformatics.
While bioinformaticians can help navigate and process these complex datasets, the work is time-consuming. To overcome this issue, researchers from the University of Texas MD Anderson Cancer Center developed an open-access, AI-driven program called DrBioRight, designed to lower barriers for all researchers.
The tool enables researchers to more easily conduct routine analyses of their own data through a user-friendly chat interface with natural language interactions. DrBioRight allows users to ask questions of the program as if they were speaking naturally and not in complex programming terms.
“We felt that we could improve the current model for conducting routine bioinformatics analysis and greatly speed up turnaround time by creating a tool that any researcher could use,” said Han Liang, PhD, professor of Bioinformatics and Computational Biology. “Our long-term goal for DrBioRight is to be an intelligent collaborator for every researcher.”
The tool is now freely available to academic researchers. The program has a number of tools ready-built to handle the most common types of bioinformatics questions, and includes some of the most frequently used public cancer datasets available, such as the Cancer Genome Atlas and Cancer Cell Line Encyclopedia.
To test the DrBioRight approach, researchers replicated the analysis of a cancer genomics paper using the tool and found that it accurately reproduced the previously published results.
Because the tool is driven by AI, it can learn from each inquiry and improve analysis, ultimately becoming a more useful tool over time. In the future, the researchers hope to further improve the tool and enable users to use their own datasets, as well as allow open development for new modules.
Programs like DrBioRight have become increasingly important in biomedical research. The massive amounts of data needed to generate actionable, significant insights can easily become overwhelming for researchers, who may not have the tools required to properly analyze these datasets.
“Getting genetic data to physicians at the point of care remains a huge challenge,” Joel Diamond, MD, adjunct associate professor of biomedical informatics at the University of Pittsburgh, told HealthITAnalytics.
“A lot of organizations have all their genetic data in a PDF or Word-based report. To make things more complicated, the language in those reports is often written in very scientific terminology, and it can be difficult to understand.”
“Additionally, adding genetic data into clinical workflows could increase fatigue and burnout, but to an even greater extent, because physicians themselves don’t feel that their knowledge base or their data infrastructure has kept up with some of the changes in genetics,” Diamond added.
With DrBioRight, the MD Anderson team will help researchers take advantage of the large amounts of data generated in modern research methods.
“As we work to improve the program, we also want to enable other bioinformaticians to contribute their algorithms and teach DrBioRight,” said Liang. “Involvement from the entire research community will help to create a tool that is useful in answering complex research questions more efficiently.”