Stories of Technology, Innovation, & Entrepreneurship in the Southeast

Knoxville Business News Tennessee Mountain Scenery Background
March 20, 2024 | Tom Ballard

AI delivers major win in fight for improved cancer treatments and diagnoses

The project involved ORNL researchers working with scientists at Louisiana State University in a partnership with the National Cancer Institute.

Artificial intelligence (AI) has delivered a major win for pathologists and researchers in the fight for improved cancer treatments and diagnoses.

In partnership with the National Cancer Institute (NCI), researchers from Oak Ridge National Laboratory (ORNL) and Louisiana State University developed a long-sequenced AI transformer capable of processing millions of pathology reports to provide experts researching cancer diagnoses and management with exponentially more accurate information on cancer reporting.

“Our goal is trying to see if we can automate the process of extraction of specific cancer site information from these pathology reports and make it into structured data for nation level cancer incidence reporting,” said Mayanka Chandra Shekar, a Research Scientist in the Computational Sciences and Engineering Division at ORNL.

The team’s work was recently published in Clinical Cancer Informatics.

AI transformer models are trained on large amounts of data and “transform” those data into information that is useful and digestible to scientists. Using the secure CITADEL framework on the Oak Ridge Leadership Computing Summit supercomputer, with support from the Exascale Computing Project and Modeling Outcomes Using Surveillance Data and Scalable Artificial Intelligence for Cancer (MOSSAIC), program, researchers at ORNL used the specialized transformer model to process 2.7 million cancer pathology reports. This model, known as Path-BigBird, pulls data from six Surveillance, Epidemiology, and End Results (SEER) cancer registries.

The NCI’s SEER program is an authoritative source of information on cancer incidence and survival in the United States. SEER currently collects and publishes cancer incidence and survival data from population-based cancer registries covering approximately 48% of the U.S. population.

“We wanted to build a language model where we could ask, ‘Can we build something that will understand the language of pathology and help us to create predictive modeling or information extraction models which will basically extract cancer site, subsite, and other key details out of pathology reports?’” Chandrashekar said.

Currently, these cancer registries are updated by hand, leaving a two-year gap between the cancer incidence and its reporting, which means if there is an increase in cancer rate nationally, researchers have to wait two years before recognizing this area of concern.

By effectively processing the information from millions of pathology reports, Path-BigBird has the potential to streamline the speed and accuracy for pathology information extraction and outperform traditional deep learning approaches to gathering important information such as identifying cancer sites, histology, and improve the precision of cancer incidence reporting at a population level.


Don’t Miss Out on the Southeast’s Latest Entrepreneurial, Business, & Tech News!

Sign-up to get the Teknovation Newsletter in your inbox each morning!

  • This field is for validation purposes and should be left unchanged.


No, thanks!