AI in Cancer Research: Tumor Phylogenetics

11/13/2021 4:46:17 PM Cancer Center at Illinois

Although cancer was not the initial focus for Mohammed El-Kebir’s research career, he found that computer science offered exciting opportunities to further our understanding of the evolutionary processes of tumor cells.

Written by Cancer Center at Illinois

Artificial intelligence is often employed in the field of cancer genomics, where bits of DNA sequencing data must be identified and further analyzed with statistical, evolutional, and probabilistic models. “Off-the-shelf” computing tools are useful for many cancer researchers, but Mohammed El-Kebir, Illinois CS professor and Cancer Center at Illinois (CCIL) scientist, is taking these AI applications a step further.

Illinois CS professor Mohammed El-Kebir.
Illinois CS professor Mohammed El-Kebir.

Although cancer was not the initial focus for El-Kebir’s research career, he found that computer science offered exciting opportunities to further our understanding of the evolutionary processes of tumor cells. Ultimately, El-Kebir began to devote his attention to these evolutions, and the phylogenetic trees that they create.

“Within a single tumor, you have intra-tumoral heterogeneity — different sets or types of mutations — that are intriguing. Evolution of species can be described in a single tree of life, but with cancer, you get a tree for each tumor,” El-Kebir said. “If you have the capability to code individual trees, you can start mining and identify subtypes — and ultimately, predict things like therapy response, taking precision medicine to the next level.”

These phylogenetic trees can reveal clones and clonal expansions, mutations from single cells to propagations of cells that share a lineage. A single tumor can gain multiple mutations and multiple clones, leading to intratumoral heterogeneity. This creates the risk of cancer therapies missing clones during treatment, leading to disease progression.

El-Kebir, also develops software programs, such as PhySigs and MACHINA, that focus on phylogenetic trees and help researchers make inferences about the mutations on these trees.

PhySigs identifies shifts in mutational signatures, or patterns of mutations, in DNA sequencing data. These shifts contribute toward a final mutational portrait that can explain how or why a mutation occurred. This AI was able to identify a specific signature in a lung cancer patient’s mutational clones which indicated the presence of a mutation in a DNA repair gene.

MACHINA, on the other hand, tracks metastatic tumor spread, allowing researchers to infer the sequence of migration from anatomical locations by comparing clones from those sites. That is, MACHINA identifies which clones migrated — and where — revealing whether a metastasis was seeded from the primary tumor or another metastatic site.

Migrations like these also occur within other medical contexts. Cancer researchers have discovered co-migration of clones that seed metastases, which also happens in viral infections where multiple pathogens can be co-transmitted in a single event, and a patient is infected with multiple variants.

“Parsimony — or Occam’s razor — tells us it is likely the simplest explanation with the fewest number of events: it is much more likely that a clone migrated than it appeared twice, independently, in different locations,” El-Kebir said.

El-Kebir’s software is primarily used in research labs rather than in direct clinical settings. Precision medicine for cancer patients typically requires a gene panel, instead of whole genome sequencing which can take approximately 12 hours to generate.

“There’s a lot of data available, and it is only increasing — and so are the opportunities. But the interpretation of the data is lagging — the tools haven’t caught up yet,” El-Kebir said. “The bottleneck here isn’t the wait for better computers, but the sequencing technology. We will get longer reads and error rates will drop — I foresee this technology will continue to improve. Better data will lead to better methods and better understanding of the composition of individual tumors. We’re getting there.”

El-Kebir’s current focus is the development of  methods that enable the estimation of cancer phylogenies from single-cell sequencing data. Specifically, he is addressing is the integration of data obtained from the same tumor using distinct single-cell technologies. Another focus is the development of comprehensive evolutionary models for somatic mutations that occur at varying genomic scales. He does this in close collaboration with researchers at the Mayo Clinic.

See the original story from the Cancer Center at Illinois.

Share this story

This story was published November 13, 2021.