Over the last two years, Illinois CS professor Mohammed El-Kebir has worked with graduate and undergraduate students on a new paper investigating gene sequences in Coronaviruses.
Even though it’s been more than two years and much has changed about the COVID-19 pandemic, researchers like Illinois Computer Science professor Mohammed El-Kebir continue to investigate the virus to ensure the medical and scientific community are better prepared to respond if something similar occurs in the future.
A recent paper from El-Kebir, entitled “Accurate Identification of Transcription Regulatory Sequences and Genes in Coronaviruses,” investigates transcription regulatory sequences (TRSs). The papers states TRSs play a critical role in discontinuous transcription in coronaviruses.
Ultimately, the work introduces “two problems collectively aimed at identifying these regulatory sequences as well as their associated genes.”
“I think the true impact of this work is that we have developed a new tool for our disposal if the need arises and there’s another pandemic – if it’s going to be another Coronavirus,” El-Kebir said. “The work allows scientists to quickly identify these TRS sites as well as the genes of future, yet undiscovered, coronaviruses. This information essentially allow us to classify the virus and accurately place it into the phylogeny of coronaviruses.
“This is a good tool that people can build upon to identify and understand new coronaviruses if and when the need occurs.”
The workgroup included one of El-Kebir’s Illinois CS graduate students, Palash Sashittal, who conducted the bulk of the technical work on the research with Chuanyi Zhang – a graduate student with Electrical & Computer Engineering (ECE). Two CS undergraduate students, Ayesha Kazi, and Michael Xiang, and one ECE undergraduate student, Yichi Zhang, contributed to the effort by creating a web interface to make the findings accessible.
Their research began in fall of 2020, with Chuanyi and Sashittal’s work in one of El-Kebir’s graduate courses, Introduction to Bioinformatics.
As the paper describes, the group first focused on the “TRS Identification problem of identifying TRS sites in a coronavirus genome sequence with prescribed gene locations.” Their solution, which the group calls CORSID-A, is an algorithm that “solves this problem to optimality in polynomial time.” The group also states that CORSID-A “outperforms existing motif-based methods in identifying TRS sites in coronaviruses.”
Second, the paper demonstrates “for the first time how TRS sites can be leveraged to identify gene locations in the coronavirus genome.”
As a computer scientist whose primary research focus is on combinatorial optimization algorithms in computational biology, El-Kebir’s work was well situated to help in response to the COVID-19 pandemic.
What amazed El-Kebir was the volume of work like his and the way the scientific community came together to produce unprecedented results so quickly. With assistance from those in the computational biology field, for example, he noted that the COVID-19 vaccine was created from nothing. And the work is not stopping there; scientists are also constantly monitoring to see if the vaccine needs to be modified for new variants.
“It’s a pretty amazing time to be a part of the scientific community right now,” El-Kebir said. “Projects like these and the way we’re sharing them through faster-than-ever information flow is remarkable. I think it’s impressive to see how quickly we came to understand the problem and to see how quickly we responded.
"There has been a tremendous exchange of ideas in all forums that has been inspiring."