CS Faculty Produce a Quicker Way to Analyze Deluge of COVID-19 Related Research
As scientists and clinicians responded to the spread of COVID-19 beginning in December, research outlets became flooded with new information. In response, a group of five Illinois Computer Science professors dedicated themselves to answering two primary problems that arose from this development.
The first problem they noticed was that this influx created more research material than scientists and clinicians could thoroughly review in a timely manner. Second, the quality of these papers dipped, as many preprint manuscripts did not undergo peer review.
“What we can do with PaperRobot during these times is build a bridge between doctors and biologists,” said Illinois CS professor Heng Ji, whose research in this area is supported by DARPA's KAIROS and AIDA Programs. “Doctors are working to find out more about the symptoms of COVID-19, while the biologists are researching the chemical and gene side of the virus. We can build a knowledge graph that can walk them from a particular drug to its connection."
“The scientific and medical communities can then analyze our graphs and summaries in a short amount of time, rather than reading through tens of thousands of papers.”
The workgroup includes four other Illinois CS faculty, and together they prepared PaperRobot:
- Jiawei Han, Michael Aiken Chair: Fine-grained knowledge extraction, evidence mining
- Bo Li, Assistant Professor: Link prediction, hypothesis verification
- Hanghang Tong, Associate Professor: Hypothesis generation
- ChengXiang Zhai, Donald Biggar Willett Professor in Engineering: Knowledge-driven question answering, interactive information retrieval
Inspiration regarding COVID-19 specifically struck this group after monitoring research outlets like PubMed 1 following the outbreak in December.
Ji said that PubMed 1 produces about 300 new papers on a given day. Since December a simple keyword search for “coronavirus” results in about 125 new papers published every day.
Even before the spread of the virus, the rate at which new research appeared exceeded an individual’s capacity to review it. That already indicated the need for a platform like PaperRobot; the spread of COVID-19 only heightened the need.
“Nobody can read 300 new papers every day,” Ji said. “Obviously, at some point this pandemic will be over, but we will still face this knowledge distillation problem.”
To solve issues like this, PaperRobot first finds and collects all the knowledge on a certain topic by focusing on the text of the research papers.
The knowledge graph produced next shows how the elements within this topic interact with each other. PaperRobot then develops a link prediction, which scours research available to see if previously unconnected subjects might have a hidden connection. Finally, the program uses graph mining to review conflicting or complimentary information to produce hypothesis verification or even generate a new hypothesis.
One extra perk to the work done by PaperRobot, is that its results can also connect people from different research communities.
The influx of research covering COVID-19 makes it even harder for individuals from different communities to find the right counterpart.
“Our knowledge network will also connect you to the original papers and authors,” Ji said. “In essence, this provides them with the best collaborative network possible. For COVID-19, we are trying to build an infrastructure that can help the research community work best together and accelerate the scientific discovery process.”