Zhang, iSE Research Group Secure Place in Meta's 2022 AI4AI Research Program

11/22/2022 Aaron Seidlitz, Illinois CS

Illinois CS professor Lingming Zhang looks forward to the opportunity to engage in this project after years of research focused on ensuring the correctness of Deep Learning (DL) systems.

Written by Aaron Seidlitz, Illinois CS

Deep Learning (DL) is a part of Machine Learning (ML) that has, as Illinois Computer Science professor Lingming Zhang mentioned, “penetrated to almost every corner of modern society.”

Lingming Zhang
Lingming Zhang

DL, which solves problems and builds intelligent solutions, has several applications. It impacts the way we stream television, the way we view advertising, the way we consume our news through aggregation, and much more.

While many see the potential in this effort, Zhang’s interest in researching Software Engineering and Programming Languages – and their synergy with ML – applies to DL by working to ensure the correctness of its systems.

His approach caught the attention of Meta Research, which recently awarded Zhang $50,000 over the next year through the 2022 AI4AI research program.

“There has been a huge body of research focusing on testing, analyzing, and verifying DL models,” Zhang said. “However, there is still limited work targeting the reliability of the emerging tensor compilers, often called DL compilers, which aim to automatically compile high-level tensor computation graphs directly into high-performance binaries for better efficiency, portability, and scalability than traditional DL libraries.

“This new opportunity helps our group further chase our goals – for which I want to take this chance to thank all my students and interns with the Intelligent Software Engineering (iSE) Lab. Without their hard work and effort, this would not be possible.”

The Meta release announcing the winners also explained the primary goal of the research now associated with this effort: “To support our AI workloads at scale in a proactive way, we are adopting a different approach to the design of our AI stack and the infrastructure it runs on. Through this RFP, we aim to partner with academics interested in using AI and ML approaches, such as reinforcement learning, Bayesian modeling, and graph representation learning to automate and improve the whole AI stack: from silicon to AI models’ output.”

Zhang said that it is his research group’s efforts, over the course of multiple years, that built to this successful connection with Meta.

He immediately associated this successful proposal with a paper he published with Illinois CS students Jiawei Liu, Yuxiong Wei, and Yinlin Deng – as well as a summer intern from Fudan University in China, Sen Yang, who is now a PhD student at Yale University – on tensor compiler fuzzing.

This work, dubbed “Tzer,” introduces “a practical fuzzing engine for the widely used TVM tensor compiler,” according to Zhang. Their experimental results show Tzer “substantially outperforms existing fuzzing techniques on tensor compiler testing.”

However, he said, its connection to the Meta Research work forthcoming derived from the way they will plan to “further leverage ML to effectively navigate through the search space for tensor compiler fuzzing.”

This work also aligns well with the group’s recent research focus on system reliability and fuzzing. To date, the group has detected “numerous bugs in real world systems for Apache, eBay, Google, Meta, Microsoft, NVIDIA, OctoML, Oracle, and Yahoo!.

Even more specifically, this Meta Research project becomes part of their efforts on ML system reliability – for which they have already helped developers find 500+ new bugs for popular DL libraries and optimizers/compilers, such as PyTorch, TensorFlow, JAX, OneFlow, TVM, TensorRT, XLA, and ONNXRuntime, in less than two years.

“This proposed research work will produce more powerful learning-based fuzzing techniques for modern tensor compilers, which can be the foundation/basis for future DL systems, applications, and advances,” Zhang said. “The proposed research will not only help find critical bugs/vulnerabilities in such critical DL infrastructures, but also can potentially influence their design and implementation.

“The project will involve a number of our graduate students here at Illinois CS and will also involve Meta developers through remote collaboration and summer internships.”

Share this story

This story was published November 22, 2022.