Adve, Team Awarded $5.6 Million to Streamline Complex Modern Software

4/14/2018 4:24:35 PM David Mercer, Illinois Computer Science

Professor Vikram Adve
Professor Vikram Adve
Professor and Interim CS Department Head Vikram Adve will lead a five-year, $5.6 million effort to reduce the complexity and size of modern software systems, using his groundbreaking work creating the LLVM compiler infrastructure as part of the project.

The Office of Naval Research awarded the grant to Adve and two co-PIs, University of Rochester Assistant Professor John Criswell, (BS CS ’03, PhD ’14 and one of Adve’s former students), and University of Utah Professor John Regehr.

Assistant Professor John Criswell, University of Rochester
Assistant Professor John Criswell, University of Rochester
ONR made the award through its Total Platform Cyber Protection program, or TPCP. With the award, ONR is seeking advances that improve the ever-growing mass of software used by government systems, allowing that software to be much more efficient, compact and secure.

“The Navy, when it comes to shipboard systems, they have enormous amounts of code that have to run on the computers on a ship as well as even greater volumes of software that’s run on land in control centers,” Adve said. “Much of this software is built on top of open-source commodity software.  Because the practices that enable us to develop large software systems today are also really inefficient, they face really large costs in developing, maintaining, and testing their software.”

New Work Will Allow Exploration of a Key Benefit of LLVM

When Professor and Interim CS Department Head Vikram Adve and his co-creators released LLVM in 2003, one intriguing piece was never fully explored. The grant from the Office of Naval Research grant could change that, Adve says.

LLVM’s lifelong optimization capability allows software to be analyzed and transformed at any time before or after shipping to end-users, unlike much software today, which must be frozen prior to shipping.

LLVM was released as an open source compiler infrastructure and then, as Adve says, “sort of took on a life of its own.”

The lifelong optimization capability of LLVM was used in Apple’s very first commercial product based on LLVM, but most other production uses from a range of companies -- including Apple, Google, Intel, Qualcomm, NVIDIA -- did not take advantage of it.

The lifelong optimization capability was recently adopted by Apple for their iPhone, iPad, Apple Watch and Apple TV. Software vendors for Apple’s mobile platforms now ship much of their software to Apple in LLVM form, enabling Apple to optimize and run their code on a range of different devices.

But even these uses still only scratch the surface of what might be possible, Adve says.

The project will allow Adve and his co-PIs to make use of one aspect of LLVM that he says was never fully explored – lifelong optimization. This capability allows software to be analyzed and transformed at any time before or after shipping to end-users, unlike much software today, which must be frozen prior to shipping.

Adve says he and his co-PIs and a group of eight to 10 students were already starting to work on the unexplored possibilities of lifelong optimization when ONR called for proposals. Their new project – an outgrowth of LLVM dubbed ALLVM -- seemed like a good fit.

“LLVM allows you to do these kinds of late-stage software customizations or optimizations even after shipping the code because you ship it in this richer form,” he said. “What we’re trying to explore now is, what benefits would you get for performance, for security, for reliability if all software on a system – hence the name ALLVM -- was available in a form that can be analyzed and optimized by compilers?”

Part of the project already underway involves building a database of all of the open source software Adve and his team can find that can be compiled by Clang, which is the C++ compiler for LLVM. Thousands of popular Linux packages are already available in this form.

Adve calls it a Bitcode database, referring to the intermediate language that LLVM uses to optimize apps.

One benefit of a Bitcode database is to comb the database looking for duplicate fragments of code within programs and even across programs.

“Within a program you can eliminate redundancy, so you can make the program smaller,” Adve said. “One thing we’re looking at is reusing the common pieces so that when you update a program and ship a new version, you don’t have to have separate complete copies of the old version and the new version installed on a system.”  For example, the project can ship nine different versions of a single, widely used database engine in the same size as it currently takes for a single version.

Professor John Regehr, University of Utah
Professor John Regehr, University of Utah
Another benefit of ALLVM and the Bitcode database, which co-PI Regehr is exploring, is to use a powerful class of techniques called superoptimizers to convert sequences of code into shorter ones. A third benefit, the focus of co-PI Criswell’s work, is to improve security of the software – and be able to measure the security improvement obtained.

Beyond the work Adve, Regehr, Criswell and their team can manage on their own, the Bitcode database and associated tools will be open source and will be shared by anyone interested in accessing or contributing to it. Adve hopes that will allow other researchers to explore ideas in compilers, software engineering, security and software reliability that he and his partners may not even have considered.