10/12/2021 8:25:42 AM
Computer vision expertise from David Forsyth, Yuxiong Wang and Alexander Schwing could narrow the gap between human and machine intelligence, while reducing a dependency on labeled data.
A team of three researchers between Illinois Computer Science and Electrical & Computer Engineering believe that now is the time to use computer vision tactics to help pace the next development in artificial intelligence.
The National Science Foundation agrees, which is why this group – led by Fulton Watson Copp Chair in Computer Science David Forsyth – recently earned a $1.2 million grant for the next four years. Fellow CS professor Yuxiong Wang and ECE professor Alexander Schwing join Forsyth on the project, entitled “Creating Knowledge with All-Novel-Class Computer Vision.”
Together, the group formed three core goals for the next four years. The bulk of the effort centers on work to develop an effective and accurate classification and detection system using all novel class recognition.
According to Forsyth, the trickiest part to widespread use of this computer vision solution is the lack of data – and, more specifically, labelled data – for many users. Big tech companies have the funds to accumulate and label data themselves, but for most that’s not a reality.
A solution, the likes of which Forsyth, Wang and Schwing are working toward, would alter this reality.
“The important aim here is democratizing vision practice,” Forsyth said. “In a perfect world, everybody who has a lot of images and a detection or classification problem, could use our ideas and continue toward their goals. Fixing that hold is intellectually important, and it’s something that many people aren’t all that motivated to take on.
“Now, the important thing to understand is that this work includes profound technical problems that we have to take on.”
Helping drive progress on a project of this scope are the experts that Forsyth sought out.
Forsyth, Wang and Schwing already have built up a rapport, thus understanding how each of their strengths bolster the collaborative spirit needed to tackle this challenge.
Wang joined Illinois CS faculty a year ago with an established interest especially focused on meta-learning, few-shot learning, predictive learning, and streaming perception. Meanwhile, Schwing has held placements with ECE and CS faculty since 2016, establishing an interest in algorithms for prediction with and learning of non-linear (deep nets), multivariate and structured distributions, and their application in numerous tasks, e.g., for 3D scene understanding from a single image.
“To me, computer vision is one of the key passes to understanding human and artificial intelligence,” Wang said. “Much of my research centers on this fundamental question surrounding the gap between human and machine intelligence through a new machine learning paradigm – which aims to mimic human’s ability to learn from experience.”
“As a human, we are collaborative in nature. We can do a lot of things at once, because context shifting doesn’t take a lot of time,” Schwing said. “How we achieve that or design machines to achieve that is, I think, one of the important questions that I’m fascinated with through my research. In that area, computer vision provides a core test that we seek to study.”
To achieve the opportunities this NSF funded project provides, the group outlined three core goals.
The first is to train an object detection procedure with all category data. Traditionally, this process takes a lot of data, though, which is the sticking point to advancement.
“So, the question you have to ask is – if you want to build an all novel class detector – can you learn those features it needs without a ton of data?” Forsyth said. “There’s already some evidence that you might be able to do that.”
The second core goal is a learning procedure that can share training examples across categories widely and effectively without explicit linking of the categories. This work stems from Wang’s previous project in “data hallucination.”
“Previous projects I’ve worked on have involved few-shot learning and data hallucination, and we’re going to try to extend the strategy behind those efforts as a solution to this additional challenge,” Wang said.
The third core goal will link learning of early vision tasks – for example recovering shading and lighting from an image – to learning of classification and detection tasks. Their goal is to complete both tasks with very little data.
“I love digging into early vision tasks, attempting to complete them without data,” Forsyth said, explaining that a server in his office is capable of a learning procedure that adjusts an image of a room for various lighting effects. “The thing that is nice about it is that it doesn’t have any ground truth data, meaning you have to build underlying representations that know a great deal about surfaces, and light, and 3D shape – otherwise it will get the picture wrong.”
Schwing said that previous work he’s engaged in ties to Forsyth’s expertise in this area and the group’s third core goal.
“Semi-supervised learning is a focus of mine that ties into this work, because it’s primarily about learning classification algorithms with as few data as possible,” Schwing said.
Altogether, will the project progress exactly how the grant proposes? While that’s impossible to tell, the group's excitement is palpable.
“You know the old saying, ‘A man’s reach should exceed his grasp, or else what were the stars for?’ I think that applies here,” Forsyth said. “The scope of this project is very difficult to predict. Oftentimes the advances you believe you are going to make later shift into something a bit different.”
What remains clear, however, is that the collaborative nature of the faculty and students involved will produce an energy dedicated to progress.
“I’ve come to understand our work as very collaborative, collegial and supportive. These are three terms that make it so much fun here at the University of Illinois Urbana-Champaign,” Schwing said. “This project will be no different, as we have amazing faculty support and student capability.”