skip to main content

Distinguished Lecture Series: Dr. William T. Freeman

1/29/2017 3:17:34 PM

As part of the CS @ ILLINOIS Distinguished Lecture Series, Dr. William T. Freeman, the Thomas and Gerd Perkins Professor of EECS at MIT, will discuss a learning-through-interaction algorithm that is able to predict audio features from videos. The lecture will take place at 11 am on January 30, in 2405 Siebel Center. 

Visually Indicated Sounds

Children may learn about the world by pushing, banging, and manipulating things, watching and listening as materials make their distinctive sounds-- dirt makes a thud; ceramic makes a clink. These sounds reveal physical properties of the objects, as well as the force and motion of the physical interaction.

We have explored a toy version of that learning-through-interaction by recording audio and video while we hit many things with a drumstick.  We developed an algorithm that predicts sounds from silent videos of the drumstick interactions. The algorithm uses a recurrent neural network to predict sound features from videos and then produces a waveform from these features with an example-based synthesis procedure. We demonstrate that the sounds generated by our model are realistic enough to fool participants in a "real or fake" psychophysical experiment, and that the task of predicting sounds allows our system to learn about material properties in the scene.

Joint work that appeared in CVPR 2016 with Andrew Owens, Phillip Isola, Josh McDermott, Antonio Torralba, and Edward H. Adelson:

Bio: William T. Freeman is the Thomas and Gerd Perkins Professor of Electrical Engineering and Computer Science at MIT, and a member of the Computer Science and Artificial Intelligence Laboratory (CSAIL) there.  He was the Associate Department Head from 2011 to 2014.

His current research interests include machine learning applied to computer vision, Bayesian models of visual perception, and computational photography. He received outstanding paper awards at computer vision or machine learning conferences in 1997, 2006, 2009 and 2012, and test-of-time awards for papers from 1990 and 1995.  Previous research topics include steerable filters and pyramids, orientation histograms, the generic viewpoint assumption, color constancy, computer vision for computer games, and belief propagation in networks with loops.

He is active in the program or organizing committees of computer vision, graphics, and machine learning conferences.  He was the program co-chair for ICCV 2005, and for CVPR 2013.