Illinois CS Vision Group Provides Leadership in a Rapidly Growing Field
In less than a decade, driven by the deep learning revolution, computer vision has boomed and begun to transform photography, social and interactive computing, manufacturing, health care, construction industry, agriculture, and other fields.
Professor David Forsyth and Associate Professors Derek Hoiem and Svetlana Lazebnik lead Illinois Computer Science’s Vision Group and play, as Forsyth says, a significant role in the academic computer vision community.
“A decent fraction of the establishment is sitting in this room,” Forsyth said during an interview with the three of them in his office. “The concentration of people here is most unusual. Most universities have zero or one people in the vision establishment. We have three.”
Between them, they’re responsible for a large and growing network of graduates who hold Illinois Computer Science PhDs and exert their own influence on how computer vision is being used and the research that is driving it.
The computer vision faculty’s own research covers a wide range, from Forsyth’s work on everything from recognition to image synthesis and manipulation to Lazebnik’s recent work has focused on joint models for image-language understanding and Hoiem’s groundbreaking work on creating 3D scene models from single photographs.
Through his research, Hoiem has a foot squarely in industry, too. Reconstruct, the startup he cofounded and for which he is CTO, uses computer vision to create 3-D modeling for construction-project management.
The three professors are also taking leadership roles in another key area that brings the field together: conferences. All three have or are serving in organizing roles for major conferences in 2018 and this year.
Hoiem is a program chair this year’s IEEE Conference on Computer Vision and Pattern Recognition (CVPR), and Forsyth filled the same role last year. Forsyth is general chair for this year’s International Conference on Computer Vision (ICCV), while Lazebnik is a program chair.
In his role with CVPR last year, Forsyth also was part of a group that put together what he believes is a first-of-its-kind code of conduct. He says the code was created after hearing about female graduate students leaving another conference because they were uncomfortable over attendee behavior.
“That is something we simply cannot have,” he said, explaining that with the code came a formal set of procedures to follow if there were future problems. “We didn’t produce a code of conduct because of flourishing evil behavior in the vison community. We needed to know what to do if something happened.”
The conferences provide a key data point in measuring just how much the field has grown.
More than 6,100 people attended CVPR 2018, roughly triple the 2,000 who attended the 2014 version of the conference and about six times the number who were there in 2009. And the gathering is on track to be even bigger this year _ 5,165 papers were submitted by the November deadline, a 56 percent increase over 2018.
“The growth in our conferences has been exponential,” Lazebnik said.
The Origin of Vision’s Growth
An article in Arstechnica traces the explosive growth in computer vision to essentially a moment in 2012, and Forsyth, Lazebnik, and Hoiem generally agree: the publication of a paper introducing AlexNet, a deep convolutional neural network designed by Alex Krizhevsky and Ilya Sutskever, then PhD students at the University of Toronto. They used AlexNet to significantly outperform their competitors at identifying and classifying objects in the ImageNet competition.
Hoiem remembers exactly where he was when he and others heard the news _ at the European Conference on Computer Vision in Florence, Italy.
“I still remember, outside of some cathedral that was part of the event, and everybody was discussing, ‘Does this mean that there’s a huge breakthrough?’ But everybody knew that it was a really significant moment,” Hoiem said.
“That was that kind of very visible moment when they broke through and outperformed the state of the art,” Lazebnik said. “And this was basically the start of the gold rush in deep learning.”
Hoiem added that computer vision’s growth has also been accelerated by the ease with which anyone can experiment with deep neural networks.
“It became possible for a high-schooler to download code from Github and train a recognition algorithm and get good results,” he said.
Some students applying for admission to Illinois CS now have already done high school projects that rely on computer vision that would not have been possible even five or six years ago.
And the demand for classes reflects how many students are interested in the field. The introductory graduate-level computer vision class, CS543, had fewer than 50 students in 2012. This spring more than 250 students are enrolled.
In his career, Forsyth’s has explored the modeling of shading and illumination, human animation, user interfaces and more. With former Illinois CS Professor Jean Ponce, in 2003 he also co-authored a now-classic computer vision textbook, "Computer Vision: A Modern Approach." And in 2017, he authored a “Probability and Statistics for Computer Science.” Now another textbook, "Applied Machine Learning,” is in press.
Lazebnik's work on models for image-language understanding has benefitted applications like text-based image search, automatic image captioning, visual question answering, and visual dialog.
Hoiem 3d modeling work is part of broader influential research on how computers can understand the space and shapes of objects and scenes from images. Hoiem and Forsyth also collaborated closely on representing parts, materials, and shape of objects, and placing synthetic objects into existing photos with natural lighting and shadows.
Influence of Vision Alumni
Students who have graduated from Illinois CS are helping shape the future of computer vision through their own research and teaching, and say their links back to the department remain influential.
“Even today I’m still learning from (Forsyth). I actually still speak to him a few times a year,” said Ali Farhadi (PhD ’11).
He is the senior research manager for the Computer Vision Group at the Allen Institute for Artificial Intelligence and an associate professor in the School of Computer Science and Engineering at the University of Washington. Farhadi led recent research at the Allen Institute that pairs a human player with an AI program in a game he and others believe could push machine learning beyond its current limits.
Lazebnik (PhD CS ’06) is an alumnae, too, working as an assistant professor at the University of North Carolina, Chapel Hill before joining the Illinois CS faculty in 2012.
Other Illinois CS graduates leading computer vision’s growth include:
- Brett Jones (BS CS ’08, MS ’10, PhD ’15), Kevin Karsch (PhD CS ‘15), and Raj Sodhi (BS CS ’08, MS ’10, PhD ’15), the founders of Lightform, a startup they created as PhD students that uses projection mapping to turn any object into an interactive display. All three were advised by Forsyth, while Professor Brian Bailey co-advised Jones and Sodhi.
- Gang Wang (PhD ECE ’10), a researcher and senior director at Alibaba and chief scientist of Alibaba AI Labs. MIT Technology Review in 2017 named him one of its 35 Innovators Under 35. He was advised by Forsyth and Hoiem.
- Steve Sullivan, (MS ECE ’90, PhD ECE ’96) general manager of Microsoft’s Mixed Reality Capture Studios and formerly of Lucasfilm, where he contributed to more than 70 films and won three Academy Awards for Technology. Advised, like Lazebnik, by former Illinois Professor Jean Ponce.
- Yasutaka Furukawa (PhD CS ’08), who is an assistant professor at Simon Fraser University with a long track record of 3D vision research, including his multi-view stereo algorithm to construct detailed 3D models from images. His software has been used by Industrial Light and Magic. He, too, was advised by Ponce.
Ease of Access
The access to faculty that students find at Illinois Computer Science is another thing that sets the Vision Group apart, according to Forsyth, Lazebnik, and Hoiem. They say they work hard to be accessible to their students in a way they suspect might not be found on some campuses.
Forsyth pointed out that, before gathering to discuss the Vision Group, he came to his office early and a student noticed the open door and stopped by to talk.
“We spent 20 minutes. And it was interesting and useful for both of us,” Forsyth said. “It creates an attractive and valuable environment for students to speak with people who really know what they’re doing, without having to wait through six months’ worth of email to set up an appointment