Boundary Cues for 3D Object Shape Recovery
Kevin Karsch, Zicheng Liao, Jason Rock, Jonathan T. Barron, and Derek Hoiem
CVPR, 2013.
[pdf] [supp]
Learning Collections of Part Models for Object Recognition
Ian Endres, Kevin Shih, Johnston Jiaa, and Derek Hoiem
CVPR, 2013.
[pdf]
Improved Object Categorization and Detection Using Comparative Object Similarity
Gang Wang, David Forsyth, and Derek Hoiem
PAMI Vol. 99, 2013.
[pdf]
Diagnosing Error in Object Detectors
Derek Hoiem, Yodsawalai Chodpathumwan, and Qieyun Dai
ECCV, 2012.
[pdf]
[supp (40MB)]
[src/data (70MB)]
[slides]
Beyond the line of sight: labeling the underlying surfaces
Ruiqi Guo and Derek Hoiem
ECCV, 2012.
[pdf]
[project]
Indoor Segmentation and Support Inference from RGBD Images
Nathan Silberman, Derek Hoiem, Pushmeet Kohli, and Rob Fergus
ECCV, 2012.
[pdf]
[project]
[slides]
Learning Shared Body Plans
Ian Endres, Vivek Srikumar, Ming-wei Chang, and Derek Hoiem
CVPR, 2012.
[pdf]
Learning to Localize Detected Objects
Qieyun Dai and Derek Hoiem
CVPR, 2012.
[pdf]
Recovering Free Space of Indoor Scenes from a Single Image
Varsha Hedau and Derek Hoiem and David Forsyth
CVPR, 2012.
[pdf]
A Data-driven Method for Feature Transformation
Mert Dikmen and Derek Hoiem and Thomas S. Huang
CVPR, 2012.
[pdf]
Learning Image Similarity from Flickr Groups Using Fast Kernel Machines
Gang Wang, Derek Hoiem, and David Forsyth
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 16 Jan 2012.
[pdf]
Paired Regions for Shadow Detection and Removal
Ruiqi Guo, Qieyun Dai, and Derek Hoiem
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI), 02 Oct 2012.
[pdf]
Representations and Techniques for 3D Object Recognition and Scene Interpretation
Synthesis Lecture on Artificial Intelligence and Machine Learning
D. Hoiem and S. Savarese
Morgan & Claypool Publishers, Aug 2011. ISBN: 1608457281
[amazon]
[M&P]
Most universities already have a subscription to the series. If you don't and want a draft, let me know.
Rendering Synthetic Objects into Legacy Photographs
K. Karsch and V. Hedau and D. Forsyth and D. Hoiem
ACM SIGGRAPH Asia 2011.
[pdf (3mb)]
[pdf (42mb)]
[project/video]
Learning Random Fields using Graph Cuts
M. Szummer and P. Kohli and D. Hoiem
Chapter in Markov Random Fields for Vision and Image Processing
Edited by A. Blake, P. Kohli, and C. Rother, MIT Press, September 2011.
[link]
Single-Image Shadow Detection and Removal using Paired Regions
R. Guo and Q. Dai and D. Hoiem
CVPR, 2011.
[pdf]
[project]
Recovering
Occlusion Boundaries from an Image
D. Hoiem, A.A. Efros, and M. Hebert
IJCV (91), No. 3, 2011.
[pdf]
The
final publication is available at www.springerlink.com.
Category Independent Object Proposals
I. Endres and D. Hoiem
ECCV 2010. [pdf]
[project]
Thinking Inside the Box: Using Appearance Models and Context Based on Room Geometry
V. Hedau, D. Hoiem, and D.A. Forsyth
ECCV 2010. [pdf]
Attribute-Centric Recognition for Cross-Category Generalization
A. Farhadi, I. Endres, and D. Hoiem
CVPR 2010. [pdf] [CORE dataset]
Comparative object similarity for improved recognition with few or no examples
G. Wang, D.A. Forsyth, D. Hoiem
CVPR 2010. [pdf]
The Benefits and Challenges of Collecting Richer Object Annotations
Ian Endres, Ali Farhadi, Derek Hoiem, and David Forsyth
ACVHL 2010 (in conjunction with CVPR). [pdf] [CORE dataset]
It's All About the Data
T.L. Berg, A. Sorokin, G. Wang, D.A. Forsyth, D. Hoiem, A. Farhadi, and I.
Endres
Proc. IEEE , Special Issue on Internet Vision, August 2010, 98 (8),
1434-1453.
Recovering
the Spatial Layout of Cluttered Rooms
V. Hedau, D. Hoiem, and D.A. Forsyth
ICCV 2009. [pdf] [project]
Learning Image Similarity from Flickr Groups Using Stochastic Intersection Kernel
Machines
G. Wang, D. Hoiem, and D.A. Forsyth
ICCV 2009. [pdf] [project]
Describing Objects by their Attributes
A. Farhadi, I. Endres, D. Hoiem, and D.A. Forsyth
CVPR 2009. [pdf] [project]
Building Text Features for Object Image Classification
G. Wang, D. Hoiem, and D.A. Forsyth
CVPR 2009. [pdf]
An Empirical Study of Context in Object Detection
S.K. Divvala, D. Hoiem, J.H. Hays, A.A. Efros, and M. Hebert
CVPR 2009. [pdf]
Learning
CRFs using Graph Cuts
M. Szummer, P. Kohli, and D. Hoiem
ECCV 2008. [pdf]
Closing the Loop on Scene Interpretation
D. Hoiem, A.A. Efros, and M. Hebert
CVPR 2008. [pdf]
Putting
Objects in Perspective
D. Hoiem, A.A. Efros, and M. Hebert
IJCV (80), No. 1, October 2008. [pdf]
The final publication is available at www.springerlink.com.
Seeing the World Behind the Image: Spatial Layout for 3D Scene Understanding
D. Hoiem
Doctoral Dissertation, CMU-RI-TR-07-28, Robotics Institute, Carnegie Mellon
University, August 2007. [pdf]
CMU School of
Computer Science Distinguished Dissertation Award
ACM Doctoral Dissertation
Honorable Mention
Learning to Extract Object Boundaries using Motion Cues
A.N. Stein, D. Hoiem, and M. Hebert
ICCV 2007. [pdf]
Recovering Occlusion Boundaries from a Single Image
D. Hoiem, A.N. Stein, A.A. Efros, and M. Hebert
ICCV 2007. [pdf]
Photo Clip Art
J-F. Lalonde, D. Hoiem, A.A. Efros, J. Winn, C. Rother and A. Criminisi
ACM SIGGRAPH 2007. [pdf] [project ]
3D LayoutCRF for Multi-View Object Class Recognition and Segmentation
D. Hoiem, C. Rother, and J. Winn
CVPR 2007. [pdf]
Recovering Surface Layout from an Image
D. Hoiem, A.A. Efros, and M. Hebert
IJCV, Vol. 75, No. 1, October 2007. [pdf]
The final publication is available at www.springerlink.com.
Putting Objects in Perspective
D. Hoiem, A.A. Efros, and M. Hebert
CVPR 2006. Best Paper Award [pdf]
[project]
Opportunistic use of vision to push back the path-planning horizon
B. Nabbe, D. Hoiem, A.A. Efros, and M. Hebert
IROS 2006. [pdf]
Geometric Context from a Single Image
D. Hoiem, A.A. Efros, and M. Hebert
ICCV 2005. [pdf]
[project]
Automatic Photo Pop-up
D. Hoiem, A.A. Efros, and M. Hebert
ACM SIGGRAPH 2005. [pdf] [project]
Vision for Music Identification
Y. Ke, D. Hoiem, and R. Sukthankar
ICCV 2005. [pdf] [project]
SOLAR: Sound Object Localization and Retrieval in Complex Audio Environments
D. Hoiem, Y. Ke, and R. Sukthankar
IEEE International Conference on Acoustics, Speech, and Signal Processing
(ICASSP), 2005. [pdf] [project]
Object-Based Image Retrieval Using the Statistics of Images
D. Hoiem, R. Sukthankar, H. Schneiderman, and L. Huston
CVPR 2004. [pdf] [project]
SnapFind: Brute Force Interactive Image Retrieval
L. Huston, R. Sukthankar, D. Hoiem, and J. Zhang
IEEE International Conference on Image Processing and Graphics, 2004.
[pdf]
Recovering the Spatial Layout of Cluttered Rooms
We propose to model indoor scenes with a 3D cuboid that gives a rough sense of the
global 3D structure and a surface layout that indicates which pixels belong to
each surface. By combining the global and local 3D representations, we are able
to better estimate each. We also can estimate which portions of the 3D scene
are occupied by objects.
Learning Image Similarity
from Flickr Groups Using Stochastic Intersection Kernel Machines
We propose an online learning method for SVMs with Histogram Intersection Kernels
that achieves similar accuracy as batch training but much, much faster. We
apply it estimate which Flickr groups an image is likely to belong to and show
that this estimated membership provides a useful similarity measure.
Describing Objects by their Attributes
We show how, when we learn semantic attributes, we can say what is unusual about
an object or learn to identify objects from verbal description. We also propose a feature selection
method that suppresses attribute correlations through the object (e.g., many
cars have both wheels and are made of metal), so that our attribute predictors
are not confused by correlated attributes (which may lead to good quantitative
accuracy in some tests but poor performance in tasks that require the semantics
to be correct). We also study cross-category generalization, looking at how well attributes trained on one
set of object categories can generalize to a new set.
Building Text Features for Object Image Classification
We propose a text feature representation that is computed by computing histograms
of the tags and groups associated with nearby images in Flickr. This allows improvements in image
classification, providing a good way to leverage massive-scale internet photo
sharing sites.
An Empirical Study of Context in Object Detection
We use a variety of contextual sources to help predict the presence, position, and
size of objects in an image. We
show that this provides substantial gains on object detection on PASCAL 2008,
and we study how context changes the error patterns. We also propose a fairly
comprehensive organization of sources and uses of context in vision.
Learning CRFs Using Graph Cuts
Describes a structured learning approach for learning CRF parameters, performing the inference
step with graph cuts. It is
possible to learn many pairwise potential parameters from dozens of images in
several minutes. We apply the
method to segmentation and pixel labeling and show that it works well in
non-submodular cases (though without the usual optimality guarantee).
Closing the Loop on Scene Interpretation
Describes
an approach to allow processes describing different characteristics of the
scene to interact to provide a more accurate and more cohesive scene
interpretation. Similarly to the
original Barrow and Tenenbaum intrinsic image paper, we use a set of confidence
maps, each describing a particular aspect of the scene, as an interface between
the different processes. This
approach allows many different types of scene analysis algorithms to work
together while keeping the sample complexity low. But it requires that cues are explicitly
defined for each pair of processes, making the algorithm somewhat design-heavy.
Seeing the World Behind the Image (dissertation)
Covers all of my work on surface layout, occlusion recovery, and viewpoint-object
reasoning, with an extended background and discussion of future directions.
Recovering Occlusion Boundaries from a Single Image
We use standard image cues together with surface and depth cues to recover
occlusion boundaries of major free-standing objects from a single image.
Learning to Extract Object Boundaries using Motion Cues
We use appearance (color, texture) and motion cues computed over an
oversegmentation to estimate occlusion boundaries from short video clips.
Recovering Surface Layout from an Image
We extend our ICCV 2005 paper with further description, analysis, and results.
Accuracy is improved and results on indoor images are demonstrated.
Putting Objects in Perspective
We propose an idea for modeling objects and their relationships to each other,
through the viewpoint, and to the surfaces in the scene. With a simple belief
propagation framework, the viewpoint of the camera is recovered with high
accuracy, and object detection performance improves considerably. Our method
can be used in conjunction with almost any local object detector.
The journal version includes derivations of our scene projection approximation
and a new example-based method to recover
viewpoint from the scene gist.
Automatic Photo Pop-up
Describes our method for estimating orientations (ground, vertical, sky) in
outdoor images by robustly estimating the scene structure and learning
appearance-based statistical models of geometry. We show how we can use the
estimated geometry to reconstruct simple 3D models of the scene from a single
image.
Geometric Context from a Single Image
We extend our system from Automatic Photo Pop-up by subclassifying vertical
regions and provide quantitative analysis of the geometric labeling. We also
use the geometric labels as context for object detection, significantly
improving the accuracy. Worth reading if interested in image understanding. If
you've read Automatic Photo Pop-up, much of the new material is in the results
and applications sections (although there are some small changes to our method
described in the other sections).
The journal version (RecoveringSurface Layout from an Image) contains additional description, analysis,
and results. Some tweaks to the
feature set result in improved accuracy, and good results on indoor images are
demonstrated.
Computer Vision for Music Identification
We identify snippets (e.g. ten second clips) of music recorded under noisy
settings by learning a 32-bit representation for the music that can be indexed
for fast retrieval and is robust to typical distortions.
SOLAR: Sound Object Localization and Retrieval in Complex Audio Environments
We learn a discriminative model for distinguishing between sound classes (such
as dog barks) and all other noises. We can then detect and localize certain
sounds in audio, such as that typically found in movies.
Object-Based Image Retrieval Using the Statistics of Images
We learn a semi-naive Bayes model of an object class (e.g. automobiles) from a
few images and retrieve images that contain that object. A feedback loop allows results to
improve. One of the key ideas is
that learning the structure of the data on an unlabeled dataset improves
results when few supervised examples exist.
SnapFind: Brute Force Interactive Image Retrieval
We present a system capable of fast user-guided image retrieval based on local
patches. The emphasis here is on
the system as a whole, which achieves speed due to distributed processing on
local data.