Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

27
LabelMe: Online Image Annotation and Applications Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT
  • date post

    19-Dec-2015
  • Category

    Documents

  • view

    212
  • download

    0

Transcript of Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

Page 1: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

LabelMe: Online Image Annotation and Applications

Proceedings of the IEEE 2010Antonio Torralba, MIT

Jenny Yuen, MITBryan C. Russell, MIT

Page 2: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

OutlineIntroductionWeb Annotation and Data Statistics

-A. Data Set Evolution and Distribution of Objects-B. Study of Online Labelers

The Space of LabelMe Images-A. Distribution of Scene Types-B. The Space of Images-C. Recognition by Scene Alignment

Beyond 2-D Images-A. From Annotations to 3-D-B. Video Annotation

Conclusion

Page 3: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

IntroductionFrom small data set to large data setIn 2005, an online tool LabelMe is

createdLabelMe provides functionalities for

drawing polygons to outline the spatioal extent of object in images

Page 4: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

Web Annotation and Data StatisticsA. Data Set Evolution and Distribution of

ObjectsB. Study of Online Labelers

Page 5: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

The Features of LabelMe DatabaseObject class recognitionLearning about objects embedded in a sceneHigh-quality labelingMany diverse object classesMany diverse imagesMany noncopyrighted imagesOpen and dynamic

Page 6: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

Data Set Evolution and Distribution of Objects(1/2)

(a)Number of annotated objects(b)Number of images with at least one annotated object(c)Number of unique object descriptions

Page 7: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

Data Set Evolution and Distribution of Objects(2/2)

The observation suggests two learning problems:1) Learning from few training samples(N->1)2) Learning with millions of samples(N->)

Page 8: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

Study of Online LabelersFrom July 7, 2008

to March 19, 2009

(a)Number of new annotations provided by individual users(b)Distribution of the length of time it takes to label an object

Page 9: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

The Space of LabelMe ImagesA. Distribution of Scene TypesB. The Space of ImagesC. Recognition by Scene Alignment

Page 10: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

Distribution of Scene Types(1/1)Let’s start from cognitive psychologyNext we study how many configurations of 4

objects are presentedThe distribution follows a power law

(n=1,2,4,8)

Page 11: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

The Space of Images(1/3)Define “Semantic Distance”:

1) Assign each pixel to a single object category2) Divide the image into NN nonoverlapping windows and build histogram for each window3) Use spatial pyramid matching over object labels

Page 12: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

Process of Defining Semantic Distance(2/3)

Page 13: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

The Space of Images(3/3)A visualization of 12201 images that are fully

annotated

Page 14: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

Recognition by Scene AlignmentWhen giving a new image as input, we use GIST

descriptor to compute the distance

Page 15: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

The Power of a Large Scale DatabaseAn algorithm provides an upper bound:

find the nearest neighbor of input image as a labeling of the input image

This result gives us a hint about “How many more images do we need to label”?

Page 16: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

Beyond 2-D ImagesA. From Annotations to 3-DB. Video Annotation

Page 17: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

From Annotations to 3-D(1/7)The label of objects now contains some

implicit information observed by analyzing the overlap between object boundaries

Object types Ground Objects

Standing Objects

Attached objects

Relations between objects

Supported-by

Part-of

Page 18: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

From Annotations to 3-D(2/7)Learning the relationship between objects

1) part-of : evaluate the frequency of high relative overlap between polygons2)supported-by : have the bottom part of its polygon live inside the supporting object

Page 19: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

From Annotations to 3-D(3/7)

Page 20: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

From Annotations to 3-D(4/7)Reconstructing a 3D model for input image

1) define object type2) define polygon edge type3) compute the real distance between objects

Object type Edge type

Ground objects(green)

Contact(white)

Standing objects(red)

Attached(gray)

Attached objects(yellow)

Occlusion(black)

Page 21: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

From Annotations to 3-D(5/7)

Page 22: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

From Annotations to 3-D(6/7)The more labeling makes the quality betterHowever, if the labeling goes wrong

Page 23: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

From Annotations to 3-D(7/7)

Page 24: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

Video Annotation(1/1)

Page 25: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

ConclusionA web-based tool that allows the labeling of

objects and their location in imagesLabelMe has collected a large annotated

database of images with many different scene and object class

LabelMe can recover the 3-D description of an image

The next goal is expending the database of video and offering a promising direction of computer vision and computer graphics

Page 26: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

References

Page 27: Proceedings of the IEEE 2010 Antonio Torralba, MIT Jenny Yuen, MIT Bryan C. Russell, MIT.

References

There are a lot more references …