How do humans sketch objects?

Mathias Eitz; James Hays; Marc Alexa

“How do humans sketch objects?” by Eitz, Hays and Alexa

Next: “How do people edit light fields?” by... »

« Previous: “How Digital Worlds Today Can Make the Real...

Conference:

SIGGRAPH 2012

Type(s):

Technical Papers

Title:

How do humans sketch objects?

Presenter(s)/Author(s):

Mathias Eitz

James Hays

Marc Alexa

Abstract:

Humans have used sketching to depict our visual world since prehistoric times. Even today, sketching is possibly the only rendering technique readily available to all humans. This paper is the first large scale exploration of human sketches. We analyze the distribution of non-expert sketches of everyday objects such as ‘teapot’ or ‘car’. We ask humans to sketch objects of a given category and gather 20,000 unique sketches evenly distributed over 250 object categories. With this dataset we perform a perceptual study and find that humans can correctly identify the object category of a sketch 73% of the time. We compare human performance against computational recognition methods. We develop a bag-of-features sketch representation and use multi-class support vector machines, trained on our sketch dataset, to classify sketches. The resulting recognition method is able to identify unknown sketches with 56% accuracy (chance is 0.4%). Based on the computational model, we demonstrate an interactive sketch recognition system. We release the complete crowd-sourced dataset of sketches to the community.

References:

1. Chalechale, A., Naghdy, G., and Mertins, A. 2005. Sketch-based image matching using angular partitioning. IEEE Trans. Systems, Man and Cybernetics, Part A 35, 1, 28–41. Google ScholarDigital Library
2. Chen, T., Cheng, M., Tan, P., Shamir, A., and Hu, S. 2009. Sketch2Photo: internet image montage. ACM Trans. Graph. (Proc. SIGGRAPH ASIA) 28, 5, 124:1–124:10. Google ScholarDigital Library
3. Datta, R., Joshi, D., Li, J., and Wang, J. 2008. Image retrieval: ideas, influences, and trends of the new age. ACM Computing Surveys 40, 2, 1–60. Google ScholarDigital Library
4. Dixon, D., Prasad, M., and Hammond, T. 2010. iCanDraw?: using sketch recognition and corrective feedback to assist a user in drawing human faces. In Proc. Int’l. Conf. on Human Factors in Computing Systems, 897–906. Google ScholarDigital Library
5. Eitz, M., Hildebrand, K., Boubekeur, T., and Alexa, M. 2011. Sketch-based image retrieval: benchmark and bag-of-features descriptors. IEEE Trans. Visualization and Computer Graphics 17, 11, 1624–1636. Google ScholarDigital Library
6. Eitz, M., Richter, R., Hildebrand, K., Boubekeur, T., and Alexa, M. 2011. Photosketcher: interactive sketch-based image synthesis. IEEE Computer Graphics and Applications 31, 6, 56–66. Google ScholarDigital Library
7. Eitz, M., Richter, R., Boubekeur, T., Hildebrand, K., and Alexa, M. 2012. Sketch-based shape retrieval. ACM Trans. Graph. (Proc. SIGGRAPH) 31, 4. to appear. Google ScholarDigital Library
8. Everingham, M., Van Gool, L., Williams, C. K. I., Winn, J., and Zisserman, A. 2010. The PASCAL visual object classes (VOC) challenge. Int’l. Journal of Computer Vision 88, 2, 303–338. Google ScholarDigital Library
9. Fu, H., Zhou, S., Liu, L., and Mitra, N. 2011. Animated construction of line drawings. ACM Trans. Graph. (Proc. SIGGRAPH ASIA) 30, 6, 133:1–133:10. Google ScholarDigital Library
10. Garland, M., and Heckbert, P. 1997. Surface simplification using quadric error metrics. in Proc. SIGGRAPH, 209–216. Google ScholarDigital Library
11. Georgescu, B., Shimshoni, I., and Meer, P. 2003. Mean shift based clustering in high dimensions: a texture classification example. in IEEE Int’l. Conf. Computer Vision, 456–463. Google ScholarDigital Library
12. Griffin, G., Holub, A., and Perona, P. 2007. Caltech-256 object category dataset. Tech. rep., California institute of Technology.Google Scholar
13. Hammond, T., and Davis, R. 2005. LADDER, a sketching language for user interface developers. Computers & Graphics 29, 4, 518–532. Google ScholarDigital Library
14. Herot, C. F. 1976. Graphical input through machine recognition of sketches. Computer Graphics (Proc. SIGGRAPH) 10, 2, 97–102. Google ScholarDigital Library
15. LaViola Jr., J. J., and Zeleznik, R. 2007. MathPad: a system for the creation and exploration of mathematical sketches. ACM Trans. Graph. (Proc. SIGGRAPH) 23, 3, 432–440. Google ScholarDigital Library
16. Lazebnik, S., Schmid, C., and Ponce, J. 2006. Beyond bags of features: spatial pyramid matching for recognizing natural scene categories. In IEEE Conf. Computer Vision and Pattern Recognition, 2169–2178. Google ScholarDigital Library
17. Lee, Y., Zitnick, C., and Cohen, M. 2011. ShadowDraw: real-time user guidance for freehand drawing. ACM Trans. Graph. (Proc. SIGGRAPH) 30, 4, 27:1–27:10. Google ScholarDigital Library
18. Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. Int’l. Journal of Computer Vision 60, 2, 91–110. Google ScholarDigital Library
19. Ouyang, T., and Davis, R. 2011. ChemInk: a natural real-time recognition system for chemical drawings. In Proc. Int’l. Conf. Intelligent User Interfaces, 267–276. Google ScholarDigital Library
20. Paulson, B., and Hammond, T. 2008. PaleoSketch: accurate primitive sketch recognition and beautification. In Proc. Int’l. Conf. Intelligent User Interfaces, 1–10. Google ScholarDigital Library
21. Philbin, J., Chum, O., Isard, M., Sivic, J., and Zisserman, A. 2008. Lost in quantization: improving particular object retrieval in large scale image databases. In IEEE Conf. Computer Vision and Pattern Recognition, 1–8.Google Scholar
22. Russell, B., Torralba, A., Murphy, K., and Freeman, W. 2008. LabelMe: a database and web-based tool for image annotation. Int’l Journal of Computer Vision 77, 1, 157–173. Google ScholarDigital Library
23. Samet, H. 2006. Foundations of multidimensional and metric data structures. Morgan Kaufmann. Google ScholarDigital Library
24. Schölkopf, B., and Smola, A. 2002. Learning with kernels. MIT Press.Google Scholar
25. Sezgin, T. M., Stahovich, T., and Davis, R. 2001. Sketch based interfaces: early processing for sketch understanding. In Workshop on Perceptive User Interfaces, 1–8. Google ScholarDigital Library
26. Shilane, P., Min, P., Kazhdan, M., and Funkhouser, T. 2004. The Princeton Shape Benchmark. In Shape Modeling International, 167–178. Google ScholarDigital Library
27. Shrivastava, A., Malisiewicz, T., Gupta, A., and Efros, A. A. 2011. Data-driven visual similarity for cross-domain image matching. ACM Trans. Graph.. (Proc. SIGGRAPH ASIA) 30, 6, 154:1–154:10. Google ScholarDigital Library
28. Sivic, J., and Zisserman, A. 2003. Video Google: a textretrieval approach to object matching in videos. In IEEE Int’l. Conf. Computer Vision, 1470–1477. Google ScholarDigital Library
29. Sutherland, I. 1964. SketchPad: a man-machine graphical communication system. In Proc. AFIPS, 323–328. Google ScholarDigital Library
30. van der Maaten, L., and Hinton, G. 2008. Visualizing data using t-SNE. Journal of Machine Learning Research 9, 2579–2605.Google Scholar
31. Walther, D., Chai, B., Caddigan, E., Beck, D., and FeiFei, L. 2011. Simple line drawings suffice for functional MRI decoding of natural scene categories. Proc. National Academy of Sciences 108, 23, 9661–9666.Google ScholarCross Ref
32. Xiao, J., Hays, J., Ehinger, K. A., Oliva, A., and Torralba, A. 2010. SUN database: large-scale scene recognition from abbey to zoo. In Proc. IEEE Conf. Computer Vision and Pattern Recognition, 3485–3492.Google Scholar

ACM Digital Library Publication: