“Micro Perceptual Human Computation for Visual Tasks” by Gingold, Shamir and Cohen-Or

  • ©Yotam Gingold, Ariel Shamir, and Daniel Cohen-Or

Conference:


Type(s):


Title:

    Micro Perceptual Human Computation for Visual Tasks

Presenter(s)/Author(s):



Abstract:


    Human Computation (HC) utilizes humans to solve problems or carry out tasks that are hard for pure computational algorithms. Many graphics and vision problems have such tasks. Previous HC approaches mainly focus on generating data in batch, to gather benchmarks, or perform surveys demanding nontrivial interactions. We advocate a tighter integration of human computation into online, interactive algorithms. We aim to distill the differences between humans and computers and maximize the advantages of both in one algorithm. Our key idea is to decompose such a problem into a massive number of very simple, carefully designed, human micro-tasks that are based on perception, and whose answers can be combined algorithmically to solve the original problem. Our approach is inspired by previous work on micro-tasks and perception experiments. We present three specific examples for the design of micro perceptual human computation algorithms to extract depth layers and image normals from a single photograph, and to augment an image with high-level semantic information such as symmetry.

References:


    Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., and Susstrunk, S. 2010. Superpixels. Tech. rep., EPFL.Google Scholar
    Adar, E. 2011. Why I hate Mechanical Turk research. In Proceedings of the CHI’ Workshop on Crowdsourcing and Human Computation.Google Scholar
    Adomavicius, G. and Tuzhilin, A. 2005. Toward the next generation of recommender systems: A survey of the state-of-the-art and possible extensions. Trans. Knowl. Data Engin. 17, 734–749. Google ScholarDigital Library
    Ahn, L. V., Blum, M., Hopper, N. J., and Langford, J. 2003. CAPTCHA: Using hard AI problems for security. In Proceedings of the Conference on Advances in Cryptology (Eurocrypt). 294–311. Google ScholarDigital Library
    Amazon. 2005. Mechanical turk. http://www.mturk.com/.Google Scholar
    Amer, M., Raich, R., and Todorovic, S. 2010. Monocular extraction of 2.1D sketch. In Proceedings of the International Conference on Image Processing (ICIP). 3437–3440.Google Scholar
    Armbrust, M., Fox, A., Griffith, R., Joseph, A. D., Katz, R., Konwinski, A., Lee, G., Patterson, D., Rabkin, A., Stoica, I., and Zaharia, M. 2010. A view of cloud computing. Comm. ACM 53, 50–58. Google ScholarDigital Library
    Assa, J. and Wolf, I. 2007. Diorama construction from a single image. In Proceedings of the Eurographics Conference. Eurographics Association.Google Scholar
    Belhumeur, P. N., Kriegman, D. J., and Yuille, A. L. 1997. The bas-relief ambiguity. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1060–1066. Google ScholarDigital Library
    Bernstein, M. S., Brandt, J., Miller, R. C., and Karger, D. R. 2011. Crowds in two seconds: Enabling real-time crowd-powered interfaces. In Proceedings of the Annual ACM Symposium on User Interface Software and Technology (UIST). 32–42. Google ScholarDigital Library
    Bernstein, M. S., Little, G., Miller, R. C., Hartmann, B., Ackerman, M. S., Karger, D. R., Crowell, D., and Panovich, K. 2010. Soylent: A word processor with a crowd inside. In Proceedings of the Annual ACM Symposium on User Interface Software and Technology (UIST). 313–322. Google ScholarDigital Library
    Bhat, P., Zitnick, C. L., Cohen, M., and Curless, B. 2010. GradientShop: A gradient-domain optimization framework for image and video filtering. ACM Trans. Graph. 29, 10:1–10:14. Google ScholarDigital Library
    Bigham, J. P., Jayant, C., Ji, H., Little, G., Miller, A., Miller, R. C., Miller, R., Tatarowicz, A., White, B., White, S., and Yeh, T. 2010. VizWiz: Nearly real-time answers to visual questions. In Proceedings of the Annual ACM Symposium on User Interface Software and Technology (UIST). 333–342. Google ScholarDigital Library
    Branson, S., Wah, C., Babenko, B., Schroff, F., Welinder, P., Perona, P., and Belongie, S. 2010. Visual recognition with humans in the loop. In Proceedings of the European Conference on Computer Vision (ECCV). Google ScholarDigital Library
    Chen, P.-C., Hays, J. H., Lee, S., Park, M., and Liu, Y. 2007. A quantitative evaluation of symmetry detection algorithms. Tech. rep. CMU-RI-TR-07-36, Robotics Institute, Pittsburgh, PA.Google Scholar
    Chen, X., Golovinskiy, A., and Funkhouser, T. 2009. A benchmark for 3D mesh segmentation. ACM Trans. Graph. 28, 3. Google ScholarDigital Library
    Chilton, L. B., Horton, J. J., Miller, R. C., and Azenkot, S. 2010. Task search in a human computation market. In Proceedings of the ACM SIGKDD Workshop on Human Computation (HCOMP). 1–9. Google ScholarDigital Library
    Cole, F., Sanik, K., DeCarlo, D., Finkelstein, A., Funkhouser, T., Rusinkiewicz, S., and Singh, M. 2009. How well do line drawings depict shape? ACM Trans. Graph. 28, 3. Google ScholarDigital Library
    Comaniciu, D. and Meer, P. 2002. Mean shift: A robust approach toward feature space analysis. IEEE Trans. Pattern Anal. Mach. Intell. 24, 5, 603–619. Google ScholarDigital Library
    Cornelius, H., Perd’och, M., Matas, J., and Loy, G. 2007. Efficient symmetry detection using local affine frames. In Proceedings of the Scandinavian Conference on Image Analysis (SCIA). 152–161. Google ScholarDigital Library
    CrowdFlower. 2007. Crowdflower. http://crowdflower.com/.Google Scholar
    Durou, J.-D., Falcone, M., and Sagona, M. 2008. Numerical methods for shape-from-shading: A new survey with benchmarks. Comput. Vis. Image Understand. 109, 22–43. Google ScholarDigital Library
    Faridani, S., Hartmann, B., and Ipeirotis, P. 2011. What’s the right price? Pricing tasks for finishing on time. In Proceedings of the AAAI Workshop on Human Computation (HCOMP).Google Scholar
    Goldberg, D., Nichols, D., Oki, B. M., and Terry, D. 1992. Using collaborative filtering to weave an information tapestry. Comm. ACM 35, 61–70. Google ScholarDigital Library
    Grier, D. A. 2005. When Computers Were Human. Princeton University Press. Google ScholarDigital Library
    Hayes, B. 2008. Cloud computing. Comm. ACM 51, 7, 9–11. Google ScholarDigital Library
    Healy, A. F., Proctor, R. W., and Weiner, I. B., Eds. 2003. Experimental Psychology. Handbook of Psychology. Vol. 4. Wiley.Google Scholar
    Heer, J. and Bostock, M. 2010. Crowdsourcing graphical perception: Using mechanical turk to assess visualization design. In Proceedings of the ACM Conference on Human Factors in Computing Systems (CHI). 203–212. Google ScholarDigital Library
    Hoiem, D., Efros, A. A., and Hebert, M. 2005. Automatic photo pop-up. http://www.cs.uiuc.edu/homes/dhoiem/projects/popup/. Google ScholarDigital Library
    Huang, E., Zhang, H., Parkes, D. C., Gajos, K. Z., and Chen, Y. 2010. Toward automatic task design: A progress report. In Proceedings of the ACM SIGKDD Workshop on Human Computation (HCOMP). Google ScholarDigital Library
    Ipeirotis, P. G. 2010. Analyzing the amazon mechanical turk marketplace. ACM Crossroads 17, 16–21. Google ScholarDigital Library
    Ipeirotis, P. G., Provost, F., and Wang, J. 2010. Quality management on amazon mechanical turk. In Proceedings of the ACM SIGKDD Workshop on Human Computation (HCOMP). Google ScholarDigital Library
    Kalogerakis, E., Hertzmann, A., and Singh, K. 2010. Learning 3D mesh segmentation and labeling. ACM Trans. Graph. 29, 3. Google ScholarDigital Library
    Koenderink, J. J., van Doorn, A. J., and Kappers, A. M. L. 1992. Surface perception in pictures. Percept. Psycophys. 52, 5, 487–496.Google ScholarCross Ref
    Koenderink, J. J., van Doorn, A. J., Kappers, A. M. L., and Todd, J. T. 2001. Ambiguity and the ‘mental eye’ in pictorial relief. Percept. 30, 431–448.Google ScholarCross Ref
    Levinshtein, A., Stere, A., Kutulakos, K. N., Fleet, D. J., Dickinson, S. J., and Siddiqi, K. 2009. TurboPixels: Fast superpixels using geometric flows. IEEE Trans. Pattern Anal. Mach. Intell. 31, 2290–2297. Google ScholarDigital Library
    Little, G., Chilton, L. B., Goldman, M., and Miller, R. C. 2010. TurKit: Human computation algorithms on Mechanical Turk. In Proceedings of the Annual ACM Symposium on User Interface Software and Technology (UIST). Google ScholarDigital Library
    Liu, Y., Hel-Or, H., Kaplan, C. S., and Gool, L. V. 2010. Computational symmetry in computer vision and computer graphics. Found. Trends Comput. Graph. Vis. 5, 1–195.Google ScholarCross Ref
    Mason, W. and Suri, S. 2011. Conducting behavioral research on amazon’s mechanical turk. Behav. Res. Methods 44, 1.Google ScholarCross Ref
    Mason, W. and Watts, D. J. 2010. Financial incentives and the “performance of crowds”. SIGKDD Explor. Newslett. 11, 100–108. Google ScholarDigital Library
    Oh, B. M., Chen, M., Dorsey, J., and Durand, F. 2001. Image-Based modeling and photo editing. In Proceedings of the ACM SIGGRAPH Conference. 433–442. Google ScholarDigital Library
    Quinn, A. J. and Bederson, B. B. 2011. Human computation: A survey and taxonomy of a growing field. In Proceedings of the ACM SIGCHI Conference. 1403–1412. Google ScholarDigital Library
    Russel, B. C., Torralba, A., Murphy, K. P., and Freeman, W. T. 2008. LabelMe: A database and Web-based tool for image annotation. Int. J. Comput. Vis. 77, 1–3, 157-173. Google ScholarDigital Library
    Samasource. 2008. Samasource. http://www.samasource.org/.Google Scholar
    Saxena, A., Sun, M., and Ng, A. Y. 2009. Make3D: Learning 3D scene structure from a single still image. IEEE Trans. Pattern Anal. Mach. Intell. 31, 824–840. Google ScholarDigital Library
    Schmidt, R., Khan, A., Kurtenbach, G., and Singh, K. 2009. On expert performance in 3D curve-drawing tasks. In Proceedings of the Eurographics Workshop on Sketch-Based Interfaces and Modeling (SBIM). 133–140. Google ScholarDigital Library
    Shahaf, D. and Horvitz, E. 2010. Generalized task markets for human and machine computation. In Proceedings of the National Conference on Artificial Intelligence.Google Scholar
    Sorokin, A., Berenson, D., Srinivasa, S., and Hebert, M. 2010. People helping robots helping people: Crowdsourcing for grasping novel objects. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS).Google Scholar
    Spiro, I., Taylor, G., Williams, G., and Bregler, C. 2010. Hands by hand: Crowd-Sourced motion tracking for gesture annotation. In Proceedings of the Computer Vision and Pattern Recognition Workshops (CVPRW). 17–24.Google Scholar
    Sykora, D., Sedlacek, D., Jinchao, S., Dingliana, J., and Collins, S. 2010. Adding depth to cartoons using sparse depth (in)equalities. Comput. Graph. Forum 29, 2.Google ScholarCross Ref
    Talton, J. O., Gibson, D., Yang, L., Hanrahan, P., and Koltun, V. 2009. Exploratory modeling with collaborative design spaces. ACM Trans. Graph. 28, 167:1–167:10. Google ScholarDigital Library
    Txteagle. 2009. Txteagle. http://txteagle.com/.Google Scholar
    Ventura, J., DiVerdi, S., and Hollerer, T. 2009. A sketch-based interface for photo pop-up. In Proceedings of the Eurographics Workshop on Sketch-Based Interfaces and Modeling (SBIM). Google ScholarDigital Library
    von Ahn, L. 2005. Human computation. Ph.D. thesis, Carnegie Mellon University, Pittsburgh, PA. Google ScholarDigital Library
    von Ahn, L. and Dabbish, L. 2004. Labeling images with a computer game. In Proceedings of the ACM SIGCHI Conference. 319–326. Google ScholarDigital Library
    von Ahn, L. and Dabbish, L. 2008. General techniques for designing games with a purpose. Comm. ACM 51, 8, 58–67. Google ScholarDigital Library
    Wu, T.-P., Sun, J., Tang, C.-K., and Shum, H.-Y. 2008. Interactive normal reconstruction from a single image. ACM Trans. Graph. 27, 119:1–119:9. Google ScholarDigital Library
    Yuen, J., Russell, B. C., Liu, C., and Torralba, A. 2009. LabelMe video: Building a video database with human annotations. In Proceedings of the IEEE 12th International Conference on Computer Vision (ICCV). 1451–1458.Google Scholar


ACM Digital Library Publication:



Overview Page: