“Tracking the gaze on objects in 3D: how do people really look at the bunny?”
Conference:
Type(s):
Title:
- Tracking the gaze on objects in 3D: how do people really look at the bunny?
Session/Category Title: How people look and move
Presenter(s)/Author(s):
Moderator(s):
Abstract:
We provide the first large dataset of human fixations on physical 3D objects presented in varying viewing conditions and made of different materials. Our experimental setup is carefully designed to allow for accurate calibration and measurement. We estimate a mapping from the pair of pupil positions to 3D coordinates in space and register the presented shape with the eye tracking setup. By modeling the fixated positions on 3D shapes as a probability distribution, we analysis the similarities among different conditions. The resulting data indicates that salient features depend on the viewing direction. Stable features across different viewing directions seem to be connected to semantically meaningful parts. We also show that it is possible to estimate the gaze density maps from view dependent data. The dataset provides the necessary ground truth data for computational models of human perception in 3D.
References:
1. Richard A. Abrams, David E. Meyer, and Sylvan Kornblum. 1989. Speed and accuracy of saccadic eye movements: Characteristics of impulse variability in the oculomotor system. Journal of Experimental Psychology: Human Perception and Performance 15, 3 (1989), 8.Google ScholarCross Ref
2. Sameer Agarwal, Keir Mierle, and Others. {n. d.}. Ceres Solver. http://ceres-solver.org. ({n. d.}).Google Scholar
3. Frank J Aherne, Neil A Thacker, and Peter I Rockett. 1998. The Bhattacharyya metric as an absolute similarity measure for frequency coded data. Kybernetika 34, 4 (1998), 363–368.Google Scholar
4. Amin Banitalebi-Dehkordi, Eleni Nasiopoulos, Mahsa T Pourazad, and Panos Nasiopoulos. 2018. Benchmark 3D eye-tracking dataset for visual saliency prediction on stereoscopic 3D video. arXiv preprint arXiv:1803.04845 (2018).Google Scholar
5. Ali Borji and Laurent Itti. 2013. State-of-the-art in visual attention modeling. IEEE transactions on pattern analysis and machine intelligence 35, 1 (2013), 185–207. Google ScholarDigital Library
6. Ali Borji and Laurent Itti. 2015. Cat2000: A large scale fixation dataset for boosting saliency research. arXiv preprint arXiv:1505.03581 (2015).Google Scholar
7. Abdullah Bulbul, Tolga Capin, Guillaume Lavouè, and Marius Preda. 2011. Assessing Visual Quality of 3-D Polygonal Models. IEEE Signal Processing Magazine 28, 6 (Nov 2011), 80–90.Google ScholarCross Ref
8. Moran Cerf, Jonathan Harel, Wolfgang Einhäuser, and Christof Koch. 2008. Predicting human gaze using low-level saliency combined with face detection. In Advances in neural information processing systems. 241–248. Google ScholarDigital Library
9. Juan J. Cerrolaza, Arantxa Villanueva, and Rafael Cabeza. 2012. Study of Polynomial Mapping Functions in Video-Oculography Eye Trackers. ACM Trans. Comput.-Hum. Interact. 19, 2, Article 10 (July 2012), 25 pages. Google ScholarDigital Library
10. Xiaobai Chen, Abulhair Saparov, Bill Pang, and Thomas Funkhouser. 2012. Schelling Points on 3D Surface Meshes. ACM Trans. Graph. 31, 4, Article 29 (July 2012), 12 pages. Google ScholarDigital Library
11. Helin Dutagaci, Chun Pan Cheung, and Afzal Godil. 2012. Evaluation of 3D interest point detection techniques via human-generated ground truth. The Visual Computer 28, 9 (01 Sep 2012), 901–917. Google ScholarDigital Library
12. Elham Ebrahimi, Bliss M Altenhoff, Christopher C Pagano, and Sabarish V Babu. 2015. Carryover effects of calibration to visual and proprioceptive information on near field distance judgments in 3d user interaction. In 3D User Interfaces (3DUI), 2015 IEEE Symposium on. IEEE, 97–104.Google Scholar
13. Bradley Efron. 1981. Nonparametric estimates of standard error: the jackknife, the bootstrap and other methods. Biometrika 68, 3 (1981), 589–599.Google ScholarCross Ref
14. Kai Essig, Marc Pomplun, and Helge Ritter. 2006. A neural network for 3D gaze recording with binocular eye trackers. International Journal of Parallel, Emergent and Distributed Systems 21, 2 (2006), 79–95.Google ScholarCross Ref
15. Miquel Feixas, Mateu Sbert, and Francisco González. 2009. A Unified Information-theoretic Framework for Viewpoint Selection and Mesh Saliency. ACM Trans. Appl. Percept. 6, 1, Article 1 (Feb. 2009), 23 pages. Google ScholarDigital Library
16. Hayward J Godwin, Tamaryn Menneer, Kyle R Cave, Michael Thaibsyah, and Nick Donnelly. 2015. The effects of increasing target prevalence on information processing during visual search. Psychonomic bulletin & review 22, 2 (2015), 469–475.Google Scholar
17. Jacqueline Gottlieb, Pierre-Yves Oudeyer, Manuel Lopes, and Adrien Baranes. 2013. Information-seeking, curiosity, and attention: computational and neural mechanisms. Trends in Cognitive Sciences 17, 11 (2013), 585 — 593.Google ScholarCross Ref
18. S. Gottschalk, M. C. Lin, and D. Manocha. 1996. OBBTree: A Hierarchical Structure for Rapid Interference Detection. In Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’96). ACM, New York, NY, USA, 171–180. Google ScholarDigital Library
19. Esteban Gutierrez Mlot, Hamed Bahmani, Siegfried Wahl, and Enkelejda Kasneci. 2016. 3D Gaze Estimation Using Eye Vergence. In Proceedings of the International Joint Conference on Biomedical Engineering Systems and Technologies (BIOSTEC 2016). SCITEPRESS – Science and Technology Publications, Lda, Portugal, 125–131. Google ScholarDigital Library
20. Mary Hayhoe and Dana Ballard. 2005. Eye movements in natural behavior. Trends in Cognitive Sciences 9, 4 (2005), 188 — 194.Google ScholarCross Ref
21. John M Henderson, James R Brockmole, Monica S Castelhano, and Michael Mack. 2007. Visual saliency does not account for eye movements during visual search in real-world scenes. Eye movements: A window on mind and brain (2007), 537–562.Google ScholarCross Ref
22. John M Henderson and Andrew Hollingworth. 1999. High-level scene perception. Annual review of psychology 50, 1 (1999), 243–271.Google Scholar
23. Kenneth Holmqvist and Richard Andersson. 2017. Eye tracking: A comprehensive guide to methods, paradigms and measures. Lund: Lund Eye-Tracking Research Institute.Google Scholar
24. Kenneth Holmqvist, Marcus Nyström, Richard Andersson, Richard Dewhurst, Halszka Jarodzka, and Joost Van de Weijer. 2011. Eye tracking: A comprehensive guide to methods and measures. OUP Oxford.Google Scholar
25. Laurent Itti. 2005. Quantifying the contribution of low-level saliency to human eye movements in dynamic scenes. Visual Cognition 12, 6 (2005), 1093–1123.Google ScholarCross Ref
26. Laurent Itti and Ali Borji. 2015. Computational models: Bottom-up and top-down aspects. arXiv preprint arXiv:1510.07748 (2015).Google Scholar
27. L. Itti and C. Koch. 2001. Computational modelling of visual attention. Nature reviews. Neuroscience 2, 3 (March 2001), 194–203.Google Scholar
28. Ming Jiang, Shengsheng Huang, Juanyong Duan, andQi Zhao. 2015. SALICON: Saliency in Context. In The IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
29. Tilke Judd, Frédo Durand, and Antonio Torralba. 2012. A Benchmark of Computational Models of Saliency to Predict Human Fixations. In MIT Technical Report.Google Scholar
30. M. G. Kendall. 1938. A New Measure of Rank Correlation. Biometrika 30, 1/2 (1938), 81–93. http://www.jstor.org/stable/2332226Google ScholarCross Ref
31. Wolf Kienzle, Felix A Wichmann, Matthias O Franz, and Bernhard Schölkopf. 2007. A nonparametric approach to bottom-up visual saliency. In Advances in neural information processing systems. 689–696. Google ScholarDigital Library
32. Youngmin Kim, Amitabh Varshney, David W. Jacobs, and François Guimbretière. 2010. Mesh Saliency and Human Eye Fixations. ACM Trans. Appl. Percept. 7, 2, Article 12 (Feb. 2010), 13 pages. Google ScholarDigital Library
33. Peter Kovesi. 2015. Good Colour Maps: How to Design Them. CoRR abs/1509.03700 (2015). arXiv:1509.03700 http://arxiv.org/abs/1509.03700Google Scholar
34. Eileen Kowler. 2011. Eye movements: The past 25years. Vision Research 51, 13 (2011), 1457 — 1483. Vision Research 50th Anniversary Issue: Part 2.Google ScholarCross Ref
35. Srinivas SS Kruthiventi, Kumar Ayush, and Radhakrishnan Venkatesh Babu. 2017. Deepfix: A fully convolutional neural network for predicting human eye fixations. IEEE Transactions on Image Processing (2017).Google Scholar
36. Matthias Kümmerer, Thomas SA Wallis, and Matthias Bethge. 2016. DeepGaze II: Reading fixations from deep features trained on object recognition. arXiv preprint arXiv:1610.01563 (2016).Google Scholar
37. Manfred Lau, Kapil Dev, Weiqi Shi, Julie Dorsey, and Holly Rushmeier. 2016. Tactile Mesh Saliency. ACM Trans. Graph. 35, 4, Article 52 (July 2016), 11 pages. Google ScholarDigital Library
38. Guillaume Lavoué, Frédéric Cordier, Hyewon Seo, and Mohamed-Chaker Larabi. 2018. Visual Attention for Rendered 3D Shapes. Computer Graphics Forum 37, 2 (2018), 191–203.Google ScholarCross Ref
39. Guillaume Lavoué and Massimilano Corsini. 2010. A Comparison of Perceptually-Based Metrics for Objective Evaluation of Geometry Processing. IEEE Transactions on Multimedia 12, 7 (Nov 2010), 636–649. Google ScholarDigital Library
40. Chang Ha Lee, Amitabh Varshney, and David W. Jacobs. 2005. Mesh Saliency. ACM Trans. Graph. 24, 3 (July 2005), 659–666.Google ScholarDigital Library
41. Li-Jia Li, Hao Su, Li Fei-Fei, and Eric P Xing. 2010. Object bank: A high-level image representation for scene classification & semantic feature sparsification. In Advances in neural information processing systems. 1378–1386. Google ScholarDigital Library
42. David Lindlbauer, Joerg Mueller, and Marc Alexa. 2016. Changing the Appearance of Physical Interfaces Through Controlled Transparency. In Proceedings of the 29th Annual Symposium on User Interface Software and Technology (UIST ’16). ACM, New York, NY, USA, 425–435. Google ScholarDigital Library
43. Simon P Liversedge, Keith Rayner, Sarah J White, John M Findlay, and Eugene McSorley. 2006. Binocular coordination of the eyes during reading. Current Biology 16, 17 (2006), 1726–1729.Google ScholarCross Ref
44. Mohsen Mansouryar, Julian Steil, Yusuke Sugano, and Andreas Bulling. 2016. 3D Gaze Estimation from 2D Pupil Positions on Monocular Head-mounted Eye Trackers. In Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications (ETRA ’16). ACM, New York, NY, USA, 197–200. Google ScholarDigital Library
45. Susana Martinez-Conde, Stephen L. Macknik, and David H. Hubel. 2004. The role of fixational eye movements in visual perception. Nature Reviews Neuroscience 5 (01 Mar 2004), 229 EP -. Review Article.Google Scholar
46. Kameo Matusita. 1967. On the notion of affinity of several distributions and some of its applications. Annals of the Institute of Statistical Mathematics 19, 1 (1967), 181.Google ScholarCross Ref
47. Michael Maurus, Jan Hendrik Hammer, and Jürgen Beyerer. 2014. Realistic Heatmap Visualization for Interactive Analysis of 3D Gaze Data. In Proceedings of the Symposium on Eye Tracking Research and Applications (ETRA ’14). ACM, New York, NY, USA, 295–298. Google ScholarDigital Library
48. Mark Meyer, Alan Barr, Haeyoung Lee, and Mathieu Desbrun. 2002. Generalized Barycentric Coordinates on Irregular Polygons. Journal of Graphics Tools 7, 1 (2002), 13–22. Google ScholarDigital Library
49. Richard A Monty, Dennis F Fisher, and John W Senders. 2017. Eye movements: cognition and visual perception. Routledge.Google Scholar
50. Niels Christian Nilsson, Stefania Serafin, Frank Steinicke, and Rolf Nordahl. 2018. Natural walking in virtual reality: A review. Computers in Entertainment (CIE) 16, 2 (2018), 8. Google ScholarDigital Library
51. Antje Nuthmann and Reinhold Kliegl. 2009. An examination of binocular reading fixations based on sentence corpus data. Journal of Vision 9, 5 (2009), 31–31.Google ScholarCross Ref
52. Thies Pfeiffer. 2012. Measuring and Visualizing Attention in Space with 3D Attention Volumes. In Proceedings of the Symposium on Eye Tracking Research and Applications (ETRA ’12). ACM, New York, NY, USA, 29–36. Google ScholarDigital Library
53. Thies Pfeiffer, Marc E. Latoschik, and Ipke Wachsmuth. 2008. Evaluation of Binocular Eye Trackers and Algorithms for 3D Gaze Interaction in Virtual Reality Environments. JVRB – Journal of Virtual Reality and Broadcasting 5(2008), 16 (2008).Google Scholar
54. Thies Pfeiffer and Patrick Renner. 2014. Eyesee3d: A low-cost approach for analyzing mobile 3d eye tracking data using computer vision and augmented reality technology. In Proceedings of the Symposium on Eye Tracking Research and Applications. ACM, 369–376. Google ScholarDigital Library
55. Thies Pfeiffer, Patrick Renner, and Nadine Pfeiffer-Lessmann. 2016. EyeSee3D 2.0: Model-based Real-time Analysis of Mobile Eye-tracking in Static and Dynamic Three-dimensional Scenes. In Proceedings of the Ninth Biennial ACM Symposium on Eye Tracking Research & Applications (ETRA ’16). ACM, New York, NY, USA, 189–196. Google ScholarDigital Library
56. Keith Rayner. 2009. Eye movements and attention in reading, scene perception, and visual search. The quarterly journal of experimental psychology 62, 8 (2009), 1457–1506.Google Scholar
57. Ryan V Ringer, Zachary Throneburg, Aaron P Johnson, Arthur F Kramer, and Lester C Loschky. 2016. Impairing the useful field of view in natural scenes: Tunnel vision versus general interference. Journal of Vision 16, 2 (2016), 7–7.Google ScholarCross Ref
58. Do A. Robinson. 1965. The mechanics of human smooth pursuit eye movement. The Journal of Physiology 1801, 3 (1965), 569–591.Google ScholarCross Ref
59. Philip Schneider and David H Eberly. 2002. Geometric tools for computer graphics. Elsevier. Google ScholarDigital Library
60. Johannes L. Schönberger and Jan-Michael Frahm. 2016. Structure-from-Motion Revisited. In 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 4104–4113.Google Scholar
61. Philip Shilane and Thomas Funkhouser. 2007. Distinctive Regions of 3D Surfaces. ACM Trans. Graph. 26, 2, Article 7 (June 2007). Google ScholarDigital Library
62. Vincent Sitzmann, Ana Serrano, Amy Pavel, Maneesh Agrawala, Diego Gutierrez, Belen Masia, and Gordon Wetzstein. 2017. How do people explore virtual environments? IEEE Transactions on Visualization and Computer Graphics (2017). Google ScholarDigital Library
63. Ran Song, Yonghuai Liu, Ralph R. Martin, and Paul L. Rosin. 2014a. Mesh Saliency via Spectral Processing. ACM Trans. Graph. 33, 1, Article 6 (Feb. 2014), 17 pages. Google ScholarDigital Library
64. Ran Song, Yonghuai Liu, Ralph R. Martin, and Paul L. Rosin. 2014b. Mesh Saliency via Spectral Processing. ACM Trans. Graph. 33, 1, Article 6 (Feb. 2014), 17 pages. Google ScholarDigital Library
65. Flora P. Tasse, Jiri Kosinka, and Neil Dodgson. 2015. Cluster-Based Point Set Saliency. In 2015 IEEE International Conference on Computer Vision (ICCV). 163–171. Google ScholarDigital Library
66. Benjamin W Tatler, Mary M Hayhoe, Michael F Land, and Dana H Ballard. 2011. Eye guidance in natural vision: Reinterpreting salience. Journal of vision 11, 5 (2011), 5–5.Google ScholarCross Ref
67. Godfried T Toussaint. 1974. Some properties of Matusita’s measure of affinity of several distributions. Annals of the Institute of Statistical Mathematics 26, 1 (1974), 389–394.Google ScholarCross Ref
68. Mélodie Vidal, Andreas Bulling, and Hans Gellersen. 2012. Detection of Smooth Pursuits Using Eye Movement Shape Features. In Proceedings of the Symposium on Eye Tracking Research and Applications (ETRA ’12). ACM, New York, NY, USA, 177–180. Google ScholarDigital Library
69. Michael Wagner, Walter H. Ehrenstein, and Thomas V. Papathomas. 2009. Vergence in reverspective: Percept-driven versus data-driven eye movement control. Neuroscience Letters 449, 2 (2009), 142 — 146.Google ScholarCross Ref
70. Rui I. Wang, Brandon Pelfrey, Andrew T. Duchowski, and Donald H. House. 2014. Online 3D Gaze Localization on Stereoscopic Displays. ACM Trans. Appl. Percept. 11, 1, Article 3 (April 2014), 21 pages. Google ScholarDigital Library
71. Wenguan Wang, Jianbing Shen, Yizhou Yu, and Kwan-Liu Ma. 2017b. Stereoscopic Thumbnail Creation via Efficient Stereo Saliency Detection. IEEE Transactions on Visualization and Computer Graphics 23, 8 (Aug 2017), 2014–2027.Google ScholarDigital Library
72. Xi Wang, Kenneth Holmqvist, and Marc Alexa. 2018. The recorded mean point of vergence is biased (In preparation). (2018).Google Scholar
73. Xi Wang, David Lindlbauer, Christian Lessig, and Marc Alexa. 2017a. Accuracy of Monocular Gaze Tracking on 3D Geometry. In Eye Tracking and Visualization, Michael Burch, Lewis Chuang, Brian Fisher, Albrecht Schmidt, and Daniel Weiskopf (Eds.). Springer International Publishing, Cham, 169–184.Google Scholar
74. Xi Wang, David Lindlbauer, Christian Lessig, Marianne Maertens, and Marc Alexa. 2016. Measuring the Visual Salience of 3D Printed Objects. IEEE Computer Graphics and Applications 36, 4 (July 2016), 46–55. Dagmar A. Wismeijer, Raymond van Ee, and Casper J. Erkelens. 2008. Depth cues, rather than perceived depth, govern vergence. Experimental Brain Research 184, 1 (01 Jan 2008), 61–70.Google ScholarDigital Library
75. Juan Xu, Ming Jiang, Shuo Wang, Mohan S Kankanhalli, and Qi Zhao. 2014. Predicting human gaze beyond pixels. Journal of vision 14, 1 (2014), 28–28.Google ScholarCross Ref

