“Beyond blur: real-time ventral metamers for foveated rendering” by Walton, Dos Anjos, Friston, Swapp, Akşit, et al. …

  • ©David R. Walton, Rafael Kuffner Dos Anjos, Sebastian Friston, David Swapp, Kaan Akşit, Anthony Steed, and Tobias Ritschel




    Beyond blur: real-time ventral metamers for foveated rendering



    To peripheral vision, a pair of physically different images can look the same. Such pairs are metamers relative to each other, just as physically-different spectra of light are perceived as the same color. We propose a real-time method to compute such ventral metamers for foveated rendering where, in particular for near-eye displays, the largest part of the framebuffer maps to the periphery. This improves in quality over state-of-the-art foveation methods which blur the periphery. Work in Vision Science has established how peripheral stimuli are ventral metamers if their statistics are similar. Existing methods, however, require a costly optimization process to find such metamers. To this end, we propose a novel type of statistics particularly well-suited for practical real-time rendering: smooth moments of steerable filter responses. These can be extracted from images in time constant in the number of pixels and in parallel over all pixels using a GPU. Further, we show that they can be compressed effectively and transmitted at low bandwidth. Finally, computing realizations of those statistics can again be performed in constant time and in parallel. This enables a new level of quality for foveated applications such as such as remote rendering, level-of-detail and Monte-Carlo denoising. In a user study, we finally show how human task performance increases and foveation artifacts are less suspicious, when using our method compared to common blurring.


    1. Kaan Akşit, Praneeth Chakravarthula, Kishore Rathinavel, Youngmo Jeong, Rachel Albert, Henry Fuchs, and David Luebke. 2019. Manufacturing application-driven foveated near-eye displays. IEEE Trans Vis and Comp Graph 25, 5 (2019), 1928–1939.Google ScholarCross Ref
    2. Rachel Albert, Anjul Patney, David Luebke, and Joohwan Kim. 2017. Latency requirements for foveated rendering in virtual reality. ACM Trans App Perc 14, 4 (2017).Google Scholar
    3. Stuart M Anstis. 1974. A chart demonstrating variations in acuity with retinal position. Vis Res 14, 7 (1974), 589–592.Google ScholarCross Ref
    4. H Aubert and R Förster. 1857. Beiträge zur Kenntniss des indirecten Sehens.(I). Untersuchungen über den Raumsinn der Retina. Archiv für Ophthalmologie 3, 2 (1857), 1–37.Google Scholar
    5. Steve Bako, Thijs Vogels, Brian McWilliams, Mark Meyer, Jan Novák, Alex Harvill, Pradeep Sen, Tony Derose, and Fabrice Rousselle. 2017. Kernel-predicting convolutional networks for denoising Monte Carlo renderings. ACM Trans Graph 36, 4 (2017), 97–1.Google ScholarDigital Library
    6. Herman Bouma. 1970. Interaction effects in parafoveal letter recognition. Nature 226, 5241 (1970), 177–178.Google Scholar
    7. Matteo Carandini, Jonathan B. Demb, Valerio Mante, David J. Tolhurst, Yang Dan, Bruno A. Olshausen, Jack L. Gallant, and Nicole C. Rust. 2005. Do we know what the early visual system does? J Neuroscience 25, 46 (2005), 10577–10597.Google ScholarCross Ref
    8. Chakravarty R Alla Chaitanya, Anton S Kaplanyan, Christoph Schied, Marco Salvi, Aaron Lefohn, Derek Nowrouzezahrai, and Timo Aila. 2017. Interactive reconstruction of Monte Carlo image sequences using a recurrent denoising autoencoder. ACM Trans Graph 36, 4 (2017), 1–12.Google ScholarDigital Library
    9. FJJ Clarke. 1960. A study of Troxler’s effect. Optica Acta: Int J Optics 7, 3 (1960), 219–236.Google ScholarCross Ref
    10. Arturo Deza, Aditya Jonnalagadda, and Miguel Eckstein. 2017. Towards metamerism via foveated style transfer. arXiv:1705.10041 (2017).Google Scholar
    11. Annette J. Dobson and Adrian G. Barnett. 2018. An Introduction to Generalized Linear Models. Chapman and Hall/CRC.Google Scholar
    12. William Donnelly and Andrew Lauritzen. 2006. Variance shadow maps. In Proc. i3D. 161–165.Google ScholarDigital Library
    13. Andrew T Duchowski, Nathan Cournia, and Hunter Murphy. 2004. Gaze-Contingent Displays: A Review. CyberPsychology & Behavior 7, 6 (2004), 621–634.Google ScholarCross Ref
    14. Alexei A Efros and Thomas K Leung. 1999. Texture synthesis by non-parametric sampling. In ICCV, Vol. 2. 1033–1038.Google ScholarDigital Library
    15. Mark D Fairchild. 2013. Color appearance models. John Wiley & Sons.Google Scholar
    16. Jenelle Feather, Alex Durango, Ray Gonzalez, and Josh McDermott. 2019. Metamers of neural networks reveal divergence from human perceptual systems. NeurIPS 32 (2019), 1–25.Google Scholar
    17. Jeremy Freeman and Eero P Simoncelli. 2011. Metamers of the ventral stream. Nature Neuroscience 14, 9 (2011), 1195–1201.Google ScholarCross Ref
    18. William T Freeman, Edward H Adelson, et al. 1991. The design and use of steerable filters. IEEE PAMI 13, 9 (1991), 891–906.Google ScholarDigital Library
    19. Sebastian Friston, Tobias Ritschel, and Anthony Steed. 2019. Perceptual rasterization for head-mounted display image synthesis. ACM Trans Graph 38, 4 (2019), 1–14.Google ScholarDigital Library
    20. Masahiro Fujita and Takahiro Harada. 2014. Foveated real-time ray tracing for virtual reality headset. SIGGRAPH Asia Posters 14 (2014).Google Scholar
    21. Bruno Galerne, Ares Lagae, Sylvain Lefebvre, and George Drettakis. 2012. Gabor noise by example. ACM Trans Graph 31, 4 (2012), 1–9.Google ScholarDigital Library
    22. Leon A Gatys, Alexander S Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In CVPR. 2414–2423.Google Scholar
    23. Wilson S Geisler and Jeffrey S Perry. 1998. Real-time foveated multiresolution system for low-bandwidth video communication. In HVIE III, Vol. 3299. 294–305.Google Scholar
    24. John A. Greenwood, Peter J. Bex, and Steven C. Dakin. 2009. Positional averaging explains crowding with letter-like stimuli. Proc NAS US 106, 31 (2009), 13130–13135.Google ScholarCross Ref
    25. Umut Güçlü and Marcel A.J. van Gerven. 2015. Deep neural networks reveal a gradient in the complexity of neural representations across the ventral stream. J Neuroscience 35, 27 (2015), 10005–10014.Google ScholarCross Ref
    26. Brian Guenter, Mark Finch, Steven Drucker, Desney Tan, and John Snyder. 2012. Foveated 3D graphics. ACM Trans Graph 31, 6 (2012), 1–10.Google ScholarDigital Library
    27. Yong He, Yan Gu, and Kayvon Fatahalian. 2014. Extending the graphics pipeline with adaptive, multi-rate shading. ACM Trans Graph 33, 4 (2014).Google ScholarDigital Library
    28. David J Heeger and James R Bergen. 1995. Pyramid-based texture analysis/synthesis. In Proc. SIGGRAPH. 229–238.Google Scholar
    29. David Hoffman, Zoe Meraz, and Eric Turner. 2018. Limits of peripheral acuity and implications for VR system design. J SID 26, 8 (2018), 483–495.Google Scholar
    30. Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In ICCV. 1501–1510.Google Scholar
    31. David H. Hubel. 1982. Exploration of the primary visual cortex, 1955–78. Nature 299, 5883 (oct 1982), 515–524.Google ScholarCross Ref
    32. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In CVPR. 1125–1134.Google Scholar
    33. Nima Khademi Kalantari, Steve Bako, and Pradeep Sen. 2015. A machine learning approach for filtering Monte Carlo noise. ACM Trans Graph 34, 4 (2015), 122–1.Google ScholarDigital Library
    34. Anton S Kaplanyan, Anton Sochenov, Thomas Leimkühler, Mikhail Okunev, Todd Goodall, and Gizem Rufo. 2019. DeepFovea: Neural reconstruction for foveated rendering and video compression using learned statistics of natural videos. ACM Trans Graph 38, 6 (2019), 1–13.Google ScholarDigital Library
    35. Michael Kass and Davide Pesare. 2011. Coherent noise for non-photorealistic rendering. ACM Trans. Graph. (TOG) 30, 4 (2011), 1–6.Google ScholarDigital Library
    36. Jonghyun Kim, Youngmo Jeong, Michael Stengel, Kaan Akşit, Rachel Albert, Ben Boudaoud, Trey Greer, Joohwan Kim, Ward Lopes, Zander Majercik, et al. 2019. Foveated AR: dynamically-foveated augmented reality display. ACM Trans Graph 38, 4 (2019), 1–15.Google ScholarDigital Library
    37. Min H Kim, Tobias Ritschel, and Jan Kautz. 2011. Edge-aware color appearance. ACM Trans Graph 30, 2 (2011), 1–9.Google ScholarDigital Library
    38. Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980 (2014).Google Scholar
    39. Ares Lagae, Peter Vangorp, Toon Lenaerts, and Philip Dutré. 2010. Procedural isotropic stochastic textures by example. Computers & Graphics 34, 4 (2010), 312–321.Google ScholarDigital Library
    40. Gordon E Legge and Daniel Kersten. 1987. Contrast discrimination in peripheral vision. J OSA A 4, 8 (1987), 1594–1598.Google Scholar
    41. Marc Levoy and Ross Whitaker. 1990. Gaze-directed volume rendering. In Proc. i3D. 217–223.Google ScholarDigital Library
    42. Lin Liang, Ce Liu, Ying-Qing Xu, Baining Guo, and Heung-Yeung Shum. 2001. Real-time texture synthesis by patch-based sampling. ACM Trans Graph 20, 3 (2001), 127–150.Google ScholarDigital Library
    43. Rafał Mantiuk, Kil Joong Kim, Allan G. Rempel, and Wolfgang Heidrich. 2011. HDR-VDP-2. ACM Trans Graph 30, 4(2011), 1–14.Google ScholarDigital Library
    44. Xiaoxu Meng, Ruofei Du, Matthias Zwicker, and Amitabh Varshney. 2018. Kernel foveated rendering. Proc. i3D 1, 1 (2018), 1–20.Google ScholarDigital Library
    45. Hunter Murphy and Andrew T Duchowski. 2001. Gaze-contingent level of detail rendering. Proc. Eurographics (2001).Google Scholar
    46. Toshikazu Ohshima, Hiroyuki Yamamoto, and Hideyuki Tamura. 1996. Gaze-directed adaptive rendering for interacting with virtual space. Proceedings – Virtual Reality Annual International Symposium (1996), 103–110.Google ScholarCross Ref
    47. Anjul Patney, Marco Salvi, Joohwan Kim, Anton Kaplanyan, Chris Wyman, Nir Benty, David Luebke, and Aaron Lefohn. 2016. Towards foveated rendering for gaze-tracked virtual reality. ACM Trans Graph 35, 6 (2016), 179.Google ScholarDigital Library
    48. Ken Perlin. 1985. An image synthesizer. ACM Siggraph Computer Graphics 19, 3 (1985), 287–296.Google ScholarDigital Library
    49. Javier Portilla and Eero P Simoncelli. 2000. A parametric texture model based on joint statistics of complex wavelet coefficients. Int J Comp Vis 40, 1 (2000), 49–70.Google ScholarDigital Library
    50. Charles Poynton. 2012. Digital Video and HD: Algorithms and Interfaces. Morgan Kaufmann. 752 pages.Google Scholar
    51. Eyal M. Reingold, Lester C. Loschky, George W. McConkie, and David M. Stampe. 2003. Gaze-contingent multiresolutional displays: An integrative review. Human Factors 45, 2 (2003), 307–328.Google ScholarCross Ref
    52. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-Net: Convolutional Networks for Biomedical Image Segmentation. In MICCAI. Cham, 234–241.Google Scholar
    53. R Rosenholtz. 2016. Capabilities and Limitations of Peripheral Vision. Annual review of vision science 2 (2016), 437.Google Scholar
    54. Ruth Rosenholtz, Jie Huang, Alvin Raj, Benjamin J. Balas, and Livia Ilie. 2012. A summary statistic representation in peripheral vision explains visual search. J Vision 12, 4 (2012), 14–14.Google ScholarCross Ref
    55. Daniel L. Ruderman, Thomas W. Cronin, and Chuan-Chin Chiao. 1998. Statistics of cone responses to natural images: implications for visual coding. J OSA A 15, 8 (1998), 2036–2045.Google ScholarCross Ref
    56. Anita M. Schmid, Keith P Purpura, Ifije E Ohiorhenuan, Ferenc Mechler, and Jonathan D Victor. 2009. Subpopulations of neurons in visual area V2 perform differentiation and integration operations in space and time. 3 (2009), 1–16.Google ScholarCross Ref
    57. Albulena Shaqiri, Maya Roinishvili, Lukasz Grzeczkowski, Eka Chkonia, Karin Pilz, Christine Mohr, Andreas Brand, Marina Kunchulia, and Michael H. Herzog. 2018. Sex-related differences in vision are heterogeneous. Scientific Rep 8, 1 (2018), 7521.Google ScholarCross Ref
    58. Josef Spjut, Ben Boudaoud, Jonghyun Kim, Trey Greer, Rachel Albert, Michael Stengel, Kaan Aksit, and David Luebke. 2019. Toward standardized classification of foveated displays. arXiv:1905.06229 (2019).Google Scholar
    59. Michael Stengel, Steve Grogorick, Martin Eisemann, and Marcus Magnor. 2016. Adaptive image-space sampling for gaze-contingent real-time rendering. 35, 4 (2016), 129–39.Google Scholar
    60. Hans Strasburger. 2020. Seven Myths on Crowding and Peripheral Vision. i-Perception 11, 3 (2020).Google Scholar
    61. Hans Strasburger, Ingo Rentschler, and Martin Jüttner. 2011. Peripheral vision and pattern recognition: A review. J Vision 11, 5 (2011), 13–13.Google ScholarCross Ref
    62. Nicholas T Swafford, José A Iglesias-Guitian, Charalampos Koniaris, Bochang Moon, Darren Cosker, and Kenny Mitchell. 2016. User, metric, and computational evaluation of foveated rendering methods. In Proc. SAP. 7–14.Google ScholarDigital Library
    63. Keiji Tanaka. 1996. Inferotemporal Cortex and Object Vision. Ann Rev Neuro 19, 1 (1996), 109–39.Google ScholarCross Ref
    64. V Javier Traver and Alexandre Bernardino. 2010. A review of log-polar imaging for visual perception in robotics. Robotics and Autonomous Systems 58, 4 (2010), 378–398.Google ScholarDigital Library
    65. Okan Tarhan Tursun, Elena Arabadzhiyska-Koleva, Marek Wernikowski, Radosław Mantiuk, Hans-Peter Seidel, Karol Myszkowski, and Piotr Didyk. 2019. Luminance-contrast-aware foveated rendering. ACM Trans Graph 38, 4 (2019), 1–14.Google ScholarDigital Library
    66. Dmitry Ulyanov, Andrea Vedaldi, and Victor Lempitsky. 2016. Instance normalization: The missing ingredient for fast stylization. arXiv:1607.08022 (2016).Google Scholar
    67. Leslie G Ungerleider and James V Haxby. 1994. ‘What’ and ‘where’ in the human brain. Current Opinion in Neurobiology 4, 2 (1994), 157–65.Google ScholarCross Ref
    68. Karthik Vaidyanathan, Marco Salvi, Robert Toth, Tim Foley, Jim Akenine-Möller, Tomas Nilsson, Jacob Munkberg, Jon Hasselgren, Masamichi Sugihara, Petrik Clarberg, Tomasz Janczak, and Aaron Lefohn. 2014. Coarse Pixel Shading. In Proc. HPG.Google Scholar
    69. Thomas SA Wallis, Matthias Bethge, and Felix A Wichmann. 2016. Testing models of peripheral encoding using metamerism in an oddity paradigm. J Vision 16, 2 (2016), 4–4.Google ScholarCross Ref
    70. Li-Yi Wei, Sylvain Lefebvre, Vivek Kwatra, and Greg Turk. 2009. State of the Art in Example-based Texture Synthesis. In Eurographics STAR. 93–117.Google Scholar
    71. Martin Weier, Thorsten Roth, Ernst Kruijff, André Hinkenjann, Arsène Pérard-Gayot, Philipp Slusallek, and Yongmin Li. 2016. Foveated real-time ray tracing for head-mounted displays. 35, 7 (2016), 289–298.Google Scholar
    72. G. N. Wilkinson and C. E. Rogers. 1973. Symbolic Description of Factorial Models for Analysis of Variance. J Royal Stat Soc C 22, 3 (1973), 392–399.Google Scholar
    73. Lance Williams. 1983. Pyramidal parametrics. In Proc. SIGGRAPH. 1–11.Google ScholarDigital Library
    74. Xin Zhang, Wei Chen, Zhonglei Yang, Chuan Zhu, and Qunsheng Peng. 2011. A new foveation ray casting approach for real-time rendering of 3D scenes. Proc CAD/Graphics (2011), 99–102.Google ScholarDigital Library

ACM Digital Library Publication:

Overview Page: