Instant neural graphics primitives with a multiresolution hash encoding

Thomas Müller; Alex Evans; Christoph Schied; Alexander Keller

“Instant neural graphics primitives with a multiresolution hash encoding” by Müller, Evans, Schied and Keller

Next: “Instant radiosity” by Keller »

« Previous: “Instant Convolution Shadows for Volumetric...

Conference:

SIGGRAPH 2022

Type(s):

Technical Papers

Title:

Instant neural graphics primitives with a multiresolution hash encoding

Presenter(s)/Author(s):

Thomas Müller

Alex Evans

Christoph Schied

Alexander Keller

Abstract:

Neural graphics primitives, parameterized by fully connected neural networks, can be costly to train and evaluate. We reduce this cost with a versatile new input encoding that permits the use of a smaller network without sacrificing quality, thus significantly reducing the number of floating point and memory access operations: a small neural network is augmented by a multiresolution hash table of trainable feature vectors whose values are optimized through stochastic gradient descent. The multiresolution structure allows the network to disambiguate hash collisions, making for a simple architecture that is trivial to parallelize on modern GPUs. We leverage this parallelism by implementing the whole system using fully-fused CUDA kernels with a focus on minimizing wasted bandwidth and compute operations. We achieve a combined speedup of several orders of magnitude, enabling training of high-quality neural graphics primitives in a matter of seconds, and rendering in tens of milliseconds at a resolution of 1920×1080.

References:

1. Thomas Annen, Tom Mertens, Philippe Bekaert, Hans-Peter Seidel, and Jan Kautz. 2007. Convolution Shadow Maps. In Rendering Techniques, Jan Kautz and Sumanta Pattanaik (Eds.). The Eurographics Association. Google ScholarCross Ref
2. Jonathan T. Barron, Ben Mildenhall, Matthew Tancik, Peter Hedman, Ricardo Martin-Brualla, and Pratul P. Srinivasan. 2021a. Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. arXiv (2021). https://jonbarron.info/mipnerf/Google Scholar
3. Jonathan T. Barron, Ben Mildenhall, Dor Verbin, Pratul P. Srinivasan, and Peter Hedman. 2021b. Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields. arXiv:2111.12077 (Nov. 2021).Google Scholar
4. Rohan Chabra, Jan E. Lenssen, Eddy Ilg, Tanner Schmidt, Julian Straub, Steven Love-grove, and Richard Newcombe. 2020. Deep Local Shapes: Learning Local SDF Priors for Detailed 3D Reconstruction. In Computer Vision – ECCV 2020, Andrea Vedaldi, Horst Bischof, Thomas Brox, and Jan-Michael Frahm (Eds.). Springer International Publishing, Cham, 608–625.Google ScholarDigital Library
5. Eric R. Chan, Connor Z. Lin, Matthew A. Chan, Koki Nagano, Boxiao Pan, Shalini De Mello, Orazio Gallo, Leonidas Guibas, Jonathan Tremblay, Sameh Khamis, Tero Karras, and Gordon Wetzstein. 2021. Efficient Geometry-aware 3D Generative Adversarial Networks. arXiv:2112.07945 (2021). arXiv:2112.07945 [cs.CV]Google Scholar
6. Julian Chibane, Thiemo Alldieck, and Gerard Pons-Moll. 2020. Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE.Google ScholarCross Ref
7. Terrance DeVries, Miguel Angel Bautista, Nitish Srivastava, Graham W. Taylor, and Joshua M. Susskind. 2021. Unconstrained Scene Generation with Locally Conditioned Radiance Fields. arXiv (2021).Google Scholar
8. Eric Enderton and Daniel Wexler. 2011. The Workflow Scale. In Computer Graphics International Workshop on VFX, Computer Animation, and Stereo Movies.Google Scholar
9. Alex Evans. 2006. Fast Approximations for Global Illumination on Dynamic Scenes. In ACM SIGGRAPH 2006 Courses (Boston, Massachusetts) (SIGGRAPH ’06). Association for Computing Machinery, New York, NY, USA, 153–171. Google ScholarDigital Library
10. Stephan J. Garbin, Marek Kowalski, Matthew Johnson, Jamie Shotton, and Julien Valentin. 2021. FastNeRF: High-Fidelity Neural Rendering at 200FPS. arXiv:2103.10380 (March 2021).Google Scholar
11. Jonas Gehring, Michael Auli, David Grangier, Denis Yarats, and Yann N. Dauphin. 2017. Convolutional Sequence to Sequence Learning. In Proceedings of the 34th International Conference on Machine Learning – Volume 70 (Sydney, NSW, Australia) (ICML’17). JMLR.org, 1243–1252.Google Scholar
12. Xavier Glorot and Yoshua Bengio. 2010. Understanding the Difficulty of Training Deep Feedforward Neural Networks. In Proc. 13th International Conference on Artificial Intelligence and Statistics (Sardinia, Italy, May 13–15). JMLR.org, 249–256.Google Scholar
13. Saeed Hadadan, Shuhong Chen, and Matthias Zwicker. 2021. Neural radiosity. ACM Transactions on Graphics 40, 6 (Dec. 2021), 1–11. Google ScholarDigital Library
14. David Money Harris and Sarah L. Harris. 2013. 3.4.2 – State Encodings. In Digital Design and Computer Architecture (second ed.). Morgan Kaufmann, Boston, 129–131. Google ScholarCross Ref
15. Jon Jansen and Louis Bavoil. 2010. Fourier Opacity Mapping. In Proceedings of the 2010 ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (Washington, D.C.) (I3D ’10). Association for Computing Machinery, New York, NY, USA, 165–172. Google ScholarDigital Library
16. Chiyu Max Jiang, Avneesh Sud, Ameesh Makadia, Jingwei Huang, Matthias Nießner, and Thomas Funkhouser. 2020. Local Implicit Grid Representations for 3D Scenes. In Proceedings IEEE Conf. on Computer Vision and Pattern Recognition (CVPR).Google ScholarCross Ref
17. Diederik P. Kingma and Jimmy Ba. 2014. Adam: A Method for Stochastic Optimization. arXiv:1412.6980 (June 2014).Google Scholar
18. Derrick H. Lehmer. 1951. Mathematical Methods in Large-scale Computing Units. In Proceedings of the Second Symposium on Large Scale Digital Computing Machinery. Harvard University Press, Cambridge, United Kingdom, 141–146.Google Scholar
19. Jaakko Lehtinen, Jacob Munkberg, Jon Hasselgren, Samuli Laine, Tero Karras, Miika Aittala, and Timo Aila. 2018. Noise2Noise: Learning Image Restoration without Clean Data. arXiv:1803.04189 (March 2018).Google Scholar
20. Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. 2020. Neural Sparse Voxel Fields. NeurIPS (2020). https://lingjie0206.github.io/papers/NSVF/Google Scholar
21. Julien N.P. Martel, David B. Lindell, Connor Z. Lin, Eric R. Chan, Marco Monteiro, and Gordon Wetzstein. 2021. ACORN: Adaptive Coordinate Networks for Neural Representation. ACM Trans. Graph. (SIGGRAPH) (2021).Google ScholarDigital Library
22. Ishit Mehta, Michaël Gharbi, Connelly Barnes, Eli Shechtman, Ravi Ramamoorthi, and Manmohan Chandraker. 2021. Modulated Periodic Activations for Generalizable Local Functional Representations. In IEEE International Conference on Computer Vision. IEEE.Google Scholar
23. Paulius Micikevicius, Sharan Narang, Jonah Alben, Gregory Diamos, Erich Elsen, David Garcia, Boris Ginsburg, Michael Houston, Oleksii Kuchaiev, Ganesh Venkatesh, and Hao Wu. 2018. Mixed Precision Training. arXiv:1710.03740 (Oct. 2018).Google Scholar
24. Ben Mildenhall, Peter Hedman, Ricardo Martin-Brualla, Pratul Srinivasan, and Jonathan T. Barron. 2021. NeRF in the Dark: High Dynamic Range View Synthesis from Noisy Raw Images. arXiv:2111.13679 (Nov. 2021).Google Scholar
25. Ben Mildenhall, Pratul P. Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and Abhishek Kar. 2019. Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines. ACM Trans. Graph. 38, 4, Article 29 (July 2019), 14 pages. Google ScholarDigital Library
26. Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In ECCV.Google Scholar
27. Thomas Müller. 2021. Tiny CUDA Neural Network Framework. https://github.com/nvlabs/tiny-cuda-nn.Google Scholar
28. Thomas Müller, Brian McWilliams, Fabrice Rousselle, Markus Gross, and Jan Novák. 2019. Neural Importance Sampling. ACM Trans. Graph. 38, 5, Article 145 (Oct. 2019), 19 pages. Google ScholarDigital Library
29. Thomas Müller, Fabrice Rousselle, Alexander Keller, and Jan Novák. 2020. Neural Control Variates. ACM Trans. Graph. 39, 6, Article 243 (Nov. 2020), 19 pages. Google ScholarDigital Library
30. Thomas Müller, Fabrice Rousselle, Jan Novák, and Alexander Keller. 2021. Real-time Neural Radiance Caching for Path Tracing. ACM Trans. Graph. 40, 4, Article 36 (Aug. 2021), 16 pages. Google ScholarDigital Library
31. Ken Museth. 2013. VDB: High-Resolution Sparse Volumes with Dynamic Topology. ACM Trans. Graph. 32, 3, Article 27 (July 2013), 22 pages. Google ScholarDigital Library
32. Ken Museth. 2021. NanoVDB: A GPU-Friendly and Portable VDB Data Structure For Real-Time Rendering And Simulation. In ACM SIGGRAPH 2021 Talks (Virtual Event, USA) (SIGGRAPH ’21). Association for Computing Machinery, New York, NY, USA, Article 1, 2 pages. Google ScholarDigital Library
33. Thomas Neff, Pascal Stadlbauer, Mathias Parger, Andreas Kurz, Joerg H. Mueller, Chakravarty R. Alla Chaitanya, Anton S. Kaplanyan, and Markus Steinberger. 2021. DONeRF: Towards Real-Time Rendering of Compact Neural Radiance Fields using Depth Oracle Networks. Computer Graphics Forum 40, 4 (2021). Google ScholarCross Ref
34. Matthias Nießner, Michael Zollhöfer, Shahram Izadi, and Marc Stamminger. 2013. RealTime 3D Reconstruction at Scale Using Voxel Hashing. ACM Trans. Graph. 32, 6, Article 169 (nov 2013), 11 pages. Google ScholarDigital Library
35. Fakir S. Nooruddin and Greg Turk. 2003. Simplification and Repair of Polygonal Models Using Volumetric Techniques. IEEE Transactions on Visualization and Computer Graphics 9, 2 (apr 2003), 191–205. Google ScholarDigital Library
36. Melissa E. O’Neill. 2014. PCG: A Family of Simple Fast Space-Efficient Statistically Good Algorithms for Random Number Generation. Technical Report HMC-CS-2014-0905. Harvey Mudd College, Claremont, CA.Google Scholar
37. Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Love-grove. 2019. DeepSDF: Learning Continuous Signed Distance Functions for Shape Representation. arXiv:1901.05103 (Jan. 2019).Google Scholar
38. Songyou Peng, Michael Niemeyer, Lars Mescheder, Marc Pollefeys, and Andreas Geiger. 2020a. Convolutional Occupancy Networks. In European Conference on Computer Vision (ECCV).Google Scholar
39. Songyou Peng, Michael Niemeyer, Lars Mescheder, Marc Pollefeys, and Andreas Geiger. 2020b. Convolutional Occupancy Networks. (2020). arXiv:2003.04618 [cs.CV]Google Scholar
40. Matt Pharr, Wenzel Jakob, and Greg Humphreys. 2016. Physically Based Rendering: From Theory to Implementation (3rd ed.) (3rd ed.). Morgan Kaufmann Publishers Inc., San Francisco, CA, USA. 1266 pages.Google ScholarDigital Library
41. Vincent Sitzmann, Julien N.P. Martel, Alexander W. Bergman, David B. Lindell, and Gordon Wetzstein. 2020. Implicit Neural Representations with Periodic Activation Functions. In Proc. NeurIPS.Google Scholar
42. Cheng Sun, Min Sun, and Hwann-Tzong Chen. 2021. Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction. arXiv:2111.11215 (Nov. 2021).Google Scholar
43. Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, and Sanja Fidler. 2021. Neural Geometric Level of Detail: Real-time Rendering with Implicit 3D Shapes. (2021).Google Scholar
44. Matthew Tancik, Pratul P. Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T. Barron, and Ren Ng. 2020. Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains. NeurIPS (2020). https://bmild.github.io/fourfeat/index.htmlGoogle Scholar
45. Danhang Tang, Mingsong Dou, Peter Lincoln, Philip Davidson, Kaiwen Guo, Jonathan Taylor, Sean Fanello, Cem Keskin, Adarsh Kowdle, Sofien Bouaziz, Shahram Izadi, and Andrea Tagliasacchi. 2018. Real-Time Compression and Streaming of 4D Performances. ACM Trans. Graph. 37, 6, Article 256 (dec 2018), 11 pages. Google ScholarDigital Library
46. Matthias Teschner, Bruno Heidelberger, Matthias Müller, Danat Pomeranets, and Markus Gross. 2003. Optimized Spatial Hashing for Collision Detection of Deformable Objects. In Proceedings of VMV’03, Munich, Germany. 47–54.Google Scholar
47. Sergios Theodoridis. 2008. Pattern Recognition. Elsevier.Google Scholar
48. Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N. Gomez, Lukasz Kaiser, and Illia Polosukhin. 2017. Attention Is All You Need. arXiv:1706.03762 (June 2017).Google Scholar
49. Dor Verbin, Peter Hedman, Ben Mildenhall, Todd Zickler, Jonathan T. Barron, and Pratul P. Srinivasan. 2021. Ref-NeRF: Structured View-Dependent Appearance for Neural Radiance Fields. arXiv:2112.03907 (Dec. 2021).Google Scholar
50. Suttisak Wizadwongsa, Pakkapon Phongthawee, Jiraphon Yenphraphai, and Supasorn Suwajanakorn. 2021. NeX: Real-time View Synthesis with Neural Basis Expansion. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR).Google Scholar
51. Alex Yu, Sara Fridovich-Keil, Matthew Tancik, Qinhong Chen, Benjamin Recht, and Angjoo Kanazawa. 2021a. Plenoxels: Radiance Fields without Neural Networks. arXiv:2112.05131 (Dec. 2021).Google Scholar
52. Alex Yu, Ruilong Li, Matthew Tancik, Hao Li, Ren Ng, and Angjoo Kanazawa. 2021b. PlenOctrees for Real-time Rendering of Neural Radiance Fields. In ICCV.Google Scholar

ACM Digital Library Publication: