“NeuralVDB: High-resolution Sparse Volume Representation using Hierarchical Neural Networks”
Conference:
Type(s):
Title:
- NeuralVDB: High-resolution Sparse Volume Representation using Hierarchical Neural Networks
Presenter(s)/Author(s):
Abstract:
NeuralVDB enhances the VDB framework for efficient sparse volumetric data storage by integrating machine learning. This new structure significantly reduces memory usage while maintaining flexibility with minimal compression errors. It combines a shallow VDB tree with hierarchical neural networks for topology and value encoding, achieving high compression ratios and outperforming other neural representations. Additionally, NeuralVDB improves animation compression and temporal coherence through warm-starting from previous frames.
References:
[1]
Felix Achilles, Alexandru-Eugen Ichim, Huseyin Coskun, Federico Tombari, Soheyl Noachtar, and Nassir Navab. 2016. Patient MoCap: Human pose estimation under blanket occlusion for hospital monitoring applications. Medical Image Computing and Computer-Assisted Intervention?MICCAI 2016: 19th International Conference, 491?499.
[2]
Adam W. Bargteil, Tolga G. Goktekin, James F O?brien, and John A. Strain. 2006. A semi-Lagrangian contouring method for fluid simulation. ACM Transactions on Graphics 25, 1 (2006), 19?38.
[3]
Narasimha Boddeti, Yunlong Tang, Kurt Maute, David W. Rosen, and Martin L. Dunn. 2020. Optimal design and manufacture of variable stiffness laminated continuous fiber reinforced composites. Scientific Reports 10, 1 (2020), 16507.
[4]
Sofien Bouaziz, Andrea Tagliasacchi, Hao Li, and Mark Pauly. 2016. Modern techniques and applications for real-time non-rigid registration. In Proceedings of the SIGGRAPH ASIA 2016 Courses. 1?25.
[5]
A. Brock, Th. Lim, J. M. Ritchie, and N. Weston. 2016. Generative and discriminative voxel modeling with convolutional neural networks. CoRR abs/1608.04236 (2016).
[6]
Zhiqin Chen and Hao Zhang. 2019. Learning implicit fields for generative shape modeling. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2019).
[7]
Nuttapong Chentanez and Matthias M?ller. 2011. Real-time Eulerian water simulation using a restricted tall cell grid. ACM Transactions on Graphics 30, 4 (2011), 1?10.
[8]
Thomas Davies, Derek Nowrouzezahrai, and Alec Jacobson. 2020. On the effectiveness of weight-encoded neural implicit 3D shapes. arXiv:2009.09808. Retrieved from https://arxiv.org/abs/2009.09808
[9]
Jean-loup Gailly and Mark Adler. 2004. Zlib compression library. (2004).
[10]
P. Hedman, P. P. Srinivasan, B. Mildenhall, J. T. Barron, and P. Debevec. 2021. Baking neural radiance fields for real-time view synthesis. In 2021 IEEE/CVF International Conference on Computer Vision (ICCV?21). 5855?5864.
[11]
Rama Karl Hoetzlein. 2016. GVDB: Raytracing sparse voxel database structures on the GPU. In Proceedings of the High Performance Graphics. 109?117.
[12]
Ben Houston, Michael B. Nielsen, Christopher Batty, Ola Nilsson, and Ken Museth. 2006. Hierarchical RLE level set: A compact and versatile deformable surface representation. ACM Transactions on Graphics 25, 1 (2006), 151?175.
[13]
Geoffrey Irving, Eran Guendelman, Frank Losasso, and Ronald Fedkiw. 2006. Efficient simulation of large bodies of water by coupling two and three dimensional techniques. ACM Transactions on Graphics 25, 3(2006), 805?811.
[14]
Arthur Jacot, Franck Gabriel, and Clement Hongler. 2018. Neural tangent kernel: Convergence and generalization in neural networks. Advances in Neural Information Processing Systems 31 (2018).
[15]
JangaFX. 2020. EmberGen VDB Dataset. Accessed: 2022-02-15.
[16]
Diederik P. Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv:1412.6980. Retrieved from https://arxiv.org/abs/1412.6980
[17]
Heiner Kirchhoffer, Paul Haase, Wojciech Samek, Karsten M?ller, Hamed Rezazadegan-Tavakoli, Francesco Cricri, Emre B. Aksu, Miska M. Hannuksela, Wei Jiang, Wei Wang, Shan Liu, Swayambhoo Jain, Shahab Hamidi-Rad, Fabien Racap?, and Werner Bailer. 2021. Overview of the neural network compression and representation (NNR) standard. IEEE Transactions on Circuits and Systems for Video Technology 32, 5 (2021), 3203?3216.
[18]
Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The case for learned index structures. In Proceedings of the 2018 International Conference on Management of Data. 489?504.
[19]
Didier Le Gall. 1991. MPEG: A video compression standard for multimedia applications. Communication of the ACM 34, 4 (1991), 46?58.
[20]
Minjae Lee, David Hyde, Michael Bao, and Ronald Fedkiw. 2018. A skinned tetrahedral mesh for hair animation and hair-water interaction. IEEE Transactions on Visualization and Computer Graphics (2018).
[21]
Minjae Lee, David Hyde, Kevin Li, and Ronald Fedkiw. 2019. A robust volume conserving method for character-water interaction. In Proceedings of the 18th annual ACM SIGGRAPH/Eurographics Symposium on Computer Animation. 1?12.
[22]
Randall J. Leveque. 1996. High-resolution conservative algorithms for advection in incompressible flow. SIAM Journal on Numerical Analysis 33, 2 (1996), 627?665.
[23]
Yuanzhan Li, Yuqi Liu, Yujie Lu, Siyu Zhang, Shen Cai, and Yanting Zhang. 2022. High-fidelity 3D model compression based on key spheres. arXiv:2201.07486. Retrieved from https://arxiv.org/abs/2201.07486
[24]
Lingjie Liu, Jiatao Gu, Kyaw Zaw Lin, Tat-Seng Chua, and Christian Theobalt. 2020. Neural sparse voxel fields. Advances in Neural Information Processing Systems 33, (2020), 15651?15663.
[25]
Zihao Liu, Tao Liu, Wujie Wen, Lei Jiang, Jie Xu, Yanzhi Wang, and Gang Quan. 2018. DeepN-JPEG: A deep neural network favorable JPEG-based image compression framework. In Proceedings of the 55th Annual Design Automation Conference. 1?6.
[26]
Frank Losasso, Fr?d?ric Gibou, and Ron Fedkiw. 2004. Simulating water and smoke with an octree data structure. ACM Transactions on Graphics 23, 3(2004), 457?462.
[27]
Siwei Ma, Xinfeng Zhang, Chuanmin Jia, Zhenghui Zhao, Shiqi Wang, and Shanshe Wang. 2019. Image and video compression with neural networks: A review. IEEE Transactions on Circuits and Systems for Video Technology 30, 6 (2019), 1683?1698.
[28]
Jessie Maisano. 2003. CT Scan of a Chameleon. Accessed: 2022-02-15.
[29]
Julien N. P. Martel, David B. Lindell, Connor Z. Lin, Eric R. Chan, Marco Monteiro, and Gordon Wetzstein. 2021. ACORN: Adaptive coordinate networks for neural scene representation. arXiv:2105.02788. Retrieved from https://arxiv.org/abs/2105.02788
[30]
Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. 2019a. Occupancy networks: Learning 3d reconstruction in function space. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 4460?4470.
[31]
Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. 2019b. Occupancy networks: Learning 3D reconstruction in function space. In Proceedings IEEE Conference on Computer Vision and Pattern Recognition.
[32]
Mateusz Michalkiewicz, Jhony K. Pontes, Dominic Jack, Mahsa Baktashmotlagh, and Anders Eriksson. 2019. Implicit surface representations as layers in neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 4743?4752.
[33]
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. Nerf: Representing scenes as neural radiance fields for view synthesis. In Proceedings of the European Conference on Computer Vision. Springer, 405?421.
[34]
Ben Moseley, Andrew Markham, and Tarje Nissen-Meyer. 2021. Finite basis physics-informed neural networks (FBPINNs): A scalable domain decomposition approach for solving differential equations. arXiv:2107.07871. Retrieved from https://arxiv.org/abs/2107.07871
[35]
Thomas M?ller, Alex Evans, Christoph Schied, and Alexander Keller. 2022. Instant neural graphics primitives with a multiresolution hash encoding. arXiv:2201.05989. Retrieved from https://arxiv.org/abs/2201.05989
[36]
Thomas M?ller, Brian McWilliams, Fabrice Rousselle, Markus Gross, and Jan Nov?k. 2019. Neural importance sampling. ACM Transactions on Graphics 38, 5 (2019), 1?19.
[37]
Thomas M?ller, Fabrice Rousselle, Alexander Keller, and Jan Nov?k. 2020. Neural control variates. ACM Transactions on Graphics 39, 6 (2020), 1?19.
[38]
Thomas M?ller, Fabrice Rousselle, Jan Nov?k, and Alexander Keller. 2021. Real-time neural radiance caching for path tracing. ACM Transactions on Graphics 40, 4(2021), 36:1?36:16.
[39]
Ken Museth. 2011. DB+Grid: A novel dynamic blocked grid for sparse high-resolution volumes and level sets. In Proceedings of the ACM SIGGRAPH 2011 Talks (Vancouver, British Columbia). ACM, New York, NY, 1 pages.
[40]
Ken Museth. 2013. VDB: High-resolution sparse volumes with dynamic topology. ACM Transactions on Graphics 32, 3 (2013), 1?22.
[41]
Ken Museth. 2021. NanoVDB: A GPU-friendly and portable VDB data structure for real-time rendering and simulation. In Proceedings of the ACM SIGGRAPH 2021 Talks. 1?2.
[42]
Michael B. Nielsen and Ken Museth. 2006. Dynamic tubular grid: An efficient data structure and algorithms for high resolution level sets. Journal of Scientific Computing 26, 3(2006), 261?299. DOI:
[43]
Renato Pajarola and J. Rossignac. 2000. Compressed progressive meshes. IEEE Transactions on Visualization and Computer Graphics (2000).
[44]
Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Lovegrove. 2019. Deepsdf: Learning continuous signed distance functions for shape representation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 165?174.
[45]
Adam Paszke, Sam Gross, Francisco Massa, Adam Lerer, James Bradbury, Gregory Chanan, Trevor Killeen, Zeming Lin, Natalia Gimelshein, Luca Antiga, Alban Desmaison, Andreas Kopf, Edward Yang, Zachary DeVito, Martin Raison, Alykhan Tejani, Sasank Chilamkurthy, Benoit Steiner, Lu Fang, Junjie Bai, and Soumith Chintala. 2019. PyTorch: An imperative style, high-performance deep learning library. In Proceedings of the Advances in Neural Information Processing Systems.H. Wallach, H. Larochelle, A. Beygelzimer, F. d’Alch?-Buc, E. Fox, and R. Garnett (Eds.), Curran Associates, Inc., 8024?8035. Retrieved from http://papers.neurips.cc/paper/9015-pytorch-an-imperative-style-high-performance-deep-learning-library.pdf
[46]
Danping Peng, Barry Merriman, Stanley Osher, Hongkai Zhao, and Myungjoo Kang. 1999. A PDE-based fast local level set method. Journal of Computational Physics 155, 2 (1999), 410?438.
[47]
Songyou Peng, Michael Niemeyer, Lars Mescheder, Marc Pollefeys, and Andreas Geiger. 2020. Convolutional occupancy networks. In Proceedings of the European Conference on Computer Vision. Springer, 523?540.
[48]
William B. Pennebaker and Joan L. Mitchell. 1992. JPEG: Still Image Data Compression Standard. Springer Science & Business Media.
[49]
Nasim Rahaman, Aristide Baratin, Devansh Arpit, Felix Draxler, Min Lin, Fred Hamprecht, Yoshua Bengio, and Aaron Courville. 2019. On the spectral bias of neural networks. In Proceedings of the International Conference on Machine Learning. PMLR, 5301?5310.
[50]
Shunsuke Saito, Liwen Hu, Chongyang Ma, Hikaru Ibayashi, Linjie Luo, and Hao Li. 2018. 3D hair synthesis using volumetric variational autoencoders. ACM Transactions on Graphics 37, 6 (2018), 1?12.
[51]
Mirko Sattler, Ralf Sarlette, and Reinhard Klein. 2005. Simple and efficient compression of animation sequences. In Proceedings of the 2005 ACM SIGGRAPH/Eurographics Symposium on Computer Animation.Association for Computing Machinery, 209?217.
[52]
Rajsekhar Setaluri, Mridul Aanjaneya, Sean Bauer, and Eftychios Sifakis. 2014. SPGrid: A sparse paged grid structure applied to adaptive smoke simulation. ACM Transactions on Graphics 33, 6 (2014), 1?12.
[53]
Noam Shazeer, Azalia Mirhoseini, Krzysztof Maziarz, Andy Davis, Quoc V. Le, Geoffrey E. Hinton, and Jeff Dean. 2017. Outrageously large neural networks: The sparsely-gated mixture-of-experts layer. In Proceedings of the ICLR (Poster).
[54]
Vincent Sitzmann, Eric Chan, Richard Tucker, Noah Snavely, and Gordon Wetzstein. 2020. Metasdf: Meta-learning signed distance functions. Advances in Neural Information Processing Systems 33, (2020), 10136?10147.
[55]
Vincent Sitzmann, Julien Martel, Alexander Bergman, David Lindell, and Gordon Wetzstein. 2020. Implicit neural representations with periodic activation functions. Advances in Neural Information Processing Systems 33, (2020), 7462?7473.
[56]
John Strain. 2001. A fast semi-Lagrangian contouring method for moving interfaces. Journal of Computational Physics 170, 1 (2001), 373?394.
[57]
Towaki Takikawa, Alex Evans, Jonathan Tremblay, Thomas M?ller, Morgan McGuire, Alec Jacobson, and Sanja Fidler. 2022a. Variable bitrate neural fields. In ACM SIGGRAPH 2022 Conference Proceedings. 1?9.
[58]
Towaki Takikawa, Joey Litalien, Kangxue Yin, Karsten Kreis, Charles Loop, Derek Nowrouzezahrai, Alec Jacobson, Morgan McGuire, and Sanja Fidler. 2021. Neural geometric level of detail: Real-time rendering with implicit 3D shapes. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 11358?11367.
[59]
Towaki Takikawa, Or Perel, Clement Fuji Tsang, Charles Loop, Joey Litalien, Jonathan Tremblay, Sanja Fidler, and Maria Shugrina. 2022b. Kaolin Wisp: A PyTorch Library and Engine for Neural Fields Research. Retrieved April 4th, 2023 from https://github.com/NVIDIAGameWorks/kaolin-wisp
[60]
Matthew Tancik, Pratul P. Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T. Barron, and Ren Ng. 2020. Fourier features let networks learn high frequency functions in low dimensional domains. NeurIPS (2020).
[61]
Danhang Tang, Mingsong Dou, Peter Lincoln, Philip Davidson, Kaiwen Guo, Jonathan Taylor, Sean Fanello, Cem Keskin, Adarsh Kowdle, Sofien Bouaziz, et al. 2018. Real-time compression and streaming of 4d performances. ACM Transactions on Graphics 37, 6 (2018), 1?11.
[62]
Danhang Tang, Saurabh Singh, Philip A. Chou, Christian Hane, Mingsong Dou, Sean Fanello, Jonathan Taylor, Philip Davidson, Onur G. Guleryuz, Yinda Zhang, et al. 2020. Deep implicit volume compression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 1293?1303.
[63]
The Blosc Development Team. 2020. Blosc. Accessed: 2022-02-04.
[64]
S?bastien Valette and R?my Prost. 2004. A wavelet-based progressive compression scheme for triangle meshes: Wavemesh. IEEE Transactions on Visualization and Computer Graphics 10, 2 (2004), 123?129. DOI:
[65]
Patricio Gonzalez Vivo and Jen Lowe. 2015. The book of shaders: Fractal brownian motion. Patricio Gonzalez Vivo, https://thebookofshaders.com/13 (2015).
[66]
Ignacio Vizzo, Tiziano Guadagnino, Jens Behley, and Cyrill Stachniss. 2022. Vdbfusion: Flexible and efficient tsdf integration of range sensor data. Sensors 22, 3 (2022), 1296.
[67]
Walt Disney Animation Studios. 2017. Disney Clouds Dataset. Accessed: 2021-12-09.
[68]
Magnus Wrenninge, Chris Allen, Sosh Mirsepassi, Stephen Marshall, Chris Burdorf, Henrik Falt, Scot Shinderman, and Doug Bloom. 2020. Field3D. Retrieved from https://github.com/imageworks/Field3D
[69]
Tong Wu, Liang Pan, Junzhe Zhang, Tai Wang, Ziwei Liu, and Dahua Lin. 2021. Density-aware chamfer distance as a comprehensive metric for point cloud completion. arXiv:2111.12702. Retrieved from https://arxiv.org/abs/2111.12702
[70]
Yiheng Xie, Towaki Takikawa, Shunsuke Saito, Or Litany, Shiqin Yan, Numair Khan, Federico Tombari, James Tompkin, Vincent Sitzmann, and Srinath Sridhar. 2022. Neural fields in visual computing and beyond. Computer Graphics Forum (2022). DOI:
[71]
Alex Yu, Vickie Ye, Matthew Tancik, and Angjoo Kanazawa. 2021. pixelNeRF: Neural radiance fields from one or few images. In Proceedings of the CVPR.
[72]
Zongwei Zhou, Jae Shin, Lei Zhang, Suryakanth Gurudu, Michael Gotway, and Jianming Liang. 2017. Fine-tuning convolutional neural networks for biomedical image analysis: Actively and incrementally. In Proceedings of the IEEE Conference on Computer vision and Pattern Recognition. 7340?7351.