“Diffusion-based Holistic Texture Rectification and Synthesis” by Hao, Iizuka, Hara, Simo-Serra, Kataoka, et al. …
Conference:
Type(s):
Title:
- Diffusion-based Holistic Texture Rectification and Synthesis
Session/Category Title: Visualizing the Future
Presenter(s)/Author(s):
Abstract:
We present a novel framework for rectifying occlusions and distortions in degraded texture samples from natural images. Traditional texture synthesis approaches focus on generating textures from pristine samples, which necessitate meticulous preparation by humans and are often unattainable in most natural images. These challenges stem from the frequent occlusions and distortions of texture samples in natural images due to obstructions and variations in object surface geometry. To address these issues, we propose a framework that synthesizes holistic textures from degraded samples in natural images, extending the applicability of exemplar-based texture synthesis techniques. Our framework utilizes a conditional Latent Diffusion Model (LDM) with a novel occlusion-aware latent transformer. This latent transformer not only effectively encodes texture features from partially-observed samples necessary for the generation process of the LDM, but also explicitly captures long-range dependencies in samples with large occlusions. To train our model, we introduce a method for generating synthetic data by applying geometric transformations and free-form mask generation to clean textures. Experimental results demonstrate that our framework significantly outperforms existing methods both quantitatively and quantitatively. Furthermore, we conduct comprehensive ablation studies to validate the different components of our proposed framework. Results are corroborated by a perceptual user study which highlights the efficiency of our proposed approach.
References:
[1]
Safia Abdelmounaime and He Dong-Chen. 2013. New Brodatz-Based Image Databases for Grayscale Color and Multiband Texture Analysis. Volume 2013 (2013). https://doi.org/10.1155/2013/876386
[2]
Omer Bar-Tal, Lior Yariv, Yaron Lipman, and Tali Dekel. 2023. MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation. In International Conference on Machine Learning.
[3]
Sean Bell, Paul Upchurch, Noah Snavely, and Kavita Bala. 2013. OpenSurfaces: A Richly Annotated Catalog of Surface Appearance. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 32, 4 (2013).
[4]
Urs Bergmann, Nikolay Jetchev, and Roland Vollgraf. 2017. Learning Texture Manifolds with the Periodic Spatial GAN. In International Conference on Machine Learning.
[5]
Fred L Bookstein. 1989. Principal warps: Thin-plate splines and the decomposition of deformations. IEEE Transactions on Pattern Analysis and Machine Intelligence 11, 6 (1989), 567–585.
[6]
Gertjan J. Burghouts and Jan-Mark Geusebroek. 2009. Material-specific Adaptation of Color Invariant Features. Pattern Recognition Letters 30 (2009), 306–313.
[7]
Chin-Fan Chen and Evan Suma Rosenberg. 2018. Virtual Content Creation Using Dynamic Omnidirectional Texture Synthesis. In IEEE Conference on Virtual Reality and 3D User Interfaces (VR).
[8]
Mircea Cimpoi, Subhransu Maji, Iasonas Kokkinos, Sammy Mohamed, and Andrea Vedaldi. 2014. Describing Textures in the Wild. In IEEE Conference on Computer Vision and Pattern Recognition.
[9]
Antonio Criminisi, Patrick Pérez, and Kentaro Toyama. 2003. Object Removal by exemplar-based inpainting. In IEEE Conference on Computer Vision and Pattern Recognition.
[10]
Dengxin Dai, Hayko Riemenschneider, and Luc Van Gool. 2014. The Synthesizability of Texture Examples. In IEEE Conference on Computer Vision and Pattern Recognition.
[11]
Prafulla Dhariwal and Alexander Nichol. 2021. Diffusion Models Beat GANs on Image Synthesis. In Advances in Neural Information Processing Systems.
[12]
Alexei A. Efros and William T. Freeman. 2001. Image Quilting for Texture Synthesis and Transfer. In SIGGRAPH ’01: Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques. 341–346.
[13]
Alexei A. Efros and Thomas K. Leung. 1999. Texture Synthesis by Non-parametric Sampling. In International Conference on Computer Vision.
[14]
Patrick Esser, Robin Rombach, and Björn Ommer. 2021. Taming Transformers for High-Resolution Image Synthesis. In IEEE Conference on Computer Vision and Pattern Recognition.
[15]
Leon Gatys, Alexander S. Ecker, and Matthias Bethge. 2015. Texture Synthesis Using Convolutional Neural Networks. In Conference on Neural Information Processing Systems.
[16]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Conference on Neural Information Processing Systems.
[17]
Richard Hartley and Andrew Zisserman. 2003. Multiple View Geometry in Computer Vision. Cambridge University Press.
[18]
Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. GANs Trained by a Two Time-Scale Update Rule Converge to a Local Nash Equilibrium. In Conference on Neural Information Processing Systems.
[19]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. In Advances in Neural Information Processing Systems.
[20]
Jonathan Ho and Tim Salimans. 2021. Classifier-Free Diffusion Guidance. In NeurIPS Workshop on Deep Generative Models and Downstream Applications.
[21]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-Image Translation with Conditional Adversarial Nets. In IEEE Conference on Computer Vision and Pattern Recognition.
[22]
Naoya Isoyama, Yamato Sakuragi, Tsutomu Terada, and Masahiko Tsukamoto. 2021. Effects of Augmented Reality Object and Texture Presentation on Walking Behavior. Electronics 10, 6 (2021).
[23]
Nikolay Jetchev, Urs M. Bergmann, and Roland Vollgraf. 2016. Texture Synthesis with Spatial Generative Adversarial Networks. CoRR abs/1611.08207 (2016).
[24]
Justin Johnson, Alexandre Alahi, and Li Fei-Fei. 2016. Perceptual Losses for Real-Time Style Transfer and Super-Resolution. In European Conference on Computer Vision.
[25]
Diederik P. Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations.
[26]
Roland Kwitt and Peter Meerwald. 2008. Salzburg Texture Image Database. Online.
[27]
Wenbo Li, Zhe Lin, Kun Zhou, Lu Qi, Yi Wang, and Jiaya Jia. 2022a. MAT: Mask-Aware Transformer for Large Hole Image Inpainting. In IEEE Conference on Computer Vision and Pattern Recognition.
[28]
Xueting Li, Xiaolong Wang, Ming-Hsuan Yang, Alexei A. Efros, and Sifei Liu. 2022b. Scraping Textures from Natural Images for Synthesis and Editing. In European Conference on Computer Vision.
[29]
Yijun Li, Chen Fang, Jimei Yang, Zhaowen Wang, Xin Lu, and Ming-Hsuan Yang. 2017. Diversified Texture Synthesis with Feed-forward Networks. In IEEE Conference on Computer Vision and Pattern Recognition.
[30]
Guilin Liu, Fitsum A. Reda, Kevin J. Shih, Ting-Chun Wang, Andrew Tao, and Bryzan Catanzaro. 2018. Image Inpainting for Irregular Holes Using Partial Convolutions. In European Conference on Computer Vision.
[31]
Guilin Liu, Rohan Taori, Ting-Chun Wang, Zhiding Yu, Shiqiu Liu, Fitsum A. Reda, Karan Sapra, Andrew Tao, and Bryan Catanzaro. 2020. Transposer: Universal Texture Synthesis Using Feature Maps as Transposed Convolution Filter. CoRR abs/2007.07243 (2020).
[32]
P. B. Mallikarjuna, Alireza Tavakoli Targhi, Mario Fritz, Eric Hayman, Barbara Caputo, and J. O. Eklundh. 2006. THE KTH-TIPS 2 database.
[33]
Morteza Mardani, Guilin Liu, Aysegul Dundar, Shiqiu Liu, Andrew Tao, and Bryan Catanzaro. 2020. Neural FFTs for Universal Texture Image Synthesis. In Advances in Neural Information Processing Systems.
[34]
Simon Osindero Mehdi Mirza. 2014. Conditional Generative Adversarial Nets. CoRR abs/1411.1784 (2014).
[35]
Rosalind Picard, Chris Graczyk, Steve Mann, Josh Wachman, Len Picard, and Lee Campbell. 2010. Vistex Vision Texture Database. Online.
[36]
E. Riba, D. Mishkin, D. Ponsa, E. Rublee, and G. Bradski. 2020. Kornia: an Open Source Differentiable Computer Vision Library for PyTorch. In WACV.
[37]
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bjorn Ommer. 2022. High-Resolution Image Synthesis with Latent Diffusion Models. In IEEE Conference on Computer Vision and Pattern Recognition.
[38]
Chitwan Saharia, William Chan, Huiwen Chang, Chris A. Lee, Jonathan Ho, Tim Salimans, David J. Fleet, and Mohammad Norouzi. 2022. Palette: Image-to-Image Diffusion Models. In SIGGRAPH ’22: ACM SIGGRAPH Conference Proceedings.
[39]
Chitwan Saharia, Jonathan Ho, William Chan, Tim Salimans, David J. Fleet, and Mohammad Norouzi. 2023. Image Super-Resolution via Iterative Refinement. IEEE Transactions on Pattern Analysis and Machine Intelligence 45, 4 (2023).
[40]
Lavanya Sharan, Ce Liu, Ruth Rosenholtz, and Edward H. Adelson. 2014. Accuracy and Speed of Material Categorization in Real-World Images. Journal of Vision 14, 9 (2014), article 12.
[41]
Jascha Sohl-Dickstein, Eric Weiss, Niru Maheswaranathan, and Surya Ganguli. 2015. Deep Unsupervised Learning Using Nonequilibrium Thermodynamics. In International Conference on Machine Learning.
[42]
Jiaming Song, Chenlin Meng, and Stefano Ermon. 2021a. Denoising Diffusion Implicit Models. In International Conference on Learning Representations.
[43]
Yang Song, Jascha Sohl-Dickstein, Diederik P. Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. 2021b. Score-Based Generative Modeling through Stochastic Differential Equations. In International Conference on Learning Representations.
[44]
Aaron van den Oord, Oriol Vinyals, and Koray Kavukcuoglu. 2017. Neural Discrete Representation Learning. In Conference on Neural Information Processing Systems.
[45]
Dor Verbin and Todd Zickler. 2020. Toward a Universal Model for Shape from Texture. In IEEE Conference on Computer Vision and Pattern Recognition.
[46]
Zhou Wang, Alan C. Bovik, Hamid R. Sheikh, and Eero P. Simoncelli. 2004. Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Transactions on Image Processing 13, 4 (2004), 600–612.
[47]
Li-yi Wei, Sylvain Lefebvre, Vivek Kwatra, and Greg Turk. 2009. State of the Art in Example-based Texture Synthesis. In Eurographics 2009 – State of the Art Reports.
[48]
Li-Yi Wei and Marc Levoy. 2000. Fast Texture Synthesis Using Tree-Structured Vector Quantization. In SIGGRAPH ’00: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques. 479–488.
[49]
Huisi Wu, Xiaomeng Lyu, and Zhenkun Wen. 2018. Automatic texture exemplar extraction based on global and local textureness measures. Computational Visual Media 4 (2018), 173–184.
[50]
Huisi Wu, Wei Yan, Ping Li, and Zhenkun Wen. 2021. Deep Texture Exemplar Extraction Based on Trimmed T-CNN. IEEE Transactions on Multimedia 23 (2021), 4502–4514.
[51]
Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas Huang. 2019. Free-Form Image Inpainting with Gated Convolution. In International Conference on Computer Vision.
[52]
Han Zhang, Ian Goodfellow, Dimitris Metaxas, and Augustus Odena. 2019. Self-Attention Generative Adversarial Networks. In International Conference on Machine Learning.
[53]
Richard Zhang, Phillip Isola, Alexei A. Efros, Eli Shechtman, and Oliver Wang. 2018. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric. In IEEE Conference on Computer Vision and Pattern Recognition.
[54]
Yang Zhou, Zhen Zhu, Xiang Bai, Dani Lischinski, Daniel Cohen-Or, and Hui Huang. 2018. Non-Stationary Texture Synthesis by Adversarial Expansion. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 37, 4 (2018), 13 pages.
[55]
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired Image-to-Image Translation using Cycle-Consistent Adversarial Networks. In International Conference on Computer Vision.


