“Matting by Generation”
Conference:
Type(s):
Title:
- Matting by Generation
Presenter(s)/Author(s):
Abstract:
This paper redefines traditional regression-based matting as a generative modeling challenge and harnesses the capabilities of latent diffusion models enriched with extensive pre-trained knowledge to tackle this challenge. It not only produces mattes with superior resolution and detail but is also versatile and can perform both guidance-free and guidance-based matting.
References:
[1]
Yagiz Aksoy, Tun? Ozan Aydin, and Marc Pollefeys. 2017. Designing effective inter-pixel information flow for natural image matting. In CVPR.
[2]
Omer Bar-Tal, Lior Yariv, Yaron Lipman, and Tali Dekel. 2023. MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation. In ICML.
[3]
Tim Brooks, Aleksander Holynski, and Alexei A. Efros. 2023. InstructPix2Pix: Learning to Follow Image Editing Instructions. In CVPR.
[4]
Ryan Burgert, Kanchana Ranasinghe, Xiang Li, and Michael S Ryoo. 2023. Peekaboo: Text to image diffusion models are zero-shot segmentors. In CVPRW.
[5]
Quan Chen, Tiezheng Ge, Yanyu Xu, Zhiqiang Zhang, Xinxin Yang, and Kun Gai. 2018. Semantic human matting. In ACM MM.
[6]
Qifeng Chen, Dingzeyu Li, and Chi-Keung Tang. 2013. KNN matting. IEEE TPAMI 35, 9 (2013), 2175?2188.
[7]
Donghyeon Cho, Yu-Wing Tai, and In So Kweon. 2019. Deep Convolutional Neural Network for Natural Image Matting Using Initial Alpha Mattes. IEEE TIP 28, 3 (2019), 1054?1067.
[8]
Yung-Yu Chuang, Brian Curless, David H. Salesin, and Richard Szeliski. 2001. A Bayesian Approach to Digital Matting. In CVPR.
[9]
Ben Fei, Zhaoyang Lyu, Liang Pan, Junzhe Zhang, Weidong Yang, Tianyue Luo, Bo Zhang, and Bo Dai. 2023. Generative Diffusion Prior for Unified Image Restoration and Enhancement. In CVPR.
[10]
Xiaoxue Feng, Xiaohui Liang, and Zili Zhang. 2016. A Cluster Sampling Method for Image Matting via Sparse Coding. In ECCV.
[11]
Eduardo S. L. Gastal and Manuel M. Oliveira. 2010. Shared Sampling for Real-Time Alpha Matting. In Eurographics.
[12]
Thomas Germer, Tobias Uelwer, Stefan Conrad, and Stefan Harmeling. 2021. Fast multi-level foreground estimation. In ICPR.
[13]
Leo Grady, Thomas Schiwietz, Shmuel Aharon, and R?diger Westermann. 2005. Random walks for interactive alpha-matting. In Proceedings of the IASTED International Conference on Visualization, Imaging and Image Processing.
[14]
Kaiming He, Christoph Rhemann, Carsten Rother, Xiaoou Tang, and Jian Sun. 2011. A global sampling method for alpha matting. In CVPR.
[15]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising Diffusion Probabilistic Models. In NeurIPS.
[16]
Yihan Hu, Yiheng Lin, Wei Wang, Yao Zhao, Yunchao Wei, and Humphrey Shi. 2023. Diffusion for Natural Image Matting. arXiv preprint arXiv:2312.05915 (2023).
[17]
Bahjat Kawar, Shiran Zada, Oran Lang, Omer Tov, Huiwen Chang, Tali Dekel, Inbar Mosseri, and Michal Irani. 2023. Imagic: Text-Based Real Image Editing with Diffusion Models. In CVPR.
[18]
Zhanghan Ke, Jiayu Sun, Kaican Li, Qiong Yan, and Rynson WH Lau. 2022. MODNet: Real-time trimap-free portrait matting via objective decomposition. In AAAI.
[19]
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Dollar, and Ross Girshick. 2023a. Segment Anything. In ICCV.
[20]
Alexander Kirillov, Eric Mintun, Nikhila Ravi, Hanzi Mao, Chloe Rolland, Laura Gustafson, Tete Xiao, Spencer Whitehead, Alexander C. Berg, Wan-Yen Lo, Piotr Doll?r, and Ross Girshick. 2023b. Segment Anything. In ICCV.
[21]
Anat Levin, Dani Lischinski, and Yair Weiss. 2008. A Closed-Form Solution to Natural Image Matting. IEEE TPAMI 30, 2 (2008), 228?242.
[22]
Jiachen Li, Jitesh Jain, and Humphrey Shi. 2023a. Matting Anything. arXiv: 2306.05399 (2023).
[23]
Junnan Li, Dongxu Li, Silvio Savarese, and Steven Hoi. 2023b. Blip-2: Bootstrapping language-image pre-training with frozen image encoders and large language models. In ICML.
[24]
Jizhizi Li, Sihan Ma, Jing Zhang, and Dacheng Tao. 2021. Privacy-Preserving Portrait Matting. In ACM MM.
[25]
Jizhizi Li, Jing Zhang, Stephen J. Maybank, and Dacheng Tao. 2022b. Bridging composite and real: towards end-to-end deep image matting. IJCV 130, 2 (2022), 246?266.
[26]
Jizhizi Li, Jing Zhang, and Dacheng Tao. 2023d. Deep Image Matting: A Comprehensive Survey. arXiv preprint arXiv:2304.04672 (2023).
[27]
Peizhuo Li, Kfir Aberman, Zihan Zhang, Rana Hanocka, and Olga Sorkine-Hornung. 2022a. GANimator: Neural Motion Synthesis from a Single Sequence. ACM TOG 41, 4 (2022), 138.
[28]
Yanyu Li, Huan Wang, Qing Jin, Ju Hu, Pavlo Chemerys, Yun Fu, Yanzhi Wang, Sergey Tulyakov, and Jian Ren. 2023c. SnapFusion: Text-to-Image Diffusion Model on Mobile Devices within Two Seconds. In NeurIPS.
[29]
Shanchuan Lin, Andrey Ryabtsev, Soumyadip Sengupta, Brian L Curless, Steven M Seitz, and Ira Kemelmacher-Shlizerman. 2021. Real-time high-resolution background matting. In CVPR.
[30]
Jinlin Liu, Yuan Yao, Wendi Hou, Miaomiao Cui, Xuansong Xie, Changshui Zhang, and Xian sheng Hua. 2020. Boosting semantic human matting with coarse annotations. In CVPR.
[31]
Yuhao Liu, Jiake Xie, Xiao Shi, Yu Qiao, Yujie Huang, Yong Tang, and Xin Yang. 2021. Tripartite information mining and integration for image matting. In ICCV.
[32]
Hao Lu, Yutong Dai, Chunhua Shen, and Songcen Xu. 2019a. Context-Aware Image Matting for Simultaneous Foreground and Alpha Estimation. In ICCV.
[33]
Hao Lu, Yutong Dai, Chunhua Shen, and Songcen Xu. 2019b. Indices matter: Learning to index for deep image matting. In ICCV.
[34]
Sihan Ma, Jizhizi Li, Jing Zhang, He Zhang, and Dacheng Tao. 2023. Rethinking Portrait Matting with Pirvacy Preserving. IJCV 131, 8 (2023), 2172?2197.
[35]
GyuTae Park, SungJoon Son, JaeYoung Yoo, SeHo Kim, and Nojun Kwak. 2022. Matteformer: Transformer-based image matting via prior-tokens. In CVPR.
[36]
Thomas Porter and Tom Duff. 1984. Compositing Digital Images. In SIGGRAPH.
[37]
Yu Qiao, Yuhao Liu, Xin Yang, Dongsheng Zhou, Mingliang Xu, Qiang Zhang, and Xiaopeng Wei. 2020. Attention-guided hierarchical structure aggregation for image matting. In CVPR.
[38]
Christoph Rhemann, Carsten Rother, Jue Wang, Margrit Gelautz, Pushmeet Kohli, and Pamela Rott. 2009. A perceptually motivated online benchmark for image matting. In CVPR.
[39]
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Bj?rn Ommer. 2022. High-resolution image synthesis with latent diffusion models. In CVPR.
[40]
Christoph Schuhmann, Romain Beaumont, Richard Vencu, Cade Gordon, Ross Wightman, Mehdi Cherti, Theo Coombes, Aarush Katta, Clayton Mullis, Mitchell Wortsman, 2022. Laion-5b: An open large-scale dataset for training next generation image-text models. NeurIPS (2022).
[41]
Soumyadip Sengupta, Vivek Jayaram, Brian Curless, Steven M Seitz, and Ira Kemelmacher-Shlizerman. 2020. Background matting: The world is your green screen. In CVPR.
[42]
Ehsan Shahrian, Deepu Rajan, Brian Price, and Scott Cohen. 2013. Improving image matting using comprehensive sampling sets. In CVPR.
[43]
Dmitriy Smirnov, Chloe LeGendre, Xueming Yu, and Paul Debevec. 2023. Magenta Green Screen: Spectrally Multiplexed Alpha Matting with Deep Colorization. In Proceedings of the Digital Production Symposium.
[44]
Jiaming Song, Chenlin Meng, and Stefano Ermon. 2021. Denoising Diffusion Implicit Models. In ICML.
[45]
Yang Song, Prafulla Dhariwal, Mark Chen, and Ilya Sutskever. 2023. Consistency models. In ICML.
[46]
Yang Song, Jascha Sohl-Dickstein, Diederik P Kingma, Abhishek Kumar, Stefano Ermon, and Ben Poole. 2020. Score-Based Generative Modeling through Stochastic Differential Equations. In ICML.
[47]
Jian Sun, Jiaya Jia, Chi-Keung Tang, and Heung-Yeung Shum. 2004. Poisson matting. ACM TOG 23, 3 (2004), 315?321.
[48]
Yanan Sun, Chi-Keung Tang, and Yu-Wing Tai. 2021. Semantic image matting. In CVPR.
[49]
Luming Tang, Nataniel Ruiz, Chu Qinghao, Yuanzhen Li, Aleksander Holynski, David E Jacobs, Bharath Hariharan, Yael Pritch, Neal Wadhwa, Kfir Aberman, and Michael Rubinstein. 2023. RealFill: Reference-Driven Generation for Authentic Image Completion. arXiv preprint arXiv:2309.16668 (2023).
[50]
Jue Wang and Michael F. Cohen. 2007. Optimized Color Sampling for Robust Matting. In CVPR.
[51]
Tianyi Wei, Dongdong Chen, Wenbo Zhou, Jing Liao, Hanqing Zhao, Weiming Zhang, and Nenghai Yu. 2021. Improved Image Matting via Real-time User Clicks and Uncertainty Estimation. In CVPR.
[52]
Jiawei Wu, Changqing Zhang, Zuoyong Li, Huazhu Fu, Xi Peng, and Joey Tianyi Zhou. 2023. dugMatting: decomposed-uncertainty-guided matting. In ICML.
[53]
Bin Xia, Yulun Zhang, Shiyin Wang, Yitong Wang, Xinglong Wu, Yapeng Tian, Wenming Yang, and Luc Van Gool. 2023. DiffIR: Efficient Diffusion Model for Image Restoration. In ICCV.
[54]
Jiarui Xu, Sifei Liu, Arash Vahdat, Wonmin Byeon, Xiaolong Wang, and Shalini De Mello. 2023b. Open-vocabulary panoptic segmentation with text-to-image diffusion models. In CVPR.
[55]
Ning Xu, Brian Price, Scott Cohen, and Thomas Huang. 2017. Designing effective inter-pixel information flow for natural image matting. In CVPR.
[56]
Yangyang Xu, Shengfeng He, Wenqi Shao, Kwan-Yee K Wong, Yu Qiao, and Ping Luo. 2023a. DiffusionMat: Alpha Matting as Sequential Refinement Learning. arXiv preprint arXiv:2311.13535 (2023).
[57]
Xin Yang, Ke Xu, Shaozhe Chen, Shengfeng He, Baocai Yin Yin, and Rynson Lau. 2018. Active Matting. In NeurIPS.
[58]
Qihang Yu, Jianming Zhang, He Zhang, Yilin Wang, Zhe Lin, Ning Xu, Yutong Bai, and Alan Yuille. 2021. Mask guided matting via progressive refinement network. In CVPR.
[59]
Zongsheng Yue, Jianyi Wang, and Chen Change Loy. 2023. ResShift: Efficient Diffusion Model for Image Super-resolution by Residual Shifting. In NeurIPS.
[60]
Zhixing Zhang, Ligong Han, Arnab Ghosh, Dimitris Metaxas, and Jian Ren. 2023. SINE: SINgle Image Editing with Text-to-Image Diffusion Models. In CVPR.