“TryOnGAN: body-aware try-on via layered interpolation” by Lewis, Varadharajan and Kemelmacher-Shlizerman
Conference:
Type(s):
Title:
- TryOnGAN: body-aware try-on via layered interpolation
Presenter(s)/Author(s):
Abstract:
Given a pair of images—target person and garment on another person—we automatically generate the target person in the given garment. Previous methods mostly focused on texture transfer via paired data training, while overlooking body shape deformations, skin color, and seamless blending of garment with the person. This work focuses on those three components, while also not requiring paired data training. We designed a pose conditioned StyleGAN2 architecture with a clothing segmentation branch that is trained on images of people wearing garments. Once trained, we propose a new layered latent space interpolation method that allows us to preserve and synthesize skin color and target body shape while transferring the garment from a different person. We demonstrate results on high resolution 512 × 512 images, and extensively compare to state of the art in try-on on both latent space generated and real images.
References:
1. Martin Abadi et al. 2016. Tensorflow: A system for large-scale machine learning. 12th USENIX symposium on operating systems design and implementation (2016).Google Scholar
2. Rameen Abdal, Yipeng Qin, and Peter Wonka. 2020. Image2StyleGAN++: How to Edit the Embedded Images? CVPR (2020).Google Scholar
3. Yazeed Alharbi and Peter Wonka. 2020. Disentangled Image Generation Through Structured Noise Injection. CVPR (2020).Google Scholar
4. Serge Belongie, Jitendra Malik, and Jan Puzicha. 2002. Shape matching and object recognition using shape contexts. IEEE transactions on pattern analysis and machine intelligence 24.4 (2002), 509–522.Google ScholarDigital Library
5. Arunava Chakraborty et al. 2020. S2cGAN: Semi-Supervised Training of Conditional GANs with Fewer Labels. arXiv e-prints (2020).Google Scholar
6. Chao-Te Chou et al. 2018. Pivtons: Pose invariant virtual try-on shoe with conditional image completion. Asian Conference on Computer Vision (2018).Google Scholar
7. Edo Collins et al. 2020. Editing in Style: Uncovering the Local Semantics of GANs. IEEE Conf. Comput. Vis. Pattern Recog. (2020).Google Scholar
8. Haoye Dong et al. 2019. Towards multi-pose guided virtual try-on network. Proceedings of the IEEE International Conference on Computer Vision (2019).Google Scholar
9. Garoe Dorta et al. 2020. The GAN that warped: Semantic attribute editing with unpaired data. CVPR (2020).Google Scholar
10. Vincent Dumoulin et al. 2018. Feature-wise transformations. Distill (2018).Google Scholar
11. Vincent Dumoulin, Jonathon Shlens, and Manjunath Kudlur. 2016. A learned representation for artistic style. CoRR (2016).Google Scholar
12. Golnaz Ghiasi et al. 2017. Exploring the structure of a real-time, arbitrary neural artistic stylization network. CoRR (2017).Google Scholar
13. Ke Gong, Yiming Gao, Xiaodan Liang, Xiaohui Shen, Meng Wang, and Liang Lin. 2019. Graphonomy: Universal Human Parsing via Graph Transfer Learning. CVPR (2019).Google Scholar
14. Ian Goodfellow et al. 2014. Generative adversarial nets. In Advances in neural information processing systems (2014).Google Scholar
15. Xintong Han et al. 2018. Viton: An image-based virtual try-on network. IEEE Conf. Comput. Vis. Pattern Recog. (2018).Google Scholar
16. Martin Heusel et al. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. Advances in neural information processing systems (2017).Google Scholar
17. Jialu Huang, Jing Liao, and Sam Kwong. 2020. Unsupervised Image-to-Image Translation via Pre-trained StyleGAN2 Network. arXiv preprint arXiv:2010.05713 (2020).Google Scholar
18. Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. CoRR (2017).Google Scholar
19. Thibaut Issenhuth, Jérémie Mary, and Clément Calauzènes. 2019. End-to-End Learning of Geometric Deformations of Feature Maps for Virtual Try-On. arXiv preprint arXiv:1906.01347 (2019).Google Scholar
20. Surgan Jandial et al. 2020. SieveNet: A Unified Framework for Robust Image-Based Virtual Try-On. The IEEE Winter Conference on Applications of Computer Vision (2020).Google Scholar
21. Tero Karras et al. 2017. Progressive growing of gans for improved quality, stability, and variation. arXiv preprint arXiv:1710.10196 (2017).Google Scholar
22. Tero Karras et al. 2019. A style-based generator architecture for generative adversarial network. CVPR (2019).Google Scholar
23. Tero Karras et al. 2020. Analyzing and improving the image quality of stylegan. CVPR (2020).Google Scholar
24. M. Hadi Kiapour, Svetlana Lazebnik Xufeng Han, Alexander C. Berg, and Tamara L. Berg. 2015. Where to Buy It:Matching Street Clothing Photos in Online Shops. International Conference on Computer Vision (2015).Google Scholar
25. Wen Liu et al. 2020. Liquid Warping GAN with Attention: A Unified Framework for Human Image Synthesis. arXiv preprint arXiv:2011.09055 (2020).Google Scholar
26. Yifang Men et al. 2020. Controllable person image synthesis with attribute-decomposed gan. CVPR (2020).Google Scholar
27. Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014).Google Scholar
28. Assaf Neuberger et al. 2020. Image Based Virtual Try-on Network from Unpaired Data. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020).Google Scholar
29. Justin NM Pinkney and Doron Adler. 2020. Resolution Dependant GAN Interpolation for Controllable Image Synthesis Between Domains. arXiv preprint arXiv:2010.05334 (2020).Google Scholar
30. Amir Raffiee and Michael Sollami. 2020. GarmentGAN: Photo-realistic Adversarial Fashion Transfer. arXiv preprint arXiv:2003.01894 (2020).Google Scholar
31. Amit Raj et al. 2018. Swapnet: Garment transfer in single view images. ECCV (2018).Google Scholar
32. Elad Richardson et al. 2020. Encoding in style: a stylegan encoder for image-to-image translation. arXiv preprint arXiv:2008.00951. (2020).Google Scholar
33. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
34. Yang Song et al. 2017. Learning unified embedding for apparel recognition. Proceedings of the IEEE International Conference on Computer Vision Workshops (2017).Google Scholar
35. Bochao Wang et al. 2018a. Toward characteristic-preserving image-based virtual try-on network. ECCV (2018).Google Scholar
36. Ting-Chun Wang et al. 2018b. High-resolution image synthesis and semantic manipulation with conditional gans. CVPR (2018).Google Scholar
37. Zhonghua Wu et al. 2019. M2e-try on net: Fashion from model to everyone. Proceedings of the 27th ACM International Conference on Multimedia (2019).Google Scholar
38. Han Yang et al. 2020. Towards Photo-Realistic Virtual Try-On by Adaptively Generating-Preserving Image Content. CVPR (2020).Google Scholar
39. Gokhan Yildirim et al. 2019. Generating high-resolution fashion model images wearing custom outfits. Proceedings of the IEEE International Conference on Computer Vision Workshops (2019).Google Scholar
40. Mihai Zanfir et al. 2018. Human appearance transfer. IEEE Conf. Comput. Vis. Pattern Recog. (2018).Google Scholar
41. Richard Zhang et al. 2018. The unreasonable effectiveness of deep features as a perceptual metric. CVPR (2018).Google Scholar
42. Jiapeng Zhu et al. 2020. In-domain gan inversion for real image editing. ECCV (2020).Google Scholar