“iOrthoPredictor: model-guided deep prediction of teeth alignment” by Yang, Shi, Wu, Li, Zhou, et al. …
Conference:
Type(s):
Title:
- iOrthoPredictor: model-guided deep prediction of teeth alignment
Session/Category Title: Generation and Inference from Images
Presenter(s)/Author(s):
Abstract:
In this paper, we present iOrthoPredictor, a novel system to visually predict teeth alignment in photographs. Our system takes a frontal face image of a patient with visible malpositioned teeth along with a corresponding 3D teeth model as input, and generates a facial image with aligned teeth, simulating a real orthodontic treatment effect. The key enabler of our method is an effective disentanglement of an explicit representation of the teeth geometry from the in-mouth appearance, where the accuracy of teeth geometry transformation is ensured by the 3D teeth model while the in-mouth appearance is modeled as a latent variable. The disentanglement enables us to achieve fine-scale geometry control over the alignment while retaining the original teeth appearance attributes and lighting conditions. The whole pipeline consists of three deep neural networks: a U-Net architecture to explicitly extract the 2D teeth silhouette maps representing the teeth geometry in the input photo, a novel multilayer perceptron (MLP) based network to predict the aligned 3D teeth model, and an encoder-decoder based generative model to synthesize the in-mouth appearance conditional on the original teeth appearance and the aligned teeth geometry. Extensive experimental results and a user study demonstrate that iOrthoPredictor is effective in qualitatively predicting teeth alignment, and applicable to the orthodontic industry.
References:
1. Panos Achlioptas, Olga Diamanti, Ioannis Mitliagkas, and Leonidas Guibas. 2018. Learning representations and generative models for 3D point clouds. In International Conference on Machine Learning (ICML). 40–49.Google Scholar
2. Yazeed Alharbi, Neil Smith, and Peter Wonka. 2019. Latent Filter Scaling for Multimodal Unsupervised Image-to-Image Translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1458–1466.Google ScholarCross Ref
3. Matthew Amodio and Smita Krishnaswamy. 2019. TraVeLGAN: Image-to-image Translation by Transformation Vector Learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8983–8992.Google ScholarCross Ref
4. Moab Arar, Yiftach Ginger, Dov Danon, Ilya Leizerson, Amit Bermano, and Daniel Cohen-Or. 2020. Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation. arXiv preprint arXiv:2003.08073 (2020).Google Scholar
5. Milton B Asbell. 1990. A brief history of orthodontics. American Journal of Orthodontics and Dentofacial Orthopedics 98, 2 (1990), 176–183.Google ScholarCross Ref
6. Coloma Ballester, Marcelo Bertalmio, Vicent Caselles, Guillermo Sapiro, and Joan Verdera. 2001. Filling-in by joint interpolation of vector fields and gray levels. IEEE transactions on image processing 10, 8 (2001), 1200–1211.Google ScholarDigital Library
7. Connelly Barnes, Eli Shechtman, Adam Finkelstein, and Dan B Goldman. 2009. PatchMatch: A randomized correspondence algorithm for structural image editing. In ACM Transactions on Graphics (ToG), Vol. 28. ACM, 24:1–24:11.Google Scholar
8. Connelly Barnes, Eli Shechtman, Dan B Goldman, and Adam Finkelstein. 2010. The generalized PatchMatch correspondence algorithm. In European Conference on Computer Vision. Springer, 29–43.Google ScholarCross Ref
9. Marcelo Bertalmio, Guillermo Sapiro, Vincent Caselles, and Coloma Ballester. 2000. Image Inpainting. In Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH ’00). 417–424.Google Scholar
10. Volker Blanz, Curzio Basso, Tomaso Poggio, and Thomas Vetter. 2003. Reanimating faces in images and video. Computer Graphics Forum 22, 3 (2003), 641–650.Google ScholarCross Ref
11. Chen Cao, Qiming Hou, and Kun Zhou. 2014. Displaced dynamic expression regression for real-time facial tracking and animation. ACM Transactions on Graphics (ToG) 33, 4 (2014), 43.Google ScholarDigital Library
12. Duygu Ceylan, Niloy J. Mitra, Youyi Zheng, and Mark Pauly. 2014. Coupled structure-from-motion and 3D symmetry detection for urban facades. ACM Trans. Graph. 33, 1, Article Article 2 (Feb. 2014), 15 pages.Google ScholarDigital Library
13. Menglei Chai, Tianjia Shao, Hongzhi Wu, Yanlin Weng, and Kun Zhou. 2016. AutoHair: Fully automatic hair modeling from a single image. ACM Trans. Graph. 35, 4 (2016), 116:1–116:12.Google ScholarDigital Library
14. Huiwen Chang, Jingwan Lu, Fisher Yu, and Adam Finkelstein. 2018. PairedCycleGAN: Asymmetric style transfer for applying and removing makeup. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 40–48.Google ScholarCross Ref
15. Yunjey Choi, Minje Choi, Munyoung Kim, Jung-Woo Ha, Sunghun Kim, and Jaegul Choo. 2018. StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8789–8797.Google ScholarCross Ref
16. Antonio Criminisi, Patrick Pérez, and Kentaro Toyama. 2004. Region filling and object removal by exemplar-based image inpainting. IEEE Transactions on image processing 13, 9 (2004), 1200–1212.Google ScholarDigital Library
17. Zhiming Cui, Changjian Li, and Wenping Wang. 2019. ToothNet: Automatic tooth instance segmentation and identification from cone beam CT images. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6368–6377.Google ScholarCross Ref
18. Jifeng Dai, Haozhi Qi, Yuwen Xiong, Yi Li, Guodong Zhang, Han Hu, and Yichen Wei. 2017. Deformable convolutional networks. In Proceedings of the IEEE International Conference on Computer Vision. 764–773.Google ScholarCross Ref
19. Kevin Dale, Kalyan Sunkavalli, Micah K Johnson, Daniel Vlasic, Wojciech Matusik, and Hanspeter Pfister. 2011. Video face replacement. ACM Trans. Graph. 30, 6 (2011), 130:1–130:10.Google ScholarDigital Library
20. Yue Deng, Qionghai Dai, and Zengke Zhang. 2011. Graph Laplace for occluded face completion and recognition. IEEE Transactions on Image Processing 20, 8 (2011), 2329–2338.Google ScholarDigital Library
21. Hui Ding, Kumar Sricharan, and Rama Chellappa. 2018. ExprGAN: Facial expression editing with controllable expression intensity. In Thirty-Second AAAI Conference on Artificial Intelligence.Google Scholar
22. Iddo Drori, Daniel Cohen-Or, and Hezy Yeshurun. 2003. Fragment-based image completion. ACM Transactions on Graphics (ToG) 22, 3 (2003), 303–312.Google ScholarDigital Library
23. Alexei A Efros and William T Freeman. 2001. Image quilting for texture synthesis and transfer. In Proceedings of the 28th annual conference on Computer graphics and interactive techniques. ACM, 341–346.Google ScholarDigital Library
24. Alexei A Efros and Thomas K Leung. 1999. Texture synthesis by non-parametric sampling. In Proceedings of the seventh IEEE international conference on computer vision, Vol. 2. IEEE, 1033–1038.Google ScholarDigital Library
25. Patrick Esser, Ekaterina Sutter, and Björn Ommer. 2018. A variational U-Net for conditional appearance and shape generation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8857–8866.Google ScholarCross Ref
26. Yaroslav Ganin, Daniil Kononenko, Diana Sungatullina, and Victor Lempitsky. 2016. Deepwarp: Photorealistic image resynthesis for gaze manipulation. In European Conference on Computer Vision (ECCV). Springer, 311–326.Google ScholarCross Ref
27. Pablo Garrido, Levi Valgaerts, Hamid Sarmadi, Ingmar Steiner, Kiran Varanasi, Patrick Perez, and Christian Theobalt. 2015. VDub: Modifying face video of actors for plausible visual alignment to a dubbed audio track. Computer Graphics Forum 34, 2 (2015), 193–204.Google ScholarDigital Library
28. Jiahao Geng, Tianjia Shao, Youyi Zheng, Yanlin Weng, and Kun Zhou. 2018. Warp-guided GANs for single-photo facial animation. ACM Trans. Graph. 37, 6, Article 231 (2018), 12 pages.Google ScholarDigital Library
29. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Advances in Neural Information Processing Systems (NIPS). 2672–2680.Google Scholar
30. Lee W Graber, Robert L Vanarsdall, Katherine WL Vig, and Greg J Huang. 2016. Orthodontics: current principles and techniques. Elsevier Health Sciences.Google Scholar
31. Qiao Gu, Guanzhi Wang, Mang Tik Chiu, Yu-Wing Tai, and Chi-Keung Tang. 2019. LADN: Local adversarial disentangling network for facial makeup and de-makeup. Proceedings of the IEEE International Conference on Computer Vision (2019).Google ScholarCross Ref
32. Ishaan Gulrajani, Faruk Ahmed, Martin Arjovsky, Vincent Dumoulin, and Aaron C Courville. 2017. Improved training of wasserstein gans. In Advances in neural information processing systems. 5767–5777.Google Scholar
33. James Hays and Alexei A Efros. 2007. Scene completion using millions of photographs. ACM Transactions on Graphics (ToG) 26, 3 (2007), 4-es.Google ScholarDigital Library
34. Zhenliang He, Wangmeng Zuo, Meina Kan, Shiguang Shan, and Xilin Chen. 2019. AttGAN: Facial attribute editing by only changing what you want. IEEE Transactions on Image Processing (2019).Google Scholar
35. Martin Heusel, Hubert Ramsauer, Thomas Unterthiner, Bernhard Nessler, and Sepp Hochreiter. 2017. Gans trained by a two time-scale update rule converge to a local nash equilibrium. In Advances in neural information processing systems. 6626–6637.Google Scholar
36. Jie Hu, Li Shen, and Gang Sun. 2018. Squeeze-and-Excitation Networks. IEEE Conference on Computer Vision and Pattern Recognition.Google Scholar
37. Jia-Bin Huang, Sing Bing Kang, Narendra Ahuja, and Johannes Kopf. 2014. Image completion using planar structure guidance. ACM Transactions on Graphics (ToG) 33, 4 (2014), 129.Google ScholarDigital Library
38. Xun Huang and Serge Belongie. 2017. Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE International Conference on Computer Vision. 1501–1510.Google ScholarCross Ref
39. Xun Huang, Ming-Yu Liu, Serge Belongie, and Jan Kautz. 2018. Multimodal unsupervised image-to-image translation. In Proceedings of the European Conference on Computer Vision (ECCV). 172–189.Google ScholarDigital Library
40. Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2017. Globally and locally consistent image completion. ACM Transactions on Graphics (ToG) 36, 4 (2017), 107.Google ScholarDigital Library
41. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1125–1134.Google ScholarCross Ref
42. Max Jaderberg, Karen Simonyan, Andrew Zisserman, and koray kavukcuoglu. 2015. Spatial transformer networks. In Advances in Neural Information Processing Systems 28, C. Cortes, N. D. Lawrence, D. D. Lee, M. Sugiyama, and R. Garnett (Eds.). 2017–2025.Google Scholar
43. Yongcheng Jing, Yezhou Yang, Zunlei Feng, Jingwen Ye, Yizhou Yu, and Mingli Song. 2019. Neural style transfer: A review. IEEE transactions on visualization and computer graphics (2019).Google Scholar
44. Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2019. Analyzing and Improving the Image Quality of StyleGAN. CoRR abs/1912.04958 (2019).Google Scholar
45. Masahide Kawai, Tomoyori Iwao, Daisuke Mima, Akinobu Maejima, and Shigeo Morishima. 2013. Photorealistic inner mouth expression in speech animation. In ACM SIGGRAPH 2013 Posters. ACM, 9:1–9:1.Google ScholarDigital Library
46. Masahide Kawai, Tomoyori Iwao, Daisuke Mima, Akinobu Maejima, and Shigeo Morishima. 2014. Data-driven speech animation synthesis focusing on realistic inside of the mouth. Journal of information processing 22, 2 (2014), 401–409.Google ScholarCross Ref
47. Hyeongwoo Kim, Pablo Garrido, Ayush Tewari, Weipeng Xu, Justus Thies, Matthias Niessner, Patrick Pérez, Christian Richardt, Michael Zollhöfer, and Christian Theobalt. 2018. Deep video portraits. ACM Trans. Graph. 37, 4, Article 163 (2018), 14 pages.Google ScholarDigital Library
48. Diederik Kingma and Max Welling. 2013. Auto-Encoding Variational Bayes. In ICLR.Google Scholar
49. Diederik P. Kingma and Jimmy Lei Ba. 2015. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR).Google Scholar
50. Norman William Kingsley. 1880. A treatise on oral deformities as a branch of mechanical surgery. D. Appleton.Google Scholar
51. Rolf M Koch, Markus H Gross, Friedrich R Carls, Daniel F von Büren, George Fankhauser, and Yoav IH Parish. 1996. Simulating facial surgery using finite element models. In Proceedings of the 23rd annual conference on Computer graphics and interactive techniques. 421–428.Google ScholarDigital Library
52. Rolf Köhler, Christian Schuler, Bernhard Schölkopf, and Stefan Harmeling. 2014. Mask-specific inpainting with deep neural networks. In German Conference on Pattern Recognition. Springer, 523–534.Google ScholarCross Ref
53. Johannes Kopf, Wolf Kienzle, Steven Drucker, and Sing Bing Kang. 2012. Quality prediction for image completion. ACM Transactions on Graphics (ToG) 31, 6 (2012), 131.Google ScholarDigital Library
54. Iryna Korshunova, Wenzhe Shi, Joni Dambre, and Lucas Theis. 2017. Fast face-swap using convolutional neural networks. In The IEEE International Conference on Computer Vision. 3697–3705.Google ScholarCross Ref
55. Claudia Kuster, Tiberiu Popa, Jean-Charles Bazin, Craig Gotsman, and Markus Gross. 2012. Gaze correction for home video conferencing. ACM Trans. Graph. 31, 6 (2012), 174:1–174:6.Google ScholarDigital Library
56. Vivek Kwatra, Irfan Essa, Aaron Bobick, and Nipun Kwatra. 2005. Texture optimization for example-based synthesis. ACM Transactions on Graphics (ToG) 24, 3 (2005), 795–802.Google ScholarDigital Library
57. Guillaume Lample, Neil Zeghidour, Nicolas Usunier, Antoine Bordes, Ludovic DE-NOYER, et al. 2017. Fader Networks: Manipulating Images by Sliding Attributes. In Advances in Neural Information Processing Systems. 5963–5972.Google Scholar
58. Anat Levin, Assaf Zomet, and Yair Weiss. 2003. Learning how to inpaint from global image statistics. In International Conference on Computer Vision. IEEE, 305–312.Google ScholarCross Ref
59. Zhanli Li, Jingding Fu, Hongan Li, Kang Zhou, and Qiaojuan Hui. 2019. Automatic arrangement method of misaligned teeth in virtual orthodontic treatment. In Journal of Graphics (in Chinese), Vol. 40. 225–234.Google Scholar
60. Ming-Yu Liu, Thomas Breuel, and Jan Kautz. 2017. Unsupervised image-to-image translation networks. In Advances in neural information processing systems. 700–708.Google Scholar
61. Si Liu, Xinyu Ou, Ruihe Qian, Wei Wang, and Xiaochun Cao. 2016. Makeup like a superstar: Deep localized makeup transfer network. Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, IJCAI’16 (2016), 2568–2575.Google Scholar
62. Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. arXiv preprint arXiv:1411.1784 (2014).Google Scholar
63. Umar Mohammed, Simon JD Prince, and Jan Kautz. 2009. Visio-lization: Generating novel facial images. ACM Transactions on Graphics (ToG) 28, 3 (2009), 57.Google ScholarDigital Library
64. Kyle Olszewski, Zimo Li, Chao Yang, Yi Zhou, Ronald Yu, Zeng Huang, Sitao Xiang, Shunsuke Saito, Pushmeet Kohli, and Hao Li. 2017. Realistic dynamic facial textures from a single image using GANs. In IEEE International Conference on Computer Vision (ICCV). 5429–5438.Google ScholarCross Ref
65. Darko Pavić, Volker Schönefeld, and Leif Kobbelt. 2006. Interactive image completion with perspective correction. The Visual Computer 22, 9–11 (2006), 671–681.Google ScholarDigital Library
66. Guim Perarnau, Joost Van De Weijer, Bogdan Raducanu, and Jose M Álvarez. 2016. Invertible conditional gans for image editing. NIPS Workshop on Adversarial Training (2016).Google Scholar
67. Robert J Peterman, Shuying Jiang, Rene Johe, and Padma M Mukherjee. 2016. Accuracy of Dolphin visual treatment objective (VTO) prediction software on class III patients treated with maxillary advancement and mandibular setback. Progress in orthodontics 17, 1 (2016), 19.Google ScholarCross Ref
68. G Power, J Breckon, M Sherriff, and F McDonald. 2005. Dolphin Imaging Software: an analysis of the accuracy of cephalometric digitization and orthognathic prediction. International journal of oral and maxillofacial surgery 34, 6 (2005), 619–626.Google ScholarCross Ref
69. William R Proffit, Henry W Fields Jr, and David M Sarver. 2006. Contemporary orthodontics. Elsevier Health Sciences.Google Scholar
70. Albert Pumarola, Antonio Agudo, Aleix M Martinez, Alberto Sanfeliu, and Francesc Moreno-Noguer. 2018. GANimation: Anatomically-aware facial animation from a single image. In Proceedings of the European Conference on Computer Vision (ECCV). 818–833.Google ScholarCross Ref
71. Charles R Qi, Hao Su, Kaichun Mo, and Leonidas J Guibas. 2017. Pointnet: Deep learning on point sets for 3D classification and segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 652–660.Google Scholar
72. Shengju Qian, Kwan-Yee Lin, Wayne Wu, Yangxiaokang Liu, Quan Wang, Fumin Shen, Chen Qian, and Ran He. 2019. Make a face: Towards arbitrary high fidelity face manipulation. In International Conference on Computer Vision (ICCV).Google ScholarCross Ref
73. Fengchun Qiao, Naiming Yao, Zirui Jiao, Zhihao Li, Hui Chen, and Hongan Wang. 2018. Geometry-contrastive generative adversarial network for facial expression synthesis. arXiv preprint arXiv:1802.01822 (2018).Google Scholar
74. Waseem Rawat and Zenghui Wang. 2017. Deep convolutional neural networks for image classification: A comprehensive review. Neural computation 29, 9 (2017), 2352–2449.Google Scholar
75. Jimmy SJ Ren, Li Xu, Qiong Yan, and Wenxiu Sun. 2015. Shepard convolutional neural networks. In Advances in Neural Information Processing Systems. 901–909.Google Scholar
76. Yurui Ren, Xiaoming Yu, Ruonan Zhang, Thomas H Li, Shan Liu, and Ge Li. 2019. StructureFlow: Image inpainting via structure-aware appearance flow. In Proceedings of the IEEE International Conference on Computer Vision. 181–190.Google ScholarCross Ref
77. Olaf Ronneberger, Philipp Fischer, and Thomas Brox. 2015. U-net: Convolutional networks for biomedical image segmentation. In International Conference on Medical image computing and computer-assisted intervention. Springer, 234–241.Google ScholarCross Ref
78. Zhixin Shu, Mihir Sahasrabudhe, Riza Alp Guler, Dimitris Samaras, Nikos Paragios, and Iasonas Kokkinos. 2018. Deforming autoencoders: Unsupervised disentangling of shape and appearance. In The European Conference on Computer Vision (ECCV).Google ScholarCross Ref
79. Md Mahfuzur Rahman Siddiquee, Zongwei Zhou, Nima Tajbakhsh, Ruibin Feng, Michael B. Gotway, Yoshua Bengio, and Jianming Liang. 2019. Learning fixed points in generative adversarial networks: From image-to-image translation to disease detection and localization. In The IEEE International Conference on Computer Vision (ICCV).Google Scholar
80. Denis Simakov, Yaron Caspi, Eli Shechtman, and Michal Irani. 2008. Summarizing visual data using bidirectional similarity. In 2008 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1–8.Google ScholarCross Ref
81. Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).Google Scholar
82. Lingxiao Song, Zhihe Lu, Ran He, Zhenan Sun, and Tieniu Tan. 2018. Geometry guided adversarial facial expression synthesis. In ACM Multimedia Conference on Multimedia Conference. ACM, 627–635.Google ScholarDigital Library
83. Jian Sun, Lu Yuan, Jiaya Jia, and Heung-Yeung Shum. 2005. Image completion with structure propagation. ACM Transactions on Graphics (ToG) 24, 3 (2005), 861–868.Google ScholarDigital Library
84. Supasorn Suwajanakorn, Steven M Seitz, and Ira Kemelmacher-Shlizerman. 2017. Synthesizing obama: learning lip sync from audio. ACM Transactions on Graphics (ToG) 36, 4 (2017), 95.Google ScholarDigital Library
85. Justus Thies, Michael Zollhöfer, Matthias Nießner, Levi Valgaerts, Marc Stamminger, and Christian Theobalt. 2015. Real-time expression transfer for facial reenactment. ACM Trans. Graph. 34, 6 (2015), 183:1–183:14.Google ScholarDigital Library
86. Zdravko Velinov, Marios Papas, Derek Bradley, Paulo Gotardo, Parsa Mirdehghan, Steve Marschner, Jan Novák, and Thabo Beeler. 2019. Appearance capture and modeling of human teeth. ACM Transactions on Graphics (ToG) 37, 6 (2019), 207.Google ScholarDigital Library
87. Shuyang Wang and Yun Fu. 2016. Face behind makeup. In Thirtieth AAAI Conference on Artificial Intelligence. 58–64.Google ScholarDigital Library
88. Nicholas Watters, Loïc Matthey, Christopher P. Burgess, and Alexander Lerchner. 2019. Spatial Broadcast Decoder: A Simple Architecture for Learning Disentangled Representations in VAEs. CoRR abs/1901.07017 (2019). arXiv:1901.07017 http://arxiv.org/abs/1901.07017Google Scholar
89. Yonatan Wexler, Eli Shechtman, and Michal Irani. 2007. Space-time completion of video. IEEE Transactions on Pattern Analysis & Machine Intelligence 3 (2007), 463–476.Google ScholarDigital Library
90. Oliver Whyte, Josef Sivic, and Andrew Zisserman. 2009. Get out of my picture! Internet-based inpainting.. In BMVC, Vol. 2. 5.Google Scholar
91. Chenglei Wu, Derek Bradley, Pablo Garrido, Michael Zollhöfer, Christian Theobalt, Markus Gross, and Thabo Beeler. 2016. Model-based teeth reconstruction. ACM Transactions on Graphics (ToG) 35, 6 (2016), 220.Google ScholarDigital Library
92. Po-Wei Wu, Yu-Jing Lin, Che-Han Chang, Edward Y Chang, and Shih-Wei Liao. 2019b. RelGAN: Multi-Domain Image-to-Image Translation via Relative Attributes. In Proceedings of the IEEE International Conference on Computer Vision. 5914–5922.Google Scholar
93. Ruizheng Wu, Xin Tao, Xiaodong Gu, Xiaoyong Shen, and Jiaya Jia. 2019c. Attribute-driven spontaneous motion in unpaired image translation. In The IEEE International Conference on Computer Vision (ICCV).Google ScholarCross Ref
94. Wayne Wu, Kaidi Cao, Cheng Li, Chen Qian, and Chen Change Loy. 2019a. TransGaGa: Geometry-aware unsupervised image-to-image translation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 8012–8021.Google ScholarCross Ref
95. Chaohao Xie, Shaohui Liu, Chao Li, Ming-Ming Cheng, Wangmeng Zuo, Xiao Liu, Shilei Wen, and Errui Ding. 2019. Image inpainting with learnable bidirectional attention maps. In Proceedings of the IEEE International Conference on Computer Vision. 8858–8867.Google ScholarCross Ref
96. Junyuan Xie, Linli Xu, and Enhong Chen. 2012. Image denoising and inpainting with deep neural networks. In Advances in neural information processing systems. 341–349.Google Scholar
97. Saining Xie and Zhuowen Tu. 2015. Holistically-nested edge detection. In Proceedings of the IEEE International Conference on Computer Vision. 1395–1403.Google ScholarDigital Library
98. Wei Xiong, Jiahui Yu, Zhe Lin, Jimei Yang, Xin Lu, Connelly Barnes, and Jiebo Luo. 2019. Foreground-aware image inpainting. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5840–5848.Google ScholarCross Ref
99. Xiaojie Xu, Chang Liu, and Youyi Zheng. 2018. 3D tooth segmentation and labeling using deep convolutional neural networks. IEEE transactions on visualization and computer graphics 25, 7 (2018), 2336–2348.Google Scholar
100. Chao Yang, Xin Lu, Zhe Lin, Eli Shechtman, Oliver Wang, and Hao Li. 2017. Highresolution image inpainting using multi-scale neural patch synthesis. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 6721–6729.Google ScholarCross Ref
101. Lingchen Yang, Lumin Yang, Mingbo Zhao, and Youyi Zheng. 2018. Controlling Stroke Size in Fast Style Transfer with Recurrent Convolutional Neural Network. In Computer Graphics Forum, Vol. 37. 97–107.Google ScholarCross Ref
102. Raymond Yeh, Ziwei Liu, Dan B Goldman, and Aseem Agarwala. 2016. Semantic facial expression editing using autoencoded flow. arXiv preprint arXiv:1611.09961 (2016).Google Scholar
103. Zili Yi, Hao Zhang, Ping Tan, and Minglun Gong. 2017. Dualgan: Unsupervised dual learning for image-to-image translation. In Proceedings of the IEEE International Conference on Computer Vision. 2849–2857.Google ScholarCross Ref
104. Jiahui Yu, Zhe Lin, Jimei Yang, Xiaohui Shen, Xin Lu, and Thomas S Huang. 2018. Generative image inpainting with contextual attention. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 5505–5514.Google ScholarCross Ref
105. Bo Zhao, Bo Chang, Zequn Jie, and Leonid Sigal. 2018. Modular generative adversarial networks. In Proceedings of the European Conference on Computer Vision (ECCV). 150–165.Google ScholarCross Ref
106. Zhong-Qiu Zhao, Peng Zheng, Shou-tao Xu, and Xindong Wu. 2019. Object detection with deep learning: A review. IEEE transactions on neural networks and learning systems 30, 11 (2019), 3212–3232.Google Scholar
107. Chuanxia Zheng, Tat-Jen Cham, and Jianfei Cai. 2019. Pluralistic image completion. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 1438–1447.Google ScholarCross Ref
108. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A Efros. 2017a. Unpaired image-to-image translation using cycle-consistent adversarial networks. In Proceedings of the IEEE international conference on computer vision. 2223–2232.Google ScholarCross Ref
109. Jun-Yan Zhu, Richard Zhang, Deepak Pathak, Trevor Darrell, Alexei A Efros, Oliver Wang, and Eli Shechtman. 2017b. Toward multimodal image-to-image translation. In Advances in Neural Information Processing Systems. 465–476.Google Scholar
110. Xizhou Zhu, Han Hu, Stephen Lin, and Jifeng Dai. 2019a. Deformable convnets v2: More deformable, better results. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 9308–9316.Google ScholarCross Ref
111. Zhen Zhu, Tengteng Huang, Baoguang Shi, Miao Yu, Bofei Wang, and Xiang Bai. 2019b. Progressive pose attention transfer for person image generation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2347–2356.Google ScholarCross Ref


