“Mastering Sketching: Adversarial Augmentation for Structured Prediction” by Simo-Serra, Iizuka and Ishikawa

  • ©Edgar Simo-Serra, Satoshi Iizuka, and Hiroshi Ishikawa

Conference:


Type:


Title:

    Mastering Sketching: Adversarial Augmentation for Structured Prediction

Session/Category Title: Sketching


Presenter(s)/Author(s):


Moderator(s):



Abstract:


    We present an integral framework for training sketch simplification networks that convert challenging rough sketches into clean line drawings. Our approach augments a simplification network with a discriminator network, training both networks jointly so that the discriminator network discerns whether a line drawing is real training data or the output of the simplification network, which, in turn, tries to fool it. This approach has two major advantages: first, because the discriminator network learns the structure in line drawings, it encourages the output sketches of the simplification network to be more similar in appearance to the training sketches. Second, we can also train the networks with additional unsupervised data: by adding rough sketches and line drawings that are not corresponding to each other, we can improve the quality of the sketch simplification. Thanks to a difference in the architecture, our approach has advantages over similar adversarial training approaches in stability of training and the aforementioned ability to utilize unsupervised training data. We show how our framework can be used to train models that significantly outperform the state of the art in the sketch simplification task, despite using the same architecture for inference. We also present an approach to optimize for a single image, which improves accuracy at the cost of additional computation time. Finally, we show that, using the same framework, it is possible to train the network to perform the inverse problem, i.e., convert simple line sketches into pencil drawings, which is not possible using the standard mean squared error loss. We validate our framework with two user tests, in which our approach is preferred to the state of the art in sketch simplification 88.9% of the time.

References:


    1. Seok-Hyung Bae, Ravin Balakrishnan, and Karan Singh. 2008. ILoveSketch: As-natural-as-possible sketching system for creating 3D curve models. In ACM Symposium on User Interface Software and Technology. 151–160. Google ScholarDigital Library
    2. Itamar Berger, Ariel Shamir, Moshe Mahler, Elizabeth Carter, and Jessica Hodgins. 2013. Style and abstraction in portrait sketching. ACM Transactions on Graphics 32, 4, 55.Google ScholarDigital Library
    3. Jiazhou Chen, Gal Guennebaud, Pascal Barla, and Xavier Granier. 2013. Non-oriented MLS gradient fields. Computer Graphics Forum 32, 8, 98–109. Google ScholarCross Ref
    4. Chao Dong, C. C. Loy, Kaiming He, and Xiaoou Tang. 2016. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 2, 295–307. Google ScholarDigital Library
    5. Alexey Dosovitskiy and Thomas Brox. 2016. Generating images with perceptual similarity metrics based on deep networks. In Conference on Neural Information Processing Systems.Google Scholar
    6. Jean-Dominique Favreau, Florent Lafarge, and Adrien Bousseau. 2016. Fidelity vs. simplicity: A global approach to line drawing vectorization. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 35, 4. Google ScholarDigital Library
    7. Jakub Fišer, Paul Asente, Stephen Schiller, and Daniel Sýkora. 2015. ShipShape: A drawing beautification assistant. In Workshop on Sketch-Based Interfaces and Modeling. 49–57.Google Scholar
    8. Kunihiko Fukushima. 1988. Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural Networks 1, 2, 119–130. Google ScholarCross Ref
    9. Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarCross Ref
    10. Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Conference on Neural Information Processing Systems.Google Scholar
    11. Cindy Grimm and Pushkar Joshi. 2012. Just drawIt: A 3D sketching system. In International Symposium on Sketch-Based Interfaces and Modeling. 121–130.Google Scholar
    12. Xavier Hilaire and Karl Tombre. 2006. Robust and accurate vectorization of line drawings. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 6, 890–904. Google ScholarDigital Library
    13. Takeo Igarashi, Satoshi Matsuoka, Sachiko Kawachiya, and Hidehiko Tanaka. 1997. Interactive beautification: A technique for rapid geometric design. In ACM Symposium on User Interface Software and Technology. 105–114. http://doi.acm.org/10.1145/263407.263525Google ScholarDigital Library
    14. Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2016. Let there be color!: Joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 35, 4. Google ScholarDigital Library
    15. Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning.Google ScholarDigital Library
    16. Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-image translation with conditional adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarCross Ref
    17. Henry Kang, Seungyong Lee, and Charles K. Chui. 2007. Coherent line drawing. In International Symposium on Non-Photorealistic Animation and Rendering. 43–50. Google ScholarDigital Library
    18. Yann LeCun, Bernhard Boser, John S. Denker, Donnie Henderson, Richard E. Howard, Wayne Hubbard, and Lawrence D. Jackel. 1989. Backpropagation applied to handwritten zip code recognition. Neural Computation 1, 4, 541–551. Google ScholarDigital Library
    19. Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew P. Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, and Wenzhe Shi. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarCross Ref
    20. Chuan Li and Michael Wand. 2016. Precomputed real-time texture synthesis with Markovian generative adversarial networks. In European Conference on Computer Vision. Google ScholarCross Ref
    21. David Lindlbauer, Michael Haller, Mark S. Hancock, Stacey D. Scott, and Wolfgang Stuerzlinger. 2013. Perceptual grouping: Selection assistance for digital sketching. In International Conference on Interactive Tabletops and Surfaces. 51–60. Google ScholarDigital Library
    22. Xueting Liu, Tien-Tsin Wong, and Pheng-Ann Heng. 2015. Closure-aware sketch simplification. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia) 34, 6, 168:1–168:10. Google ScholarDigital Library
    23. Cewu Lu, Li Xu, and Jiaya Jia. 2012. Combining sketch and tone for pencil drawing production. In International Symposium on Non-Photorealistic Animation and Rendering. 65–73.Google Scholar
    24. Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. In Conference on Neural Image Processing Deep Learning Workshop.Google Scholar
    25. Vinod Nair and Geoffrey E. Hinton. 2010. Rectified linear units improve restricted Boltzmann machines. In International Conference on Machine Learning. 807–814.Google ScholarDigital Library
    26. Hyeonwoo Noh, Seunghoon Hong, and Bohyung Han. 2015. Learning deconvolution network for semantic segmentation. In International Conference on Computer Vision. Google ScholarDigital Library
    27. Gioacchino Noris, Alexander Hornung, Robert W. Sumner, Maryann Simmons, and Markus Gross. 2013. Topology-driven vectorization of clean line drawings. ACM Transactions on Graphics 32, 1, 4:1–4:11.Google ScholarDigital Library
    28. Günay Orbay and Levent Burak Kara. 2011. Beautification of design sketches using trainable stroke clustering and curve fitting. IEEE Transactions on Visualization and Computer Graphics 17, 5, 694–708.Google ScholarDigital Library
    29. Deepak Pathak, Philipp Krähenbühl, Jeff Donahue, Trevor Darrell, and Alexei Efros. 2016. Context encoders: Feature learning by inpainting. In IEEE Conference on Computer Vision and Pattern Recognition. Google ScholarCross Ref
    30. Alec Radford, Luke Metz, and Soumith Chintala. 2016. Unsupervised representation learning with deep convolutional generative adversarial networks. In International Conference on Learning Representations.Google Scholar
    31. David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. 1986. Learning representations by back-propagating errors. Nature 323, 533–536. Google ScholarCross Ref
    32. Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training gans. In Conference on Neural Information Processing Systems.Google Scholar
    33. Peter Selinger. 2003. Potrace: A polygon-based tracing algorithm. Potrace (online). Retrieved November 16, 2017 from http://potrace.sourceforge.net/potrace.pdf (2009-07-01)Google Scholar
    34. Amit Shesh and Baoquan Chen. 2008. Efficient and dynamic simplification of line drawings. Computer Graphics Forum 27, 2, 537–545. DOI:https://doi.org/10.1111/j.1467-8659.2008.01151.x Google ScholarCross Ref
    35. Edgar Simo-Serra, Satoshi Iizuka, Kazuma Sasaki, and Hiroshi Ishikawa. 2016. Learning to simplify: Fully convolutional networks for rough sketch cleanup. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 35, 4. Google ScholarDigital Library
    36. Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15.1929–1958.Google Scholar
    37. Yaniv Taigman, Adam Polyak, and Lior Wolf. 2017. Unsupervised cross-domain image generation. In International Conference on Learning Representations.Google Scholar
    38. Xiaolong Wang and Abhinav Gupta. 2016. Generative image modeling using style and structure adversarial networks. In European Conference on Computer Vision. Google ScholarCross Ref
    39. Max Wertheimer. 1923. Untersuchungen zur Lehre von der Gestalt, II. Psychologische Forschung 4, 301–350. Google ScholarCross Ref
    40. Donggeun Yoo, Namil Kim, Sunggyun Park, Anthony S. Paek, and In So Kweon. 2016. Pixel-level domain transfer. In European Conference on Computer Vision. Google ScholarCross Ref
    41. Matthew D. Zeiler. 2012. ADADELTA: An adaptive learning rate method. arXiv Preprint arXiv:1212.5701.Google Scholar
    42. Yipin Zhou and Tamara L. Berg. 2016. Learning temporal transformations from time-lapse videos. In European Conference on Computer Vision. Google ScholarCross Ref
    43. Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In International Conference on Computer Vision. Google ScholarCross Ref

ACM Digital Library Publication:



Overview Page: