Mastering Sketching: Adversarial Augmentation for Structured Prediction

We present an integral framework for training sketch simplification networks that convert challenging rough sketches into clean line drawings. Our approach augments a simplification network with a discriminator network, training both networks jointly so that the discriminator network discerns whether a line drawing is real training data or the output of the simplification network, which, in turn, tries to fool it. This approach has two major advantages: first, because the discriminator network learns the structure in line drawings, it encourages the output sketches of the simplification network to be more similar in appearance to the training sketches. Second, we can also train the networks with additional unsupervised data: by adding rough sketches and line drawings that are not corresponding to each other, we can improve the quality of the sketch simplification. Thanks to a difference in the architecture, our approach has advantages over similar adversarial training approaches in stability of training and the aforementioned ability to utilize unsupervised training data. We show how our framework can be used to train models that significantly outperform the state of the art in the sketch simplification task, despite using the same architecture for inference. We also present an approach to optimize for a single image, which improves accuracy at the cost of additional computation time. Finally, we show that, using the same framework, it is possible to train the network to perform the inverse problem, i.e., convert simple line sketches into pencil drawings, which is not possible using the standard mean squared error loss. We validate our framework with two user tests, in which our approach is preferred to the state of the art in sketch simplification 88.9% of the time.

References:

Seok-Hyung Bae, Ravin Balakrishnan, and Karan Singh. 2008. ILoveSketch: As-natural-as-possible sketching system for creating 3D curve models. In ACM Symposium on User Interface Software and Technology. 151–160.
Itamar Berger, Ariel Shamir, Moshe Mahler, Elizabeth Carter, and Jessica Hodgins. 2013. Style and abstraction in portrait sketching. ACM Transactions on Graphics 32, 4, 55.
Jiazhou Chen, Gal Guennebaud, Pascal Barla, and Xavier Granier. 2013. Non-oriented MLS gradient fields. Computer Graphics Forum 32, 8, 98–109.
Chao Dong, C. C. Loy, Kaiming He, and Xiaoou Tang. 2016. Image super-resolution using deep convolutional networks. IEEE Transactions on Pattern Analysis and Machine Intelligence 38, 2, 295–307.
Alexey Dosovitskiy and Thomas Brox. 2016. Generating images with perceptual similarity metrics based on deep networks. In Conference on Neural Information Processing Systems.
Jean-Dominique Favreau, Florent Lafarge, and Adrien Bousseau. 2016. Fidelity vs. simplicity: A global approach to line drawing vectorization. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 35, 4.
Jakub Fišer, Paul Asente, Stephen Schiller, and Daniel Sýkora. 2015. ShipShape: A drawing beautification assistant. In Workshop on Sketch-Based Interfaces and Modeling. 49–57.
Kunihiko Fukushima. 1988. Neocognitron: A hierarchical neural network capable of visual pattern recognition. Neural Networks 1, 2, 119–130.
Leon A. Gatys, Alexander S. Ecker, and Matthias Bethge. 2016. Image style transfer using convolutional neural networks. In IEEE Conference on Computer Vision and Pattern Recognition.
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Conference on Neural Information Processing Systems.
Cindy Grimm and Pushkar Joshi. 2012. Just drawIt: A 3D sketching system. In International Symposium on Sketch-Based Interfaces and Modeling. 121–130.
Xavier Hilaire and Karl Tombre. 2006. Robust and accurate vectorization of line drawings. IEEE Transactions on Pattern Analysis and Machine Intelligence 28, 6, 890–904.
Takeo Igarashi, Satoshi Matsuoka, Sachiko Kawachiya, and Hidehiko Tanaka. 1997. Interactive beautification: A technique for rapid geometric design. In ACM Symposium on User Interface Software and Technology. 105–114. http://doi.acm.org/10.1145/263407.263525
Satoshi Iizuka, Edgar Simo-Serra, and Hiroshi Ishikawa. 2016. Let there be color&excl;: Joint end-to-end learning of global and local image priors for automatic image colorization with simultaneous classification. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 35, 4.
Sergey Ioffe and Christian Szegedy. 2015. Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning.
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A. Efros. 2017. Image-to-image translation with conditional adversarial networks. In IEEE Conference on Computer Vision and Pattern Recognition.
Henry Kang, Seungyong Lee, and Charles K. Chui. 2007. Coherent line drawing. In International Symposium on Non-Photorealistic Animation and Rendering. 43–50.
Yann LeCun, Bernhard Boser, John S. Denker, Donnie Henderson, Richard E. Howard, Wayne Hubbard, and Lawrence D. Jackel. 1989. Backpropagation applied to handwritten zip code recognition. Neural Computation 1, 4, 541–551.
Christian Ledig, Lucas Theis, Ferenc Huszar, Jose Caballero, Andrew P. Aitken, Alykhan Tejani, Johannes Totz, Zehan Wang, and Wenzhe Shi. 2017. Photo-realistic single image super-resolution using a generative adversarial network. In IEEE Conference on Computer Vision and Pattern Recognition.
Chuan Li and Michael Wand. 2016. Precomputed real-time texture synthesis with Markovian generative adversarial networks. In European Conference on Computer Vision.
David Lindlbauer, Michael Haller, Mark S. Hancock, Stacey D. Scott, and Wolfgang Stuerzlinger. 2013. Perceptual grouping: Selection assistance for digital sketching. In International Conference on Interactive Tabletops and Surfaces. 51–60.
Xueting Liu, Tien-Tsin Wong, and Pheng-Ann Heng. 2015. Closure-aware sketch simplification. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia) 34, 6, 168:1–168:10.
Cewu Lu, Li Xu, and Jiaya Jia. 2012. Combining sketch and tone for pencil drawing production. In International Symposium on Non-Photorealistic Animation and Rendering. 65–73.
Mehdi Mirza and Simon Osindero. 2014. Conditional generative adversarial nets. In Conference on Neural Image Processing Deep Learning Workshop.
Vinod Nair and Geoffrey E. Hinton. 2010. Rectified linear units improve restricted Boltzmann machines. In International Conference on Machine Learning. 807–814.
Hyeonwoo Noh, Seunghoon Hong, and Bohyung Han. 2015. Learning deconvolution network for semantic segmentation. In International Conference on Computer Vision.
Gioacchino Noris, Alexander Hornung, Robert W. Sumner, Maryann Simmons, and Markus Gross. 2013. Topology-driven vectorization of clean line drawings. ACM Transactions on Graphics 32, 1, 4:1–4:11.
Günay Orbay and Levent Burak Kara. 2011. Beautification of design sketches using trainable stroke clustering and curve fitting. IEEE Transactions on Visualization and Computer Graphics 17, 5, 694–708.
Deepak Pathak, Philipp Krähenbühl, Jeff Donahue, Trevor Darrell, and Alexei Efros. 2016. Context encoders: Feature learning by inpainting. In IEEE Conference on Computer Vision and Pattern Recognition.
Alec Radford, Luke Metz, and Soumith Chintala. 2016. Unsupervised representation learning with deep convolutional generative adversarial networks. In International Conference on Learning Representations.
David E. Rumelhart, Geoffrey E. Hinton, and Ronald J. Williams. 1986. Learning representations by back-propagating errors. Nature 323, 533–536.
Tim Salimans, Ian Goodfellow, Wojciech Zaremba, Vicki Cheung, Alec Radford, and Xi Chen. 2016. Improved techniques for training gans. In Conference on Neural Information Processing Systems.
Peter Selinger. 2003. Potrace: A polygon-based tracing algorithm. Potrace (online). Retrieved November 16, 2017 from http://potrace.sourceforge.net/potrace.pdf (2009-07-01)
Amit Shesh and Baoquan Chen. 2008. Efficient and dynamic simplification of line drawings. Computer Graphics Forum 27, 2, 537–545. DOI:https://doi.org/10.1111/j.1467-8659.2008.01151.x
Edgar Simo-Serra, Satoshi Iizuka, Kazuma Sasaki, and Hiroshi Ishikawa. 2016. Learning to simplify: Fully convolutional networks for rough sketch cleanup. ACM Transactions on Graphics (Proceedings of SIGGRAPH) 35, 4.
Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. Journal of Machine Learning Research 15.1929–1958.
Yaniv Taigman, Adam Polyak, and Lior Wolf. 2017. Unsupervised cross-domain image generation. In International Conference on Learning Representations.
Xiaolong Wang and Abhinav Gupta. 2016. Generative image modeling using style and structure adversarial networks. In European Conference on Computer Vision.
Max Wertheimer. 1923. Untersuchungen zur Lehre von der Gestalt, II. Psychologische Forschung 4, 301–350.
Donggeun Yoo, Namil Kim, Sunggyun Park, Anthony S. Paek, and In So Kweon. 2016. Pixel-level domain transfer. In European Conference on Computer Vision.
Matthew D. Zeiler. 2012. ADADELTA: An adaptive learning rate method. arXiv Preprint arXiv:1212.5701.
Yipin Zhou and Tamara L. Berg. 2016. Learning temporal transformations from time-lapse videos. In European Conference on Computer Vision.
Jun-Yan Zhu, Taesung Park, Phillip Isola, and Alexei A. Efros. 2017. Unpaired image-to-image translation using cycle-consistent adversarial networks. In International Conference on Computer Vision.