“Globally optimal toon tracking”

  • ©Haichao Zhu, Xueting Liu, Tien-Tsin Wong, and Pheng-Ann Heng




    Globally optimal toon tracking





    The ability to identify objects or region correspondences between consecutive frames of a given hand-drawn animation sequence is an indispensable tool for automating animation modification tasks such as sequence-wide recoloring or shape-editing of a specific animated character. Existing correspondence identification methods heavily rely on appearance features, but these features alone are insufficient to reliably identify region correspondences when there exist occlusions or when two or more objects share similar appearances. To resolve the above problems, manual assistance is often required. In this paper, we propose a new correspondence identification method which considers both appearance features and motions of regions in a global manner. We formulate correspondence likelihoods between temporal region pairs as a network flow graph problem which can be solved by a well-established optimization algorithm. We have evaluated our method with various animation sequences and results show that our method consistently outperforms the state-of-the-art methods without any user guidance.


    1. Adelson, E. H., and Bergen, J. R. 1985. Spatiotemporal energy models for the perception of motion. Journal of the Optical Society of America A 2, 2, 284–299.Google ScholarCross Ref
    2. Baker, S., Scharstein, D., Lewis, J. P., Roth, S., Black, M. J., and Szeliski, R. 2011. A database and evaluation methodology for optical flow. International Journal of Computer Vision 92, 1, 1–31. Google ScholarDigital Library
    3. Berclaz, J., Fleuret, F., Türetken, E., and Fua, P. 2011. Multiple object tracking using k-shortest paths optimization. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 9, 1806–1819. Google ScholarDigital Library
    4. Bregler, C., Loeb, L., Chuang, E., and Deshpande, H. 2002. Turning to the masters: motion capturing cartoons. In ACM Transactions on Graphics, vol. 21, ACM, 399–407. Google ScholarDigital Library
    5. Butt, A. A., and Collins, R. T. 2013. Multi-target tracking by lagrangian relaxation to min-cost network flow. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, 1846–1853. Google ScholarDigital Library
    6. Horn, B. K. P., and Schunck, B. G. 1981. Determining optical flow. In Proceedings of International Society for Optics and Photonics Technical Symposium East, 319–331.Google Scholar
    7. Kort, A. 2002. Computer aided inbetweening. In Proceedings of International Symposium on Non-Photorealistic Animation and Rendering, 125–132. Google ScholarDigital Library
    8. Ling, H., and Jacobs, D. W. 2007. Shape classification using the inner-distance. IEEE Transactions on Pattern Analysis and Machine Intelligence 29, 2, 286–299. Google ScholarDigital Library
    9. Liu, X., Mao, X., Yang, X., Zhang, L., and Wong, T.-T. 2013. Stereoscopizing cel animations. ACM Transactions on Graphics 32, 6, 223. Google ScholarDigital Library
    10. Lucas, B. D., and Kanade, T. 1981. An iterative image registration technique with an application to stereo vision. In Proceedings of International Joint Conference on Artificial Intelligence, vol. 81, 674–679. Google ScholarDigital Library
    11. Madeira, J. S., Stork, A., and Gross, M. H. 1996. An approach to computer-supported cartooning. The Visual Computer 12, 1, 1–17.Google ScholarCross Ref
    12. Noris, G., Sýkora, D., Coros, S., Whited, B., Simmons, M., Hornung, A., Gross, M., and Sumner, R. 2011. Temporal noise control for sketchy animation. In Proceedings of International Symposium on Non-photorealistic Animation and Rendering, 93–98. Google ScholarDigital Library
    13. Park, C., Woehl, T. J., Evans, J. E., and Browning, N. D. 2015. Minimum cost multi-way data association for optimizing multitarget tracking of interacting objects. IEEE Transactions on Pattern Analysis and Machine Intelligence 37, 3, 611–624.Google ScholarDigital Library
    14. Pellegrini, S., Ess, A., and Gool, L. V. 2010. Improving data association by joint modeling of pedestrian trajectories and groupings. In Proceedings of European Conference on Computer Vision, Springer, 452–465. Google ScholarDigital Library
    15. Qiu, J., Seah, H. S., Tian, F., Wu, Z., and Chen, Q. 2005. Feature-and region-based auto painting for 2d animation. The Visual Computer 21, 11, 928–944.Google ScholarCross Ref
    16. Shitrit, H. B., Berclaz, J., Fleuret, F., and Fua, P. 2014. Multi-commodity network flow for tracking multiple people. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 8, 1614–1627. Google ScholarDigital Library
    17. Smeulders, A. W. M., Chu, D. M., Cucchiara, R., Calderara, S., Dehghan, A., and Shah, M. 2014. Visual tracking: an experimental survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 36, 7, 1442–1468. Google ScholarDigital Library
    18. Song, Z., Yu, J., Zhou, C., and Wang, M. 2013. Automatic cartoon matching in computer-assisted animation production. Neurocomputing 120, 397–403.Google ScholarCross Ref
    19. Sykora, D., Buriánek, J., and Žára, J. 2005. Colorization of black-and-white cartoons. Image and Vision Computing 23, 9, 767–782. Google ScholarDigital Library
    20. Sykora, D., Dingliana, J., and Collins, S. 2009. As-rigid-as-possible image registration for hand-drawn cartoon animations. In Proceedings of International Symposium on Non-Photorealistic Animation and Rendering, 25–33. Google ScholarDigital Library
    21. Sykora, D., Ben-Chen, M., Čadík, M., Whited, B., and Simmons, M. 2011. Textoons: practical texture mapping for hand-drawn cartoon animations. In Proceedings of International Symposium on Non-Photorealistic Animation and Rendering, 75–84. Google ScholarDigital Library
    22. Werner, R. 1965. Autocorrelation: a principle for the evaluation of sensory information by the central nervous system. Sensory Communication 303, 318.Google Scholar
    23. Whited, B., Noris, G., Simmons, M., Sumner, R., Gross, M., and Rossignac, J. 2010. Betweenit: An interactive tool for tight inbetweening. Computer Graphics Forum 29, 2, 605–614.Google ScholarCross Ref
    24. Xing, J., Wei, L.-Y., Shiratori, T., and Yatani, K. 2015. Autocomplete hand-drawn animations. ACM Transactions on Graphics 34, 6, 169. Google ScholarDigital Library
    25. Xu, L., Jia, J., and Matsushita, Y. 2012. Motion detail preserving optical flow estimation. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 9, 1744–1757. Google ScholarDigital Library
    26. Yen, J. Y. 1971. Finding the k shortest loopless paths in a network. Management Science 17, 11, 712–716.Google ScholarDigital Library
    27. Zhang, S.-H., Chen, T., Zhang, Y.-F., Hu, S.-M., and Martin, R. R. 2009. Vectorizing cartoon animations. IEEE Transactions on Visualization and Computer Graphics 15, 4, 618–629. Google ScholarDigital Library
    28. Zhang, L., Huang, H., and Fu, H. 2012. Excol: an extract-and-complete layering approach to cartoon animation reusing. IEEE Transactions on Visualization and Computer Graphics 18, 7, 1156–1169. Google ScholarDigital Library

ACM Digital Library Publication: