“FontCode: Embedding Information in Text Documents Using Glyph” by Xiao, Zhang and Zheng

  • ©Chang Xiao, Cheng Zhang, and Changxi Zheng



Session Title:

    Decision & Style


    FontCode: Embedding Information in Text Documents Using Glyph




    We introduce FontCode, an information embedding technique for text documents. Provided a text document with specific fonts, our method embeds user-specified information in the text by perturbing the glyphs of text characters while preserving the text content. We devise an algorithm to choose unobtrusive yet machine-recognizable glyph perturbations, leveraging a recently developed generative model that alters the glyphs of each character continuously on a font manifold. We then introduce an algorithm that embeds a user-provided message in the text document and produces an encoded document whose appearance is minimally perturbed from the original document. We also present a glyph recognition method that recovers the embedded information from an encoded document stored as a vector graphic or pixel image, or even on a printed paper. In addition, we introduce a new error-correction coding scheme that rectifies a certain number of recognition errors. Lastly, we demonstrate that our technique enables a wide array of applications, using it as a text document metadata holder, an unobtrusive optical barcode, a cryptographic message embedding scheme, and a text document signature.


    1. Martín Abadi, Ashish Agarwal, Paul Barham, Eugene Brevdo, Zhifeng Chen, Craig Citro, et al. 2015. TensorFlow: Large-Scale Machine Learning on Heterogeneous Systems. Retrieved from http://tensorflow.org/.
    2. A. Adobe. 2001. Manager’s introduction to Adobe eXtensible metadata platform. The Adobe XML Metadata Framework. 1–18.
    3. Monika Agarwal. 2013. Text steganographic approaches: A comparison. International Journal of Network Security 8 Its Applications (IJNSA) 5, 1 (2013), 91–106.
    4. Adnan M. Alattar and Osama M. Alattar. 2004. Watermarking electronic text documents containing justified paragraphs and irregular line spacing. In Electronic Imaging 2004. International Society for Optics and Photonics, 685–695.
    5. Carlos Avilés-Cruz, Risto Rangel-Kuoppa, Mario Reyes-Ayala, A. Andrade-Gonzalez, and Rafael Escarela-Perez. 2005. High-order statistical texture analysis—font recognition applied. Pattern Recognition Letters 26, 2 (2005), 135–145. 
    6. Wesam Bhaya, Abdul Monem Rahma, and A. L.-Nasrawi Dhamyaa. 2013. Text steganography based on font type in MS-Word documents. Journal of Computer Science 9, 7 (2013), 898.
    7. Jeffrey A. Bloom, Ingemar J. Cox, Ton Kalker, J.-PMG Linnartz, Matthew L. Miller, and C. Brendan S. Traw. 1999. Copy protection for DVD video. Proc. IEEE 87, 7 (1999), 1267–1276.
    8. Dan Boneh. 2000. Finding smooth integers in short intervals using CRT decoding. In Proceedings of the 32nd ACM Symposium on Theory of Computing. 265–272. 
    9. Ralph Allan Bradley and Milton E. Terry. 1952. Rank analysis of incomplete block designs: I. The method of paired comparisons. Biometrika 39, 3/4 (1952), 324–345.
    10. Jack T. Brassil, Steven Low, Nicholas F. Maxemchuk, and Lawrence O’Gorman. 1995. Electronic marking and identification techniques to discourage document copying. IEEE Journal on Selected Areas in Communications 13, 8 (1995), 1495–1504. 
    11. Neill D. F. Campbell and Jan Kautz. 2014. Learning a manifold of fonts. ACM Trans. Graph. 33, 4 (July 2014), 91:1–91:11. 
    12. Sunita Chaudhary, Meenu Dave, and Amit Sanghi. 2016. Text steganography based on feature coding method. In Proceedings of the International Conference on Advances in Information Communication Technology 8 Computing. ACM, 7. 
    13. Abbas Cheddad, Joan Condell, Kevin Curran, and Paul Mc Kevitt. 2010. Digital image steganography: Survey and analysis of current methods. Signal Processing 90, 3 (2010), 727–752. 
    14. Guang Chen, Jianchao Yang, Hailin Jin, Jonathan Brandt, Eli Shechtman, Aseem Agarwala, and Tony X. Han. 2014. Large-scale visual font recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 3598–3605. 
    15. François Chollet and others. 2015. Keras. https://github.com/fchollet/keras. (2015).
    16. ADC Denso. 2011. QR code essentials. Denso Wave 900 (2011).
    17. Günay Dogan, Javier Bernal, and Charles R. Hagwood. 2015. FFT-based alignment of 2D closed curves with application to elastic shape analysis. In Proceedings of the 1st International Workshop on Differential Geometry in Computer Vision for Analysis of Shape, Images and Trajectories. 4222–4230.
    18. Donald Eastlake 3rd and Paul Jones. 2001. US Secure Hash Algorithm 1 (SHA1) (No. RFC 3174). 
    19. Niels Ferguson and Bruce Schneier. 2003. Practical Cryptography. Vol. 23. Wiley, New York. 
    20. Oded Goldreich, Dana Ron, and Madhu Sudan. 1999. Chinese remaindering with errors. In Proceedings of the 31st ACM Symposium on Theory of Computing. 225–234. 
    21. Jane Greenberg. 2005. Understanding metadata and metadata schemes. Cataloging 8 Classification Quarterly 40, 3–4 (2005), 17–36.
    22. Adnan Gutub and Manal Fattani. 2007. A novel arabic text steganography method using letter points and extensions. World Academy of Science, Engineering and Technology 27 (2007), 28–31.
    23. Changyuan Hu and Roger D. Hersch. 2001. Parameterizable fonts based on shape components. IEEE Computer Graphics and Applications 21, 3 (2001), 70–85. 
    24. Kensei Jo, Mohit Gupta, and Shree K. Nayar. 2016. DisCo: Display-camera communication using rolling shutter sensors. ACM Trans. Graph. 35, 5 (July 2016), 150:1–150:13. 
    25. Min-Chul Jung, Yong-Chul Shin, and Sargur N. Srihari. 1999. Multifont classification using typographical attributes. In Proceedings of the Fifth International Conference on Document Analysis and Recognition, 1999. ICDAR’99. IEEE, 353–356. 
    26. Richard M. Karp. 1972. Reducibility among combinatorial problems. In Complexity of Computer Computations. Springer, 85–103.
    27. Victor J. Katz and Annette Imhausen. 2007. The Mathematics of Egypt, Mesopotamia, China, India, and Islam: A Sourcebook. Princeton University Press.
    28. Young-Won Kim, Kyung-Ae Moon, and Il-Seok Oh. 2003. A text watermarking algorithm based on word classification and inter-word space statistics. In ICDAR. 775–779. 
    29. Diederik Kingma and Jimmy Ba. 2015. Adam: A method for stochastic optimization. In International Conference on Learning Representations (ICLR) (2015).
    30. D. E. Knuth. 1986. The METAFONT Book. Addison-Welsey, Reading, MA. 
    31. Janez Konc and Dušanka Janezic. 2007. An improved branch and bound algorithm for the maximum clique problem. MATCH Commun. Math. Comput. Chem. 58, (2007), 569–590.
    32. Vincent M. K. Lau. 2009. Learning by example for parametric font design. In ACM SIGGRAPH ASIA 2009 Posters (SIGGRAPH ASIA’09). 5:1–5:1. 
    33. Shu Lin and Daniel J. Costello. 2004. Error Control Coding. Pearson Education India.
    34. Jörn Loviscach. 2010. The universe of fonts, charted by machine. In ACM SIGGRAPH 2010 Talks. ACM, 27. 
    35. Peter O’Donovan, Jānis Lībeks, Aseem Agarwala, and Aaron Hertzmann. 2014. Exploratory font selection using crowdsourced attributes. ACM Trans. Graph. 33, 4 (July 2014), 92:1–92:9. 
    36. Nobuyuki Otsu. 1975. A threshold selection method from gray-level histograms. Automatica 11, 285–296 (1975), 23–27.
    37. Jeebananda Panda, Nishant Gupta, Parag Saxena, Shubham Agrawal, Surabhi Jain, and Asok Bhattacharyya. 2015. Text watermarking using sinusoidal greyscale variations of font based on alphabet count. In International Journal of Innovative Research in Computer and Communication Engineering 3, 4 (2015), 3353–3361.
    38. Huy Quoc Phan, Hongbo Fu, and Antoni B. Chan. 2015. Flexyfont: Learning transferring rules for flexible typeface synthesis. In Computer Graphics Forum, Vol. 34. Wiley Online Library, 245–256. 
    39. R. Ramanathan, K. P. Soman, L. Thaneshwaran, V. Viknesh, T. Arunkumar, and P. Yuvaraj. 2009. A novel technique for English font recognition using support vector machines. In International Conference on Advances in Recent Technologies in Communication and Computing (ARTCom’09). IEEE, 766–769. 
    40. Irving S. Reed and Gustave Solomon. 1960. Polynomial codes over certain finite fields. Journal of the Society for Industrial and Applied Mathematics 8, 2 (1960), 300–304.
    41. R. Rivest. 1992. The MD5 Message-Digest Algorithm. (1992).
    42. Ronald L. Rivest, Adi Shamir, and Leonard Adleman. 1978. A method for obtaining digital signatures and public-key cryptosystems. Commun. ACM 21, 2 (1978), 120–126. 
    43. Stefano Giovanni Rizzo, Flavio Bertini, and Danilo Montesi. 2016. Content-preserving text watermarking through unicode homoglyph substitution. In Proceedings of the 20th International Database Engineering 8 Applications Symposium. ACM, 97–104. 
    44. Kenneth H. Rosen. 2011. Elementary Number Theory. Pearson Education.
    45. Lynn Rugglcs. 1983. Letterform Design Systems. Technical Report. Stanford University. 
    46. Ákos Seress. 2003. Permutation Group Algorithms. Vol. 152. Cambridge University Press.
    47. Ariel Shamir and Ari Rappoport. 1998. Feature-based design of fonts using constraints. In Electronic Publishing, Artistic Imaging, and Digital Typography. Springer, 93–108. 
    48. Claude Elwood Shannon. 1948. A mathematical theory of communication. The Bell System Technical Journal 27 (1948), 379–423, 623–656.
    49. R. Smith. 2007. An overview of the tesseract OCR engine. In Proceedings of the Ninth International Conference on Document Analysis and Recognition – Volume 02 (ICDAR’07). IEEE Computer Society, Washington, DC, 629–633. 
    50. Ivan Stojanov, Aleksandra Mileva, and Igor Stojanovic. 2014. A new property coding in text steganography of Microsoft Word documents. In the Eighth International Conference on Emerging Security Information, Systems and Technologies.
    51. Rapee Suveeranont and Takeo Igarashi. 2010. Example-based automatic font generation. In International Symposium on Smart Graphics. Springer, 127–138. 
    52. Kiwon Um, Xiangyu Hu, and Nils Thuerey. 2017. Perceptual evaluation of liquid simulation methods. ACM Trans. Graph. 36, 4 (2017). 
    53. Zhangyang Wang, Jianchao Yang, Hailin Jin, Eli Shechtman, Aseem Agarwala, Jonathan Brandt, and Thomas S. Huang. 2015. DeepFont: Identify your font from an image. In Proceedings of the 23rd ACM International Conference on Multimedia (MM’15). ACM, New York, NY, 451–459. 
    54. Peter Wayner. 1992. Mimic functions. Cryptologia 16, 3 (1992), 193–214. 
    55. Peter Wayner. 2009. Disappearing Cryptography: Information Hiding: Steganography 8 Watermarking. Morgan Kaufmann.