“Deshaking endoscopic video for kymography” by Schneider, Hilsmann and Eisert

  • ©David C. Schneider, Anna Hilsmann, and Peter Eisert

  • ©David C. Schneider, Anna Hilsmann, and Peter Eisert




    Deshaking endoscopic video for kymography



    The opening and closing of the vocal folds (plica vocalis) at high frequencies is a major source of sound in human speech. Videokymography [Svec and Schutte 1995] is a technique for visualizing the motion of the vocal folds for medical diagnosis: The vibrating folds are filmed with an endoscopic camera pointed into the larynx. The camera records at a high framerate to capture vocal fold vibration (see fig. 1 for example frames). The kymogram used for medical diagnosis is a time-slice image, i.e. an X-t-cut through the X-Y-t image cube of the endoscopic video (fig. 2). The quality and diagnostic interpretability of a kymogram deteriorates significantly if the camera moves relative to the scene as this motion interferes with the vibratory motion of the vocal fold in the kymogram. Therefore, we propose an approach to stabilizing the motion of endoscopic video for kymography.


    1. Hilsmann, A., Schneider, D. C., and Eisert, P. 2010. Realistic cloth augmentation in single view video under occlusions. Comput. Graph. 34 (October), 567–574.
    2. Liu, F., Gleicher, M., Jin, H., and Agarwala, A. 2009. Contentpreserving warps for 3d video stabilization. ACM Trans. Graphics 28 (July), 44:1–44:9.
    3. Schneider, D. C., Hilsmann, A., and Eisert, P. 2011. Warp-based Motion Compensation for Endoscopic Kymography. In Eurographics 2011, Llandudno.
    4. Svec, J. G., and Schutte, H. K. 1995. Videokymography: High-speed line scanning of vocal fold vibration. Journal of Voice 10/2, 201–205.

ACM Digital Library Publication:

Overview Page: