“Motion-driven concatenative synthesis of cloth sounds” by An, James and Marschner

  • ©Steven S. An, Doug L. James, and Steve Marschner




    Motion-driven concatenative synthesis of cloth sounds



    We present a practical data-driven method for automatically synthesizing plausible soundtracks for physics-based cloth animations running at graphics rates. Given a cloth animation, we analyze the deformations and use motion events to drive crumpling and friction sound models estimated from cloth measurements. We synthesize a low-quality sound signal, which is then used as a target signal for a concatenative sound synthesis (CSS) process. CSS selects a sequence of microsound units, very short segments, from a database of recorded cloth sounds, which best match the synthesized target sound in a low-dimensional feature-space after applying a hand-tuned warping function. The selected microsound units are concatenated together to produce the final cloth sound with minimal filtering. Our approach avoids expensive physics-based synthesis of cloth sound, instead relying on cloth recordings and our motion-driven CSS approach for realism. We demonstrate its effectiveness on a variety of cloth animations involving various materials and character motions, including first-person virtual clothing with binaural sound.


    1. Baraff, D., and Witkin, A. P. 1998. Large steps in cloth simulation. In Proceedings of SIGGRAPH 98, Computer Graphics Proceedings, Annual Conference Series, 43–54. Google ScholarDigital Library
    2. Belongie, S., Malik, J., and Puzicha, J. 2002. Shape matching and object recognition using shape contexts. Pattern Analysis and Machine Intelligence, IEEE Trans. on 24, 4 (April), 509–522. Google ScholarDigital Library
    3. Bilbao, S. 2009. Numerical Sound Synthesis. Wiley Online Library.Google Scholar
    4. Bridson, R., Fedkiw, R. P., and Anderson, J. 2002. Robust treatment of collisions, contact, and friction for cloth animation. ACM Transactions on Graphics 21, 3 (July), 594–603. Google ScholarDigital Library
    5. Brown, C., and Duda, R. 1998. A structural model for binaural sound synthesis. Speech and Audio Processing, IEEE Transactions on 6, 5 (sep), 476–488.Google Scholar
    6. Cardle, M., Brooks, S., Bar-Joseph, Z., and Robinson, P. 2003. Sound-by-numbers: motion-driven sound synthesis. In Proceedings of the 2003 ACM SIGGRAPH/Eurographics symposium on Computer animation, Eurographics Association, Aire-la-Ville, Switzerland, Switzerland, SCA ’03, 349–356. Google ScholarDigital Library
    7. Chadwick, J. N., and James, D. L. 2011. Animating fire with sound. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2011) 30, 4 (Aug.). Google ScholarDigital Library
    8. Chadwick, J. N., An, S. S., and James, D. L. 2009. Harmonic shells: A practical nonlinear sound model for near-rigid thin shells. ACM Transactions on Graphics 28, 5 (Dec.), 119:1–119:10. Google ScholarDigital Library
    9. Cho, G., Casali, J., and Yi, E. 2001. Effect of fabric sound and touch on human subjective sensation. Fibers and Polymers 2, 196–202.Google ScholarCross Ref
    10. Cho, G., Kim, C., Cho, J., and Ha, J. 2005. Physiological signal analyses of frictional sound by structural parameters of warp knitted fabrics. Fibers and Polymers 6, 89–94.Google ScholarCross Ref
    11. Choi, K.-J., and Ko, H.-S. 2002. Stable but responsive cloth. ACM Transactions on Graphics 21, 3 (July), 604–611. Google ScholarDigital Library
    12. Cook, P. 2002. Real Sound Synthesis for Interactive Applications. A. K. Peters. Google ScholarDigital Library
    13. Courshesnes, M., Volino, P., and Magnenat-Thalmann, N. 1995. Versatile and efficient techniques for simulating cloth and other deformable objects. In Proceedings of SIGGRAPH 95, Computer Graphics Proceedings, Annual Conference Series, 137–144. Google ScholarDigital Library
    14. Dubnov, S., Bar-Joseph, Z., El-Yaniv, R., Lischinski, D., and Werman, M. 2002. Synthesizing sound textures through wavelet tree learning. Computer Graphics and Applications, IEEE 22, 4, 38–48. Google ScholarDigital Library
    15. Fontana, F., and Bresin, R. 2003. Physics-based sound synthesis and control: crushing, walking and running by crumpling sounds. In Proc. Colloquium on Musical Informatics, 109–114.Google Scholar
    16. Heeger, D. J., and Bergen, J. R. 1995. Pyramid-based texture analysis/synthesis. In Proceedings of SIGGRAPH 95, Computer Graphics Proceedings, Annual Conference Series, 229–238. Google ScholarDigital Library
    17. Houle, P., and Sethna, J. 1996. Acoustic emission from crumpling paper. Physical Review E 54, 1, 278.Google ScholarCross Ref
    18. Huang, G., Metaxas, D., and Govindaraj, M. 2003. Feel the “fabric”: an audio-haptic interface. In 2003 ACM SIGGRAPH/Eurographics Symposium on Computer Animation, 52–61. Google ScholarDigital Library
    19. Hunt, A., and Black, A. 1996. Unit selection in a concatenative speech synthesis system using a large speech database. In Acoustics, Speech, and Signal Processing, 1996. ICASSP-96. Conference Proceedings., 1996 IEEE International Conference on, vol. 1, IEEE, 373–376. Google ScholarDigital Library
    20. James, D. L., Barbič, J., and Pai, D. K. 2006. Precomputed Acoustic Transfer: Output-sensitive, accurate sound generation for geometrically complex vibration sources. ACM Transactions on Graphics 25, 3 (July), 987–995. Google ScholarDigital Library
    21. Kaldor, J. M., James, D. L., and Marschner, S. 2008. Simulating knitted cloth at the yarn level. ACM Transactions on Graphics 27, 3 (Aug.), 65:1–65:9. Google ScholarDigital Library
    22. Karplus, K., and Strong, A. 1983. Digital Synthesis of Plucked-String and Drum Timbres. Computer Music Journal 7, 2, 43–55.Google ScholarCross Ref
    23. Marelli, D., Aramaki, M., Kronland-Martinet, R., and Verron, C. 2010. Time–frequency synthesis of noisy sounds with narrow spectral components. Audio, Speech, and Language Processing, IEEE Transactions on 18, 8, 1929–1940. Google ScholarDigital Library
    24. Matusik, W., Zwicker, M., and Durand, F. 2005. Texture design using a simplicial complex of morphable textures. ACM Transactions on Graphics 24, 3 (Aug.), 787–794. Google ScholarDigital Library
    25. O’Brien, J. F., Cook, P. R., and Essl, G. 2001. Synthesizing sounds from physically based motion. In Proceedings of ACM SIGGRAPH 2001, Computer Graphics Proceedings, Annual Conference Series, 529–536. Google ScholarDigital Library
    26. O’Brien, J. F., Shen, C., and Gatchalian, C. M. 2002. Synthesizing sounds from rigid-body simulations. In ACM SIGGRAPH Symposium on Computer Animation, 175–181. Google ScholarDigital Library
    27. Peltola, L., Erkut, C., Cook, P. R., and Valimaki, V. 2007. Synthesis of hand clapping sounds. Audio, Speech, and Language Processing, IEEE Transactions on 15, 3, 1021–1029. Google ScholarDigital Library
    28. Picard, C., Tsingos, N., and Faure, F. 2009. Retargetting example sounds to interactive physics-driven animations. In AES 35th International Conference on Audio for Games.Google Scholar
    29. Pullen, K., and Bregler, C. 2002. Motion capture assisted animation: Texturing and synthesis. ACM Transactions on Graphics 21, 3 (July), 501–508. Google ScholarDigital Library
    30. Rabiner, L., and Juang, B. H. 1993. Fundamentals of Speech Recognition, united states ed. Prentice Hall, Apr. Google ScholarDigital Library
    31. Raghuvanshi, N., and Lin, M. C. 2006. Interactive Sound Synthesis for Large Scale Environments. In SI3D ’06: Proceedings of the 2006 symposium on Interactive 3D graphics and games, ACM Press, New York, NY, USA, 101–108. Google ScholarDigital Library
    32. Roads, C. 2004. Microsound. The MIT Press. Google ScholarDigital Library
    33. Rodet, X., Depall, P., Rodet, X., and Depalle, P. 1992. Spectral envelopes and inverse FFT synthesis. In Proceedings of the 93rd Audio Engineering Society Convention.Google Scholar
    34. Schwarz, D. 2004. Data-Driven Concatenative Sound Synthesis. PhD thesis, Ircam, Centre Pompidou, University of Paris 6–Pierre et Marie Curie.Google Scholar
    35. Serra, X., and Smith, Julius, I. 1990. Spectral modeling synthesis: A sound analysis/synthesis system based on a deterministic plus stochastic decomposition. Computer Music Journal 14, 4, pp. 12–24.Google ScholarCross Ref
    36. Strobl, G., Eckel, G., Rocchesso, D., and le Grazie, S. 2006. Sound texture modeling: A survey. In Proceedings of the 2006 Sound and Music Computing (SMC) International Conference, 61–5.Google Scholar
    37. Takala, T., and Hahn, J. 1992. Sound rendering. In Computer Graphics (Proceedings of SIGGRAPH 92), 211–220. Google ScholarDigital Library
    38. Terzopoulos, D., Platt, J., Barr, A., and Fleischer, K. 1987. Elastically deformable models. In Computer Graphics (Proceedings of SIGGRAPH 87), 205–214. Google ScholarDigital Library
    39. van den Doel, K., and Pai, D. K. 1996. Synthesis of shape dependent sounds with physical modeling. In Intl Conf. on Auditory Display.Google Scholar
    40. van den Doel, K., Kry, P. G., and Pai, D. K. 2001. FoleyAutomatic: Physically Based Sound Effects for Interactive Simulation and Animation. In Proceedings of ACM SIGGRAPH 2001, Computer Graphics Proceedings, Annual Conference Series, 537–544. Google ScholarDigital Library
    41. Wahba, G. 1990. Spline Models for Observational Data. Society for Industrial and Applied Mathematics, Philadephia, PA.Google Scholar
    42. Zheng, C., and James, D. L. 2010. Rigid-body fracture sound with precomputed soundbanks. ACM Transactions on Graphics 29, 4 (July), 69:1–69:13. Google ScholarDigital Library

ACM Digital Library Publication: