“Decoupled Sampling for Graphics Pipelines” by Ragan-Kelley, Lehtinen, Chen, Doggett and Durand

  • ©Jonathan Ragan-Kelley, Jaakko Lehtinen, Jiawen Chen, Michael Doggett, and Frédo Durand

Conference:


Type:


Title:

    Decoupled Sampling for Graphics Pipelines

Presenter(s)/Author(s):



Abstract:


    We propose a generalized approach to decoupling shading from visibility sampling in graphics pipelines, which we call decoupled sampling. Decoupled sampling enables stochastic supersampling of motion and defocus blur at reduced shading cost, as well as controllable or adaptive shading rates which trade off shading quality for performance. It can be thought of as a generalization of multisample antialiasing (MSAA) to support complex and dynamic mappings from visibility to shading samples, as introduced by motion and defocus blur and adaptive shading. It works by defining a many-to-one hash from visibility to shading samples, and using a buffer to memoize shading samples and exploit reuse across visibility samples. Decoupled sampling is inspired by the Reyes rendering architecture, but like traditional graphics pipelines, it shades fragments rather than micropolygon vertices, decoupling shading from the geometry sampling rate. Also unlike Reyes, decoupled sampling only shades fragments after precise computation of visibility, reducing overshading.
    We present extensions of two modern graphics pipelines to support decoupled sampling: a GPU-style sort-last fragment architecture, and a Larrabee-style sort-middle pipeline. We study the architectural implications of decoupled sampling and blur, and derive end-to-end performance estimates on real applications through an instrumented functional simulator. We demonstrate high-quality motion and defocus blur, as well as variable and adaptive shading rates.

References:


    1. Akeley, K. 1993. RealityEngine graphics. In Proceedings of SIGGRAPH International Conference on Computer Graphics and Interactive Techniques. 109–116.
    2. Akenine-Möller, T., Munkberg, J., and Hasselgren, J. 2007. Stochastic rasterization using time-continuous triangles. In Proceedings of the Graphics Hardware Conference. 7–16.
    3. Boulos, S., Luong, E., Fatahalian, K., Moreton, H., and Hanrahan, P. 2010. Space-time hierarchical occlusion culling for micropolygon rendering with motion blur. In Proceedings of the High Performance Graphics Conference. 11–18.
    4. Brunhaver, J., Fatahalian, K., and Hanrahan, P. 2010. Hardware implementation of micropolygon rasterization with motion and defocus blur. In Proceedings of the High Performance Graphics Conference. 1–9.
    5. Burns, C. A., Fatahalian, K., and Mark, W. R. 2010. A lazy object-space shading architecture with decoupled sampling. In Proceedings of the High Performance Graphics Conference. 19–28.
    6. Cook, R. L. 1986. Stochastic sampling in computer graphics. ACM Trans. Graph. 5, 1, 51–72.
    7. Cook, R. L., Carpenter, L., and Catmull, E. 1987. The reyes image rendering architecture. In Proceedings of SIGGRAPH 87 International Conference on Computer Graphics and Interactive Techniques. 95–102.
    8. Cook, R. L., Porter, T., and Carpenter, L. 1984. Distributed ray tracing. In Proceedings of SIGGRAPH 84 International Conference on Computer Graphics and Interactive Techniques. 137–145.
    9. Eldridge, M. 2001. Designing graphics architectures around scalability and communication. Ph.D. thesis, Stanford University.
    10. Fatahalian, K., Boulos, S., Hegarty, J., Akeley, K., Mark, W. R., Moreton, H., and Hanrahan, P. 2010. Reducing shading on gpus using quad-fragment merging. ACM Trans. Graph. 29, 4, 67:1–67:8.
    11. Fatahalian, K., Luong, E., Boulos, S., Akeley, K., Mark, W. R., and Hanrahan, P. 2009. Data-Parallel rasterization of micropolygons with defocus and motion blur. In Proceedings of the High Performance Graphics Conference. 59–68.
    12. Fisher, M., Fatahalian, K., Boulos, S., Akeley, K., Mark, W. R., and Hanrahan, P. 2009. DiagSplit: Parallel, crack-free, adaptive tessellation for micropolygon rendering. ACM Trans. Graph. 28, 5, 150:1–150:10.
    13. Haeberli, P. E. and Akeley, K. 1990. The accumulation buffer: Hardware support for high-quality rendering. In Proceedings of SIGGRAPH International Conference on Computer Graphics and Interactive Techniques. 309–318.
    14. Hammon, E. 2007. Practical post-process depth of field. In GPU Gems 3, H. Nguyen Ed., Addison Wesley, Chapter 28, 583–605.
    15. Hasselgren, J. and Akenine-Möller, T. 2006. An efficient multi-view rasterization architecture. In Proceedings of the 17th Eurographics Workshop on Rendering. 61–72.
    16. Jones, T., Perry, R., and Callahan, M. 2000. Shadermaps: a method for accelerating procedural shading. Tech. rep. 2000-25, Mitsubishi Electric Research Labs.
    17. Lee, S., Eisemann, E., and Seidel, H.-P. 2009. Depth-of-field rendering with multiview synthesis. ACM Trans. Graph. 28, 5, 134:1–134:6.
    18. Max, N. L. and Lerner, D. M. 1985. A two-and-a-half-D motion-blur algorithm. In Proceedings of SIGGRAPH 85 International Conference on Computer Graphics and Interactive Techniques. 85–93.
    19. Molnar, S., Cox, M., Ellsworth, D., and Fuchs, H. 1994. A sorting classification of parallel rendering. IEEE Comput. Graph. Appl. 14, 4, 23–32.
    20. Nehab, D., Sander, P. V., Lawrence, J., Tatarchuk, N., and Isidoro, J. R. 2007. Accelerating real-time shading with reverse reprojection caching. In Proceedings of the Graphics Hardware Conference. 25–35.
    21. Olano, M. and Greer, T. 1997. Triangle scan conversion using 2d homogeneous coordinates. In Proceedings of the SIGGRAPH/Eurographics Workshop on Graphics Hardware. 89–96.
    22. Owens, J. D., Khailany, B., Towles, B., and Dally, W. J. 2002. Comparing Reyes and OpenGL on a stream architecture. In Proceedings of the Graphics Hardware Conference. 47–56.
    23. Patney, A. and Owens, J. D. 2008. Real-time Reyes-style adaptive surface subdivision. ACM Trans. Graph. 27, 5, 143:1–143:8.
    24. Ragan-Kelley, J., Kilpatrick, C., Smith, B. W., Epps, D., Green, P., Hery, C., and Durand, F. 2007. The Lightspeed automatic interactive lighting preview system. ACM Trans. Graph. 26, 3, 25:1–25:11.
    25. Rosado, G. 2007. GPU Gems 3. Addison Wesley, 575–581.
    26. Seiler, L., Carmean, D., Sprangle, E., Forsyth, T., Abrash, M., Dubey, P., Junkins, S., Lake, A., Sugerman, J., Cavin, R., Espasa, R., Grochowski, E., Juan, T., and Hanrahan, P. 2008. Larrabee: A many-core ×86 architecture for visual computing. ACM Trans. Graph. 27, 3.
    27. Sitthi-Amorn, P., Lawrence, J., Yang, L., Sander, P. V., and Nehab, D. 2008. An improved shading cache for modern GPUs. In Proceedings of the Graphics Hardware Conference. 95–101.
    28. Sloan, P.-P., Luna, B., and Snyder, J. 2005. Local, deformable precomputed radiance transfer. ACM Trans. Graph. 24, 3, 1216–1224.
    29. Stoll, G., Mark, W. R., Djeu, P., Wang, R., and Elhassan, I. 2006. Razor: An architecture for dynamic multiresolution ray tracing. Tech. rep. 06-21, University of Texas at Austin.
    30. Torborg, J. and Kajiya, J. 1996. Talisman: Commodity real-time 3D graphics for the PC. In Proceedings of SIGGRAPH International Conference on Computer Graphics and Interactive Techniques. 353–364.
    31. Yang, L., Sander, P. V., and Lawrence, J. 2008. Geometry-aware framebuffer level of detail. Comput. Graph. Forum 27, 4, 1183–1188.
    32. Zhou, K., Hou, Q., Ren, Z., Gong, M., Sun, X., and Guo, B. 2009. Renderants: Interactive reyes rendering on gpus. ACM Trans. Graph. 28, 5.

ACM Digital Library Publication:



Overview Page: