“RPU: a programmable ray processing unit for realtime ray tracing” by Woop, Schmittler and Slusallek

  • ©Sven Woop, Jörg Schmittler, and Philipp Slusallek




    RPU: a programmable ray processing unit for realtime ray tracing



    Recursive ray tracing is a simple yet powerful and general approach for accurately computing global light transport and rendering high quality images. While recent algorithmic improvements and optimized parallel software implementations have increased ray tracing performance to realtime levels, no compact and programmable hardware solution has been available yet.This paper describes the architecture and a prototype implementation of a single chip, fully programmable Ray Processing Unit (RPU). It combines the flexibility of general purpose CPUs with the efficiency of current GPUs for data parallel computations. This design allows for realtime ray tracing of dynamic scenes with programmable material, geometry, and illumination shaders.Although, running at only 66 MHz the prototype FPGA implementation already renders images at up to 20 frames per second, which in many cases beats the performance of highly optimized software running on multi-GHz desktop CPUs. The performance and efficiency of the proposed architecture is analyzed using a variety of benchmark scenes.


    1. Aila, T., and Laine, S. 2004. Alias-free shadow maps. In Proceedings of EUROGRAPHICS Symposium on Rendering 2004, Eurographics Association, 161–166.]] Google ScholarDigital Library
    2. Alpha-Data. 2003. ADM-XRC-II. http://www.alphadata.uk.co.]]Google Scholar
    3. Amanatides, and Woo. 1987. A fast voxel traversal algorithm for ray tracing. In Proceedings of EUROGRAPHICS 1987, 3–10.]]Google Scholar
    4. Andrea Sanna, P. M., and Rossi, M. 1998. A Flexible Algorithm for Multiprocessor Ray Tracing. Tech. rep.]]Google Scholar
    5. Appel, A. 1968. Some Techniques for Shading Machine Renderings of Solids. SJCC, 27–45.]]Google Scholar
    6. Badouel, D., and Priol, T. 1990. An Efficient Parallel Ray Tracing Scheme for Highly Parallel Architectures. IRISA – Campus de Beaulieu – 35042 Rennes Cedex France.]]Google Scholar
    7. Benthin, C., Wald, I., and Slusallek, P. 2004. Interactive Ray Tracing of Free-Form Surfaces. In Proceedings of Afrigraph 2004.]] Google ScholarDigital Library
    8. Buck, I., Foley, T., Horn, D., Sugerman, J., Fatahalian, K., Houston, M., and Hanrahan, P. 2004. Brook for GPUs: Stream Computing on Graphics Hardware. In Proceedings of SIGGRAPH.]] Google ScholarDigital Library
    9. Carr, N. A., Hall, J. D., and Hart, J. C. 2002. The ray engine. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS 2002 conference on Graphics Hardware, Eurographics Association, 37–46.]] Google ScholarDigital Library
    10. Fisher, J. A. 1983. Very Long Instruction Word Architectures and the ELI-512. In Proceedings of the 10th Symposium on Computer Architectures, 140–150.]] Google ScholarDigital Library
    11. Green, S. A., and Paddon, D. J. 1990. A highly flexible multiprocessor solution for ray tracing. The Visual Computer 6, 2, 62–73.]]Google ScholarCross Ref
    12. Green, S. A. 1991. Parallel processing for computer graphics. MIT Press, 62–73.]] Google ScholarDigital Library
    13. Greg Humphreys, C. S. A. 1996. Tigershark: A hardware accelerated ray-tracing engine. Tech. rep., Princeton University.]]Google Scholar
    14. H. Kalte, M. P., and Rückert, U. 2000. Using a dynamically reconfigurable system to accelerate octree based 3d graphics. Tech. rep., System and Circuit Technology, University of Paderborn.]]Google Scholar
    15. Hall, D. 2001. The AR350: Today’s ray trace rendering processor. In Proceedings of the EUROGRAPHICS/SIGGRAPH workshop on Graphics Hardware – Hot 3D Session.]]Google Scholar
    16. Havran, V. 2001. Heuristic Ray Shooting Algorithms. PhD thesis, Faculty of Electrical Engineering, Czech Technical University in Prague.]]Google Scholar
    17. Igehy, H. 1999. Tracing ray differentials. In SIGGRAPH, 179–186.]] Google ScholarDigital Library
    18. Johnson, G. S., Mark, W. R., and Burns, C. A. 2004. The Irregular Z-Buffer and its Application to Shadow Mapping. Tech. rep., The University of Texas at Austin, Department of Computer Sciences. Technical Report TR-04-09, April 15.]]Google Scholar
    19. Kajiya, J. T. 1986. The rendering equation. In Computer Graphics (SIGGRAPH ’86 Proceedings), vol. 20, 143–150.]] Google ScholarDigital Library
    20. Kapasi, U. J., Dally, W. J., Khailany, B., Owens, J. D., and Rixner, S. 2002. The Imagine Stream Processor. In Proceedings of the IEEE International Conference on Computer Design, 282–288.]] Google ScholarDigital Library
    21. Keates, M. J., and Hubbold, R. J. 1995. Interactive ray tracing on a virtual shared-memory parallel computer. Computer Graphics Forum 14, 4, 189–202.]]Google ScholarCross Ref
    22. Kobayashi, H., Ichi Suzuki, K., Sano, K., and Oba, N. 2002. Interactive Ray-Tracing on the 3DCGiRAM Architecture. In Proceedings of ACM/IEEE MICRO-35.]]Google Scholar
    23. Lext, J., Assarsson, U., and Möller, T. 2000. BART: A Benchmark for Animated Ray Tracing. Tech. rep., Department of Computer Engineering, Chalmers University of Technology, Göteborg, Sweden, May. Available at http://www.ce.chalmers.se/BART/.]]Google Scholar
    24. Lin, T. T., and Slater, M. 1991. Stochastic Ray Tracing Using SIMD Processor Arrays. The Visual Computer, 187–199.]] Google ScholarDigital Library
    25. Mai, K., Paaske, T., Jayasena, N., Ho, R., Dally, W., and Horowitz, M. 2000. Smart Memories: A Modular Recongurable Architecture. IEEE International Symposium on Computer Architecture.]] Google ScholarDigital Library
    26. Meissner, M., Kanus, U., and Strasser, W. 1998. VIZARD II, A PCI-Card for Real-Time Volume Rendering. In EUROGRAPHICS/SIGGRAPH Workshop on Graphics Hardware.]] Google ScholarDigital Library
    27. Muuss, M. J. 1995. Towards real-time ray-tracing of combinatorial solid geometric models. In Proceedings of BRL-CAD Symposium ’95.]]Google Scholar
    28. Nebel, J.-C. 1997. A Mixed Dataflow Algorithm for Ray Tracing on the CRAY T3E. In Third European CRAY-SGI MPP Workshop.]]Google Scholar
    29. Nvidia, 2004. http://www.nvidia.com/dev_content/nvopenglspecs/GL_NV_fragment_program2.txt.]]Google Scholar
    30. Parker, S., Shirley, P., Livnat, Y., Hansen, C., and Sloan, P. P. 1999. Interactive ray tracing. In Interactive 3D Graphics (I3D), 119–126.]] Google ScholarDigital Library
    31. Pfister, H., Hardenbergh, J., Knittel, J., Lauer, H., and Seiler, L. 1999. The VolumePro real-time ray-casting system. Computer Graphics 33.]] Google ScholarDigital Library
    32. Pharr, M., and Humphreys, G. 2004. Physically Based Rendering: From Theory to Implementation. Morgan Kaufmann.]] Google ScholarDigital Library
    33. Purcell, T., 2001. The SHARP Ray Tracing Architecture. SIGGRAPH course on Interactive Ray Tracing.]]Google Scholar
    34. Purcell, T. J. 2004. Ray Tracing on a Stream Processor. PhD thesis, Stanford University.]] Google ScholarDigital Library
    35. Reinhard, E., Smits, B., and Hansen, C. 2000. Dynamic Acceleration Structures for Interactive Ray Tracing. In Proceedings of the Eurographics Workshop on Rendering, 299–306.]] Google ScholarDigital Library
    36. Scenes, 1999–2003. Unreal Tournament 2003 by Epic Games, Return to Castle Wolfenstein by Activision, Quake3-Arena by Id-Software, and Mafia by Illusion Softworks.]]Google Scholar
    37. Schmittler, J., Wald, I., and Slusallek, P. 2002. Saar-COR – A Hardware Architecture for Ray Tracing. In Proceedings of the ACM SIGGRAPH/EUROGRAPHICS Conference on Graphics Hardware, 27–36.]] Google ScholarDigital Library
    38. Schmittler, J., Leidinger, A., and Slusallek, P. 2003. A Virtual Memory Architecture for Real-Time Ray Tracing Hardware. Computer & Graphics 27, 5 (October), 693-699. ISSN 0097-8493.]]Google ScholarCross Ref
    39. Schmittler, J., Woop, S., Wagner, D., Paul, W. J., and Slusallek, P. 2004. Realtime Ray Tracing of Dynamic Scenes on an FPGA Chip. In Proceedings of Graphics Hardware.]] Google ScholarDigital Library
    40. Slotnick, D. L., Borck, W. C., and McReynolds, R. C. 1962. The SOLOMON Computer. In Proceedings of the Fall Eastern Joint Computer Conference, 97–107.]]Google ScholarDigital Library
    41. Slusallek, P., and Seidel, H.-P. 1995. Vision: An Architecture for Global Illumination Calculations. In IEEE Transactions on Visualization and Computer Graphics, 1(1), 77–96.]] Google ScholarDigital Library
    42. Slusallek, P., Pflaum, T., and Seidel, H.-P. 1995. Using procedural RenderMan shaders for global illumination. In Computer Graphics Forum (Proc. of EUROGRAPHICS ’95), 311–324.]]Google Scholar
    43. Sun Microsystems, 1987. The SPARC Processor. http://www.sun.com/.]]Google Scholar
    44. Wald, I., Slusallek, P., Benthin, C., and Wagner, M. 2001. Interactive Rendering with Coherent Ray Tracing. Computer Graphics Forum 20, 3, 153–164. (Proceedings of EUROGRAPHICS).]]Google ScholarDigital Library
    45. Wald, I., Benthin, C., and Slusallek, P. 2003. Distributed Interactive Ray Tracing of Dynamic Scenes. In Proceedings of the IEEE Symposium on Parallel and Large-Data Visualization and Graphics (PVG).]] Google ScholarDigital Library
    46. Wald, I., Purcell, T. J., Schmittler, J., Benthin, C., and Slusallek, P. 2003. Realtime Ray Tracing and its use for Interactive Global Illumination. In EUROGRAPHICS State of the Art Reports.]]Google Scholar
    47. Wald, I. 2004. Realtime Ray Tracing and Interactive Global Illumination. PhD thesis, Computer Graphics Group, Saarland University. Available at http://www.mpi-sb.mpg.de/~wald/PhD/.]]Google Scholar
    48. Xilinx. 2003. Virtex-II. http://www.xilinx.com.]]Google Scholar

ACM Digital Library Publication: