“ATI Stream Profiler: a tool to optimize an OpenCL kernel on ATI Radeon GPUs” by Purnomo, Rubin and Houston

  • ©Budirijanto Purnomo, Norman Rubin, and Michael (Mike) Houston

Conference:


Type:


Entry Number: 54

Title:

    ATI Stream Profiler: a tool to optimize an OpenCL kernel on ATI Radeon GPUs

Presenter(s)/Author(s):



Abstract:


    Modern GPUs have been shown to be highly efficient machines for data-parallel applications such as graphics, image, video processing, or physical simulation applications. For example, a single ATI Radeon™ HD 5870 GPU has a theoretical peak of 2.72 teraflops (1012 floating-point operations per second) with a video memory bandwidth of 153.6 GB/s. While it is not difficult to port CPU algorithms to run on GPUs, it is extremely challenging to optimize the algorithms to achieve teraflops performance on GPUs. Only a select few expert engineers with the application domain expertise, a deep understanding of the modern GPU architecture, and an intimate knowledge of shader compiler optimization can program GPUs close to their optimal capabilities. Many developers are content with several folds of improvements rather than one or several orders of magnitude acceleration compared to their optimized CPU implementations.


ACM Digital Library Publication:



Overview Page: