“Anywhere pixel compositor”




    Anywhere pixel compositor


    Even with the recent rapid advancement in hardware, the demand from high-end graphics applications (including video games) seems to always outpace the capability that a single GPU can offer. Graphics hardware vendors are now offering dual or even quad GPU configurations (such as NVIDIA’s SLI and ATI’s CrossFire technology). As we migrate from a single GPU to multiple GPUs or eventually GPU clusters, how to effectively assemble the final image from these distributed rendering nodes becomes an important issue. Here we propose to develop a flexible pixel compositor to solve this problem. Our compositor is capable of performing an arbitrary mapping of pixels from any input frame to any output frame, and executing typical composition operations at the same time. Figure 1(a) shows a schematic of our design. The pixels are transmitted digitally. The core mapping and arithmetic operations are carried out by a programmable FPGA chip. It is connected to a large memory bank that stores both the mapping information and temporary frames (if necessary). A single compositor unit has four input links and four outputs. Multiple units can be arranged in a network, shown in Figure 1(b), to achieve the scalability for large clusters.


    1. Cavin, X., Mion, C., and Filbois, A. 2005. Cots cluster-based sort-last rendering: Performance evaluation and pipelined implementation. In Proceedings of IEEE Visualization, 15–23.
    2. Stoll, G., Eldridge, M., Patterson, D., Webb, A., Berman, S., Levy, R., Caywood, C., Taveira, M., Hunt, S., and Hanrahan, P. 2001. Lightning-2: a high-performance display subsystem for pc clusters. In Proceedings of SIGGRAPH 2001, 141–148.

ACM Digital Library Publication:

Overview Page: