Applications
To develop algorithms, methodologies, and software that will allow parallel scientific and engineering applications to produce runtime visual output.
Our approach is to exploit the available processing power to perform graphics and visualization operations in situ. To accomplish this, we are developing efficient, scalable parallel rendering algorithms and encapsulating them within a 3-D graphics library (called PGL) that can be invoked from parallel application programs. Additional components provide for image transport and display and allow construction of distributed user interfaces.
We are assessing the performance and functionality of our designs across a variety of aerospace applications and computing platforms and incorporating refinements in response to evolving application requirements and advances in computer architecture. The resulting software relies on accepted standards (ANSI C, MPI, Unix system services) for portability across a wide range of parallel computing systems. The results of our research are being distributed to the HPCC community in source code form via the World Wide Web.
The initial release of the PGL rendering system ran on the Intel Paragon and IBM SP2. During the past year, PGL has been ported to several new platforms, including the HP Exemplar, Silicon Graphics' CRAY Origin2000 and CRAY T3E, and networks of Sun and SGI workstations.
We have also devised an improved end-of-frame termination detection algorithm, which provides substantial performance improvements with large numbers of processors. Benchmark results on the Intel Paragon at Caltech show performance improvements of 18 percent with 128 processors and around 80 percent for 256 and 512 processors.
Our experience with PGL now spans four generations of parallel computing hardware (Figure 1). For this application, which is asynchronous and communication intensive, distributed-memory systems give good performance with up to 128 processors, while the newer CC-NUMA architectures show very poor scalability (Figure 2). We are currently trying to develop a better understanding of these results to determine whether the problems are inherent in the architecture or merely an artifact of poor MPI implementations.

Figure 1
By adding support for several new architectures, parallel visualization and graphics capabilities are becoming available to a broader user community. Algorithmic improvements are also extending the range of scalability, providing more efficient support for the largest parallel applications.

Figure 2
We plan to release a new version of PGL, incorporating the results described here, to the user community by the end of calendar year 1998. We also intend to explore explicit shared-memory formulations of our communication algorithms in an effort to boost performance on large CC-NUMA architectures. Long-term goals include additional functionality, improved user interfaces, higher-level visualization components, and development of algorithms which will scale to hundreds or thousands of processors
Tom Crockett
Institute for Computer Applications in Science and Engineering (ICASE)
NASA Langley Research Center
tom@icase.edu
757-864-2182
http://www.icase.edu/~tom/