OBJECTIVE: Design and implement a climate data assimilation package, the PSAS (Physical-space Statistical Analysis System), for massively parallel, distributed memory supercomputers, with both high efficiency and good scalability.
APPROACH: Group observation data into regions to organize the correlation matrix into a structure which allows use of the BLAS (Basic Linear Algebra Systems) routines to achieve very high efficiency. Distribute the sparse matrix blocks in a load-balanced way. Design the preconditioned Conjugate Gradient linear solver using a fan-in-fan-out communication scheme. Use equal-area grid regions and grid-thinning to speedup the foldback process. Build a diagnostics system for debugging and monitoring.
ACCOMPLISHMENTS: Designed and implemented the following: a concurrent data partitioning algorithm which groups data into regions with good aspect ratio; a matrix block distribution algorithm which achieves the load-balance objective; an efficient Conjugate Gradient linear equations solver for sparse block matrices; equal-area grid region generation and grid thinning; a replicating algorithm for the foldback matrix generation and analysis increment calculation. Integrated all these algorithms together with a scalable parallel data IO system and a diagnostic/monitoring system. The resulting parallel assimilation package is highly efficient. An assimilation problem with 80,000 observations is solved on 512-node Paragon in 3 mins vs 5 hrs on a single CRAY C90 processor. This is a speedup of more than 100 times. The time-critical solver achieves a sustained 18.3 GFLOPS on the 512-node Paragon, which is 36% of the peak speed. On 128-PE T3D, a problem with 52000 observations achieved 3.6 GFLOPS performance for the entire problem.
SIGNIFICANCE: This work demonstrates that a distributed-memory massively parallel computer can solve the same problem 100 times faster at about same cost for the climate data assimilation problem. This parallel package now adequately meets the real-time computational requirements of NASA's Data Assimilation Office (DAO) in their daily operations relating NASA's Earth Observing Systems, Data Information Systems. It also enables the DAO to do much large problems unthinkable before. A paper describing this work was presented at Intel Supercomputer Users Group Conference 1996 and won the Best Paper Award in the performance category.
STATUS/PLANS: The entire assimilation package is implemented on the Intel Paragon, and Cray T3D. The MPI version of the codes run on both platforms.
POINT OF CONTACT:
Hong Q. Ding
Jet Propulsion Laboratory
hding@olympic.jpl.nasa.gov
818-354-8983