
Peter Lyster, Principal Investigator, NASA Data Assimilation Office
July 4, 2000
NASA/GSFC Data Assimilation Office (DAO), Greenbelt, MD
University of Maryland Earth Systems Science Interdisciplinary Center (ESSIC)
Email to lys@dao.gsfc.nasa.gov
http://ct.gsfc.nasa.gov/lys/lys.html
Front Cover: Three-dimensional Visualization of Methane Distribution in the Stratosphere for September 1992. The results were generated from the Kalman filter using the GSFC Cray T3E, and were included in the video: Images of Earth and Space: SC97 Edition.

This project on software for data assimilation has five main achievements: (1) the introduction of distributed-memory parallel algorithms to the DAO for the first time -- for both the GCMs and the analysis software; (2) advances in the understanding, with documentation, of the software and computational complexity of data assimilation systems; (3) improvements in the wall-clock performance of the operational Physical-space Statistical Analysis System (PSAS) by a factor of four; (4) development of the distributed-memory parallel PSAS with significant improvements in scalability and performance; and (5) development and scientific validation of a parallel Lagrangian Kalman filter for constituent assimilation that achieved its performance goals. The Lagrangian filter could not have been developed without high end computing capability.
The multivariate production algorithm at the DAO is the Goddard Earth Observing System Data Assimilation System (GEOS DAS). The analysis component of data assimilation systems continues to require considerable research on software complexity and performance. For GEOS DAS the compute-intensive part of the analysis is performed by the PSAS which involves complex databases and covariance models. For example, panel discussion on software Third WMO Symposium on Data Assimilation in Meteorology and Oceanography, Quebec City, Canada, 7-11 June 1999. The 50 and 100 gigaflop/s milestones for the GEOS DAS were negotiated out of our agreement. We have gained significant understanding of the software complexity and performance of the GEOS DAS and the PSAS in particular, and this will be discussed in the text and attachments.
2. Discussion of the Key Elements of the Project
This is the second phase of the DAO's High Performance Computing and Communications (HPCC) Grand Challenge Principal Investigator project (1997-1999). Early in 1998, the DAO and the HPCC Project agreed that it was premature to establish 50 and 100 gigaflop/s milestones for the end-to-end GEOS DAS; at the time there was still considerably more work needed to stabilize the components of the core system, namely the General Circulation Model (GCM) and the Physical-space Statistical Analysis System (PSAS). In particular the DAO agreed to provide working versions of its parallel GEOS-3 GCM and the PSAS, and reports describing the performance aspects of these codes. Note that this is not the same model as the fvGCM that is being developed for the new Data Assimilation System at the DAO. For background, the reader is referred to earlier submissions to HPCC Project on Peter Lyster's web page http://ct.gsfc.nasa.gov/lys/lys.html.
Before discussing details of the performance of algorithms it is important to recognize that the quality of the Scientifc Software holds primacy in efforts such as data assimilation. This involves portability, maintainability, extensibility to new science, as well as performance. At the DAO the software must meet the needs of an operational environment. In this report, issues of scientific software quality in multi-developer environment are ever present.
The most important measure of performance for scientific code is the time-to-solution because shortening this value leads to increased turn-around for scientific study. For those who need repeated simultaneous experiments, e.g., ensemble studies, a secondary measure is the ability to run concurrent small jobs possibly on the same machine or in a distributed heterogeneous environment. The fundamental limit to time-to-solution is the single processor speed of the software. For parallel computing a second issue is the scalability, namely the extent to which using extra processors (i.e., more resource) effectively reduces the time-to-solution. Amdahl's law simply quantifies the effect of the non-parallelized part of an algorithm on scalability -- the theoretical maximum number of processors that may be used effectively is approximately the inverse of the fraction of the non-parallelized part of the algorithm. In addition to the limitations of unparallelized code, multi-processor communications costs and load imbalance also degrade scalability.
The following briefly discusses the history and phases of the project, then discusses the key results of our work: design, implementation and performance of the parallel GCM, PSAS, and the Lagrangian/Kalman filter.
The following documents present theory, software, and performance issues of core algorithms at the DAO: the GCM, the PSAS, and the Lagrangian filter.