THE EARTH SCIENCES INFORMATION SYSTEM
Working Group Participation
John A. Dutton, Chairman
Francis P. Bretherton
Roy L. Jenne
Designated Federal Liaison: Dale Harris
Rapporteurs: Anne Linn, Frank Eden
WORKING GROUP SUMMARY
John A. Dutton, Chairman
The Earth Observing System Data and Information System (EOSDIS) is a central component of the EOS program for linking observations made from space with those obtained on the ground and assisting scientists to convert them into enhanced understanding of the Earth system and the processes that drive its evolution. EOSDIS must be designed and implemented so that the investment in EOS space observations is multiplied many times through revealing analyses, through new models of the Earth system and its components, and through stimulation of a wide range of educational and economic activities. The EOS program, and indeed the entire U.S. Global Change Research Program (USGCRP), cannot be successful unless EOSDIS fulfills expectations that it will empower new levels of achievement in the Earth sciences and applications, and in a wide range of activities in both the public and the private sectors.
To meet these expectations, we must now embrace a revolutionary expansion of the conceptual model that governs the management and operation of the system by affording the scientific community full partnership with shared responsibility. If we create and commit ourselves to the right model, all of the details related to design and technology will fall into place readily. Moreover, a new and successful model for EOSDIS, and by extension for USGCRP as a whole, will provide a stimulus for new approaches to data and information management in a wide variety of activities and will broadly benefit the nation.
The two key requirements for the system are that it must
- utilize an open management approach in which key decisions are made with community leadership, and assignment of responsibilities is based on peer review; and
- encourage innovation and creativity through wide participation of the scientific, public, and private sectors.
The revolution proposed in the management and implementation of EOSDIS will prove successful only if it incorporates, from the beginning, powerful incentives and meaningful criteria. As criteria for evaluating the design and implementation, that the new concept should ensure that
- users can readily locate data sets with real and valuable scientific content;
- users can access and utilize such data sets readily and in a timely fashion;
- collaborative analysis and research is stimulated and encouraged; and
- demonstrable progress in scientific endeavors and in applications to other activities is evident.
To provide incentives for the scientific community, the system must enable and encourage scientists and scientific teams to use it for interaction and as a form of electronic publication and dissemination of their results.
Historical Background of EOSDIS The EOSDIS was conceived a decade ago by the science steering groups that developed the initial plans for EOS as a powerful, distributed data and information system that would provide ready access to the data and stimulate new levels of scientific creativity and collaboration in studying the wide range of interdisciplinary issues that must be resolved to understand the evolution of the Earth system.
However, the system design developed in good faith by the National Aeronautics and Space Administration (NASA) was shaped and constrained by the engineering protocols then in vogue for the development of large and complex hardware systems. Thus, the initial architecture proposed by NASA was to be centrally controlled and operated to ensure that it met ambitious performance and reliability requirements. Later versions developed in response to the objections and advice of the scientific community retained these features. The architecture required by NASA in the initial contract with Hughes Applied Information Systems (HAIS) generated considerable concern and was revised after a thorough National Research Council (NRC), 1994 review that produced recommendations for a logically distributed system, based on a client-server model, that would accommodate evolving computer system concepts and technology. Despite the notable improvements in architecture and concept introduced by HAIS in response to NRC recommendations, the current design and performance requirements, the system of multiple Distributed Active Archive Centers (DAACs) (each configured as a stand-alone, high-performance, and highly reliable computing center), and an extensive engineering and management superstructure are stressing the bounds of affordability (see Table 1).
Still, considerable progress has been made. This new client- server architecture of EOSDIS takes advantage of logical distribution and modularity and will allow the system to evolve as both computer system concepts and technology advance in the years ahead. The system now can take advantage of the concepts of the World Wide Web (WWW), the continuing advances in computer and storage capabilities, and the advantages conferred by developing a set of permissive standards appropriate to global change research that will enable and encourage wide access to EOSDIS and wide use of, and contribution to, its resources. Thus, with appropriate incentives, the system can be flexible and quick to adapt to a rapidly changing environment.
TABLE 1.1 EOSDIS Components and Costs (FY 1991-2000)-- NASA Concept Components Cost ($ million) Flight and Data Operations Flight operations and spacecraft control 86 Ground stations (communication with spacecraft) 50 EOS data and operations system (data capture and initial processing) 225 EOSDIS backbone network (transmit data to DAACs) 106 Distributed active archive centers (preparation of data products) 1,021 Distribution of data to users via Internet 52 System Engineering and Management System engineering and integration 372 Program and project management 74 Related science support 144 TOTAL 2,230
A New Concept: The Earth Sciences Information System The present plans for the development of EOSDIS have been widely criticized for reasons ranging from an apparently excessive cost to lack of a governance structure that engages and empowers the scientific community. A number of observers do not believe that problems with the system can be eliminated by engineering redesign. Instead, the concerns are much more fundamental and are related to the basic management approach--to the conceptual model that has guided and constrained the management and engineering of EOSDIS.
Thus, a new model is proposed that will distribute many of the functions of the system to a wide range of government, academic, and private organizations through a competitive process. To distinguish this new model from those of the past, will be referred to it as the Earth Sciences Information System (ESIS). The basic concept is illustrated in Figure F-1. The functions shown on the left--flight control, data receipt and Level-0  archive, and initial processing of the of the data through Level 1--will follow the existing EOSDIS model. Although the model for this part of the system does not change, that these functions can be streamlined considerably with important reductions in cost.
On the right, the generation of products and the combination of initial products into a wide range of scientific data fields would be opened to a competitive process through an Announcement of Opportunity, with bidders allowed to bid on any number and combination of products and services. It may be anticipated that the successful bidders will include NASA laboratories (perhaps some of the present DAACs), teams of EOS investigators, other academic collaborators, and private sector organizations and firms. These entities are referred to as NASA Earth Science Information Partners (ESIP) and it is anticipated that similar organizations will develop outside of NASA sponsorship or supervision. Thus, ESIS will become a privatized, market-driven federation of product generation and enhancement capabilities. Rather than a centrally managed entity, it will become a coordinated activity, drawing in new participants.
The effectiveness of NASA ESIPs will be determined by the criteria used to evaluate both proposals and continuing performance. Three are recommended:
- timely production of specific scientifically meaningful products;
- provision of effective user support and appropriate data access; and
- formatting data sets and associated documentation in a form suitable for transmission to permanent libraries.
Introducing competition will have important consequences. The first is that bid prices will be consistent with the marginal cost of providing actual ESIS services and thus can be presumed to be considerably less than the cost of dedicated, stand-alone facilities. Second, the new model is intellectually inclusive and will attract new participants, creating a much broader and more effective process for attacking the key problems of global change research. Third, with the development of standards and protocols to interchange data sets on Internet and WWW, ESIS will create a new capability of broad value to the scientific community and the private sector and thus to the entire nation.
With suitable extensions of the catalogs and advertising services being developed by HAIS, the results of EOS research will be available to all Internet users. This, too, has important consequences. First, with appropriate standards, a wide range of scientists and scientific facilities that use EOS data will be encouraged to make their results available to others by conforming to system standards and thus publishing them electronically. Second, a market for ESIS services will develop in which value-added concerns will offer search, browse, and data delivery services that are extensions of the basic capabilities. Such services may be especially attractive to private sector users of EOS results and to schools and colleges.
Issues, Challenges, and Risks The most evident logical difference in the two models is that responsibility for processing and product generation at Levels 3 and higher has been transferred from designated government facilities to the federation of community entities. In this section, we provide a preliminary view of some of the consequences is provided.
A variety of issues and risks are common to all computer systems and all endeavors in scientific data management. These include archiving, security, providing user assistance, and documenting user activities. Preliminary study, leads to the conclusion that, except for minor variations, these are essentially similar in the two models. Successful bidders will have to demonstrate that they understand these issues and have adequate and rigorous plans for dealing with them.
The proposed model for ESIS does pose new issues, however. The first is that of managing collaboration in a competitive environment. Developing, processing, maintaining, and improving EOS scientific products will require collaboration between the instrument teams or investigators. Moreover, the strong interdependencies of some data sets will mandate effective collaboration and careful scheduling. The Announcement of Opportunity must provide for arrangements that will encourage the necessary collaboration and include initial provisions or a negotiation phase to permit instrument teams to explore collaboration with several bidders or an otherwise successful bidder.
Moreover, even in the proposed decentralized and federated systems, a number of specific functions will require centralized intellectual leadership, an example being definition of standards for metadata and supporting documentation. Further elaboration of these should take place in later stages of this review.
A second and critical issue is the governance of the new ESIS system. A significant advantage of the proposed model is its potential to stimulate the collaboration and wide participation of the scientific community in the processing and refinement of EOS products and in the development of higher-order products that reveal new aspects of an improving scientific understanding of the Earth system. To achieve this potential, the system must be responsive to users and participants--it cannot be centrally managed from the top down but must be governed as a federation of collaborating entities. Moreover, the federation must expand to include other agencies and the research teams they support. A 1995 NRC report sets forth the basic structure of such a federation in the context of managing scientific data (NRC, 1995).
A third issue is that the transition to the new system must be very sensitive to the expectations of international partners and the commitments that have been made to them. Agreements in place must not be jeopardized and should be modified only with the enthusiastic concurrence of these partners, many of whom may prefer ESIS capabilities to the present plan.
A fourth issue is whether reassessment and relaxation of system performance and reliability requirements will produce significant savings in total costs. Current requirements derive from the spacecraft data production rates and are designed to reduce risks to the central facility. With adoption of the ESIS model, the risks are transformed into those associated with scientific research, and tolerance for central risk can be increased. For data products deriving from the AM-1 platform, the transition will have to be handled with particular care because of complex interdependencies and tight schedules.
Finally, the success of either model depends in part on the continued viability of the Internet as a mechanism for high- bandwidth computer-to-computer communication. Bidders would have to demonstrate the commitment of their host organizations to maintain Internet connections of sufficient bandwidth. Although the advancing capabilities of the Internet or other national high- performance computer communication capabilities are expected to keep pace with demands for service, there is a risk that they may not. A first complication would be inadequate bandwidth to support the interactive processing of interdependent products; such a difficulty could be ameliorated by transfer of data on physical media via overnight delivery. A second complication would be charges for Internet services, a development that would lead to complications for scientific research that extend far beyond EOS. Such complications would be equally problematic in both models.
Transition to the New Model The ESIS model will create a data and information system that operates differently from the present concept and will require that the transition be carefully managed. The most important action now is to adopt the new intellectual concept for the system and be clear about our long-term goals. Every attempt should be made to put as much of the new system as possible in place before the launch of EOS AM-1. To do so, NASA, EOS investigators, and EOSDIS contractors must begin immediately to conduct a collaborative study of the implementation and cost of the federated system and to develop a plan for an effective, streamlined central management and engineering capability. Some representative actions typical of those required in such a study are listed in the next section. Although such a study may demonstrate that a gradual or incremental transition to the new system is advisable, we argue that the initial effort should be directed toward effecting a dramatic break with the past and creating an entirely new and contemporary federated management and operation of ESIS.
Recommendations The following two recommendations summarize the discussion in this appendix:
- The components of the EOSDIS now under development for flight control, data downlink, and initial processing should be retained, but streamlined.
Representative Actions to Respond to Recommendation 1
- Assess rigorously the relative costs of transmitting and receiving EOS spacecraft data with and without the Tracking Data Relay Satellite System (TDRSS).
- Reevaluate EOS Data and Operations System (EDOS) functions with the aim of incorporating advanced technologies and limiting the scope to that needed for data capture and processing to Level-0. Reduce initial data processing costs by utilizing receiving stations for Level-0 processing and existing capacity at DAACs (e.g., at Goddard and EROS Data Center) for Level-0 to Level-1 or Level-2 processing.
- Explore with end-to-end system plans the use of advanced technologies and concepts such as solid-state spacecraft data recorders, increased spacecraft autonomy, and contemporary data packet protocols to simplify data operations and reduce overall costs.
- Explore replacement of the EOSDIS Backbone Network with commercial facilities to reduce engineering and continuing management costs.
- Evaluate possible advantages and relative short-term and long- term cost savings associated with development of a unique flight operations system for each mission in order to take maximum advantage of new capabilities, new technologies, and lessons learned from previous missions.
- Responsibility for product generation and publication and for user services should be transferred to a federation of partners selected through a competitive process open to all.
- To effect this recommendation, it will be necessary to examine the systems implications of reconfiguring EOSDIS as a loosely- coupled federation of quasi-autonomous partner organizations, each with a contractual obligation to perform a subset of the tasks involved in preparing and distributing scientifically reliable products at Level-2 and higher, identifying in particular those functions or services to the federation that must be provided centrally and those for which responsibility can be delegated to the partners.
Representative Actions to Respond to Recommendation 2
- Reassess schedule, continuity, and reliability requirements for standard data products with the aim of simplifying preparation of the scientific data products, and thus reducing costs. Examine with EOS investigators and other potential users the hypothesis that only Level-0 data must be treated in a rigorous production sense.
- Assess rigorously the advantages, disadvantages, and relative costs of moving Level-1 or Level-2 data to a distributed system of scientific data processing partners via Internet, commercial surface and space-based communication networks, or overnight delivery of media.
- Obtain (from EOS instrument Principal Investigators and teams, other investigators, and an appropriate subset of existing DAACs) realistic cost estimates for preparing representative scientific data products in distributed processing units.
- Develop prototype models of minimum machine-independent data format standards and interchange protocols that will facilitate exchange, interactive use, and electronic publication of EOS scientific data sets over existing commercial and Internet facilities. This effort should engage experts from the academic and commercial computer science communities and should concentrate on whether extensions to existing standards, such as those used on Internet and World Wide Web, are necessary or advisable.
- Develop prototype protocols for peer review and signed electronic publication of scientific data sets that would provide incentives and quality control motivations to producers of these data sets.
- Explore the use of information search facilities modeled on those now in use on the World Wide Web as a means of providing users with data search and access capabilities; explore whether the EOSDIS Version 1 and Version 2 systems could operate exclusively over the Internet (or anticipated national high-performance computer communications networks) to facilitate data exchange by scientists and to provide search and access capabilities to users.
- Explore possible advantages of dividing EOS data into categories in order to determine the most effective means of processing and distributing data to users. Possible categories (and possible data producers) include operational data for other agencies (many possible producers, depending on timeliness), data of use to a limited community of scientists (instrument teams or Principal Investigators), data of wider scientific use (many possible producers), data of interest to educational institutions and the public (scientific or commercial data facilities), and data with commercial value (commercial or academic bidders).
- Develop a preliminary model of a procurement process and an Announcement of Opportunity that could be used to solicit proposals from potential participants in a distributed scientific data processing system.
- Develop a plan and realistic cost estimates, using the information generated by the above actions, for a distributed data processing federation as envisioned in Figure F-1, and seek the comments and advice of EOS investigators and the broader scientific and other user communities.
CONCLUSION The proposal made here for creating ESIS offers many advantages to the government and the scientific community. Rather than being managed top-down by the government, the new model will create a federation of participants. By taking advantage of Internet capabilities, it will extend access to EOS results to a wide audience, including new participants in the private sector. Although substantial savings may be expected, the costs of the new approach can be estimated only after careful study.
Most significantly, it will stimulate participation of the scientific community in the governance of ESIS and create an entirely new system that can be the model and foundation for the broader Global Change Data and Information System. Perhaps its greatest benefit, however, will be that it will generate a new approach to the interactive management and use of distributed data sets and, with an appropriate set of standards and protocols, provide a new capability of significant benefit to a nation increasingly dependent on collaborative and innovative exploitation of complex arrays of data and information.
The proposed new approach has substantial benefits and some challenging risks. However, the benefits envisioned more than compensate for those risks.
REFERENCES National Research Council (NRC). 1994. Final Report. Panel to Review EOSDIS Plans. National Academy Press. Washington, D.C.
National Research Council (NRC). 1995. Preserving Scientific Data on Our Physical Universe: A New Strategy for Archiving the Nation's Scientific Information Resources. National Academy Press. Washington, D.C.