

NASA’s Earth science data archives — NASA’s library of data relating to geology and meteorology — contains so many records in such a wide variety of forms across so many databases, that scientists looking for information must search across many files, and manually cross-reference records. The problem is compounded by the need to stay within communication bandwidth limitations, time constraints, and a wide variety of database protocols unique to each site. In addition, the autonomy of individual data sources must be preserved.
In this NASA-sponsored project, Charles River Analytics built a system that uses autonomous software agents that move across computer networks to retrieve data from databases on distributed machines. We developed a fully functional web-enabled prototype to retrieve data from NASA's Distributed Active Archive Centers (DAACs). We combined planning and traditional database query optimization techniques in pre-defined templates to allow agents to translate scientists' queries to sub-queries for individual data sources. By allowing cooperation among agents working on related sub-queries, the system increased retrieval efficiency 100%-600%, depending on the specific query posed. We specifically targeted this effort to integrate ACQUIRE with NASA’s Earth Observing System Data and Information System (EOSDIS).
Scientists can now use the prototype system to create one search across multiple databases, including older legacy systems. By using mobile agents, Charles River Analytics' search system allows each database to remain as it was, while simultaneously distributing the total query workload over a number of database servers. Since much of the computation and filtering can be done at the server itself, mobile agents minimize bandwidth usage by only transferring data that will be in the final result set. The overall query plan can also be optimized both statically (before the agents are deployed) and dynamically (while the agents are executing their respective tasks) in order to adapt to unexpected conditions. Finally, the system allows scientists to deploy advanced logic well beyond the limitations of standard database query languages, allowing for more powerful and flexible searches.
Das, S., Shuster, K., Wu, C., and Levit, I. (to appear in the Information Retrieval Journal). "Mobile Agents for Distributed and Heterogeneous Information Retrieval". Information Retrieval Journal. Kluwer Publishing.
Das, S., Shuster, K., Wu, C. (2002). "ACQUIRE: Agent-based Complex QUery and Information Retrieval Engine". AAMAS'02, July 15-29, 2002, Bologna, Italy.
