During 2005 and 2006, I worked on a distributed database system for the JESPP project at USC's Information Sciences Institute. This presentation is based on an unpublished paper about the project that I co-authored with Dr. Ke-Thia Yao. Below is an abstract of that paper and here is a link to the PowerPoint slides (264KB) and a copy of the draft paper as a PDF file (144KB).
High Performance Computing has made significant strides in the distributed simulation community. The Joint Forces Command has fielded more than a million independent agents in its JESPP project, with a concomitant data management challenge. Enabling and optimizing this transcontinental computing and analysis environment has drawn significant interest from the T&E community. This paper focuses on the Scalable Data Grid (SDG) project at the Information Sciences Institute and it illuminates why some Java techniques were found to be useful and some were not. The study of the programming will aid in examining the design and implementation of an effective distributed simulation database using the Java Programming Language and its associated tools and Application Programming Interfaces. The transition of intelligent agent simulations from training to experimentation requires the effectual logging, processing, storing, retrieving, and analyzing terabytes of data. The design, construction, and evaluation of the SDG strives to balance efficiency of execution, clarity of development, and security of the environment to create a robust, scalable system to support distributed simulation database population, organization, and utilization. The choice of the most appropriate programming language was central to the effective development and eventual utility of the SDG. The depth and breadth of Java technologies provide a rich set of capabilities. Not all of these capabilities are needed by all systems. The SDG was built using, among other things, Java Database Connectivity (JDBC) and Remote Method Invocation (RMI). The frameworks available for wrapping these and other technologies, such as Java2 Enterprise Edition ™ (J2EE), Java Data Objects (JDO) or Hibernate were not used. The rationale for these choices is laid out and reviewed. All of these choices are examined and the constructed classes are illustrated. The authors review lessons learned and performance evaluations across several dimensions. This paper covers design considerations that were made to effectively generate the SDG system code, including choice of language, programming tools, etc. All of these techniques should be extensible into the T&E community, as that community increases its use of HPC capabilities.