This work partially funded by NASA GSRP (NGT 5-50165) and does not represent the position or opinions of the United States Government

[MAIN] [DESCRIPTION] [DOWNLOAD] [LINKS]

Description of IDB

Overview

IDB is a framework for profiling the performance of parallel and object oriented scientific applications. At the heart of the framework is the probe library (probelib) which collects performance data in a scalable way. This data is stored in a relational database where it can be queried using a front-end visualization tool or interactively using SQL.
Performance data is tightly coupled with the programs control flow, as opposed to the underlying architecture. Scalability is ensured by maintaining aggregate statistics and instrumenting the relevant nodes of the control flow graph. The nodes that are probed map to specific performance critical events, these are: Aggregate statistics are collected at each of these events. The order in which they are encountered at run-time define the Control Flow Hierarchy of the application. Every time the application is run, the control flow hierarchy (CFH) and associated statistical data populate a performance database.

System Requirements

The IDB framework can be used to analyze both C and C++ parallel programs that use the Message Passing Interface (MPI). The Instrumentation API will work on any POSIX compliant UNIX platform with MPI installed. Postgresql, a freely available DBMS package is used to capture and query collected performance data. SQL compliance ensures other database management systems can be used without significant modifications. The INSTRUMENTOR application requires lex, yacc, and Motif and the VISTOOL requires Perl/Tk with the appropriate DBE's installed.

IDB Components

The IDB framework is comprised of three components The Instrumentor is used to introduce instrumentation to the target application. The instrumentation is comprised of API calls to the Probe library. The Vistool displays general performance information graphically by querying the performance database.

Instrumentation

There are five critical performance events that can occur in a program, these are LOOPs, CALLs, PROCedures and COMMunications or synchronizations. Calls to the Probelib API are placed at these events using an automated instrumentation tool.


The instrumenter parses C or C++ files to generate a syntax tree. The user is presented with a catalog of classes, methods, functions, loops and calls to selects from. Probelib calls are introduced at the selected locations.

Populating the Database

After the source code has been instrumented it is compiled and linked against the Probelib library. As the program is run, the instrumentation database is populated with performance data. This is done using two SQL scripts, SCHEMA-CREATE.SQL and POPULATE.SQL. The former defines the database schema and the latter is generated at run time populates the database. To verify the database has been created and populated correctly, two validation queries are provided. These are "SHOWME.SQL" and "TREE.SQL" which display all statistical data and the connectivity of the control flow hierarchy, respectively.

Visualization of Performance Data

Vistool connects to the database server and allows the user to generate customized performance visualizations.


Vistool can connect to database servers running locally or over a network. Queries to the database are displayed graphically based on user selections. The modular design of this tool demonstrates how visualization plug-ins are developed to pull specialized data from the database.
[MAIN] [DESCRIPTION] [DOWNLOAD] [LINKS]

Last updated on February 22, 2000 by Jeffrey Nesheiwat
(c) 1999 and 2000 Jeffrey Nesheiwat and Boleslaw Szymanski