Motivation: The Rensselaer Grid Currently 1000 processors. Largest university based supercomputing system (upcoming CCNI grant) over 10,000 IBM BlueGENE processors Over 70 teraflops TeraGrid/ETF (www.teragrid.org) Initially was 4 geographically distributed sites with few users. Now an extensible facility with more than 150 petabytes of storage and 102 teraflops of computing capability. iVDGL (www.ivdgl.org) LCG @ CERN (www.cern.ch) LHC (Large Hadron Collider) Computing Project Areas of Uncertainty: Application Uncertainty non-determinism changing computational/memory usage demands changing communication topology Environmental Uncertainty processor availability in grids, processors application is loaded on may not be known OS effects Interaction Uncertainty unknown how applications may perform on different architectures state space too large to test all parameters applications may compete with other concurrently running applications for resources communication links disk access processor use memory use Autonomic Runtime Manager (ARM) for Large Scale Adaptive Distributed Applications Adaptive Distributed Applications computational complexity associated with each computational region varies in both time and space over the application's execution Brief discussion of wildfire application What is ARM? dynamically schedules computational units to resource units applications profiled for information about resources and application performance Reconfiguration IR - imbalance ratio max computation spent - min computation spent / max computation spent normalizes difference in computation PCL - processor computational load - data assigned to processor PAF - processor allocation factor - average burning cell time / burning cell time at processor TB - computation time of a burning cell equal to TB(estimated) * processes at processor (if 1, identity) ACW - application computational workload - number burning cells plus number unburnt cells brief results results 1: static -> natural region -> graph partitioning results 2: 256x256: 65536 -> 512x512: 262144