---

* Research

Ph.D. Theses

An Information-Theoretic Approach to Storage Management for Middleware Caching

By Chi-nan Chiang
Advisor: Sibel Adali
July 25, 2005

Caching technique is a critical method for improving the performance of various types of applications. For database applications, there has been a great deal of research on view caching in the past decade. Views in this context model queries that are executed repeatedly by applications that access databases across a network. Most studies in this area focus on improving algorithms for cache replacement and admission by using a fixed profit metric to measure the importance of views. It has been shown that in many data intensive applications, caching methods that take into consideration multiple factors such as the unit computation cost together with the hits outperform conventional caching techniques that rely solely on hits. The creation of the profit metric in these cache management systems is usually based on observations that identify factors that contribute most to performance. However, the appropriate combination of these factors based on a specific workload is a problem that received little attention. Since the performance of a cache system may change dramatically based on the combination and scaling of these factors, this is a crucial step in designing an effective cache management system. In addition, workload changes may easily lead to degradations in the system performance. A self-tuning cache system can address this problem by adapting to changes. Design of such an adaptive system has not been addressed in the view caching literature.

In this thesis, we address these problems and propose an information-theoretic approach as a basis for combining multiple factors that predict cache performance. We describe a generic cache management system called CAVES that is able to incorporate any application specific factor in the profit metric and evaluate these against a given performance measure. We describe the architecture of such a system and develop methods for tuning the performance of the system for a specific workload. We develop a simulation model of our system using the Time-Warp simulation technique and test it against simulated workloads as well as the TPC-H benchmark. We show that our profit metric can outperform other well-known methods with the same factors. We also show that our method is able to adapt to a large range of workloads with different properties. Based on these results, we develop a methodology for tuning cache management protocols to a given workload.

* Return to main PhD Theses page


---

---