Ph.D. Theses

A Machine Learning Approach for Designing Dynamic Programming Algorithms

By Eric A. Breimer
Advisor: Mark K. Goldberg
September 25, 2002

Dynamic programming is a fundamental algorithmic strategy used to solve many computationally intense problems. Some of the most recent and notable problems are sequence comparison and alignment, which are critical tools in the analysis of genetic data. While the use of dynamic programming has resulted in many practical algorithms, there still exist "real world" problems that are so large that they cannot be solved in a reasonable amount of time using standard dynamic programming algorithms.

We introduce a new approach for improving the efficiency of dynamic programming algorithms through the use of machine learning. The goal of the approach is to identify relationships between input properties and algorithm behavior, and to use these relationships to improve existing algorithms. We investigate a technique for automatically generating more efficient algorithms tailored for specific input sources and a technique for clustering input based on "solution similarity." We combine these techniques into an adaptive framework, which classifies new input based on expected algorithmic behavior and selects the most effective algorithm from a set of evolving algorithms.

The approach is implemented for the longest common subsequence and local sequence alignment problems. The solutions to these two problems are used to measure and represent similarity between two or more sequences. Practical problems include the comparison of DNA sequences, event sequences, and human-generated text. We investigate the effects of our learning approach on the quality of the solutions and the running time.

While some of the implementation details are problem dependent, our approach is modular and generic so that it can be applied to a variety of different dynamic programming problems with only slight modifications. The goal of this thesis is to demonstrate the practical application of machine learning in the enhancement of dynamic programming algorithms and to demonstrate the feasibility of automated design in heuristic, domain-specific algorithm development.

Return to main PhD Theses page