Research Interests

My broad research interests are in the areas of software engineering, programming languages and compilers. In particular, I am interested in static and dynamic program analysis for the purposes of testing, understanding, verification and maintenance of complex software systems. I am especially interested in building tools for object-oriented software that improve software productivity and software quality.

Static and Dynamic Program Analysis

Static program analysis is a technique that analyzes the source code of the program and reasons about its run-time behavior. Traditionally, program analysis has been applied in optimizing compilers. For example, such analysis may examine the program, identify expressions that have the same value, and rewrite the code to remove redundant computations; this usually makes the program run faster. I am especially interested in the applications of static program analysis in software productivity and software quality tools. Tools can use analysis to (i) reverse engineer and compare code with design documents, (ii) compute precisely what needs to be executed during testing in order to gain confidence that the software is well-tested, (iii) identify the code in a software system that is related to a change and needs to be reexamined or retested, (iv) statically check the program and identify bugs and security flaws, etc. Such software tools save valuable time for developers and testers and improve the overall quality of the software. (For example, Microsoft uses the PREfix tool which is based on advanced program analysis techniques, to check all its software for memory leaks and null pointer references.).

There are two important requirements for a static analysis to be successfully applied in software tools. First, the analysis needs to be precise, i.e., it should report as little spurious information not corresponding to any runtime execution, as possible. For example, many existing tools use static analysis to detect errors in production-strength software systems before deployment. If the underlying static analysis is imprecise, the tool outputs spurious warnings; this leads to a lot of wasted time of the developers who need to examine such spurious warnings. Second, the analysis needs to have practical cost in order to be successfully applied in interactive software tools. However, there is always a trade-off between the precision and the cost of the static analysis.

In previous projects I have worked on techniques for implementing relatively precise analyses at practical cost [OOPSLA'01,ISSTA'02,PhD03]. In particular, I have studied the important problem of pointer and reference analysis for Java [OOPSLA'01,ISSTA'02]. Pointer analysis has a wide variety of uses in software tools and optimizing compilers: it can be used for call graph construction, virtual call resolution, side-effect analysis, etc. In another project [ASE'03,SCAM'02] I have worked on pointer analysis for C for the purposes of call graph construction. This analysis produces precise call graph results while in the same time it has practical cost which makes it suitable for application in industrial-strength software understanding tools.

Testing and Understanding of Software

Testing is a very important part of software development. Static analysis can be used in software testing tools to gain confidence that substantial parts of the code have been executed by the test suites. Such tools lead to higher quality test suites and ultimately to improved software quality and developer productivity.

The work in [TSE'04,ICSE'03] describes a tool for testing of polymorphism in Java software. The first problem addressed by this work is the computation of precise information about the possible bindings at polymorphic calls; since dynamic binding is one of the most significant causes of faults in object-oriented software, in order to gain confidence that the software is well-tested, it is important that all possible bindings are covered by the test suite. For whole programs (programs for which the entire program is available during analysis) the possible bindings at polymorphic calls can be computed using a relatively precise pointer or reference analysis. However, testing is rarely done on whole programs. Therefore, the second problem addressed in [TSE'04,ICSE'03] is how to adapt the whole-program analysis from [OOPSLA'01] and other related whole-program analyses to work on partial programs. In another project [ICSM'02] I worked on techniques for identifying interclass dependences. This information can be applied in regression testing (retesting of classes affected by a change) and integration testing (testing of the interactions between different classes). Most recent work [ISSTA'04] describes a tool for testing web software for robustness. The tool uses relatively precise static analysis to direct a fault injection engine and to ensure that the recovery code for certain operating systems faults is well tested.

In the presence of complex software systems it is necessary to build tools for program understanding. Such tools are necessary to understand the structure of the software, inspect the software to determine the impact of a given change, etc. I am especially interested in building program understanding tools for object-oriented software. In an earlier project described in [ICSM'03] precise static analysis was used to determine interclass dependences in Java software. Object-oriented software presents many challenges for research in this direction.