Phase 4

Goal

In this phase we will use the infrastructure from phase3 to measure the coverage achieved by real-world test cases on real Java code. The code that will be tested will be from the standard Java libraries (from packages java.util.zip and java.text). The test cases for this code are part of the Mauve project. The Mauve project is a open-source collaborative effort to write a free test suite for the standard Java libraries. Our goal will be to evaluate the call graph coverage achieved by the Mauve test suites. These results may be used to improve the Mauve tests to achieve better coverage, and the improved tests may be contributed back to the Mauve project.

Overall Structure

The infrastructure controlled by the "run" script consists of five stages

Stage 1

The starting points for the experiments is a set of "components under test" (CUT). A CUT is a set of Java classes that is not a complete program. Each CUT is tested by a test driver that executes one or more test cases. The driver uses a test harness (from Mauve) to provide the infrastructure for executing the tests. Each CUT is represented by a separate subdirectory of project/phase4. There are 4 different CUTs that will be used for the experiments: sample, gzip, zip, and collator. "sample" is a simple CUT used as an example (based on program p5). "gzip" and "zip" are two components from java.util.zip, and they implement functionality related to I/O of ".zip" and ".gz" files. "collator" is a component from java.text with functionality related to text collation. Each directory corresponding to a component contains several files

Stage 2

JIMPLE generation is done similarly to before. This needs to be done only once per program — there is no need to create JIMPLE multiple times (unless you change the tests or the tested code).

Stage 2

This is essentially the same as the previous phase of the project. The test driver (Main), the test cases (e.g. MyTest) and the CUT classes are analyzed together as a whole program. In addition to the files rmethods, etc. from the previous phase, CHA creates three files related to the CUT

File rmethods.cut

Contains all CHA-reachable methods that are defined in CUT classes. This is a subset of rmethods, and uses the same method ids. For example, for component "sample", the file may look like this:

9: <sample.B: void <init>()>
10: <sample.B: void m()>
11: <sample.A: sample.A n()>
13: <sample.A: void <init>()>

Note: keep in mind that for you the actual ids may be different, since different invocations of CHA may produce different method ids.

File nmethods.cut

This is the complement of rmethods.cut - the set of all CUT methods that are not CHA-reachable from the main method in the test driver. A method from this set will never be executed by the test cases. A large number of such methods may indicate a problem with the tests. (The last line in the file summarizes the percentage of CHA-unreachable methods.) For example, for "sample" the file is

<sample.A: void m()>
CUT methods: total: 5, unreachable: 1 [20%]

File edges.cut

Contains call graph edges for which both the caller and the callee are CUT methods. Of course, this is a subset of "edges". Unlike in phase3, here we distinguish between different receiver classes for virtualinvoke and interfaceinvoke call sites. (For staticinvoke and specialinvoke, the edges are the same as in phase3). For example, consider a call site with id "34_56" for which there are three possible classes of the receiver: A, B, and C. For A and B, the target method is "21"; for C, the target method is "89". Then rmethods.cut will list three separate edges: (34_56,21,A), (34_56,21,B), and (34_56,89,C). Basically, these are call graph edges that are annotated (labeled) with the class of the receiver. Our coverage measurements will be with respect to these annotated edges. For "sample", the file may look like this:

9_1,13
10_1,13
10_2,11,sample.B
10_2,11,sample.A

Stage 4

This stage is invoked by addInstrumentation inside script "run". The only difference from phase3 is the addition of another form of "beforeCall()" instrumentation. In phase3, beforeCall had a single parameter - the string id of the call site. In phase4, this version of beforeCall will only be used for staticinvoke and specialinvoke call sites (which by definition have only one possible target method). However, for potentially polymorphic call sites (virtualinvoke and interfaceinvoke), we need not only the call site id, but also the class of the receiver object, to keep track of the coverage of annotated call graph edges. So, in RuntimeTracker we will have a new method "beforeCall(java.lang.String,java.lang.Object"), where the second parameter is a reference to the receiver object at the call site. MyTransformer will insert the appropriate call to one of the two beforeCall methods, based on the kind of call site.

Note: instrumentation will be inserted only in methods listed in rmethods.cut

Stage 5

This stage runs the test driver on the instrumented bytecode. As in phase3, we have RuntimeTracker that gathers coverage statistics for edges and methods. Unlike phase3, here we are only interested in coverage for methods from rmethods.cut and edges from edges.cut The output is files nedges and nmethods with the same format as in phase3. For example, for component "sample", files nedges looks like this:

10_2,11,sample.B
Not covered: 1 out of 4 [25%]

This tells us that from the 4 edges listed in edges.cut, one is not covered.

At present, the code inside RuntimeTracker creates empty files nmethods and nedges. You will have to add code to output the appropriate information in these files. Make sure that the output format is exactly the same as in the examples above: each line except the last one should be exactly the same as the corresponding line from rmethods.cut or edges.cut. The last line should also follow the format shown above.

Important Observation

There is a common problem for phase4 that many people encounter. The problem is this: to determine whether a call edge is covered, the code in beforeCall remembers the last invoked call site (the call site id, and the receiver class if necessary), and then inside methodEntry updates the necessary data structure to remember that the edge is covered. For example, if edges.cut contains edge (x_y,z), this edge is remembered as covered when methodEntry(z) is called and then the last call site was x_y.

Unfortunately, this doesn't work. The problem is this: the execution of call site x_y does not mean that the method that will be immediately invoked after that will be z. There are several reasons for this. For example, the invocation of z may trigger the Java virtual machine to load the class that contains z; this process of loading includes the implicit invocation of method <clinit> for that class. As a result, the method that is invoked immediately after the call site is not z. Another problem is in multi-threaded programs: if there is thread switching immediately after the call site is executed, method z will not be invoked immediately after x_y. To avloid this problem, you should decide whether an edge is covered by using only the information available in beforeCall.