Phase 3
Goal
The goal of this phase of the project is to build a simple
infrastructure for measuring the coverage of call graph edges and nodes. Using
this infrastructure, we can determine how many methods and edges were actually
exercised when the program was executed. Similar kinds of measurements are the
basis for coverage tools used to evaluate the quality of test suites — a
test suite that achieves low coverage is inadequate and has to be enhanced
appropriately.
Overall Structure
The infrastructure consists of four distinct stages
- Stage 1: Generate JIMPLE
- Stage 2: Run CHA and produce files that describe the nodes and edges in
the call graph. This stage is very similar to what you did in the last phase
of the project; there are some minor differences, as described below.
- Stage 3: Use Soot to create instrumented Java bytecode. The input of this
stage are the CHA-computed files from stage 2; the output is valid
instrumented Java bytecode for the data programs (p1, p2, etc.).
- Stage 4: Run the instrumented code and gather coverage measurements for
each data program. The input is the instrumented Java bytecode; the output are
several files describing the methods/edges NOT covered during program
execution.
Stage 1
JIMPLE generation is done similarly to before. This needs to be
done only once per program — there is no need to create JIMPLE multiple times.
Stage 2
This is essentially the same as the previous phase of the
project. We will use my own implementation of CHA, just to make sure that
everyone starts from the same point. (Note: If you couldn't figure out something
in phase 2, check my code for CHA.) You do not need to read or write any code
for this stage. The only difference from phase 2 is the output of CHA. There are
three new things:
- Each reachable method is assigned a unique integer id. These ids are shown
in files rmethods and calls. The ids are needed to simplify the
instrumentation. IMPORTANT: Different invocations of CHA may assign different
ids to the same method — i.e. in one invocation of runCHA method m may be
assigned id 5, and in the next invocation of runCHA the id may be 3. All
subsequent processing in stages 3 and 4 depends on these ids (as recorded in
file rmethods). Normally, this is not an issue because you need to
invoke runCHA only once. However, if you ever rerun stage 2 for some reason,
you must also rerun stages 3 and 4.
- Each call site within a reachable method is assigned a string id of the
form "x_y". Here x is the id of the enclosing method, and y is the index of
the call site within the method. For example, if method m has id 7, the 2nd
call site in m has id "7_2". The ids are shown in file calls.
- CHA produces a new output file edges. The file lists all edges of
the call graph, in the form "call_site_id,target_method_id".
There is
no need to run CHA more than once per program.
Stage 3
This stage is invoked by addInstrumentation inside script "run".
It uses the functionality provided by Soot to "tweak" the JIMPLE and to create
from it Java bytecode. If, for example, stage 3 is executed for program p3, the
input is file p3/rmethods, and the output is stored in directory
p3/CLASSES. CLASSES contains the instrumented bytecode (e.g. X.class,
Y.class, etc.), as well as the JIMPLE code for this instrumented bytecode (e.g.
X.jimple, Y.jimple, etc.). The instrumented bytecode is executed during stage 4.
The generated JIMPLE is for debugging purposes only (to see if the
instrumentation is inserted correctly); it has no other uses.
The basic idea of the instrumentation is the following: inside each reachable
method, additional JIMPLE statements are inserted. All these statements are
calls to methods in class RuntimeTracker. This is a class that during stage 4
will collect the run-time information. There are two important methods in this
class. Method methodEntry in RuntimeTracker is invoked every time the
execution "enters" some method; the id of the entered method is used as a
parameter of methodEntry. Method beforeCall in RuntimeTracker is
invoked immediately before a call site; it takes as a parameter the string id of
the call site. As an example, suppose that we have a method with id 2 whose body
looks like this:
r0 := @this: A;
specialinvoke r0. < java.lang.Object: void <
init>()>();
return;
After instrumentation is inserted, the body of the method should look like
this:
r0 := @this: A;
staticinvoke < RuntimeTracker: void
methodEntry(int)>(2);
staticinvoke < RuntimeTracker: void
beforeCall(java.lang.String)>("2_1");
specialinvoke r0. <
java.lang.Object: void < init>()>();
return;
The instrumentation is inserted by classes Instrumenter and MyTransformer.
You will have to add some code to MyTransformer — at present, it only inserts
calls to methodEntry, but not to beforeCall.
Stage 4
This stage runs the instrumented bytecode. This is done using
runInstrumented inside "run". The execution is done through a wrapper around the
main class of the executed program. The wrapper calls RuntimeTracker.start,
invokes main, and then calls RuntimeTracker.end. Method start in
RuntimeTracker can be used to read info about the CHA-computed call graph (in
particular, files rmethods and edges). Method end writes
the coverage results to disk, in the same directory as the files produced by
CHA.
There are two output files written by end. File nmethods lists
all methods from rmethods that were never executed (i.e. not-covered
methods). Each line in the file should have the same format as in
rmethods, except for the last line. For example, the file may look
like this:
3: < C: void m()>
5: < C: A n()>
7: < A: void
m()>
Not covered: 3 out of 9 [33%]
File nedges lists all edges from edges that were never covered.
Each line in the file should have the same format as in edges, except
for the last line. For example, the file may look like this:
1_4,7
1_4,3
1_2,3
1_3,5
4_2,5
Not covered: 5 out of 13
[38%]
At present, the code inside RuntimeTracker creates empty files
nmethods and nedges. You will have to add code to output the
appropriate information in these files. Make sure that the output format is
exactly the same as in the examples above: each line except the last one
should be exactly the same as the corresponding line from rmethods or
edges. The last line should also follow the format shown above. (Note: If
you don't have experience with file IO in Java, have a look at classes
Instrumenter and ChaWriter for some examples.)