| CompOrg Spring 2004 Lab |
|   CompOrg Home   |   Matrix Transpose   |   Matrix Multiplication   |   Dramatic performance issues |
All the files for this lab are located on monte in
~hollingd/public/lab11.
|
The C program below creates 2 100x100 matrices (of int) and then calls the
function
|
|
You need to determine what effect changing the loop nesting int he
transpose function will have on the memory performance using
time or gprof. There are a number of issues
you need to consider (and evaluate with some experiments):
The memory access pattern determined by the order of the loops in
transpose will alter the stride used in access to
the array a.
The loop order will also determine the stride used to access
the array b!
Each time in the inner most loop of transpose the array
a is read from memory (or the cache), and the array b
is written with a new value. Part of what you need to determine is
whether the write operations or read operations are more significant
given the cache architecture of the machine you are running on. This
is to be determined by running experiments!
Obviously the code in transpose needs to be called many times, once is not enough to get any kind of reliable measurement.
The size of the arrays (the constant SIZE) can have an effect on memory performance (if the arrays can fit in the cache and we run the function many times we should have a very low miss rate).
|
The code shown below does multiplication of two 2-d arrays of int. Each array is the same size (100x100).
|
|
You need to play with the order of loops again, determining what effect this has on performance. There are additional complications!:
The memory access pattern for arrays a,
b and c are all effected! Note that
c is both read and written each time
in the innermost loop!
This function is much slower than transpose, as the runtime of
this function depends on SIZE3, where the run time of the
transpose function depends on SIZE2.
|
Write a program that you can has the property that when a relatively simple change is made to the program (that does not effect what the program computes, but only changes the order of memory accesses) the run time can change significantly. The general idea is to see how much of a difference we can make in the overall performance of a program just by change memory access patterns. Note that this exercise is worth doing for a couple of reasons:
|
|
|   |