CompOrg Spring 2004 Lab

Lab #11 - Memory Performance and Cache-Friendly-Code

4/21/2004

All the files for this lab are located on monte in ~hollingd/public/lab11.


The Matrix: Transpose Function Timing

The C program below creates 2 100x100 matrices (of int) and then calls the function transpose. The function transpose puts the transpose of 2-d matrix a into matrix b. The sample main calls tranpose only once, and does not initialize either matrix.

transpose.c
#include <stdio.h> 
#define SIZE 100
typedef int mtx[SIZE][SIZE];

/* put the transpose of matrix a in b */

void transpose(mtx a, mtx b ) {
  int i,j,k;

  for (i=0;i<SIZE;i++) {
    for (j=0;j<SIZE;j++) {
      b[j][i] = a[i][j];
    }
  }
}

int main(int argc, char **argv) {
  mtx x,y;

  transpose(x,y);

  return(0);
}


You need to determine what effect changing the loop nesting int he transpose function will have on the memory performance using time or gprof. There are a number of issues you need to consider (and evaluate with some experiments):


The Matrix Reloaded: Matrix Multiplication

The code shown below does multiplication of two 2-d arrays of int. Each array is the same size (100x100).

mm.c
 
#define SIZE 100
typedef int mtx[SIZE][SIZE];

/* matrix multiplication. c=ab
   assumes that all of the matrices are the same
   size (and square).
*/
void matrix_mult(mtx a, mtx b, mtx c ) {
  int i,j,k;
  /* initialize c to all zeros */
  for (i=0;i<SIZE;i++) 
    for (j=0;j<SIZE;j++) 
      c[i][j]=0;

  /* do the multiplication */

  for (i=0;i<SIZE;i++) {
    for (j=0;j<SIZE;j++) {
      for (k=0;k<SIZE;k++) {
	c[i][j] += a[i][k]*b[k][j];
      }
    }
  }
}


int main(int argc, char **argv) {
  int i;
  mtx x,y,z;

  matrix_mult(x,y,z);

  return(0);
}


You need to play with the order of loops again, determining what effect this has on performance. There are additional complications!:


Revolutions: Most dramatic performance contrast you can come up with.

Write a program that you can has the property that when a relatively simple change is made to the program (that does not effect what the program computes, but only changes the order of memory accesses) the run time can change significantly. The general idea is to see how much of a difference we can make in the overall performance of a program just by change memory access patterns.

Note that this exercise is worth doing for a couple of reasons:

  • You might learn something!
  • You might be asked about this on the final exam!