| CompOrg Spring 2004 Lab |
|   CompOrg Home   |   cnt.c   |   ops.c   |   substring |
All the files for this lab are located on monte in
~hollingd/public/lab10.
The C program shown below calls the function
count_to_n with a count specified as a command line
parameter. You also specify the number of times that
count_to_n is called, for example, to run the program
and tell it to call count_to_n(10000000) 100 times:
> ./cnt 10000000 100
The program does not generate any output.
| cnt.c |
#include <stdio.h>
int count_to_n(int n) {
int i;
int tot;
for (i=0;i<n;i++) {
tot++;
}
return(tot);
}
int main(int argc, char **argv) {
int n,times;
int i;
if (argc<3) {
printf("Usage: cnt n number_of_times\n");
exit(1);
}
n = atoi(argv[1]);
times = atoi(argv[2]);
for (i=0;i<times;i++) {
count_to_n(n);
}
return(1);
}
|
You will use the time command to measure how long the
program runs. The time command will run a program for you
and keep track of how long it takes the program to run. The
time command will output three times:
real: This is the elapsed time (wall clock time). This value depends on what else is happening on the machine (if other people are also running programs you don't get all the CPU cycles for your program). This value is not what we are interested in today.
user: This is the amount of time spent processing your program (the time spent with the CPU executing your code). This is what we want to measure today!
system: This is the amount of time the system spends in service of your program. For example, if your code makes system calls (creating processes, allocating pages of memory, I/O, etc) this time indicates how much time the kernel spends processing your system calls. We won't deal with system calls (we can ignore this time).
Below is sample use of the time command on the above
program:
> gcc -o cnt cnt.c > time ./cnt 10000000 100 real 0m2.598s user 0m2.590s sys 0m0.010s
As shown, this program took 2.59 seconds when running with the parameters shown (counts to 10,000,000 100 times).
Your job is to make some measurements of this program using the time command, then apply some optimizations and see how fast you can get it to run.
Try modifying the command line parameters to the cnt
program to determine what the overhead of the function call
contributes to the total execution time. For example, if you switch
the parameters the program will count to 100 10,000,000 times (same
total number of counts, but lots more subroutine calls).
The function count_to_n doesn't do much, but your jobs
is to speed it up as much as possible by changing the C code
(using compiler optimizations not allowed!). Use the parameters
10000000 100 to make your measurements.
Get serious! Generate assembly code (gcc -S
cnt.c), see what is happening at the assembly level, and now
see how fast you can get the program by modifying the assembly code
for count_to_n. Your code must
actually have the loop (and do all the actual work involved, you can't
just return n)
Compile the original source with optimizations turned on:
gcc -O3 -o cnt cnt.c. Run (and time) this version.
Go back to previous step and try harder! (or look at the code
generated by gcc when optimization is turned on, and yell "it
cheats!").
Use the time command to make some measurements that can be used to compare the relative time of various integer and floating point operations. Below is some code to get you started (it supports only integer addition and subtraction). You need to make measurements of the following operations:
Your results should be relative (how much slower is division that addition), absolute times are not important. Don't trust any timings less than 1 second (modify parameters to make sure the program takes at least one second to run).
| ops.c |
#include <stdio.h>
int main(int argc, char **argv) {
int times,i;
int a,b,c;
char op;
if (argc<3) {
printf("Usage: ops operation number_of_times\n");
printf(" valid operations: + - \n");
exit(1);
}
op = argv[1][0];
times = atoi(argv[2]);
switch(op) {
case '+':
for (i=0;i<times;i++) a = b + c; break;
case '-':
for (i=0;i<times;i++) a = b - c; break;
default:
printf("invalid operation. Must be + or - \n");
}
return(1);
}
|
This exercise involves using gprof to profile
a program (so that you can see where the program can be
improved). Profiling using gprof requires that you use a special
option when compiling (-pg), this includes profiling code
in your application that will create a data file named
gmon.out when you run the program. gmon.out
includes information about how many times each function got called,
and how much time was spent in each function. The gprof
program presents this information in a human readable format.
You can find complete documentation on using gprof at
gnu.org
gprof documentation. Below is shown a sample session to get you
started.
> gcc -pg -o cnt cnt.c
> ./cnt 1000000 100
> gprof -b -T cnt
granularity: each sample hit covers 4 byte(s) for 4.35% of 0.23
seconds
called/total parents
index %time self descendants called+self name index
called/total children
0.23 0.00 100/100 main [2]
[1] 100.0 0.23 0.00 100 count_to_n [1]
-----------------------------------------------
<spontaneous>
[2] 100.0 0.00 0.23 main [2]
0.23 0.00 100/100 count_to_n [1]
-----------------------------------------------
granularity: each sample hit covers 4 byte(s) for 4.35% of 0.23
seconds
% cumulative self self total
time seconds seconds calls ms/call ms/call name
100.0 0.23 0.23 100 2.30 2.30 count_to_n [1]
Index by function name
[1] count_to_n
|
Your job is to profile the code shown below that searches a string for a substring. There are a things you need to know before you can get this to work:
profiling using gprof is based on statistical sampling (interval counting), and is very inaccurate for programs that don't take long (and functions that are rarely called). You need to modify the main program below to make profiling possible - basically you need the function substring to be called many times to get reasonable statistics.
Notice that the function substring might take a long time (if the substring is never found) or take very little time (if the substring is found early on). I'd suggest giving substring large strings (both string and substring) and that it not find the substring (or not find it real fast).
There are some obvious optimizations that can speed up the substring function, but the idea is to first profile the code and see for yourself that gprof can provide useful information!
You want the profiling information to include statistics about
libray functions like strlen and strncmp!
This won't happen unless you do something special, the C library is
(by default) a dynamic library and you won't see gprof mention these
functions. The solution to this is to build an executable that does
not rely on dynamic libraries, but instead has all the code it needs
(so a copy of strlen and strcmp are actually included in the
executable). You can enable this by passing gcc the
-static command line option:
gcc -static -pg -o substring substring.c
The gprof command has lots of options that control
the output, I'd suggest using gprof -b -T substring to
get a useful summary (a flat summary).
Here is the code you need to profile:
| substring.c |
#include <stdio.h>
/* substring locates a substring sub in a string s.
If the substring is not found, it returns -1.
If the substring is found, it returns the index in s
where the substring is found.
strncmp(s1,s2,n) compares the first n characters of two strings
and returns 0 if they match.
*/
int substring( char *s, char *sub) {
int i;
for (i=0;i<strlen(s);i++) {
if (strncmp(s+i,sub,strlen(sub))==0)
return(i);
}
return(-1);
}
int main(int argc, char **argv) {
int pos;
if (argc<3) {
printf("Usage: substring string sub\n");
exit(1);
}
printf("Looking for %s in %s\n",argv[2],argv[1]);
pos = substring(argv[1],argv[2]);
if (pos==-1)
printf("Not found\n");
else
printf("found at position %d\n",pos);
return(1);
}
|
Once you have profiling working, start trying to optimize the substring function and see how much faster you can make it for your test case(s). Profile the new code and find out where the time is spent as well!