OpSys Spring 2005 - HW2

HW2 - Unix and Windows diskusage commands

Due Date: Sunday, Feb 13th, 11:55PM

Submit to WebCT drop box labeled HW2

Late Policy

- diskusage command for Unix and Windows (2 programs)

The objectives of this assignment are:

  1. Get used to making systems calls in both Unix and Windows.
  2. Become familiar with Windows (win32) api.
  3. Learn how to deal with directories and find out information about files.

You are to write two programs, one that runs on the CS FreeBSD machines and the other that will run on a PC running Windows. The Windows program will be graded (run) on a machine running Windows XP, but if you develop on something running Windows 98 or 2000 it will work fine under XP.

Both program will print out the size of any files or directories listed on the command line. For the Unix version, your program should match the numbers reported by the du command, this is the actual disk usage including the space used by the directory file itself. For the Windows version you just need to compute the file sizes (NOT DISKUSAGE, and you DO NOT INCLUDE the size of a directory file). Both need to work recursively, so they report on an entire hierarchy - given a directory name compute the space used by everything in the directory. Here is an example of running the program:

< ./diskusage build comporg opsys .bashrc
807424  build
1659904 comporg
389120  opsys
4096    .bashrc

In the above example the directory build and it's contents are using up 807424 bytes, the file .bashrc is using 4096.

There are many ways to write this program, but in general the simplest approach is to write a recursive function that can determine the disk usage of a file or directory, in the case of a directory the function calls itself recursively on each file/directory contained in the directory.

- Unix system calls and library functions

The Unix program will report on the number of bytes of disk space used by a file or by a directory and everything contained in the directory. Your program can receive any number of command line arguments, each is the name of a file or directory. Your program should compute the total disk space used by the file or directory (and it's contents) and print a single line that includes the number of bytes followed by the file/directory name.

NOTE: Your command must include the size of a directory itself when dealing with directories. You can verify that your program reports the correct sizes by comparing to the output of the Unix du command:

>  du -s -B 1 build comporg opsys .bashrc
807424  build
1659904 comporg
389120  opsys
4096    .bashrc

In the above example, the Unix du command is passed some options that make it act the way your program is suppose to act, you don't need to support any options at all (everything on the command line is assumed to be a file or directory name).

To get information about a specific file or directory in Unix you can use the stat() system call. You can view an on-line version of the man page here: freebsd stat(2) man page

To read a directory you need to use opendir and readdir, both are documented here: freebsd opendir(3) man page.

NOTE: The structure created by readdir is a struct dirent which may contain many fields, although the POSIX standard indicates that you should only use the field named d_name which is a null terminated string containing a file name.

- Windows (win32 api) system calls and library functions

IMPORTANT! As it turns out, it is very difficult to write a program that computes disk usage on a Windows machine with an NTFS files system (any file system that can handle sizes > 2GB). The issue is finding out the block size (called cluster in the Windows world) so you can round up file sizes (you don't have something like stat that will just hand you # blocks and blocksize...). SO - don't worry about reporting actual disk usage, you should report file sizes without regard to how much disk space is actually used. For reporting the size of a directory, simply add up the sizes of the files held in the directory (and within any subdirectories), don't worry about the size of the directory file itself... (you do have to handle this in the Unix version!).

Your Windows program will be compiled using gcc and run in a cygwin shell. It is expected that you can test your code under cygwin (feel free to develop using Visual Studio, but make sure it compiles and runs under cygwin). Most RPI laptops (last 2 years at least) include cygwin installed. If you need to install cygwin yourself, you can get everything (programs and documentation) at www.cygwin.com. The download process involves getting and running an installation program in which you can pick and choose what you want to install. You need to tell it you want gcc!

When programming in cygwin, you can use any Windows editor to edit your files, or you can get a cygwin version of emacs/vi/pcio and edit wihtin the cygwin shell window. The gcc command line is just like when running on the FreeBSD machines, other utilities (like Make) also work the same).

The win32 API includes many functions for messing with files and directories - below are some that you may find useful:

NOTE: We don't care whether you support big files or not. Feel free to assume any file or directory is no larger than 2^32 bytes. The WIN32_FIND_DATA structure supports file sizes up to 2^64 bytes, you can ignore the upper 32 bits (nFileSizeHigh).

- Project Requirements

The following are the requirements for the project:

Grading: Each of the progams is worth 1/2 of the grade. Each program will be graded according to the formula below. Note that to get full credit we must be able to understand your code (it must be commented!)

30%Proper output. For Unix, computes diskusage correctly (matches the du command). For Windows, correctly computes file and/or directory sizes.
10%Impossible to crash the program. Basically you need to make sure there is nothing we can do (there are not command line arguments) that can cause your program to crash (for example, SEGV). Check the return value of all system calls (check for errors)!. Keep track of dynamically allocated memory (and free anything that is not needed).
10%Code quality (comments, organization, how hard is it to understand ?).

You can get partial credit for any part.

If you code does not compile and run under FreeBSD/Cygwin, you will lose at least 1/2 the relevant points ( partial credit will be awarded based on visual inspection of the code).

- How to Submit

Log in to WebCT at webct.rpi.edu using your RCS id and password. Once you get to MyWebCT click on "Operating Systems", and from there go to the homework drop boxes. Submit your files (individually, zipped or tarred) to the drop box labeled HW2. IMPORTANT: make sure yout submit both programs!

-Resources