CS423 -- Operating Systems Design

MP4: Nachos File System

Due date: Friday Monday, December 5 2005 (final hard extension) December 2, 2005 at 10:00 am CST

 

 

0. Update

HANDIN is now READY! Switch to your filesys directory, type ~cs423ug/handin and follow directions.

=================================================\
| CS 423 UG |
| Fall 2005 |
| Welcome to handin |
\===================================================/
-> Type in the Machine Problem number (1, 2, 3, 4): PNumber: 4 
| Current Time: Wed Nov 30 21:26:15 CST 2005 
------------------------------------------------------
andin currently believes that your are: bgdavis
Enter y if this is correct or anything else otherwise
y

-> Your MP has been handed in successfully.

These are the files that you submitted:
.:
total 108
-rw-r--r-- 1 bgdavis stdt 8 Nov 30 21:25 README.txt
-rw-r--r-- 1 bgdavis stdt 6232 Nov 23 2004 directory.cc
-rw-r--r-- 1 bgdavis stdt 3166 Nov 23 2004 directory.h
-rw-r--r-- 1 bgdavis stdt 5021 Nov 23 2004 filehdr.cc
-rw-r--r-- 1 bgdavis stdt 2555 Nov 23 2004 filehdr.h
-rw-r--r-- 1 bgdavis stdt 14075 Nov 23 2004 filesys.cc
-rw-r--r-- 1 bgdavis stdt 2426 Nov 23 2004 filesys.h
-rw-r--r-- 1 bgdavis stdt 7407 Nov 23 2004 openfile.cc
-rw-r--r-- 1 bgdavis stdt 2440 Nov 23 2004 openfile.h

______________________________________________________________________

The mp4 has been modified! If you have downloaded it before November 17, 2005 you need to download it again.  Afterward, you will need to copy and paste your source into the new files so that it will work with the test cases.

1. Overview

In this assignment you are asked to implement a file system for Nachos. The given baseline file system is severely limited: it only supports one directory (the root directory); the root directory can only contain 10 files; files do not include a "last modified time,"; and files have a fixed size that is specified when they are created. Files cannot grow any larger than the size specified when they are created. Files cannot be truncated to a smaller size.

You are asked to over come these limitations.

2. Preparation

You solutions must compile and run on the EWS Sun systems. All instructions are intended for the EWS Sun systems i. e., the machines you have been using..

File systems are discussed in Section 5.4 and Chapter 6 of the the Tanenbaum textbook.

As with previous machine problems, given code has been provided. The code can be downloaded from here

The tar program will list the files it is creating. Do not issue the above commands again from the same directory unless you want to overwrite all changes you have made! If you wish to replace individual files, untar the mp4.tar.gz file in a different directory and copy the relevant files to your own source tree. You need all of the files provided in the given code to compile the changes you will be making. In this MP, we will be focusing on the filesys directory in Nachos source code directory tree.

To compile your code, use the following commands: (from the filesys directory)

  1. gmake depend

  2. gmake

Compilation Note: You must use gmake to compile your programs; the regular make program will not work. You may work on other systems, but your code must compile on the CSIL Solaris systems. After you have run gmake depend for the first time, you can just use gmake to compile your programs.

Important Note: For this MP, there is not one solution you are expected to find. Part of this MP is to design your own file system. You may need to radically alter some of the provided C++ classes to correctly implement your file system. This MP may be more time consuming than previous MPs since it involves more than finding places to insert code.

For this MP, you are allowed to change a number files. Depending on your implementation, you may not need to change all of the files. The files you are allowed to change are: directory.{h,cc}, filehdr.{h,cc}, filesys.{h,cc}, and openfile.{h,cc}.

3. Baseline File System

filesys/filehdr.h, filesys/filehdr.cc -- These files maintain routines for managing the disk file header (the counterpart for Unix i-node). The file header is used to locate where on disk the file data is stored. This is implemented as a fixed size table of pointers - each entry in the table points to the disk sector containing that portion of the file data. The table size is chosen so that the file header will be just big enough to fit in one disk sector. This implies that there is space for 126 sector numbers, making the maximum size of a Nachos file equal to about 63K. Unlike in a real system, the filehdr does not keep track of file permissions, ownership, last modification date, etc. You will not be concerned with these limitations in this MP.

filesys/directory.h, filesys/directory.cc -- These routines maintain a directory of file names. The directory is a table of fixed length entries; each entry represents a single file, and contains the file name and the location of the file header on disk. The fixed size of each directory entry means that we have the restriction of a fixed maximum size for file names. In addition, the provided implementation has the restriction that the size of the directory cannot expand. In other words, once all the entries in the directory are used, no more files can be created.

filesys/filesys.h, filesys/filesys.cc -- Routines are provided in these files for initializing (formatting) the Nachos disk, and performing basic file operations.

There are two key data structures in the file system. There is a single "root" directory, listing all of the files in the file system; unlike UNIX, the baseline system does not provide a hierarchical directory structure. In addition, there is a bitmap for allocating disk sectors. Both the root directory and the bitmap are themselves stored as files in the Nachos file system.

filesys/openfile.h, filesys/openfile.cc -- These routines are used by the rest of the Nachos file system implementation to manipulate Nachos files once they have been opened and the file header information brought into kernel memory. In addition, routines are supplied for reading and writing file data to or from kernel memory.  Currently, the file size is fixed during file creation. In particular, the writeAt member function does not write bytes beyond the current file size.

userprog/bitmap.h, userprog/bitmap.cc -- routines for manipulating bitmaps, which will be useful for keeping track of the usage of Nachos disk.

userprog/syscall.h, userprog/exception.cc -- exception.cc is where the code you write will be called from. It includes an interrupt handle that handles all system calls generated by user-level code. exception.cc then dispatches each system call to the appropriate handler.

You are advised to examine the above mentioned files carefully and thoroughly for efficient implementation of this assignment.

For this MP, you may assume that there is only one process executing in the system at any time. Thus, you need not worry about concurrent accesses to the file system by different threads. You also do not to maintain additional data structure about files such as ownership, permissions, etc.

After reading all the files provided in filesys and userprog directories, you may want to try out the baseline file system provided to you.

4. Requirements

4.1. Extensible Files

In the baseline file system, files are limited to a size specified when they are created. Additionally, the maximum size of a file is a little less than 64KB. You are to remove both of these limitations.

You should support a file size as large as 500,000 bytes.

Your code should make sure there are actually enough free sectors on the disk to expand the file and return an error if the request cannot be satisfied. You also need to support file truncation (i.e. files can decrease in size).

You can keep the 9 character limit on file names.

Hint: OpenFile::WriteAt is where you can tell if a file needs to grow. The 64KB limit is due to how the Nachos i-nodes are stored. You have to come up with a new i-node system.

4.2. Extensible Directories

The baseline file system allows only one directory, the root directory. Additionally, the root directory only allows ten entries. You need to remove the ten entry limit. The number of directory entries should only be limited by the amount of space on the disk and the maximum file size.

4.3. Hierarchical Directories

The baseline file system has no concept of a "sub-directory." There is only one root directory. You need to add support for a hierarchical directory system (tree-structured directory). The FileSystem::MkNode and FileSystem::RmNode function are used to create and remove directories respectively. In addition to implementing these functions, you will need to modify the other file systems functions (like FileSystem::Open, FileSystem::Create, FileSystem::Remove) to handle the hierarchical file system. You should also make sure that FileSystem::Remove does not remove directories.

File and directory entries should use the same namespace. So for any directory, a file and a subdirectory should not share the same name.

If the RmNode function is asked to remove a non-empty directory, it should fail.

4.4. Documentation

Please create a short text file named README to describe your file system. This does not need to be extremely long, but it should describe issues such as how directory entries are distinguished from file entries, how additional space for file is allocated when one is extended, and how the i-node system handles files larger than 64KB.

5. Testing

Your code need to pass six tests.

A test case can be run by running these steps from the filesys directory:

  1. nachos -f
  2. nachos -t# (For example, it is nachos -t1 for test1)

Step 1 formats the disk; you should do this before running each test case. There are some other nachos arguments that you might find useful:

-f Formats the Nachos disk. (Creates a new, empty file system.)
-cp Copies a file from UNIX to Nachos.
-p Prints a file on the Nachos file system to stdout.
-l Lists the contents of the Nachos file system.
-D Prints the contents of the entire Nachos file system.
-x Execute a Nachos file.

To output DEBUG lines in your code, use the -d f flag (there is a blank between d and f). For example, nachos -d f -t1

Matching execution time, page faults, and disk reads is not important. You should examine test.[1-5].cc to see what operations each of the test cases attempt to perform and if necessary, change them while debugging your code.

6. Measurements

Measure the performance of your system for a variety of different workloads. Specifically, describe three large-scale workloads (each with at least 20 files), the data from your system, and your interpretation of the data. Suggestions for workloads are to use different combinations/fractions of (1) small files that are short-lived, (2) large files that are few in number but account for most of the reads and writes. Notice that these are chosen to match with reported characteristics of real file systems.

Write the experimental section up as a 2-page report (exceeding the limit will receive a 0) and submit a printed copy in-class on the day MP4 is due.

6. Submission and Grading

Handin instructions will be posted soon.

Correctness will be measured by comparing the output of your program to the standard output. Also the -l and -p Nachos command line options will be used to examine the changes your program makes to the Nachos files system. Style and legibility is important: if the graders can't see that your code is correct, we can't give you points. We reserve the right to use additional test cases when testing your programs; however, these test cases will not test for any conditions not clearly mentioned in the assignment. Crashing or failing to compile is, of course, not a correct behavior.

Implementation: 60%
Report: 30%
Coding, Commenting etc.: 10%

7. General Rules and Information

You should read the relevant Nachos source code before you begin coding. Chapters 5.4 and 6 of the textbook may be helpful in understanding and completing this assignment. As a reminder, the MPs for this class at to be done in groups of TWO; you may discuss among one another general operating system concepts or Sun lab problems, including those concepts which are covered in this MP, but not the details of the assignment solution.

You should use gdb to assist you. We won't be able to help you if you simply send us your code by email and say "I've got a segmentation fault. Help me debug it."

A good quick tutorial on gdb: http://www.unknownroad.com/rtfm/gdbtut/gdbtoc.html

For this MP, you can expect an amount of work more than that of the three previous MPs combined. So start as early as you can. If you are smart *and* lucky (encountering very few bugs), you may write the MP in two whole days. Otherwise, give yourself more time. 

Direct general questions to the newsgroup and specific questions to ta423ug@cs.uiuc.edu.

Any revisions will be listed at the top of the page and announced in the course newsgroup.


Written by: Brian Davis
Written on: Weds., Nov. 9, 2005