The starting point for this assignment is a simplified file system, cse451fs, the design of which imposes strict limits on both the number of files that can be stored and the maximum size of any one file. In particular, no matter how big a disk you might have, this file system can hold only about 8,000 distinct files, and no file can be larger than 13KB. These restrictions result from the choice of on-disk data structures used to find files and the data blocks of a given file, that is, the superblock and inode representations.
Here are the major steps involved in this assignment:
- If you are working in a team, make sure that you, individually, understand the mechanical aspects of "development environment" you'll be working in. These include how to build the file system code, how to configure the raw disk device to host your file system, and how to run and test your file system. A description of these mechanical aspects is here.
- Design how you want to represent the file system on disk: what the superblock looks like, the inodes look like, how you keep track of free/allocated space, etc. (Those are the minimal changes need to relax the size restrictions of the current implementation, and are probably sufficient for anything you'll want to do.)
There are many tradeoffs involved among factors such as maximum file size, the number of files you can store, the amount of raw disk space (blocks) spent on management data structures (and so not available to hold file data), efficiency (say as measured by the number of disk IOs needed to access the first byte of a file, or the number required to access just the Nth byte, or all bytes). An ideal (but unachievable) design would:
- Be able to store a single file that was as big as the raw disk capacity, no matter how big the disk.
- Be able to store as many distinct files as there are blocks on the physical device.
- Be able to access the Nth byte of the file in one IO operation.
You may decide to come very close to achieving one of these goals while compromising significantly on the others. Or, you may decide to compromise somewhat on all of them, so that no one aspect is terribly bad.That latter is the approach taken by ext2, the default Linux file system (hmm, this may be out-of-date by now ... is ext3 now the default file system?). That system is motivated by measurements of real file systems showing that most files are small (under 8K or so) and read sequentially. You can make these same assumptions in your design, or you can make other assumptions entirely. For instance, you might want to design a file system for the explicit purpose of storing streaming media files (e.g., audio or video clips). Those files are large, and when used for playback are read sequentially with the requirement that the times to access successive chunks must have low variance.
- Alter the skeleton code to implement your file system. There are two major components to this. One is that that user level program mkfs.cse451fs must be changed to initialize the raw disk device with a valid, empty file system using your new on disk data structures. The other is to change the file system source (fsSource/) itself.
You should schedule an appointment with us (instructor & TA) to discuss your design before you get too deep into your implementation (although you probably should do some prototyping first just to make sure you understand all the issues). Once you are done, hand in your completed assignment using WebCT.
Hand in the following:
- A report detailing:
- what your file system is trying to accomplish
- the design for the file system you implemented. This might include a discussion of other approaches you considered but rejected, if any.
- awhat you had to do to implement (e.g., which files required major changes, and of what sort)
- how you tested your file system (for functionality), and whether or not it works
- the goals and design of your hypothetical persistent file system. (Again, you might also describe other ideas you discussed but rejected, and say why.)
- A gzipped tar file containing:
- source code for you modified mkfs.cse451fs and instructions for making and running it.
- a patch that the TA can apply to the Linux kernel with the original fsSource subdirectory to get your modified file system. If all of your changes are limited to fsSource, then you can alternatively just include the entire subdirectory. Again, give appropriate instructions.