CSCI 333 :: Storage Systems

CSCI 333

Storage Systems

Home | Schedule | Labs | Williams CS | Piazza

Lab 3: FUSE FS, part I

Assigned	Thursday, 03/05
Due Date	Friday 03/20 at 11:59pm

Objectives

You have become (relatively) familiar with FUSE though your barebones "Hello FUSE" implementation. So far you have successfully created a minimal file system that contains a single pseudo-file. Now we will extend our FS behavior to support the creation, deletion, and modification of files and directories. Through this lab, we will explore the challenges of managing persistent data in a file system.

Overview

Developing a working file system is very hard. For that reason, this assignment is divided into two parts. In part I (this part), you will develop a lot of scaffolding and enough code for your filesystem to do something testable. In part II (the next part), you will complete the filesystem.

The Assignment

Your assignment is to develop a "Simple FS"-like filesystem that supports the following features by the end of part I:

The general structure of the filesystem is similar to the "Simple FS" design (see "Simple FS Design") below.
The filesystem supports the following operations at a minimum: getattr, access, readdir, and mkdir. (Note that at this point it is not necessary to support file I/O, or even regular files.)
Your mkdir operation must allocate an inode in the inode table and a data block from the data area.
Your filesystem should be backed by a SINGLE preallocated 10-MB file. That file should have a fixed name and exist on a non-fuse file system. For example, you may have a file named "simplefs_disk" in your project directory (i.e., it lives outside your FUSE file system). The size of the file should be a #defined constant (this way you can change it later). Remember to watch out for the working-directory gotcha.
When the filesystem is invoked, if the backing file doesn't exist, the backing file is created and initialized (you probably have a fixed layout of your disk that divides it into a superblock, bitmap(s), an inode region, and a data region, so be sure to write the initial state of your metadata structures to their appropriate regions on disk so that your file system is usuable). However, if the backing file does exist, the backing file should be used and its previous contents should be visible (i.e., populate your in-memory data structures by reading the contents of the on-disk versions that you have persisted)
When a mutating operation occurs, that operation's effects must be immediately visible in the backing file. (This means that you can't do everything in memory and then wait until exit time to write file system state out. I will test this feature by killing your process with SIGKILL. O_DSYNC isn't necessary, because the operating system will make sure your data reaches stable storage unless the entire OS crashes—which isn't part of the testing plan!)
Subdirectories must be supported (i.e., you can create a directory inside another directory)
Your directories may be fixed-size; it is not necessary to be able to create an arbitrary number of entries in a directory. If you do have a limit, you should document the limit somewhere.
Directory entries (paths) may also be fixed-size, as long as the name length is moderately reasonable. (Nothing under 16 characters is "reasonable" in my book; my minimal implementation compromises with a limit of 32.) Recall that a path is broken into many components; you traverse the file system one component at a time.
If you choose, file sizes may be limited to either 2³² or 2⁶⁴ bytes.
The "completeness" test of your filesystem is that you can to create directories, list them (with ls -la returning reasonable results including "." and ".."), and cd into the directories that you've created.
Other operations are up to you. We will be extending the filesystem to support files, rmdir, etc. in the next assignment, so you are welcome to implement those things. However, they will not be tested in the current assignment.

Why this particular set of features? It's the minimum set of necessary operations to have a filesystem where you can do something visible: create and list directories. You'll find that you need to create quite a bit of scaffolding to get that far (in particular, the code that creates an initialized "Simple FS" filesystem from scratch).

Simple FS Design

When I refer to a "Simple FS"-like filesystem, I mean the following:

The first sector stores the superblock, which describes the filesystem-level details, including the locations of important file system structures like the bitmaps, inode region, and data region.
The inode bitmap should have one bit that represents the allocation status of each inode in your inode table.
The data bitmap should have one bit that represents the allocation status of each available data block in your data region.
The inode table should be a logical array of inode structures on disk. You should pad your structures so that no inode structure spans a disk block boundary.
The data region should occupy whatever leftover space is on your disk.
The on-disk copy of the superblock, bitmaps, and inode table can be read at mount (filesystem initialization) time and is updated at your discretion (but note that your process might be killed at any time!).
All file metadata is kept in an inode structure. At a minimum, this should include the file type (directory or file), the size (in bytes), the name, and the locations of the first fes data blocks. (Subsequent blocks are located via an indirect block. You do not need to support doubly or triply indirect blocks). Other metadata, such as ownership, permissions, and timestamps, are up to you but are not required.
The block size is up to you, but it must be at least 512. (I recommend 4096, just to keep up with the modern world.)
Like any other file system, the on-disk data structures are stored in a single file (pseudo-disk) and are kept in binary.

For reference, my minimal implementation used a block size of 512 bytes. To make it easy to store the superblock in a filesystem block, I used the following union:

        union {
            struct simple_superblock s;
            char		pad[512];
        }  superblock;

(Note that the superblock should be only 512 bytes, even if you use a different block size for your filesystem. That design makes it possible to read the superblock without knowing the block size, which is a useful feature. If the filesystem uses blocks larger than 512 bytes, the remaining space in the larger "block" is simply wasted.)

I also found it useful to create a few macros to do things like seeking to a particular block, converting back and forth between byte offsets and block numbers, checking/setting a bitmap bit, etc.

Important Notes

Note: You are supposed to be writing a real filesystem. The only differences from a true implementation of a "Simple FS" should be:

Your implementation is backed by a plain file in the filesystem, rather than an actual disk. This means you can use file interfaces like pread and pwrite acces your structures on disk.
You do not need to handle things like concurrency and consistency checking. However, by making your updates syncronous, you should never have an inconssistent on-disk state.

To mimic a real filesystem, your implementation must satisfy the following criteria:

All access to the "disk" must be in multiples of the block size, which must be a power of 2 and must be 512 or greater.
Changes to files and directories must be reflected on disk immediately. It's cheating (and incorrect) to save things in memory and then write them out when you unmount.
Information must persist in the backing store (your persistent file) after unmount.

Submission

Submit your code (it should be inside a single file named simplefs.c) to your git repository. If you implement any additional features, describe them prominently in your README.md file so that you receive credit.

I would like everyone to use a "new" feature when submitting part 3a: git tags. A "tag" is essentially a label for a specific commit. You can create a tag from the command line (git manual on tags).

When you have completed your Lab 3a, please do two things:

Create a tag named "v3a" (this does not follow the semantic versioning spec...)

Send an email to let me know that you have finished.

Since you will continue to work on code in the same repository for part 3b, the tag will make sure that I test and give feedback on the correct version of your lab.

This lab borrows heavily in from an assignment created by Geoff Kuenning.