Assignment Chef icon Assignment Chef
All English tutorials

Programming lesson

Build Your Own ext2 File System Tools: A Hands-On C Programming Tutorial for CSC369

Learn how to implement ext2 file system tools in C, including mkdir, ln, rm, restore, and checker, with practical examples and timely analogies from modern tech trends.

ext2 file system C programming tutorial CSC369 assignment file system tools ext2 mkdir ext2 ln ext2 rm ext2 restore ext2 checker Linux file system internals binary data manipulation file system consistency inode management virtual disk image C programming projects operating systems assignment

Introduction: Why File Systems Matter in 2026

File systems are the unsung heroes of every digital device you use. From your smartphone’s storage to cloud servers running AI models, the way data is organized and retrieved impacts performance, reliability, and security. In May 2026, as AI-powered apps generate terabytes of data daily, understanding low-level file system internals is more relevant than ever. This tutorial will guide you through building essential ext2 file system tools in C, mirroring the classic CSC369 assignment but with fresh examples drawn from today’s tech landscape.

Understanding ext2: The Classic Linux File System

ext2 (second extended file system) was the default file system for many Linux distributions before ext3 and ext4 added journaling. Despite its age, ext2 remains an excellent teaching tool because its on-disk structures are relatively simple yet complete. You’ll work with virtual disk images, which are just binary files mimicking a real disk. Key structures include the superblock, block group descriptors, inode tables, and data blocks.

Think of ext2 like a well-organized library: the superblock is the library catalog, block groups are sections, inodes are book records, and data blocks are the actual books. In 2026, similar hierarchical structures underpin AI training datasets and blockchain storage systems.

Setting Up Your Environment

To follow along, you’ll need a Linux environment (or WSL2 on Windows) with a C compiler (gcc) and basic command-line tools. Create a virtual disk using dd and format it with mke2fs:

dd if=/dev/zero of=my_disk.img bs=1024 count=1024
mke2fs -b 1024 my_disk.img

This creates a 1MB ext2 disk. You’ll manipulate it with your own tools.

Tool 1: ext2_mkdir – Creating Directories

Your first tool, ext2_mkdir, creates a directory at a given absolute path. It must parse the path, traverse existing directories, allocate a new inode and data block, and update the parent directory’s entries. The ext2 specification requires directory entries to be 4-byte aligned and names not null-terminated (instead, the entry length encodes the name length).

Key Steps:

  1. Open the disk image and read the superblock to locate block group descriptors.
  2. For each path component, search the current directory’s entries.
  3. If the final component doesn’t exist, allocate an inode (type directory) and a data block for its entries.
  4. Write the new directory entry into the parent’s data block, respecting alignment.
  5. Update free inode/block counts in the superblock and block group descriptors.

Error Handling: Return ENOENT if a parent directory doesn’t exist, EEXIST if the target already exists.

In 2026, directory creation analogies appear in AI file organization tools that automatically sort generated images into folders. Your tool does the same at the file system level.

Tool 2: ext2_ln – Hard and Symbolic Links

Links are fundamental to Unix file systems. Hard links share the same inode (data), while symbolic links store a path string. Your tool must support both, with a -s flag for symlinks.

Hard Links

To create a hard link, increment the source inode’s link count and add a new directory entry pointing to the same inode number. No new data blocks needed.

// Pseudocode
inode = read_inode(source_inode_num);
inode.i_links_count++;
write_inode(source_inode_num, inode);
add_directory_entry(parent_dir, new_name, source_inode_num, FILE_TYPE_REG);

Hard links cannot cross file systems or link directories. In 2026, hard links are used in container image layers to save space—multiple containers share the same base file system blocks.

Symbolic Links

Symlinks allocate a new inode (type symlink) and store the target path in the inode’s data block pointers (or inline if short). The path length is stored in i_size.

if (symlink_flag) {
    allocate_inode(&new_inode_num, EXT2_FT_SYMLINK);
    write_inode_data(new_inode_num, target_path, strlen(target_path));
    add_directory_entry(parent_dir, link_name, new_inode_num, EXT2_FT_SYMLINK);
}

Symlinks are analogous to shortcuts on modern operating systems, but they work at the file system level. In AI pipelines, symlinks help manage large datasets without duplication.

Tool 3: ext2_rm – Removing Files

Removing a file involves decrementing the inode’s link count. If it reaches zero, mark the inode as free, set i_dtime to the current timestamp, and free its data blocks (though you don’t need to zero them). Also, remove the directory entry. Importantly, you must not shift remaining entries; just mark the entry as unused by setting its inode number to 0.

Bonus -r flag: For recursive removal of directories, traverse all entries and recursively remove files/subdirectories. This is like an AI training script cleaning up a temporary dataset folder.

if (is_directory(path)) {
    if (!recursive_flag) return EISDIR;
    // recursively remove all children
    for each entry in dir {
        if (entry is not . or ..) {
            ext2_rm(disk, full_child_path, recursive_flag);
        }
    }
    // then remove the directory itself
}

Tool 4: ext2_restore – Undeleting Files

This tool recovers recently deleted files by scanning directory entry “gaps” where inode numbers are 0 but the entry structure still exists. You must check that the inode’s i_dtime is recent and that the inode bitmap still marks it as free. Then, reallocate the inode and update the entry.

Constraints: Do not restore files that had hardlinks at deletion time (since the inode may have been reused). This mirrors real-world file recovery tools like extundelete, which are crucial after accidental deletions in critical AI training pipelines.

Tool 5: ext2_checker – File System Consistency

A lightweight file system checker validates and fixes common inconsistencies:

  • Free block/inode counts: Compare superblock and block group counters with bitmap scans. Trust bitmaps and update counters. Output: Fixed: superblock's free blocks counter was off by 5 compared to the bitmap.
  • Inode mode vs. directory entry type: For each entry, ensure the file type matches the inode’s mode. Trust the inode and fix the entry. Output: Fixed: Entry type vs inode mismatch: inode 42.
  • Allocated inodes in bitmap: If an inode is in use but not marked in the bitmap, mark it and update counters.

Automated checkers are vital for large-scale storage systems. In 2026, AI-driven data centers use similar tools to maintain integrity across millions of files.

Putting It All Together: Testing Your Tools

After implementing all tools, test them on your virtual disk. Create directories, files, and links; remove and restore; then run the checker to verify consistency. Example workflow:

./ext2_mkdir my_disk.img /home
./ext2_mkdir my_disk.img /home/user
./ext2_ln my_disk.img /etc/passwd /home/user/passwd_link
./ext2_rm my_disk.img /home/user/passwd_link
./ext2_restore my_disk.img /home/user/passwd_link
./ext2_checker my_disk.img

This sequence mimics real-world file management tasks. In 2026, similar operations are automated in cloud storage systems like Google Drive or AI training data pipelines.

Conclusion: From Assignment to Real-World Skills

Building ext2 tools teaches you low-level C programming, binary data manipulation, and file system internals. These skills are directly applicable to modern challenges: optimizing storage for AI models, debugging file system corruption in embedded devices, or contributing to open-source storage projects. As data continues to explode in volume, understanding how file systems work becomes a superpower.

Remember to test edge cases: empty paths, root directory, symbolic link loops, and full disks. Your implementation should be robust and follow the ext2 specification precisely. Good luck, and happy coding!