Assignment Chef icon Assignment Chef
All English tutorials

Programming lesson

Mastering Cache Simulation in C: A COMP9414 AI Assignment Tutorial

Learn to implement stub functions for cache simulation in C, focusing on setSizesOffsetsAndMaskFields, getindex, gettag, writeback, and fill. This tutorial uses timely examples from AI and gaming to make concepts relatable.

COMP9414 cache simulation C programming stub functions setSizesOffsetsAndMaskFields getindex gettag writeback fill cache hierarchy matrix multiplication AI assignment CSE lab machines 64-bit system cache performance computer architecture

Introduction: Why Cache Simulation Matters in 2026

In the world of artificial intelligence and high-performance computing, cache hierarchies are the unsung heroes that bridge the speed gap between processors and memory. As AI models grow larger and real-time applications like gaming and autonomous driving demand instant data access, understanding cache behavior becomes critical. This tutorial will guide you through implementing the five stub functions in YOURCODEHERE.c for the COMP9414 assignment, using analogies from current trends such as AI chatbots, esports tournaments, and viral apps. By the end, you'll not only ace your assignment but also grasp concepts that power modern computing.

Understanding the Cache Hierarchy

Before diving into code, let's demystify the cache hierarchy. Imagine a gaming console's memory: the fastest but smallest cache (L1) is like a player's quick-access inventory, while slower but larger L2 and L3 caches are like storage chests. The CPU requests data from L1 first; if absent (a cache miss), it looks in L2, then main memory. This assignment simulates such a hierarchy for matrix multiplication, a core operation in AI training.

The provided framework uses a structure cache_t defined in csim.h. Key fields include num_sets, associativity, block_size, index_mask, tag_mask, and offset_mask. Your job is to compute these masks and implement cache access functions.

Step 1: setSizesOffsetsAndMaskFields

This function calculates the number of sets and the bit masks for index, tag, and offset. Given the cache size, associativity, and block size, you compute:

  • Number of blocks: cache_size / block_size
  • Number of sets: num_blocks / associativity
  • Offset bits: log2(block_size)
  • Index bits: log2(num_sets)
  • Tag bits: address_bits - index_bits - offset_bits

For a 64-bit system (like CSE lab machines), address_bits = 64. The masks are created by shifting 1 left by the number of bits and subtracting 1. For example, offset_mask = (1 << offset_bits) - 1. Similarly, index_mask = ((1 << index_bits) - 1) << offset_bits, and tag_mask = ~(index_mask | offset_mask) but limited to address_bits.

void setSizesOffsetsAndMaskFields(cache_t *cache) {
    int block_size = cache->block_size;
    int assoc = cache->associativity;
    int cache_size = cache->size;
    int num_blocks = cache_size / block_size;
    int num_sets = num_blocks / assoc;
    cache->num_sets = num_sets;
    int offset_bits = log2(block_size);
    int index_bits = log2(num_sets);
    int tag_bits = 64 - index_bits - offset_bits;
    cache->offset_mask = (1 << offset_bits) - 1;
    cache->index_mask = ((1 << index_bits) - 1) << offset_bits;
    cache->tag_mask = ((1 << tag_bits) - 1) << (index_bits + offset_bits);
}

Step 2: getindex and gettag

These functions extract the index and tag from a memory address. Using the masks computed above:

int getindex(cache_t *cache, uint64_t address) {
    return (address & cache->index_mask) >> cache->offset_bits;
}
uint64_t gettag(cache_t *cache, uint64_t address) {
    return (address & cache->tag_mask) >> (cache->index_bits + cache->offset_bits);
}

Think of this like sorting trading cards: the index tells you which set (e.g., box number), and the tag identifies the specific card within that box.

Step 3: writeback and fill

These functions handle moving data between cache levels. writeback sends a dirty block from the current cache to the next level (lower cache or memory). fill brings a block from the next level into the current cache. They rely on performaccess, a function that accesses the next level. The provided performaccess expects arguments: cache, address, size, type (read/write), and data. For writeback, you call performaccess on the next cache with a write request. For fill, you call performaccess on the next cache with a read request, then copy the data into the local block.

void writeback(cache_t *cache, int set, int way) {
    cache_line_t *line = &cache->sets[set].lines[way];
    uint64_t addr = (line->tag << (cache->index_bits + cache->offset_bits)) | (set << cache->offset_bits);
    performaccess(cache->next, addr, 8, WRITE, line->data);
    line->dirty = 0;
}

void fill(cache_t *cache, int set, int way, uint64_t address) {
    cache_line_t *line = &cache->sets[set].lines[way];
    performaccess(cache->next, address, 8, READ, line->data);
    line->tag = gettag(cache, address);
    line->valid = 1;
    line->dirty = 0;
}

Note: The size argument in performaccess is fixed as 8 bytes (one word) per assignment instructions.

Putting It All Together: The Cache Access Flow

When a memory reference occurs, the performaccess function in the top-level cache (L1) is called. It uses getindex and gettag to check for a hit. On a miss, it selects a victim line (using a simple LRU or FIFO policy; the framework may already have this). If the victim line is dirty, writeback is called. Then fill brings the requested block from the next level. This repeats down the hierarchy until main memory.

Testing and Debugging

After implementing, run make test. Expected output should match provided test output. Common pitfalls: incorrect mask computation (especially on 64-bit systems), off-by-one errors in bit shifts, and forgetting to update the tag after fill. Use printf statements to debug, but remove them before submission.

Trend Connection: AI Chatbots and Cache Performance

In 2026, AI chatbots like ChatGPT-5 process billions of tokens per second. Their performance hinges on efficient cache usage—just like your matrix multiplication simulation. A poorly configured cache can cause thrashing, slowing down responses. Similarly, esports games like Valorant and League of Legends rely on cache-friendly code for smooth 240 FPS gameplay. Understanding cache simulation gives you insight into optimizing real-world systems.

Conclusion

By implementing these five functions, you've built the core of a cache simulator. This skill is invaluable for careers in AI, systems programming, and game development. Remember to test thoroughly and adhere to the assignment constraints. Good luck!