Breaking SHA-1 MAC: Length Extension Attack Tutorial for Cs6260

Understanding the UF-CMA Vulnerability in SHA-1 MACs

In applied cryptography, the unforgeability under chosen message attack (UF-CMA) is a critical security notion for message authentication codes (MACs). The assignment Cs6260 homework 1 challenges you to break a deterministic hash-based MAC that uses SHA-1 in a Merkle-Damgård construction. Specifically, the MAC computes tag = SHA1(password ∥ message). This construction is vulnerable to a length extension attack, which allows an attacker to forge valid (message, tag) pairs without knowing the secret key. In this tutorial, we will walk through the attack step by step, using the provided Python files and the SHA-1 RFC.

Why Length Extension Attacks Matter Today

Length extension attacks are not just academic; they have real-world implications. For example, in 2024, a similar vulnerability was found in some legacy API authentication schemes, leading to data breaches. Think of it like a viral app where users can append extra commands to a signed request. If the signature scheme uses SHA-1 in a naive way, an attacker can extend the message and compute a valid new signature. This is exactly what we will demonstrate.

How SHA-1 and Merkle-Damgård Enable Length Extension

SHA-1 processes messages in 512-bit blocks. The internal state after hashing password ∥ message is the final hash value. Without knowing the password, an attacker can initialize a new SHA-1 instance with that state, then feed additional blocks. The resulting hash is exactly SHA1(password ∥ message ∥ padding ∥ extra), where padding is determined by the original message length. This is the core of the attack.

Step-by-Step Implementation Guide

Step 1: Understand the Provided Code

You are given student.py where you must implement the main() function. The function should return a (message, tag) pair. The autograder expects a forged message that is longer than the original, with a valid tag computed via length extension.

Step 2: Compute the Padding for the Original Message

Given the password length (e.g., 16 bytes) and the original message (e.g., b'msg'), compute the SHA-1 padding of password + original_message. The padding includes a 1 bit, zeros, and the length in bits. Use the RFC 3174 specification for SHA-1.

Step 3: Extract the Internal State from the Known Tag

The known tag is the final hash of password + original_message. SHA-1's internal state consists of five 32-bit words (H0-H4). These are the five hex values you get when you split the 160-bit hash. For example, if the tag is a94a8fe5ccb19ba61c4c0873d391e987982fbbd3, then H0 = 0xa94a8fe5, H1 = 0xccb19ba6, H2 = 0x1c4c0873, H3 = 0xd391e987, H4 = 0x982fbbd3.

Step 4: Initialize a New SHA-1 with That State

Create a new SHA-1 instance but set its internal state to the extracted H0-H4. Also set the message length counter to the total length of password + original_message + padding (in bits). This simulates the state after processing the original message.

Step 5: Append Extra Data and Compute the New Hash

Feed your extra data (e.g., b'extra') into the modified SHA-1 instance. The resulting hash is the tag for the forged message: original_message + padding + extra. Note that the padding here is the SHA-1 padding for the original length, not the total length. This is crucial.

Step 6: Return the Forged Pair

Your main() should return (forged_message, forged_tag_hex). The forged message includes the original message, the padding bytes (which may include null bytes), and your extra data. The autograder will verify that the tag is valid for the forged message under the unknown password.

Example Code Snippet

def main(password_len=16, original_msg=b'msg', known_tag='a94a8fe5ccb19ba61c4c0873d391e987982fbbd3', extra=b'extra'):
    # Step 2: compute padding for original (password + original_msg)
    total_len = password_len + len(original_msg)
    padding = sha1_padding(total_len * 8)  # length in bits
    
    # Step 3: extract state from known_tag
    state = [int(known_tag[i:i+8], 16) for i in range(0, 40, 8)]
    
    # Step 4: initialize modified SHA-1
    sha = ModifiedSHA1(state=state, count=(total_len + len(padding)) * 8)
    
    # Step 5: feed extra data
    sha.update(extra)
    forged_tag = sha.digest()
    
    # Step 6: construct forged message
    forged_msg = original_msg + padding + extra
    return (forged_msg, forged_tag.hex())

Testing Your Solution

Run python grader.py your_username to test locally. The autograder will check if your forged tag matches the expected hash. Common mistakes include incorrect padding calculation (remember to use big-endian byte order for the length) or forgetting that the padding includes the length of the original message only, not the extra.

Trend-Inspired Example: Gaming Leaderboard Cheat

Imagine a gaming platform where scores are authenticated using SHA1(secret_key ∥ score). A player with a score of 1000 gets a tag. Using length extension, you can append extra data to forge a tag for a score of 1000 + extra (like a cheat code). This is exactly the attack we are implementing.

Conclusion

Length extension attacks are a classic vulnerability in hash-based MACs. By understanding the Merkle-Damgård structure, you can forge valid tags without the key. This assignment reinforces why modern MACs like HMAC use a different construction to prevent such attacks. Submit your student.py and a one-page report explaining the attack.

SEO Keywords

SHA-1 length extension attack
UF-CMA security break
Merkle-Damgård vulnerability
hash-based MAC forgery
applied cryptography homework
Cs6260 assignment solution
SHA-1 padding exploit
cryptographic hash attack
forged message authentication
Python SHA-1 implementation
cybersecurity lab tutorial
hash extension attack example
API authentication bypass
gaming leaderboard cheat
viral app security flaw