Programming lesson
Understanding SHA-3 Array Transformations: A Step-by-Step Guide for CSCI 181 Homework 4
Learn how to implement the SHA-3 input, output, and θ functions with clear code examples and conceptual insights. This guide helps you tackle Homework 4 without giving away the answers.
Introduction to SHA-3 and Array Mapping
In modern cryptography, hash functions like SHA-3 are essential for data integrity and security. SHA-3, also known as Keccak, operates on a 3-dimensional state array. For your CSCI 181 Homework 4, you need to implement functions that convert between 1D and 3D arrays and perform the θ (theta) step. This guide explains the concepts, provides code templates, and helps you avoid common pitfalls.
Understanding the Mapping: From 1D to 3D
The SHA-3 state is a 5×5×64 array, represented as a[5][5][64]. The input is a 1600-bit string stored in a 1D array v[1600]. The mapping given is: a[i][j][k] = v[64(5j + i) + k]. This is not the typical row-major order; it's designed to match the Keccak specification. Let's break it down.
Think of i as the x-coordinate (0–4), j as the y-coordinate (0–4), and k as the bit position within a lane (0–63). The index formula groups bits into lanes of 64 bits. The lane index is 5j + i, which means we iterate over j (y) first, then i (x). This is similar to a column-major order but with a twist.
Example: Mapping a Small Array
Suppose we have a tiny state with 2×2×2 instead of 5×5×64. The formula would be a[i][j][k] = v[2(2j + i) + k]. For i=0, j=0, k=0: index = 2*(0+0)+0 = 0. For i=1, j=0, k=0: index = 2*(0+1)+0 = 2. For i=0, j=1, k=0: index = 2*(2+0)+0 = 4. This shows that lanes are stored contiguously in memory, but the order is j-major.
Implementing inputSHA3()
Your inputSHA3() function should read the 1600-bit file into v and then populate a. Here's a pseudocode outline:
function inputSHA3(v[1600]) returns a[5][5][64]:
for i in 0..4:
for j in 0..4:
for k in 0..63:
index = 64 * (5*j + i) + k
a[i][j][k] = v[index]
return aNote the order of loops: i, j, k. This matches the formula where i and j are used to compute the lane offset. If you loop over j first, you'll get the correct mapping.
Implementing outputSHA3()
The inverse function outputSHA3() reverses the mapping:
function outputSHA3(a[5][5][64]) returns v[1600]:
for i in 0..4:
for j in 0..4:
for k in 0..63:
index = 64 * (5*j + i) + k
v[index] = a[i][j][k]
return vThis is essentially the same code, but the assignment direction is reversed. Ensure that your loops are identical to inputSHA3() to maintain consistency.
The θ (Theta) Step
The θ step is a linear operation that provides diffusion. It computes parity of columns and adds it to the state. The algorithm from the Keccak specification:
- For each column (i, k), compute
C[i][k] = XOR of a[i][j][k] for all j. - For each column, compute
D[i][k] = C[(i-1) mod 5][k] XOR C[(i+1) mod 5][(k-1) mod 64]. - Update the state:
aout[i][j][k] = ain[i][j][k] XOR D[i][k].
Here's a Python-like implementation (adjust for your language):
function theta(ain[5][5][64]) returns aout[5][5][64]:
# Step 1: Compute C
C = array[5][64]
for i in 0..4:
for k in 0..63:
C[i][k] = 0
for j in 0..4:
C[i][k] ^= ain[i][j][k]
# Step 2: Compute D
D = array[5][64]
for i in 0..4:
for k in 0..63:
D[i][k] = C[(i-1+5)%5][k] ^ C[(i+1)%5][(k-1+64)%64]
# Step 3: Update state
for i in 0..4:
for j in 0..4:
for k in 0..63:
aout[i][j][k] = ain[i][j][k] ^ D[i][k]
return aoutBe careful with modulo operations for negative indices. Use (i-1+5)%5 and (k-1+64)%64.
Verifying Your Implementation
Your assignment provides a check: after applying θ to the input file, aout[4][3][9..18] should be 0011011000. This means bits 9 through 18 (inclusive) of lane (i=4, j=3) should match that pattern. Note that indices are 0-based, so aout[4][3][9] is the 10th bit in that lane. Also, you need to list aout[3][1][15..24] in your writeup.
To debug, print the entire aout[4][3] lane (64 bits) and compare the substring. If it doesn't match, check your mapping and θ logic. Common mistakes include incorrect loop order in inputSHA3() or off-by-one errors in the parity computation.
Conceptual Questions: Hash Function Properties
The first two questions deal with hash function collisions and the relationship between one-way and collision resistance. Let's discuss them conceptually.
Question 1: Collision Probability
Given a hash function h: {0,1}^1088 → {0,1}^256, the number of inputs is 2^1088, outputs is 2^256. On average, each output has n = 2^1088 / 2^256 = 2^832 preimages. The probability that a random input maps to a specific output y is n / (number of inputs) = 2^832 / 2^1088 = 1/2^256. That's extremely small, showing why brute-force inversion is infeasible.
Question 2: Weak Collision Resistance Implies One-Way
Prove by contrapositive: If f is not one-way, then it is not weakly collision resistant. Assume there exists an efficient algorithm that given y finds x such that f(x)=y. Then for a given x, we can compute y=f(x) and use the algorithm to find x' such that f(x')=y. If x' ≠ x, we have a collision. If x' = x, we can try again with a different random x or modify the algorithm to find a different preimage. Thus, breaking one-way leads to breaking weak collision resistance.
Trend Connection: Why Hash Functions Matter in 2026
In 2026, with the rise of AI-generated content and deepfakes, cryptographic hashing is more critical than ever. SHA-3 is used in blockchain technologies, secure messaging apps, and digital signatures. Understanding how the state array works gives you insight into the inner workings of these systems. For example, the θ step provides diffusion, ensuring that a single bit change in input affects many output bits—a property essential for security.
Moreover, the concept of mapping between 1D and 3D arrays is analogous to how data is organized in modern GPUs and AI accelerators. When you run a neural network, tensors are often reshaped and permuted, similar to the SHA-3 transformations. So mastering these array manipulations is not just for cryptography; it's a fundamental skill in high-performance computing.
Common Pitfalls and Tips
- Indexing: Double-check the mapping formula. The lane index is
5j + i, not5i + j. This is a common source of errors. - Bit order: The k index runs from 0 to 63, where k=0 is the least significant bit? Check your environment. In Keccak, bits are typically stored with k=0 as the least significant bit, but your assignment may treat them as a simple array. Consistency is key.
- Modulo arithmetic: In θ, when computing
D[i][k], the second term uses(k-1) mod 64. Ensure you handle wrap-around correctly. - Testing: Use the provided check values to validate each step. Write small test cases with known outputs.
Conclusion
Homework 4 covers fundamental SHA-3 operations and theoretical hash function properties. By implementing inputSHA3(), outputSHA3(), and θ, you gain hands-on experience with cryptographic primitives. Remember to test incrementally and verify your mappings. Good luck!