Programming lesson
Understanding SHA-3 Hash Functions: From Collision Probability to Theta, Rho, and Pi Implementations
A step-by-step tutorial covering hash function collision probability, one-way vs weak collision resistance proofs, and implementing SHA-3's θ, ρ, and π functions in C++.
Introduction: Why SHA-3 Matters in 2026
As of May 2026, SHA-3 remains a cornerstone of cryptographic security, used in blockchain technologies, digital signatures, and even AI model verification. This tutorial breaks down key concepts from Homework 4 of CSCI 181, focusing on hash function properties and the implementation of SHA-3's core permutation functions. Whether you're a computer science student or a developer diving into cryptography, understanding these fundamentals is essential for building secure systems.
Part 1: Hash Function Collision Probability (10 points)
What is a Hash Function?
A hash function h: X → Y maps an input set X to a smaller output set Y. In our case, h takes 1088-bit strings and outputs 256-bit strings. Because |X| > |Y|, collisions are inevitable.
Finding the Average n-to-1 Ratio
Given |X| = 2^1088 and |Y| = 2^256, the average number of inputs mapping to each output is:
n = |X| / |Y| = 2^1088 / 2^256 = 2^(1088-256) = 2^832So on average, each output has 2^832 preimages.
Probability of Solving the One-Way Problem by Brute Force
The one-way problem asks: given y ∈ Y, find any x ∈ X such that h(x) = y. If we randomly pick an x, the probability it maps to a specific y is:
P = n / |X| = 2^832 / 2^1088 = 1 / 2^256This astronomically small probability shows why brute-forcing a hash is infeasible—a key insight for security in cryptocurrencies and password storage.
Part 2: Proving Weak Collision Resistance Implies One-Way Property (10 points)
Contrapositive Proof
We need to show: if f is not one-way, then it is not weakly collision resistant.
Assume f does NOT have the one-way property. That means for a given y, we can efficiently find some x such that f(x) = y. Now, to break weak collision resistance, we are given an arbitrary x. Compute y = f(x). Using the one-way breaker, find x' such that f(x') = y. If x' ≠ x, we have a collision. If x' = x, repeat the process with a different random input to find another preimage. Since |X|/|Y| is large, there are many preimages, so we will quickly find a distinct x'. This shows weak collision resistance is broken. Thus, weak collision resistance implies one-way property.
This proof is crucial for understanding why hash functions like SHA-3 are trusted in applications such as AI model integrity checks and blockchain mining.
Part 3: Implementing SHA-3 Functions in C++
Understanding the State Array
SHA-3 operates on a 3D state array a[5][5][64]. The input is a 1D array v[1600]. The mapping between them is:
a[i][j][k] = v[64*(5*j + i) + k]This layout follows the SHA-3 standard, where i is the x-coordinate, j is the y-coordinate, and k is the bit position in the lane.
Function: inputSHA3()
Converts a 1D array to 3D:
void inputSHA3(bool v[1600], bool a[5][5][64]) {
for (int i = 0; i < 5; i++)
for (int j = 0; j < 5; j++)
for (int k = 0; k < 64; k++)
a[i][j][k] = v[64*(5*j + i) + k];
}Function: outputSHA3()
Converts 3D back to 1D:
void outputSHA3(bool a[5][5][64], bool v[1600]) {
for (int i = 0; i < 5; i++)
for (int j = 0; j < 5; j++)
for (int k = 0; k < 64; k++)
v[64*(5*j + i) + k] = a[i][j][k];
}The θ (Theta) Step
The θ function updates each bit based on the parity of its neighboring columns. The C++ implementation:
void theta(bool ain[5][5][64], bool aout[5][5][64]) {
bool C[5][64] = {0};
bool D[5][64] = {0};
// Compute column parity
for (int x = 0; x < 5; x++)
for (int k = 0; k < 64; k++)
for (int y = 0; y < 5; y++)
C[x][k] ^= ain[x][y][k];
// Compute D[x][k] = C[x-1][k] XOR C[x+1][k-1]
for (int x = 0; x < 5; x++)
for (int k = 0; k < 64; k++)
D[x][k] = C[(x+4)%5][k] ^ C[(x+1)%5][(k+63)%64];
// Update state
for (int x = 0; x < 5; x++)
for (int y = 0; y < 5; y++)
for (int k = 0; k < 64; k++)
aout[x][y][k] = ain[x][y][k] ^ D[x][k];
}To verify, apply θ to the provided input file and check that aout[4][3][9..18] equals 0011011000. Then list aout[3][1][15..24] in your writeup.
The ρ (Rho) Step
ρ performs bitwise rotations on each lane. The rotation offsets are given by the rhomatrix:
int rhomatrix[5][5] = {
{0, 36, 3, 41, 18},
{1, 44, 10, 45, 2},
{62, 6, 43, 15, 61},
{28, 55, 25, 21, 56},
{27, 20, 39, 8, 14}
};
void rho(bool ain[5][5][64], bool aout[5][5][64]) {
for (int x = 0; x < 5; x++)
for (int y = 0; y < 5; y++) {
int rot = rhomatrix[x][y];
for (int k = 0; k < 64; k++)
aout[x][y][k] = ain[x][y][(k + 64 - rot) % 64];
}
}Check: aout[4][3][9..18] should be 0110011001. List aout[3][1][15..24].
The π (Pi) Step
π rearranges lanes by permuting coordinates:
void pi(bool ain[5][5][64], bool aout[5][5][64]) {
for (int x = 0; x < 5; x++)
for (int y = 0; y < 5; y++)
for (int k = 0; k < 64; k++)
aout[x][y][k] = ain[(x + 3*y) % 5][x][k];
}Verify: aout[4][3][9..18] = 0110110001. List aout[3][1][15..24].
Connecting to Real-World Trends
In 2026, SHA-3 is widely used in AI model verification to ensure training data integrity, and in decentralized finance (DeFi) for secure transactions. Understanding these low-level functions helps developers build trust in cryptographic systems. Just as a video game's random number generator must be collision-resistant to prevent exploits, SHA-3's design ensures unpredictable, one-way mappings.
Conclusion
This tutorial covered the theoretical foundations of hash function collisions and proofs, plus practical implementations of SHA-3's θ, ρ, and π steps. By completing these exercises, you gain hands-on experience with cryptographic primitives that underpin modern security. For the next homework, you'll implement χ and ι to complete the Keccak permutation. Happy coding!