Programming lesson
Mastering Run-Length Encoding for Image Compression: A COP3504C Project Guide
Learn how to implement run-length encoding (RLE) for image data in Python, covering encoding, decoding, string conversion, and menu-driven programs. This guide aligns with the COP3504C Project 1 assignment, featuring real-world applications in gaming, AI, and retro pixel art.
Introduction to Run-Length Encoding (RLE) in Image Processing
Run-length encoding (RLE) is a simple yet powerful lossless compression technique widely used in image processing, particularly for pixel art and simple graphics. The core idea is to replace sequences of repeated values (runs) with a count and a single value. For example, the flat data [0, 0, 2, 2, 2, 0, 0, 0, 0, 0, 0, 2, 2] becomes the RLE representation [2, 0, 3, 2, 6, 0, 2, 2], meaning two 0s, three 2s, six 0s, and two 2s. This technique is especially effective for images with large uniform areas, like the black-and-green gator pixel image from the COP3504C assignment.
In this tutorial, you will build Python functions to encode and decode image data using RLE, implementing a menu-driven program that loads, displays, and converts image data. By the end, you will have a solid grasp of loops, strings, arrays, methods, and type-casting—all essential skills for programming projects and real-world applications.
Why RLE Matters in 2026: From Retro Gaming to AI Art
RLE is not just a classroom exercise—it's used in industry-standard formats like BMP, PCX, and TIFF. In 2026, with the rise of AI-generated pixel art and retro-style indie games, understanding compression algorithms is more relevant than ever. For instance, popular games like Minecraft and Terraria use similar techniques to store chunk data efficiently. Even AI image generators often rely on efficient encoding to reduce file sizes. By mastering RLE, you're learning a foundational skill that bridges classic computer science with modern trends.
Setting Up Your Project
Your program must run in standalone mode with a menu offering five ways to load data and four ways to display it. You'll also implement eight core functions. Let's break down each component step by step.
Function 1: count_runs(flatData)
This function counts the number of runs in flat (unencoded) data. For example, [15,15,15,4,4,4,4,4,4] has two runs: three 15s and six 4s. The result is 2. This is useful for determining the length of the RLE byte array (double the run count).
def count_runs(flat_data):
if not flat_data:
return 0
runs = 1
for i in range(1, len(flat_data)):
if flat_data[i] != flat_data[i-1]:
runs += 1
return runsFunction 2: to_hex_string(data)
Converts a list of integers (0-15) into a hexadecimal string without delimiters. For [3, 15, 6, 4], the output is "3f64". Note that values above 9 are represented as letters (10=A, 11=B, etc.).
def to_hex_string(data):
hex_chars = "0123456789abcdef"
return ''.join(hex_chars[byte] for byte in data)Function 3: encode_rle(flat_data)
Encodes flat data into RLE format. The output is a bytes object where each run is represented by a count followed by the value. For [15,15,15,4,4,4,4,4,4], the result is b'\x03\x0f\x06\x04' (count 3, value 15, count 6, value 4).
def encode_rle(flat_data):
if not flat_data:
return b''
rle = []
count = 1
for i in range(1, len(flat_data)):
if flat_data[i] == flat_data[i-1]:
count += 1
else:
rle.extend([count, flat_data[i-1]])
count = 1
rle.extend([count, flat_data[-1]])
return bytes(rle)Function 4: get_decoded_length(rle_data)
Given RLE data, returns the length of the decoded flat data. For [3, 15, 6, 4], the sum of counts (3+6) yields 9.
def get_decoded_length(rle_data):
return sum(rle_data[::2])Function 5: decode_rle(rle_data)
Decodes RLE data back to flat data. For [3, 15, 6, 4], the output is b'\x0f\x0f\x0f\x04\x04\x04\x04\x04\x04'.
def decode_rle(rle_data):
flat = []
for i in range(0, len(rle_data), 2):
count = rle_data[i]
value = rle_data[i+1]
flat.extend([value] * count)
return bytes(flat)Function 6: string_to_data(data_string)
Converts a hexadecimal string (e.g., "3f64") into a bytes object. This is the inverse of to_hex_string.
def string_to_data(data_string):
return bytes(int(data_string[i:i+2], 16) for i in range(0, len(data_string), 2))Function 7: to_rle_string(rle_data)
Converts RLE data into a human-readable string with decimal counts and hex values separated by colons. For [10, 15, 6, 4], output is "10f:64".
def to_rle_string(rle_data):
parts = []
for i in range(0, len(rle_data), 2):
count = rle_data[i]
value = rle_data[i+1]
parts.append(f"{count}{value:x}")
return ':'.join(parts)Function 8: string_to_rle(rle_string)
Parses a human-readable RLE string back into bytes. For "10f:64", output is b'\x0a\x0f\x06\x04'.
def string_to_rle(rle_string):
rle = []
for part in rle_string.split(':'):
count = int(part[:-1])
value = int(part[-1], 16)
rle.extend([count, value])
return bytes(rle)Building the Menu-Driven Program
Your main function should display a welcome message, show a color test, and present a menu with options 1-9. Use a loop to keep the program running until the user chooses to exit. The menu options are:
- 1: Load a file (uses
console_gfx.load_file) - 2: Load test image
- 3: Read RLE string (decimal with delimiters)
- 4: Read RLE hex string (no delimiters)
- 5: Read flat data hex string
- 6: Display image
- 7: Display RLE string (human-readable)
- 8: Display RLE hex data
- 9: Display flat hex data
For options 3-5, you'll need to store the current image data internally. Use a variable like image_data to hold the flat bytes. When loading RLE data, decode it immediately to flat data for consistency.
Real-World Application: Pixel Art Compression in Indie Games
Consider a 2026 indie game like Stardew Valley 2, where pixel art characters and environments are stored as RLE-compressed sprites. When the game loads a level, it decodes the RLE data into a pixel buffer for rendering. Your project mirrors this process: you load RLE or flat data, convert between formats, and display the image. Understanding RLE helps you appreciate how games manage memory and loading times.
Common Pitfalls and Debugging Tips
- Off-by-one errors: In
count_runs, ensure you start counting at 1 and iterate correctly. - Hex conversion: Remember that values 10-15 map to letters a-f. Use
int(value, 16)when parsing. - Byte vs. int: When building bytes objects, ensure values are within 0-255. For RLE, counts can exceed 15? The assignment specifies pixels are 0-15, but counts can be up to 255 (since a run length fits in one byte).
- Delimiters: In
to_rle_string, the count is decimal (1-3 digits), and the value is a single hex digit. For example, a run of 15 zeros becomes"150"? Actually, the example uses"10f"meaning count 10, value f (15). So the count is always decimal, and the value is always a single hex digit.
Testing Your Implementation
Use the provided test files and examples. For instance, load the chubby smiley image (testfiles/smiley.gfx) and verify that the RLE hex string matches 28106b10ab102b10cb102b105b20bb106b10. Also, test edge cases like empty data or single-run data.
Conclusion
By completing this project, you've gained hands-on experience with run-length encoding, a fundamental compression technique. You've also practiced essential Python skills: loops, string manipulation, list/byte operations, and modular programming. These skills are directly applicable to more advanced topics like Huffman coding, LZW compression, and even AI model serialization. In 2026, as data continues to grow, efficient encoding remains critical. Your work here is a stepping stone to mastering data compression and image processing.