Programming lesson
Run-Length Encoding in Python: A Step-by-Step Guide for Image Compression (COP3502c HW3 & HW4)
Learn run-length encoding (RLE) in Python with practical examples for image data. Covers encoding/decoding, hex conversion, RLE strings, and menu-driven programs. Perfect for COP3502c assignments.
Introduction to Run-Length Encoding (RLE)
Run-length encoding (RLE) is a simple form of lossless data compression that is particularly effective for data with many consecutive repeated values. In the context of image compression, RLE is often used for pixel art, icons, and simple graphics where large areas of the same color appear. For example, in retro video games, sprite sheets frequently contain long runs of identical pixels, making RLE an ideal choice for reducing file size without losing quality.
In this tutorial, we will explore the core concepts of RLE as applied to image data, focusing on the methods required for COP3502c Homework 3 and 4. You will learn how to encode flat pixel data into RLE format, decode RLE back to flat data, convert between hexadecimal strings and lists, and build a menu-driven program to load, display, and manipulate image data. By the end, you will have a solid understanding of how RLE works and be able to implement the required functions.
Understanding the Data Format
Before diving into the code, it is crucial to understand the data structures involved. In this assignment, images are represented as a list of integers. The first two elements are the width and height of the image in pixels. The remaining elements are the pixel values, each ranging from 0 to 15 (representing 16 possible colors). For example, a simple 4x4 image might have flat data like:
[4, 4, 0, 0, 0, 0, 2, 2, 2, 2, 0, 0, 0, 0, 0, 0, 0, 0]Here, the width is 4, height is 4, and the pixel values follow. The flat data is the raw, uncompressed representation. RLE encoding compresses this by storing runs of identical values as pairs: (count, value). For instance, the above flat data contains two runs: four zeros then four twos? Actually, let's count: the first four pixels are 0,0,0,0? Wait, the example given: [4,4,0,0,0,0,2,2,2,2,0,0,0,0,0,0,0,0] — that's 4 width, 4 height, then 16 pixels. The runs: 4 zeros, 4 twos, 8 zeros? That would be three runs. But RLE would represent it as [4,0,4,2,8,0]. Notice the run length is stored first, then the value.
However, note that the assignment uses a slightly different convention: the encoded list alternates between run lengths and values. So for the example above, the encoded RLE list would be [4, 0, 4, 2, 8, 0]. This is the format you will work with.
Key Functions to Implement
You will need to implement eight functions. Let's go through each one with detailed explanations and examples.
1. to_hex_string(data)
This function converts a list of integers (each 0-15) into a hexadecimal string without delimiters. Each integer is represented as a single hex digit (0-9, a-f). For example, [3, 15, 6, 4] becomes "3f64". Note that 15 is 'f', 6 is '6', etc.
Implementation tip: Use Python's built-in hex() function or a custom mapping. Since values are 0-15, you can use format(value, 'x') to get the hex character.
def to_hex_string(data):
return ''.join(format(v, 'x') for v in data)2. count_runs(flat_data)
This function returns the number of runs in the flat data. A run is a sequence of consecutive identical values. For example, [15,15,15,4,4,4,4,4,4] has two runs: three 15s and six 4s.
Implementation: Iterate through the list, counting changes.
def count_runs(flat_data):
if not flat_data:
return 0
runs = 1
for i in range(1, len(flat_data)):
if flat_data[i] != flat_data[i-1]:
runs += 1
return runs3. encode_rle(flat_data)
This function encodes flat data into RLE format. It returns a list where each pair is (run_length, value). For the example above, it should return [3,15,6,4].
Implementation: Traverse the flat data, count consecutive identical values, and append count then value to the result list.
def encode_rle(flat_data):
if not flat_data:
return []
result = []
count = 1
for i in range(1, len(flat_data)):
if flat_data[i] == flat_data[i-1]:
count += 1
else:
result.append(count)
result.append(flat_data[i-1])
count = 1
result.append(count)
result.append(flat_data[-1])
return result4. get_decoded_length(rle_data)
This is the inverse of count_runs. Given RLE data, it returns the total number of pixels after decoding. For [3,15,6,4], the decoded length is 3+6 = 9.
Implementation: Sum every other element starting from index 0.
def get_decoded_length(rle_data):
return sum(rle_data[i] for i in range(0, len(rle_data), 2))5. decode_rle(rle_data)
This function decodes RLE data back into flat data. For [3,15,6,4], it should produce [15,15,15,4,4,4,4,4,4].
Implementation: Iterate over pairs (count, value) and extend the result list with value repeated count times.
def decode_rle(rle_data):
result = []
for i in range(0, len(rle_data), 2):
count = rle_data[i]
value = rle_data[i+1]
result.extend([value] * count)
return result6. string_to_data(data_string)
This function converts a hexadecimal string (like "3f64") into a list of integers. It is the inverse of to_hex_string.
Implementation: Use int(char, 16) for each character.
def string_to_data(data_string):
return [int(ch, 16) for ch in data_string]7. to_rle_string(rle_data)
This function converts RLE data into a human-readable string with delimiters. For each run, it displays the run length in decimal (1-2 digits) and the run value in hexadecimal (1 digit), separated by colons. For example, [15,15,6,4] becomes "15f:64". Note that the run length is 15 (decimal) and the value is 15 (f).
Implementation: Iterate over pairs, format count as decimal and value as hex, join with ':' .
def to_rle_string(rle_data):
parts = []
for i in range(0, len(rle_data), 2):
count = rle_data[i]
value = rle_data[i+1]
parts.append(f"{count}{format(value, 'x')}")
return ':'.join(parts)8. string_to_rle(rle_string)
This function parses a human-readable RLE string (like "15f:64") back into RLE data list. It is the inverse of to_rle_string.
Implementation: Split by ':', then for each part, the last character is the hex value, and the preceding characters are the decimal count.
def string_to_rle(rle_string):
result = []
for part in rle_string.split(':'):
value = int(part[-1], 16)
count = int(part[:-1])
result.append(count)
result.append(value)
return resultBuilding the Menu-Driven Program
Your program should present a menu with options to load data (from file, test image, RLE string, RLE hex string, flat hex string) and display data (image, RLE string, RLE hex, flat hex). Here is a sample skeleton:
def main():
image_data = None
while True:
print("1. Load file")
print("2. Load test image")
print("3. Read RLE string")
print("4. Read RLE hex string")
print("5. Read flat hex string")
print("6. Display image")
print("7. Display RLE string")
print("8. Display RLE hex")
print("9. Display flat hex")
print("0. Exit")
choice = input("Select a Menu Option: ")
# handle each option
For options 1-5, you will load data into image_data (as flat data). For options 6-9, you will display based on the current image_data. Note that when loading RLE data, you must first decode it to flat data for display. The ConsoleGfx class is provided by the assignment; you do not need to implement it.
Practical Tips and Common Pitfalls
- Hex digits are case-insensitive: When converting, always output lowercase hex to match examples.
- Edge cases: Handle empty lists gracefully. For example,
to_hex_string([])should return an empty string. - Testing: Use the provided test images (like "uga.gfx") to verify your functions. The
ConsoleGfx.test_imageis also useful. - Performance: RLE is efficient for simple images; your functions should handle large lists without excessive memory.
Connecting to Real-World Trends
RLE is not just for homework—it's used in many modern applications. For instance, in 2026, the rise of AI-generated pixel art has brought RLE back into the spotlight. Tools like DALL-E 3 and Midjourney can create sprite sheets for indie games, and RLE helps compress those assets for faster loading. Also, in the world of retro gaming, emulators use RLE to store save states efficiently. Even social media platforms like Instagram use similar compression techniques for thumbnail previews.
"Understanding RLE is like learning the alphabet of image compression. Once you master it, you can appreciate how data is optimized in everything from video streaming to cloud storage."
Conclusion
In this tutorial, we covered the essential functions for run-length encoding in Python as required by COP3502c Homework 3 and 4. By implementing to_hex_string, count_runs, encode_rle, get_decoded_length, decode_rle, string_to_data, to_rle_string, and string_to_rle, you have built a solid foundation for working with RLE image data. Practice with the provided test files and ensure your output matches the expected results exactly. Good luck with your assignment!