Assignment Chef icon Assignment Chef
All English tutorials

Programming lesson

Building a Binary Tree Decoder in Java: Reconstruct Archived Messages Like a Pro

Learn how to implement a binary tree-based decoder in Java to reconstruct archived messages. This step-by-step tutorial covers tree construction, preorder traversal, decoding algorithms, and performance statistics, using a real-world assignment example from COM S 2280.

binary tree decoder Java archived message reconstruction COM S 2280 project 4 MsgTree class preorder traversal encoding decode bit string Java space savings calculation iterative tree construction Java data structures tutorial binary tree compression prefix code decoding Java recursive tree building decode archived messages arch file format binary tree statistics Java programming assignment help

Introduction: Why Binary Tree Decoding Matters in 2026

In the age of AI-powered compression and real-time data streaming, understanding how to decode archived messages using binary trees is a fundamental skill for any computer science student. Whether you're working on a COM S 2280 project or just exploring data structures, this tutorial will guide you through reconstructing messages from a binary-tree-based algorithm. We'll use Java to build a MsgTree class, parse encoding strings, and decode bit sequences — all while keeping performance in mind.

Think of this like unzipping a file, but instead of a generic algorithm, you're implementing a custom tree-based decoder. It's the same concept behind how modern apps like WhatsApp compress messages or how AI models tokenize text. By the end, you'll be able to decode any .arch file and even calculate space savings.

Understanding the Encoding Scheme

The archival algorithm uses a binary tree where each leaf node holds a character. Edges to left children represent 0, and edges to right children represent 1. The path from root to leaf gives the character's binary code. For example, with the tree encoding ^a^^!^dc^rb, the character 'a' is encoded as 0, '!' as 100, and 'c' as 1011. This is a prefix code — no code is a prefix of another, which makes decoding unambiguous.

The encoding string is a preorder traversal of the tree: ^ denotes an internal node, and any other character is a leaf. If the string spans two lines (due to newline or space characters), you must concatenate them properly.

Building the MsgTree Class

Start by creating a MsgTree class with three fields: payloadChar (the character at this node), left, and right child references. Include a static integer staticCharIdx to track the current position in the encoding string during recursive construction.

public class MsgTree {
    public char payloadChar;
    public MsgTree left;
    public MsgTree right;
    private static int staticCharIdx = 0;

    public MsgTree(String encodingString) {
        // Constructor that builds the tree from the encoding string
        if (encodingString.charAt(staticCharIdx) == '^') {
            payloadChar = '^'; // internal node
            staticCharIdx++;
            left = new MsgTree(encodingString);
            right = new MsgTree(encodingString);
        } else {
            payloadChar = encodingString.charAt(staticCharIdx);
            staticCharIdx++;
            left = null;
            right = null;
        }
    }

    public MsgTree(char payloadChar) {
        this.payloadChar = payloadChar;
        this.left = null;
        this.right = null;
    }
}

Notice the recursive constructor: if the current character is ^, we create an internal node and recursively build left and right subtrees. Otherwise, it's a leaf. This mirrors the preorder traversal of the tree.

Printing Character Codes with printCodes()

The printCodes method performs a preorder traversal and prints each leaf character along with its binary code. Use a StringBuilder to accumulate the path.

public static void printCodes(MsgTree root, String code) {
    if (root == null) return;
    if (root.left == null && root.right == null) {
        System.out.println(root.payloadChar + " " + code);
    } else {
        printCodes(root.left, code + "0");
        printCodes(root.right, code + "1");
    }
}

This method is essential for verifying your tree structure. In the main program, you'll call it after building the tree and print a header like "character code".

Decoding the Message

The decode method takes the tree and the bit string (as a String of '0's and '1's). It walks the tree from the root, following bits until it reaches a leaf, then prints the leaf's character and resets to the root.

public void decode(MsgTree codes, String msg) {
    MsgTree current = codes;
    for (int i = 0; i < msg.length(); i++) {
        char bit = msg.charAt(i);
        if (bit == '0') {
            current = current.left;
        } else {
            current = current.right;
        }
        if (current.left == null && current.right == null) {
            System.out.print(current.payloadChar);
            current = codes;
        }
    }
    System.out.println();
}

For example, with the encoding ^a^^!^dc^rb and message 10110101011101101010100, the output is cadbard!. This is exactly how your program will reconstruct the original text.

Handling Multi-line Encoding Files

Some archive files may have the encoding string split across two lines (if it contains a newline character). To handle this, read all lines into a list. If the file has three lines, the first two are the encoding (concatenate them), and the third is the message. If two lines, the first is encoding and second is message.

List<String> lines = Files.readAllLines(Paths.get(args[0]));
String encoding;
String message;
if (lines.size() == 3) {
    encoding = lines.get(0) + lines.get(1);
    message = lines.get(2);
} else {
    encoding = lines.get(0);
    message = lines.get(1);
}

Calculating Statistics (Extra Credit)

For the 5% bonus, print statistics after decoding: average bits per character, total characters, and space savings. Space savings assumes uncompressed characters use 16 bits each.

int totalChars = decodedMessage.length();
int totalBits = 0; // sum over each character of its code length
// You need to compute code lengths during printCodes or store them in a map.
// Then:
double avgBits = (double) totalBits / totalChars;
double savings = (1 - (double) totalBits / (totalChars * 16)) * 100;
System.out.println("STATISTICS:");
System.out.println("Avg bits/char: " + avgBits);
System.out.println("Total characters: " + totalChars);
System.out.println("Space savings: " + String.format("%.1f%%", savings));

Iterative Solution (15% Bonus)

If you're up for a challenge, implement an iterative tree builder using a stack. This avoids recursion and is more efficient. The idea: push nodes onto a stack as you parse the encoding string, and when you encounter a leaf, pop to build the tree bottom-up. This is significantly harder but demonstrates advanced understanding of tree construction.

Testing with Sample Files

Download the test files from the assignment (e.g., monalisa.arch). Run your program with the filename as a command-line argument. Verify that the output matches expected decoded messages. For example, the provided cadbard.arch should output:

character code
c 1011
r 110
b 111
a 0
! 100
d 1010

MESSAGE: cadbard!

Common Pitfalls and Tips

  • Static index: Reset staticCharIdx to 0 before building a new tree.
  • Newline in encoding: Ensure you concatenate lines correctly; a newline character is a valid leaf payload.
  • Decoding reset: After printing a leaf, always reset to the root for the next character.
  • Space savings: Use 16 bits per uncompressed character; compute compressed bits as sum of code lengths for each character in the decoded message.

Real-World Connections: From Gaming to AI

Binary tree decoding isn't just for homework. In 2026, it's used in:

  • Game streaming: Services like Xbox Cloud Gaming use prefix codes to compress controller inputs.
  • AI tokenizers: Models like GPT-4 use Byte-Pair Encoding, a form of tree-based compression.
  • Messaging apps: WhatsApp's message compression relies on Huffman coding, which is a binary tree algorithm.
  • Financial data: High-frequency trading systems use custom binary trees to encode market data.

By mastering this project, you're building skills directly applicable to cutting-edge technology.

Conclusion

You've now built a complete binary tree decoder in Java. You learned to parse encoding strings, construct a tree recursively, print character codes, decode messages, and compute statistics. This project from COM S 2280 Spring 2025 is a fantastic way to solidify your understanding of trees and recursion. Whether you pursue the iterative bonus or stick with recursion, you're now equipped to handle archived message reconstruction like a pro.

Remember to package your classes in edu.iastate.cs2280.hw4 and include the @author Javadoc tag. Good luck!