Programming lesson
Binary Trees in C++: Building a Simplified XML Parser with Tilted Trees
Learn how to implement binary trees in C++ to parse XML data using the tilted tree representation. This tutorial covers tree traversals, parsing functions, and practical coding tips for CSCI 340 students.
Introduction: Why Binary Trees and XML Matter Today
Binary trees are a fundamental data structure in computer science, and their applications go far beyond academic exercises. In 2026, with AI-driven apps and complex data interchange formats like JSON and XML everywhere, understanding how to parse hierarchical data efficiently is a must-have skill. Whether you're working on a school project for CSCI 340 or building a real-world tool that reads configuration files or web data, binary trees offer a powerful way to represent nested structures.
In this tutorial, we'll explore how to implement a simplified XML parser using binary trees with a twist: the tilted tree representation. This technique allows us to store any non-binary tree (like an XML document) inside a binary tree by redefining the left and right pointers. We'll walk through the core functions you need to write for your assignment, from basic tree traversals to XML-specific parsing logic. Let's dive in.
Understanding the Assignment: Binary Trees and XML
Your assignment asks you to implement several functions in C++ for binary trees and an XML parser. The key challenge is that XML elements can have many children, but a binary tree node only has two pointers. The solution is to use a tilted binary tree: the left child points to the first child of the node, and the right child points to the next sibling. This transforms any non-binary tree into a binary tree, making it easy to store and traverse.
For example, consider an XML snippet like:
<book>
<title>Learning C++</title>
<author>John Doe</author>
</book>In a tilted tree, the book node's left child is title, and title's right child is author. The plain text inside title is stored as a separate child node of title (left child).
Core Binary Tree Functions
You'll need to implement four standard traversals in bintree.h: inorder, preorder, postorder, and levelorder. Each takes a root pointer and a function to apply to each node's data.
Inorder Traversal
In a binary tree, inorder visits the left subtree, then the node, then the right subtree. For a tilted tree, this corresponds to visiting the node's first child (left), then the node itself, then the next sibling (right). However, for XML parsing, you'll often use preorder or levelorder to build the tree.
template <typename T>
void inorder(Node<T>* root, void (*fn)(T&)) {
if (!root) return;
inorder(root->left, fn);
fn(root->data);
inorder(root->right, fn);
}Preorder Traversal
Preorder visits the node first, then left, then right. This is useful for outputting XML with opening tags before children.
template <typename T>
void preorder(Node<T>* root, void (*fn)(T&)) {
if (!root) return;
fn(root->data);
preorder(root->left, fn);
preorder(root->right, fn);
}Postorder Traversal
Postorder visits left, right, then node. This is useful for closing tags after children.
Levelorder Traversal
Levelorder visits nodes level by level, using a queue. This is especially important for the tilted tree because it mimics the original non-binary tree's breadth-first order.
template <typename T>
void levelorder(Node<T>* root, void (*fn)(T&)) {
if (!root) return;
std::queue<Node<T>*> q;
q.push(root);
while (!q.empty()) {
Node<T>* cur = q.front(); q.pop();
fn(cur->data);
// In tilted tree, left goes to first child, right to sibling
// But levelorder should follow the original tree structure:
// For each node, enqueue its first child (left) and then all siblings via right pointers
Node<T>* child = cur->left;
while (child) {
q.push(child);
child = child->right;
}
}
}Note: The standard levelorder for a binary tree would just enqueue left then right. But for a tilted tree, we need to traverse the original hierarchy: the left child is the first child, and then we follow right pointers to get all children at the same level.
Delete Tree
Don't forget to free memory with delete_tree, which recursively deletes all nodes.
Tilted Tree Specific Functions
Your assignment also requires three functions specifically for tilted trees.
tilted_find_parent
Given a node, find its parent in the tilted tree. This is tricky because the parent is not directly stored. In a tilted tree, a node's parent is either the node whose left child points to it (if it's a first child) or the node whose right child points to it (if it's a sibling). You can traverse from the root to locate the parent by checking both left and right pointers.
tilted_get_children
Return a list of all children of a node. Since children are stored as a linked list via right pointers starting from the left child, you can iterate from node->left and follow right pointers.
tilted_levelorder
This is similar to the levelorder above but specifically for the tilted tree representation. The key is to treat the left pointer as the first child and the right pointer as the next sibling. When you pop a node, you should enqueue its first child (left) and then all siblings (right chain).
XML Parser Functions
Now let's look at the XML-specific functions in xml.cc. The provided parse_xml function reads the file and calls your handlers for tags and plain text. Your job is to build the tilted tree.
to_string
Convert an xml_element to a string for printing. If opening is true, output the opening tag with attributes; otherwise, output the closing tag.
std::string to_string(const xml_element &element, bool opening) {
if (element.type == plain) {
return element.fulltext;
} else {
if (opening) {
std::string result = "<" + element.name;
for (auto &[key, value] : element.attrs) {
result += " " + key + "=\"" + value + "\"";
}
result += ">";
return result;
} else {
return "</" + element.name + ">";
}
}
}xml_handle_tag
This is called when an opening tag is encountered. You need to create a new xml_element of type tag, set its name and attributes, and add it to the tree using xml_add_node. Also update the current node pointer.
xml_handle_plaintext
When plain text is found, create a plain element and add it as a child of the current node.
xml_handle_attributes
Parse the attribute string into the map. Attributes are key-value pairs separated by whitespace, with values in double quotes.
xml_add_node
This function adds a new node as a child of the current node. In the tilted tree, if the current node has no children (left is null), set left to the new node. Otherwise, traverse the right chain of the left child to find the last sibling and append the new node there.
void xml_add_node(xml_tree_state &state, const xml_element &elem) {
Node<xml_element>* new_node = new Node<xml_element>(elem);
if (state.cur == nullptr) {
state.root = new_node;
state.cur = new_node;
return;
}
// Add as child of current node
if (state.cur->left == nullptr) {
state.cur->left = new_node;
} else {
Node<xml_element>* sibling = state.cur->left;
while (sibling->right != nullptr) {
sibling = sibling->right;
}
sibling->right = new_node;
}
// For tags, the new node becomes the current node (to handle nesting)
if (elem.type == tag) {
state.cur = new_node;
}
}xml_close_tag
When a closing tag is encountered, verify that it matches the current node's name. Then set the closed flag to true and move the current pointer up to the parent. To find the parent, you can use tilted_find_parent or maintain a stack (but the assignment expects you to use the tilted tree functions).
xml_print_subtree
Print the subtree rooted at a given node in a human-readable format, often using indentation to show nesting.
xml_find_by_name
Search the tree for an element with a given tag name. You can use a traversal (e.g., preorder) and check each node's name.
xml_find_with_attr
Search for an element that has a specific attribute key-value pair.
Practical Tips for CSCI 340 Students
- Start with the traversals: Implement and test the four basic traversals first. Use a simple binary tree (not tilted) to verify correctness.
- Understand the tilted tree: Draw the non-binary tree and its tilted binary tree representation. Practice converting between them.
- Test edge cases: Empty tree, single node, deeply nested XML, multiple siblings, attributes with spaces, etc.
- Use the provided code: The
parse_xmlfunction andxml_elementclass are given. Don't reinvent the wheel. - Debug with print statements: Print the tree after each step to see if the structure is correct.
Real-World Analogy: Social Media Feed as a Tilted Tree
Imagine you're building a feed for a social media app like TikTok or Instagram in 2026. Each post can have comments, and comments can have replies—a classic non-binary tree. Using a tilted binary tree, you can store the feed efficiently. The left child of a post points to its first comment, and that comment's right child points to the next comment. Replies to a comment are stored as left children of that comment. This structure allows you to traverse the feed in different orders: preorder to display posts and their comments in a threaded view, or levelorder to show all top-level comments first.
Conclusion
Binary trees are a versatile tool, and the tilted tree representation is a clever way to store hierarchical data in a binary tree. By implementing the functions in this assignment, you'll gain a deep understanding of tree traversals, parsing, and data structure design. Whether you're preparing for a career in software engineering, AI, or web development, these skills are invaluable. Good luck with your CSCI 340 assignment!