Programming lesson
Mastering Serialization and Network Protocols in Distributed Systems: A Problem Set 1 Guide
Learn key concepts from 15-440/15-640 Distributed Systems problem set 1, including serialization, network delay calculation, and protocol selection with real-world analogies.
Understanding Serialization in Distributed Systems
In distributed systems, data must be transmitted between machines that may have different architectures. Serialization—converting data structures into a byte stream—is a fundamental operation. The problem set excerpt shows a serialize function for a game state. Let's break down the key components.
Memory Layout and Size Calculation
On a 64-bit system, pointers are 8 bytes, int32_t is 4 bytes, and int8_t is 1 byte. The game_t struct contains a pointer (squareval) and an integer (round). However, the board data is dynamically allocated. To serialize, we need to include the board's raw data. The total size includes the board (100*100 bytes) plus the round number (4 bytes). The missing code fragments are:
- _(1)_:
sizeof(int8_t) * 100 * 100 + sizeof(int32_t)or10000 + 4. Explanation: we need space for the board and the round. - _(2)_:
&total_sizeorsizeof(size_t)? Actually,memcpy(data_packet, total_size, ...)is incorrect; likely they meant to copy the total_size value into the packet. Probablymemcpy(data_packet, &total_size, sizeof(size_t))to store the size first. But the pseudocode saysmemcpy(data_packet, total_size, _(2)_); the second argument should be a pointer. So _(2)_ should besizeof(size_t)and the first argument should be&total_size. However, the snippet showsmemcpy(data_packet, total_size, _(2)_); it's ambiguous. In typical serialization, you'd store the total size first. Let's assume _(2)_ issizeof(size_t)and the line is corrected tomemcpy(data_packet, &total_size, sizeof(size_t)). - _(3)_:
sizeof(size_t). The offset after storing the total size. - _(4)_:
sizeof(int8_t) * 100= 100 bytes per row. - _(5)_:
currentstate->squareval + i * 100or&(currentstate->squareval[i*100]). The pointer to the start of row i.
Calculating Network Delays
Part B asks for total delay from move completion to opponent seeing the move. This includes marshalling (5 ms), one-way network latency (10 ms), transmission time (data size / bandwidth), and unmarshalling (5 ms). The board size is 10000 bytes = 80,000 bits. Bandwidth is 240 kbps = 240,000 bps. Transmission time = 80,000 / 240,000 = 0.3333 s = 333.33 ms. Total delay = 5 + 10 + 333.33 + 10 + 5 = 363.33 ms. Note: marshalling/unmarshalling times are on each side, so add both.
Efficient Move Transmission
Instead of sending the entire board, send only the changed squares. For up to 10 changed squares, we can encode each change as (row, col, new_value). A compact structure: typedef struct { int16_t row; int16_t col; int8_t new_val; } MoveChange;. Then a move packet contains a count (1 byte) followed by the changes. Average size per move: 1 + 10 * (2+2+1) = 1 + 50 = 51 bytes. For original encoding, average size is still 10000 bytes. Delay with original: 5 + 10 + (80000/240000)*1000 + 10 + 5 = 363.33 ms. With efficient: 5 + 10 + (51*8/240) + 10 + 5 = 5+10+1.7+10+5 = 31.7 ms. Much faster!
When Many Squares Change
If average changes per move is 5000, the efficient encoding size = 1 + 5000 * 5 = 25001 bytes. Original still 10000 bytes. Now original is smaller! So the best approach depends on the update pattern. Insight: use delta encoding when changes are few; otherwise, full state may be better.
Endianness and Hardware Upgrades
Part G highlights a classic issue: different CPU architectures use different byte orders (endianness). Intel uses little-endian; some non-Intel chips (e.g., PowerPC) use big-endian. If the serialized data is sent without converting to a common format (e.g., network byte order), the integers will be interpreted incorrectly. Fix: use functions like htonl/ntohl to convert to network byte order (big-endian) before transmission.
Protocol Choice: Stop-and-Go vs. Blast
Stop-and-go sends one recipe at a time, waiting for ACK. Blast sends all at once. On a high-latency link (Mars, L seconds), blast is better because it uses the bandwidth more efficiently—only one round-trip delay for the entire batch. On a low-latency link with high packet loss, stop-and-go may be better because retransmitting all recipes on a single loss is wasteful.
End-to-End Delay Pipeline
For the ChatGPT query pipeline, total delay = sum of latencies + transmission times + processing. For a 10 KB query and 5 KB response: transmission times: Wi-Fi 10KB/10Mbps = 8ms, Ethernet 10KB/100Mbps = 0.8ms, WAN 10KB/50Mbps = 1.6ms; similarly for response 5KB. Total = (10+1+500)*2 for latencies? Actually one-way: query goes through three links, response comes back. So total delay = (10+1+500) ms latency each way + transmission times + server processing 200 ms. Compute carefully.
Real-World Connections
These concepts are crucial for modern multiplayer games, cloud gaming (like GeForce Now), and real-time AI inference pipelines. Understanding serialization and protocol design helps build efficient distributed systems.