Programming lesson
Mastering Time-Varying Channels: A Tutorial on Ee276 Homework #6 P0
Learn how to analyze time-varying discrete memoryless channels, jointly typical sequences, and feedback in BSC with this step-by-step tutorial inspired by Ee276 homework #6 p0. Perfect for students preparing for exams in information theory.
Introduction to Time-Varying Channels and Joint Typicality
In information theory, understanding how to compute channel capacity and error probabilities is crucial. This tutorial breaks down the key concepts from Ee276 homework #6 p0, focusing on time-varying channels, jointly typical sequences, and the binary symmetric channel (BSC) with feedback. Whether you're preparing for an exam or deepening your knowledge of channel coding, these examples will help solidify your understanding.
Time-Varying Discrete Memoryless Channels
Consider a channel where each use has a different crossover probability δi. The channel is memoryless but time-varying, meaning the conditional distribution is p(y|x) = ∏i=1n pi(yi|xi), where each pi is a BSC(δi). The goal is to show that the maximum mutual information maxPX I(X;Y) equals the average of capacities, i.e., (1/n)∑i=1n Ci, where Ci = 1 - H(δi).
Proving the Capacity Formula
Using a chain of inequalities similar to the channel coding converse proof, we can show that for any input distribution PX, I(X;Y) ≤ ∑i=1n I(Xi;Yi) ≤ ∑i=1n Ci. The first inequality follows from the data processing inequality and the memoryless property, while the second uses the fact that for each i, the maximum mutual information is Ci. Equality is achieved when the inputs are independent and each Xi is uniform.
Jointly Typical Sequences for a BSC
Consider a BSC with crossover probability 0.1 and uniform input distribution. The joint distribution p(x,y) is:
- p(0,0)=0.45, p(0,1)=0.05
- p(1,0)=0.05, p(1,1)=0.45
From this, we compute H(X)=1 bit, H(Y)=1 bit, H(X,Y)= -2(0.45 log 0.45 + 0.05 log 0.05) ≈ 1.469 bits, and I(X;Y)=H(X)+H(Y)-H(X,Y)=0.531 bits.
Typical Sequences for n=25
For n=25 and ε=0.2, a sequence xn is typical if its empirical entropy is within ε of H(X)=1. For Bernoulli(1/2), the typical set includes sequences with number of 1's between about 10 and 15. The table in the homework shows probabilities for k ones. The size of the typical set Aε(n)(X) is approximately 2nH(X) = 225 = 33,554,432, but the actual number of sequences with k in [10,15] is sum of binomial coefficients: C(25,10)+C(25,11)+C(25,12)+C(25,13)+C(25,14)+C(25,15) = 3,268,760 + 4,457,400 + 5,200,300 + 5,200,300 + 4,457,400 + 3,268,760 = 25,852,920. So about 77% of sequences are typical.
Jointly Typical Set Size
The jointly typical set requires both xn and yn to be typical and (xn,yn) to be jointly typical, meaning the empirical joint entropy is close to H(X,Y). For the BSC, this is equivalent to requiring that the error pattern zn = yn ⊕ xn is typical with respect to the Bernoulli(0.1) distribution. For n=25 and ε=0.2, the typical set for Z includes sequences with number of 1's between about 1 and 4 (since H(0.1)=0.469, 2nH(Z) ≈ 211.725 ≈ 3400, but actual count: k=1: 25, k=2: 300, k=3: 2300, k=4: 12650, total ≈ 15275). The size of the jointly typical set is |Aε(n)(X)| * |Aε(n)(Z)| ≈ 25,852,920 * 15,275 ≈ 3.95e11, but more precisely it's the number of pairs (xn, zn) with xn typical and zn typical, which equals |Aε(n)(X)| * |Aε(n)(Z)|.
Probability of Error for Joint Typical Decoding
For a fixed received sequence yn = 0...0, the probability that a randomly chosen codeword Xn is jointly typical with yn equals the probability that Zn = yn ⊕ Xn is typical. Since Xn is uniform, Zn is also uniform, and the probability that Zn is typical is |Aε(n)(Z)| / 2n ≈ 15,275 / 33,554,432 ≈ 0.000455. For a code with 512 codewords, the probability that at least one other codeword is jointly typical with the received sequence (given that the sent one is) is at most (511) * 0.000455 ≈ 0.2325 by union bound. The exact probability is 1 - (1 - 0.000455)511 ≈ 0.207. The total error probability is the sum of the probability that the sent codeword is not jointly typical (negligible for large n) plus the probability that another codeword is jointly typical, which is about 0.207.
BSC with Feedback
When feedback is used, the strategy X1 ~ Bern(1/2), X2 = Y1, ..., Xn = Yn-1 effectively sends the same bit repeatedly. The mutual information I(X;Y) per channel use approaches 1 - H(p) as n→∞, which is exactly the capacity of the BSC without feedback. Thus, feedback does not increase capacity for this memoryless channel.
Fano's Inequality Without Conditioning
Given a random variable X with probabilities p1 ≥ p2 ≥ ... ≥ pm, the minimum error probability is Pe = 1 - p1. To maximize H(X) subject to fixed Pe, we use the fact that the conditional distribution of X given that we make an error (i.e., X ≠ 1) has entropy at most log(m-1). By the chain rule, H(X) = H(1-p1) + (1-p1) H(X|error) ≤ H(Pe) + Pe log(m-1). This gives the bound Pe ≥ (H(X) - 1)/log(m-1) for binary case, but more generally, Fano's inequality states H(X) ≤ H(Pe) + Pe log(m-1).
Real-World Analogy: AI Chatbot Error Rates
Just as a BSC has crossover probability, an AI chatbot may have a probability of misunderstanding user intent. Understanding joint typicality helps in designing error-correcting codes for reliable communication, similar to how redundancy in prompts can improve chatbot accuracy. The concept of feedback in BSC mirrors how a chatbot can ask clarifying questions to reduce errors.
Conclusion
This tutorial covered time-varying channels, jointly typical sequences, BSC with feedback, and Fano's inequality. These concepts are fundamental for understanding channel capacity and error correction in modern communication systems, from 5G networks to deep space communications.