Problem 13
Question
Prove that if \(X\) can take on any of \(n\) possible values with respective probabilities \(P_{1}, \ldots, P_{n},\) then \(H(X)\) is maximized when \(P_{i}=1 / n, i=1, \ldots, n .\) What is \(H(X)\) equal to in this case?
Step-by-Step Solution
Verified Answer
In summary, the entropy \(H(X)\) of a discrete random variable \(X\) is maximized when all probabilities are equal to \(1/n\), i.e., \(P_i = 1/n\) for all \(i\). In this case, the maximum entropy is \(-\log\frac{1}{n}\). We proved this using the method of Lagrange multipliers for constrained optimization.
1Step 1: Write down the formula for entropy
The entropy \(H(X)\) of a discrete random variable \(X\) is given by:
\[H(X) = -\sum_{i=1}^n P_i \log P_i\]
Our goal is to maximize this function subject to the constraint that the probabilities sum to 1, i.e.,
\[\sum_{i=1}^n P_i = 1\]
2Step 2: Set up the Lagrangian function
In order to solve this constrained optimization problem, we will use the method of Lagrange multipliers. Let's introduce a Lagrange multiplier \(\lambda\) and form the Lagrangian function:
\[L(P_1,\ldots,P_n,\lambda) = -\sum_{i=1}^n P_i \log P_i - \lambda \left(\sum_{i=1}^n P_i - 1\right)\]
Now to find the optimal probability values, we will take partial derivatives with respect to each \(P_i\) and \(\lambda\), and set them to zero.
3Step 3: Find partial derivatives and set them to zero
Let's find the partial derivative with respect to \(P_i\) and set it to zero:
\[\frac{\partial L}{\partial P_i} = -\log P_i - 1 - \lambda = 0\]
\[\log P_i + 1 + \lambda = 0\]
\[\log P_i = -1 - \lambda\]
Now take the partial derivative with respect to \(\lambda\) and set it to zero:
\[\frac{\partial L}{\partial \lambda} = -\left(\sum_{i=1}^n P_i - 1\right) = 0\]
\[\sum_{i=1}^n P_i = 1\]
4Step 4: Solve the system of equations
We have obtained a system of equations: one equation for each \(P_i\):
\[\log P_i = -1 - \lambda, \qquad i = 1,\ldots, n\]
and one equation for the constraint:
\[\sum_{i=1}^n P_i = 1\]
Let's solve these equations:
From the equation \(\log P_i = -1 - \lambda\), we can get:
\[P_i = e^{-1 - \lambda}\]
By plugging this into the constraint equation, we get:
\[\sum_{i=1}^n e^{-1 - \lambda} = 1\]
\[ne^{-1 - \lambda} = 1\]
\[\lambda = -1 - \log \frac {1} {n}\]
Plug this back into the equation for \(P_i\):
\[P_i = e^{-1 - \lambda} = \frac{1}{n}\]
This shows that the entropy is maximized when all probabilities are equal to \(1/n\).
5Step 5: Compute the entropy for uniform probabilities
Now let's find the value of \(H(X)\) when all probabilities are equal to \(1/n\). Plugging the values back into the entropy formula:
\[H(X) = -\sum_{i=1}^n P_i \log P_i = -\sum_{i=1}^n \frac{1}{n} \log \frac{1}{n} = -\log\frac{1}{n}\]
Thus, the maximum entropy \(H(X)\) is \(-\log\frac{1}{n}\) when all probabilities are equal to \(1/n\).
Other exercises in this chapter
Problem 7
A transition probability matrix is said to be doubly stochastic if $$ \sum_{i=0}^{M} P_{i j}=1 $$ for all states \(j=0,1, \ldots, M .\) Show that such a Markov
View solution Problem 9
Suppose that whether it rains tomorrow depends on past weather conditions only through the last 2 days. Specifically, suppose that if it has rained yesterday an
View solution Problem 14
A pair of fair dice is rolled. Let $$ X=\left\\{\begin{array}{ll} 1 & \text { if the sum of the dice is } 6 \\ 0 & \text { otherwise } \end{array}\right. $$ and
View solution Problem 15
A coin having probability \(p=\frac{2}{3}\) of coming up heads is flipped 6 times. Compute the entropy of the outcome of this experiment.
View solution