Problem 23

Question

Let \(\mathcal{B}\) be an event with \(\mathrm{P}[\mathcal{B}] \neq 0,\) and let \(\left\\{\boldsymbol{B}_{i}\right\\}_{i \in I}\) be a finite, pairwise disjoint family of events whose union is \(\mathcal{B}\). Generalizing the law of total expectation \((8.24),\) show that for every real-valued random variable \(X,\) if \(I^{*}:=\left\\{i \in I: \mathrm{P}\left[\mathcal{B}_{i}\right] \neq 0\right\\},\) then we have $$ \mathrm{E}[X \mid \mathcal{B}] \mathrm{P}[\mathcal{B}]=\sum_{i \in I^{*}} \mathrm{E}\left[X \mid \mathcal{B}_{i}\right] \mathrm{P}\left[\mathcal{B}_{i}\right] $$ Also show that if \(\mathrm{E}\left[X \mid \mathcal{B}_{i}\right] \leq \alpha\) for each \(i \in I^{*}\), then \(\mathrm{E}[X \mid \mathcal{B}] \leq \alpha\)

Step-by-Step Solution

Verified
Answer
Question: Prove that if $\mathrm{E}[X \mid \mathcal{B}_i] \leq \alpha$ for every $i \in I^*$, where $I^* = \{i \in I: \mathrm{P}[\mathcal{B}_i] \neq 0\}$, then $\mathrm{E}[X \mid \mathcal{B}] \leq \alpha$. Solution: We showed that $\mathrm{E}[X \mid \mathcal{B}] \mathrm{P}[\mathcal{B}] = \sum_{i \in I^*} \mathrm{E}[X \mid \mathcal{B}_i] \mathrm{P}[\mathcal{B}_i]$. Since $\mathrm{E}[X \mid \mathcal{B}_i] \leq \alpha$ for every $i \in I^*$, we have $\sum_{i \in I^*} \mathrm{E}[X \mid \mathcal{B}_i] \mathrm{P}[\mathcal{B}_i] \leq \alpha \sum_{i \in I^*} \mathrm{P}[\mathcal{B}_i]$. This implies that $\mathrm{E}[X \mid \mathcal{B}] \mathrm{P}[\mathcal{B}] \leq \alpha \mathrm{P}[\mathcal{B}]$, and thus, dividing by $\mathrm{P}[\mathcal{B}]$, we get $\mathrm{E}[X \mid \mathcal{B}] \leq \alpha$.
1Step 1: Define Conditional Expectation and the Law of Total Expectation
Recall that the conditional expectation is the expected value of a random variable given that some event has occurred, and it's defined as: $$ \mathrm{E}[X \mid \mathcal{B}] = \frac{\mathrm{E}[X \mathbb{I}\{\mathcal{B}\}]}{\mathrm{P}[\mathcal{B}]} $$ where \(\mathbb{I}\{\mathcal{B}\}\) is the indicator function of event \(\mathcal{B}\). The law of total expectation states that for a random variable \(X\) and events \(\mathcal{B}_i\): $$ \mathrm{E}[X] = \sum_i \mathrm{E}[X \mid \mathcal{B}_i]\mathrm{P}[\mathcal{B}_i] $$
2Step 2: Rewrite Conditional Expectation using Indicator Functions
We start by multiplying both sides of the given formula by \(\mathrm{P}[\mathcal{B}]\): $$ \mathrm{E}[X \mid \mathcal{B}] \mathrm{P}[\mathcal{B}] = \frac{\mathrm{E}[X \mathbb{I}\{\mathcal{B}\}]}{\mathrm{P}[\mathcal{B}]}\mathrm{P}[\mathcal{B}] $$ Simplifying it, we have: $$ \mathrm{E}[X \mid \mathcal{B}] \mathrm{P}[\mathcal{B}] = \mathrm{E}[X \mathbb{I}\{\mathcal{B}\}] $$
3Step 3: Apply the Law of Total Expectation
Applying the law of the total expectation on \(X \mathbb{I}\{\mathcal{B}\}\) with the given family of events \(\left\\{\boldsymbol{B}_{i}\right\\}_{i \in I}\), we have: $$ \mathrm{E}[X \mathbb{I}\{\mathcal{B}\}] = \sum_{i \in I} \mathrm{E}\left[X \mathbb{I}\{\mathcal{B}\} \mid \mathcal{B}_i\right] \mathrm{P}[\mathcal{B}_i] $$ Since the family \(\left\\{\boldsymbol{B}_{i}\right\\}_{i \in I}\) is pairwise-disjoint, the intersection between the events will be \(\emptyset\). Therefore, we can rewrite the above expression as: $$ \mathrm{E}[X \mathbb{I}\{\mathcal{B}\}] = \sum_{i \in I} \mathrm{E}\left[X \mathbb{I}\{\mathcal{B}_i\} \mid \mathcal{B}_i\right] \mathrm{P}[\mathcal{B}_i] $$ Now, for each \(i \in I\) such that \(\mathrm{P}[\mathcal{B}_i] \neq 0\), we can say: $$ \mathrm{E}\left[X \mathbb{I}\{\mathcal{B}_i\} \mid \mathcal{B}_i\right] = \mathrm{E}\left[X \mid \mathcal{B}_i\right] $$ So, $$ \mathrm{E}[X \mathbb{I}\{\mathcal{B}\}] = \sum_{i \in I^*} \mathrm{E}\left[X \mid \mathcal{B}_i\right] \mathrm{P}[\mathcal{B}_i] $$ Thus proving the first part: $$ \mathrm{E}[X \mid \mathcal{B}] \mathrm{P}[\mathcal{B}] =\sum_{i \in I^*} \mathrm{E}\left[X \mid \mathcal{B}_i\right] \mathrm{P}[\mathcal{B}_i] $$
4Step 4: Prove the inequality
We're given the following inequality: $$ \mathrm{E}\left[X \mid \mathcal{B}_i\right] \leq \alpha, \text{ for every } i \in I^* $$ Multiplying both sides by \(\mathrm{P}[\mathcal{B}_i]\) (which is non-negative), we get: $$ \mathrm{E}\left[X \mid \mathcal{B}_i\right] \mathrm{P}[\mathcal{B}_i] \leq \alpha \mathrm{P}[\mathcal{B}_i], \text{ for every } i \in I^* $$ Now summing over all \(i \in I^*\), $$ \sum_{i \in I^*} \mathrm{E}\left[X \mid \mathcal{B}_i\right] \mathrm{P}[\mathcal{B}_i] \leq \alpha \sum_{i \in I^*} \mathrm{P}[\mathcal{B}_i] $$ By the result of the first part, we have: $$ \mathrm{E}[X \mid \mathcal{B}] \mathrm{P}[\mathcal{B}] \leq \alpha \sum_{i \in I^*} \mathrm{P}[\mathcal{B}_i] $$ Since \(\sum_{i \in I^*} \mathrm{P}[\mathcal{B}_i] = \mathrm{P}[\mathcal{B}]\), dividing both sides by \(\mathrm{P}[\mathcal{B}]\) gives: $$ \mathrm{E}[X \mid \mathcal{B}] \leq \alpha $$ Thus proving the inequality.

Key Concepts

Law of Total ExpectationRandom VariableIndicator Function
Law of Total Expectation
The Law of Total Expectation is a fundamental concept used in probability theory. It allows us to find the expected value of a random variable by considering it in the context of multiple possible scenarios. Imagine you have a random variable, say \(X\), and this variable can be understood through different events \(\{\mathcal{B}_i\}_{i \in I}\) occurring in your probability space. Each \(\mathcal{B}_i\) might have different probabilities and conditions impacting \(X\).

Here's how the Law of Total Expectation works:
  • It states that the expected value of \(X\) can be seen as a combination of expectations calculated for each "scenario" or sub-event \(\mathcal{B}_i\).
  • Mathematically, it looks like this: \[\mathrm{E}[X] = \sum_i \mathrm{E}[X \mid \mathcal{B}_i]\mathrm{P}[\mathcal{B}_i]\]
  • In simple terms, you weigh the expectation of \(X\) for each \(\mathcal{B}_i\) by the probability that \(\mathcal{B}_i\) happens.

This principle helps in simplifying and solving complex probability problems by breaking them down into simpler cases that collectively explain the behavior of the whole system.
Random Variable
A random variable is a cornerstone of probability and statistics. It is a way to assign numeric values to the outcomes of a random process. Let's break it down for a clearer understanding.

  • **Definitions**: Random variables map outcomes of a random process to numerical values, making mathematical manipulation possible. They can be discrete (having countable values like dice rolls) or continuous (with any value in a range, like measuring height).
  • **Expectations**: The expected value of a random variable represents its average or mean over many trials. For a discrete random variable \(X\), this is calculated as: \[ \mathrm{E}[X] = \sum x_i\mathrm{P}(x_i) \]

Random variables simplify the study of randomness by allowing average behaviors and trends to be analyzed with defined mathematical tools. They are essential for modeling real-world situations in fields like finance, science, and engineering.
Indicator Function
The indicator function is a simple yet powerful tool in statistics, acting like a switch that turns events on or off.

  • **Basics**: An indicator function for an event \(\mathcal{B}\), denoted as \(\mathbb{I}\{ \mathcal{B} \}\), is a function that takes the value 1 if the event \(\mathcal{B}\) occurs and 0 if it does not.
  • **Uses**: It helps isolate and study specific events within a broader analysis. For example, the conditional expectation formula \( \mathrm{E}[X \mid \mathcal{B}] = \frac{\mathrm{E}[X \mathbb{I}\{\mathcal{B}\}]}{\mathrm{P}[\mathcal{B}]} \) uses indicator functions to focus only on scenarios where \(\mathcal{B}\) is true.

Indicator functions simplify the handling of cases or scenarios, letting us focus on computations directly related to the occurrence or non-occurrence of key events in a probability space.