Problem 23

Question

Let $\mathcal{B}$ be an event with $\mathrm{P}[\mathcal{B}] \neq 0,$ and let $\left\\{\boldsymbol{B}_{i}\right\\}_{i \in I}$ be a finite, pairwise disjoint family of events whose union is $\mathcal{B}$. Generalizing the law of total expectation $(8.24),$ show that for every real-valued random variable $X,$ if $I^{*}:=\left\\{i \in I: \mathrm{P}\left[\mathcal{B}_{i}\right] \neq 0\right\\},$ then we have $$ \mathrm{E}[X \mid \mathcal{B}] \mathrm{P}[\mathcal{B}]=\sum_{i \in I^{*}} \mathrm{E}\left[X \mid \mathcal{B}_{i}\right] \mathrm{P}\left[\mathcal{B}_{i}\right] $$ Also show that if $\mathrm{E}\left[X \mid \mathcal{B}_{i}\right] \leq \alpha$ for each $i \in I^{*}$, then $\mathrm{E}[X \mid \mathcal{B}] \leq \alpha$

Step-by-Step Solution

Verified

Answer

Question: Prove that if $\mathrm{E}[X \mid \mathcal{B}_i] \leq \alpha$ for every $i \in I^*$, where $I^* = \{i \in I: \mathrm{P}[\mathcal{B}_i] \neq 0\}$, then $\mathrm{E}[X \mid \mathcal{B}] \leq \alpha$. Solution: We showed that $\mathrm{E}[X \mid \mathcal{B}] \mathrm{P}[\mathcal{B}] = \sum_{i \in I^*} \mathrm{E}[X \mid \mathcal{B}_i] \mathrm{P}[\mathcal{B}_i]$. Since $\mathrm{E}[X \mid \mathcal{B}_i] \leq \alpha$ for every $i \in I^*$, we have $\sum_{i \in I^*} \mathrm{E}[X \mid \mathcal{B}_i] \mathrm{P}[\mathcal{B}_i] \leq \alpha \sum_{i \in I^*} \mathrm{P}[\mathcal{B}_i]$. This implies that $\mathrm{E}[X \mid \mathcal{B}] \mathrm{P}[\mathcal{B}] \leq \alpha \mathrm{P}[\mathcal{B}]$, and thus, dividing by $\mathrm{P}[\mathcal{B}]$, we get $\mathrm{E}[X \mid \mathcal{B}] \leq \alpha$.

1Step 1: Define Conditional Expectation and the Law of Total Expectation

Recall that the conditional expectation is the expected value of a random variable given that some event has occurred, and it's defined as: $$ \mathrm{E}[X \mid \mathcal{B}] = \frac{\mathrm{E}[X \mathbb{I}\{\mathcal{B}\}]}{\mathrm{P}[\mathcal{B}]} $$ where $\mathbb{I}\{\mathcal{B}\}$ is the indicator function of event $\mathcal{B}$. The law of total expectation states that for a random variable $X$ and events $\mathcal{B}_i$: $$ \mathrm{E}[X] = \sum_i \mathrm{E}[X \mid \mathcal{B}_i]\mathrm{P}[\mathcal{B}_i] $$

2Step 2: Rewrite Conditional Expectation using Indicator Functions

We start by multiplying both sides of the given formula by $\mathrm{P}[\mathcal{B}]$: $$ \mathrm{E}[X \mid \mathcal{B}] \mathrm{P}[\mathcal{B}] = \frac{\mathrm{E}[X \mathbb{I}\{\mathcal{B}\}]}{\mathrm{P}[\mathcal{B}]}\mathrm{P}[\mathcal{B}] $$ Simplifying it, we have: $$ \mathrm{E}[X \mid \mathcal{B}] \mathrm{P}[\mathcal{B}] = \mathrm{E}[X \mathbb{I}\{\mathcal{B}\}] $$

3Step 3: Apply the Law of Total Expectation

Applying the law of the total expectation on $X \mathbb{I}\{\mathcal{B}\}$ with the given family of events $\left\\{\boldsymbol{B}_{i}\right\\}_{i \in I}$, we have: $$ \mathrm{E}[X \mathbb{I}\{\mathcal{B}\}] = \sum_{i \in I} \mathrm{E}\left[X \mathbb{I}\{\mathcal{B}\} \mid \mathcal{B}_i\right] \mathrm{P}[\mathcal{B}_i] $$ Since the family $\left\\{\boldsymbol{B}_{i}\right\\}_{i \in I}$ is pairwise-disjoint, the intersection between the events will be $\emptyset$. Therefore, we can rewrite the above expression as: $$ \mathrm{E}[X \mathbb{I}\{\mathcal{B}\}] = \sum_{i \in I} \mathrm{E}\left[X \mathbb{I}\{\mathcal{B}_i\} \mid \mathcal{B}_i\right] \mathrm{P}[\mathcal{B}_i] $$ Now, for each $i \in I$ such that $\mathrm{P}[\mathcal{B}_i] \neq 0$, we can say: $$ \mathrm{E}\left[X \mathbb{I}\{\mathcal{B}_i\} \mid \mathcal{B}_i\right] = \mathrm{E}\left[X \mid \mathcal{B}_i\right] $$ So, $$ \mathrm{E}[X \mathbb{I}\{\mathcal{B}\}] = \sum_{i \in I^*} \mathrm{E}\left[X \mid \mathcal{B}_i\right] \mathrm{P}[\mathcal{B}_i] $$ Thus proving the first part: $$ \mathrm{E}[X \mid \mathcal{B}] \mathrm{P}[\mathcal{B}] =\sum_{i \in I^*} \mathrm{E}\left[X \mid \mathcal{B}_i\right] \mathrm{P}[\mathcal{B}_i] $$

4Step 4: Prove the inequality

We're given the following inequality: $$ \mathrm{E}\left[X \mid \mathcal{B}_i\right] \leq \alpha, \text{ for every } i \in I^* $$ Multiplying both sides by $\mathrm{P}[\mathcal{B}_i]$ (which is non-negative), we get: $$ \mathrm{E}\left[X \mid \mathcal{B}_i\right] \mathrm{P}[\mathcal{B}_i] \leq \alpha \mathrm{P}[\mathcal{B}_i], \text{ for every } i \in I^* $$ Now summing over all $i \in I^*$, $$ \sum_{i \in I^*} \mathrm{E}\left[X \mid \mathcal{B}_i\right] \mathrm{P}[\mathcal{B}_i] \leq \alpha \sum_{i \in I^*} \mathrm{P}[\mathcal{B}_i] $$ By the result of the first part, we have: $$ \mathrm{E}[X \mid \mathcal{B}] \mathrm{P}[\mathcal{B}] \leq \alpha \sum_{i \in I^*} \mathrm{P}[\mathcal{B}_i] $$ Since $\sum_{i \in I^*} \mathrm{P}[\mathcal{B}_i] = \mathrm{P}[\mathcal{B}]$, dividing both sides by $\mathrm{P}[\mathcal{B}]$ gives: $$ \mathrm{E}[X \mid \mathcal{B}] \leq \alpha $$ Thus proving the inequality.

Key Concepts

Law of Total ExpectationRandom VariableIndicator Function

Law of Total Expectation

The Law of Total Expectation is a fundamental concept used in probability theory. It allows us to find the expected value of a random variable by considering it in the context of multiple possible scenarios. Imagine you have a random variable, say $X$, and this variable can be understood through different events $\{\mathcal{B}_i\}_{i \in I}$ occurring in your probability space. Each $\mathcal{B}_i$ might have different probabilities and conditions impacting $X$.

Here's how the Law of Total Expectation works:

It states that the expected value of $X$ can be seen as a combination of expectations calculated for each "scenario" or sub-event $\mathcal{B}_i$.
Mathematically, it looks like this: \[\mathrm{E}[X] = \sum_i \mathrm{E}[X \mid \mathcal{B}_i]\mathrm{P}[\mathcal{B}_i]\]
In simple terms, you weigh the expectation of $X$ for each $\mathcal{B}_i$ by the probability that $\mathcal{B}_i$ happens.

This principle helps in simplifying and solving complex probability problems by breaking them down into simpler cases that collectively explain the behavior of the whole system.

Random Variable

A random variable is a cornerstone of probability and statistics. It is a way to assign numeric values to the outcomes of a random process. Let's break it down for a clearer understanding.

**Definitions**: Random variables map outcomes of a random process to numerical values, making mathematical manipulation possible. They can be discrete (having countable values like dice rolls) or continuous (with any value in a range, like measuring height).
**Expectations**: The expected value of a random variable represents its average or mean over many trials. For a discrete random variable $X$, this is calculated as: \[ \mathrm{E}[X] = \sum x_i\mathrm{P}(x_i) \]

Random variables simplify the study of randomness by allowing average behaviors and trends to be analyzed with defined mathematical tools. They are essential for modeling real-world situations in fields like finance, science, and engineering.

Indicator Function

The indicator function is a simple yet powerful tool in statistics, acting like a switch that turns events on or off.

**Basics**: An indicator function for an event $\mathcal{B}$, denoted as $\mathbb{I}\{ \mathcal{B} \}$, is a function that takes the value 1 if the event $\mathcal{B}$ occurs and 0 if it does not.
**Uses**: It helps isolate and study specific events within a broader analysis. For example, the conditional expectation formula $ \mathrm{E}[X \mid \mathcal{B}] = \frac{\mathrm{E}[X \mathbb{I}\{\mathcal{B}\}]}{\mathrm{P}[\mathcal{B}]} $ uses indicator functions to focus only on scenarios where $\mathcal{B}$ is true.

Indicator functions simplify the handling of cases or scenarios, letting us focus on computations directly related to the occurrence or non-occurrence of key events in a probability space.

Problem 22

Problem 24

Other exercises in this chapter

Problem 21

Suppose $X$ and $Y$ take non-negative real values, and that $Y \leq c$ for some constant $c .$ Show that $E[X Y] \leq c E[X]$

View solution

Problem 22

Let $X$ be a $0 / 1$ -valued random variable. Show that $\operatorname{Var}[X] \leq 1 / 4$.

View solution

Problem 24

Let $B$ be an event with $\mathrm{P}[B] \neq 0,$ and let $\left\\{C_{i}\right\\}_{i \in I}$ be a finite, pairwise disjoint family of events whose union co

View solution

Problem 25

This exercise makes use of the notion of convexity (see $\$ \mathrm{~A} 8$ ). (a) Prove Jensen's inequality: if $f$ is convex on an interval, and $X$ is a

View solution