\framebox{
\vbox{\vspace{2mm}
\hbox to 6.28in { {\bf STAT 205~Probability Theo...
...ribe:} Daniel Metzger, {\it Editor:} Ricardo Fernholz \hfill }
\vspace{2mm}}
}


Prerequisites

Basic Measure Theory, Random Variables

Summary

We define the expected value of a random variable via the Lebesgue integral. We then state the basic properties of expected value and sketche out a proof of its uniqueness and existence. The proof sketch very closely follows the traditional process of defining the Lebesgue integral.

Expected Value

Throughout this section, $ (\Omega,{\cal F},{\mathbb{P}})$ shall denote a probability space.

Definition 1   Let $ X: \Omega \to {\mathbb{R}}$ be an $ {\cal F}$-measurable random variable. The expected value of $ X$ is defined as

$\displaystyle {\mathbb{E}}(X):= \int_{\Omega} Xd{\mathbb{P}}= \int_{\Omega} X(\omega){\mathbb{P}}(d\omega),$ (1)

where the integral is defined as in Lebesgue integration (whenever $ \int_{\Omega} \vert X\vert d{\mathbb{P}}<\infty$).

Theorem 2 (Existence of the Integral for Extended Real Random Variables)   There exists a unique functional $ {\mathbb{E}}:X\mapsto {\mathbb{E}}(X) \in
[0,\infty]$ such that
$\displaystyle {\mathbb{E}}({\mathbf 1}_A)$ $\displaystyle =$ $\displaystyle {\mathbb{P}}(A), \quad \forall \ A\in {\cal F}$ (2)
$\displaystyle {\mathbb{E}}(cX)$ $\displaystyle =$ $\displaystyle c {\mathbb{E}}(X), \quad \forall\ c \geq 0, X \geq 0$ (3)
$\displaystyle {\mathbb{E}}(X+Y)$ $\displaystyle =$ $\displaystyle {\mathbb{E}}(X)+{\mathbb{E}}(Y), \quad \forall\ X,Y \geq 0$ (4)
$\displaystyle X \leq Y$ $\displaystyle \Rightarrow$ $\displaystyle {\mathbb{E}}(X)\leq {\mathbb{E}}(Y)$ (5)
$\displaystyle X_n \uparrow X$ $\displaystyle \Rightarrow$ $\displaystyle {\mathbb{E}}(X_n) \uparrow {\mathbb{E}}(X)$ (6)

Proof Sketch:Recall that an extended real random variable is simply a real random variable whose range is extended to $ \bar{{\mathbb{R}}}$ (so $ +\infty$ and $ -\infty$ are possible values). The procedure of defining $ {\mathbb{E}}(X)$ follows the procedure of defining the Lebesgue integral very closely: start with indicators and simple random variables, extend to general nonnegative random variables using continuity from below, and then generalize to signed random variables.

Step 1: Simple random variables

We start with indicators, setting $ E({\mathbf 1}_A) := P(A)$ where $ A \in
{\cal F}$. Next, we extend to simple random variables, so that when $ X=\sum_{i=1}^n c_i{\mathbf 1}_{A_i}$ (where $ c_i \in {\mathbb{R}}$ and $ A_i \in
{\cal F}$) we define

$\displaystyle {\mathbb{E}}(X) := \sum_{i=1}^n c_i {\mathbb{P}}(A_i).$ (7)

The fact that $ E(X)$ is well-defined in this case is a standard result from Lebesgue integration. Furthermore, it is clear that this definition satisfies conditions ([*])-([*]) of the theorem.

Step 2: Nonnegative random variables

Let $ X \geq 0$, and observe that

$\displaystyle X_n = \left\{ \begin{array}{ll} \text{greatest multiple of $2^{-n...
... is $ \leq X$}&\text{if $X \leq n$,} \\ n&\text{if $X>n$,}, \end{array} \right.$    

is a sequence of simple random variables that satisfies $ X_n \uparrow X$. By the monotonicity of $ {\mathbb{E}}$, it follows that $ {\mathbb{E}}(X_n)\uparrow$ and therefore we define

$\displaystyle {\mathbb{E}}(X) := \lim_{n \rightarrow \infty}{\mathbb{E}}(X_n).$ (8)

As before, it can be easily verified that $ {\mathbb{E}}(X)$ is still well-defined (that this limit exists is a standard result from Lebesgue integration), and it is clear that all conditions in the theorem are satisfied by this definition. Note, however, that we are not requiring that the limit be finite.

Step 3: Signed random variables

Finally, we generalize the definition of $ E(X)$ to random variables that are positive and negative. To do this, we write $ X = X^+ - X^-$, where $ X^+ = \max(X,0)$ and $ X^- =
\max(-X,0)$ (so $ X^+$ is the positive part of $ X$ and $ X^-$ is the negative part of $ X$), and then define $ {\mathbb{E}}(X)$ as

$\displaystyle {\mathbb{E}}(X) := {\mathbb{E}}(X^+) - {\mathbb{E}}(X^-).$ (9)

Both $ E(X^+)$ and $ E(X^-)$ have already been defined above (because they are nonnegative), so this definition satisfies conditions ([*])-([*]) and is well-defined provided the expression is not $ \infty - \infty$ (in which case $ E(X)$ does not exist).

Just as in Lebesgue integration, a random variable $ X$ is said to be integrable if $ E(\vert X\vert)<\infty$. In the case that $ E(X^+) - E(X^-) = \infty - \infty$, the random variable $ X$ is called quasi-integrable. It is important to note that $ {\mathbb{E}}(X) = +\infty$ is possible even if $ {\mathbb{P}}(X<\infty)=1$. Consider a random variable $ G$ which is geometric, so that $ {\mathbb{P}}(G = g)=
2^{-g}, \forall g = 1, 2, 3, \ldots$. Then $ {\mathbb{P}}(G<\infty) = 1$, but $ {\mathbb{E}}(2^G) = \sum_{i=1}^{\infty} 2^g2^{-g} = \infty$.

References

Durrett, Section 1.3