A random variable $X$ maps each possible outcome $s∈S$ where $S$ is the sample space to a real number. Random variables can represent anything.

Example

If we flip a coin 5 times, we can have a random variable $H$ that is the number of heads. $H$ is a set such that $H∈{0,1,2,3,4,5}$, because you can at most flip 5 heads.

We can ask about the probability of an event occurring for a random variable. For example, asking $P(H≥4)$ is asking the probability of the event that the number of heads we flip is 4 or greater.

In reality, this is equivalent to $P({s∈S∣H(s)≥4})$. We are considering the set of outcomes such that when it’s mapped to $H$, its value is 4 or greater.

You can also define a random variable by explicitly stating how it maps outcomes to numbers. For example, we can have a $X$ such that $X=1$ if the flip is heads, and $X=0$ if the flip is tails. This would be an Indicator variable.

## Discrete random variable

A r.v. is called **discrete** if there is a countable set of values $V$ that $X$ can equal. $V$ can be a finite set, or a countably infinite set, but not an uncountable set.

The countable set of values $x$ such that the PMF $P(X=x)>0$ is called the **support** of $X$. This should be $V$.

In problem sets, we need to specify the support whenever we give a PMF for a random variable.

## Continuous random variable

**Continuous** r.v.s can take on any real value in $R$ for some interval.

It doesn’t make sense to define a PMF for a continuous variable $X$. Asking for a specific value of $X$ makes no sense when it is defined on a *continuous* interval.

- If $X$ equals the high temperature in Philly, then $P(X=68)=0$. Intuitively, we’re asking for the probability that $X$ equals a single number out of an
*infinite*number of values. - In fact, for any constant $k$, if $X$ is continuous, then $P(X=k)=0$.

### Formal definition

A r.v. $X$ has a continuous distribution if its CDF $F(u)=P(X≤u)$ is:

- Differentiable (and therefore continuous everywhere)
- Or continuous everywhere, and differentiable at all but a finite number of points.

If $X$ has such a continuous distribution, then we say $X$ is a continuous random variable.

### Using the continuous distribution instead

We define the **support** of the random variable as the set of all $u$ such that the PDF of $X$ is greater than zero. That is, $f_{X}(u)>0$.

The probability that $X$ exists within an interval $[a,b]$, however, is defined. This is just integrating over the PDF from $a$ to $b$.

$P(a≤X≤b)=F_{X}(b)−F_{X}(a)=∫_{a}f_{X}(t)dt$Note that $[a,b]=(a,b)=[a,b)=[a,b)$ because again, the probability that exactly $P(X=a)$ or $P(X=b)$ is zero.

### Expectation

Since $X$ is continuous, we now use an integral instead of a summation to find the expected value.

$E(X)=∫_{−∞}x⋅f_{u}(x)dx$where $f_{u}(x)$ is the PDF when $X=x$ (remember that this really is an infinitesimal range).

- Only holds when integral converges absolutely ($∫_{−∞}∣x∣⋅f_{u}(x)dx$ is finite). Otherwise, $E(X)$ is undefined.
- In this course we usually assume $E(X)$ is well-defined.