Sometimes we want to find the probability of events that involve two or more Random variables. For example, finding $P(H>65,L<55)$, where $H$ and $L$ represent tomorrow’s high and low temp, respectively.

Also, while we’ve discussed ways of summarizing a distribution of a single r.v. (with Expectation and Variance), we would also like a way to explain the relationship between two r.v.s.

## Discrete variables

There are three kinds of PMFs that we can observe between two discrete random variables. Given two discrete $X$ and $Y$, we have:

- The
**joint PMF**is given by the function $p_{X,Y}$ and equals $p_{X,Y}(x,y)=P(X=x,Y=y)$. Three or more variables follow the same pattern. - WLOG, the
**marginal PMF**of $X$ is the function $p_{X}$ given by

Note that this is just the original PMF that we’ve learned before. Here, we are summing up across the whole support of $Y$. The only new part is that $P(x=x)$ can be defined in terms of the joint PMF.

- WLOG, for each value $x∈Support(X)$, the
**conditional PMF**of $Y$, given $X=x$, is simply the equation for Conditional probability.

Here, we’re holding the value of $x$ constant. Depending on $X=x$, we will get a different conditional probability for $Y$.

## Continuous variables

### Motivation

Remember that in PDFs, the probability that a variable is exactly one value is zero. But, over a range, the probability is positive.

Consider two continuous r.v.s, $X$ and $Y$. If we were to draw a plane, with $X$ on the horizontal axis and $Y$ on the vertical, something similar occurs.

- Any point, line, or curve in the plane has zero probability.
- However, regions of the plane (e.g. $P(63≤X≤65,47≤Y≤49)$) will have positive probability.

Then, the probability of an event involving both $X$ and $Y$ will be the *volume*, which we will calculate with a *double integral* over the relevant region.

Given two continuous r.v.s $X$ and $Y$, we can define the following.

### Joint CDF

This is a function $F_{X,Y}(x,y)$ given by

$F_{X,Y}(x,y)=P(X≤x,Y≤y)$It is very similar to a regular CDF for one variable. Three or more variables follow the same pattern. Like regular CDFs, this definition also holds for discrete r.v.s.

### Joint PDF

The joint PDF is the *derivative* of their joint CDF with respect to $x$ and $y$.

Valid joint PDFs are nonnegative for all $x,y$ and integrates to $1$ over $x,y∈(−∞,∞)$, which makes sense, given that it is the total volume under the PDF.

For any set $A⊆R_{2}$, we have

$P((X,Y)∈A)=∬_{A}f_{X,Y}(x,y)dxdy$which just means that given $a≤X≤b$ and $c≤Y≤d$, we integrate over $∫_{a}∫_{c}$.

Bound definitions

Be careful with how the bounds are defined for $x$ and $y$. $0<x<1,0<y<1$ is different from $0<x<y<1$ and also changes the bounds for the integration. The latter would have the double integral

$∫_{0}∫_{x}dydx$This is because $y$ always has to be greater than $x$, so its lower bound has to start where $x$ ends.

### Marginal PDF

Just like with the marginal PMF, WLOG we can get the marginal PDF of $X$ by integrating over the entire support of $Y$, i.e. $(−∞,∞)$. This is just the regular PDF for $X$.

$f_{X}(x)=∫_{−∞}f_{X,Y}(x,y)dy$### Conditional PDF

WLOG, for each value of $x∈Support(X)$, the conditional PDF of $Y$, given $X=x$, is

$f_{Y∣X}(y∣x)=f_{X}(x)f_{X,Y}(x,y) $Note that $P(X=x)=0$, so we’re actually calculating the region defined by $(x−2Δ ,x+2Δ )$ and taking the limit as $Δ→0$.

### Factoring

If the joint PDF of $X$ and $Y$ can be factored as $f_{X,Y}(x,y)=g(x)⋅h(y)$ for all $x,y∈R$, where $g$ and $h$ are nonnegative functions, then $X$ and $Y$ are Independent.

- Useful when we don’t know their marginal PDFs, or if $g$ and $h$ are the marginal PDFs.
- Then, to get their marginal PDFs, we simply “rescale” $g$ and $h$ so they both integrate to 1.