Motivation

Often used as a model for data we’re counting. This implies non-negative values and integers in a sequence, like

A r.v. models the total number of “successes” (in this case, something that we count) as a sequence of Bernoulli trials, where we have a large amount of trials and the probability of “success” (being counted) is small. Some examples:

  • Number of text messages during this class period
  • Number of car accidents within the next hour
  • Number of raisins in a loaf of raisin bread: here, a “trial” is a very small surface area of the bread, where we can either have 0 or 1 raisin (count, or no count).

Notice the similarities with the Binomial distribution. Often, when the number of trials is exceedingly large, calculating the PMF for takes more compute time (a lot of multiplications), and accuracy issues (multiplying very large numbers). So the Poisson is often used as a good approximation in this case.

PMF

Start with the PMF of the Binomial distribution, but take the number of trials to infinity, and to zero. Keep constant though (otherwise this would make no sense), so let . This is called the rate.

Calculate . Taking the limit, .

For , we calculate the ratio , beginning with , because we have the denominator .

Taking the limits, we have . From this, we can see , and generalizing this we have

There is no upper bound on the support for a r.v. that has the Poisson distribution. It takes one parameter, , so we can say .

Properties of the distribution

If , then as we take the limits of and to infinity and 0 respectively, then the PMF of converges to the PMF of the Poisson with .

The sum of independent Poissons is also a Poisson: given and , if and are independent, then .

If , then both Expectation and Variance .

Poisson paradigm (informal)

This distribution is a good approximation when we have Bernoulli trials that each have different success probabilities, and aren’t perfectly independent.

Let be events where , where , and is independent or weakly dependent. Let be an Indicator variable for , and . Then is approximately , where .