In a given population, we partition it into two groups and , where and are the number of people in each group. We then sample people without replacement.
Let be the Random variable that equals the number of people from group . Then is said to have a Hypergeometric distribution with parameters , , and . Textbook writes .
Note that we are sampling without replacement to get our group of people.
What distribution would
X
have if we sampled with replacement?We can show this by considering that the number of samples is , and the success rate (picking a person from group ) is . In addition, each of the trials are independent Bernoulli trials.
Then we can say .
PMF
Each sample size of is a valid outcome in our experiment. We can count the number of outcomes, which is given by . They’re all equally likely.
To find , we need to count the number of ways to choose exactly people from group out of the sample size.
- ways to choose people from group .
- ways to choose the remaining people from group .
Therefore, we have
for integers and .
Expectation
We can use Indicator variables for Hypergeometric distributions as well. Given , we let be the sum of indicator variables, where if the th person is included in the sample.
This is assuming that equals the number of people from group in our sample of . Without loss of generality this applies for counting as well.