Definition – Binomial distribution

Contents
1. Definitions
2. Derivation of the binomial distribution
3. Use of the binomial distribution
3. Mean.
4. Variance.

1. Definition

Let $X_i$ for $i=1,2,...,n$ be independent random variables such that for each $i$ , $X_i \sim Bernoulli(p)$ . Then, the probability distribution for $X = X_1 + X_2 + ... + X_n$ is given by,

$p(X=x) = \binom{n}{x} p^x (1-p)^{1-x}$

We write that $X \sim Binomial(n,k)$ .

2. Derivation of the binomial distribution

The derivation of the binomial distribution is rather simple. From combinatorics, we have that there are $\binom{n}{k}$ ways of rearranging $n$ elements, where $k$ of them have a special property, and the remaining $(n-k)$ also share a different property. Thus, there are $\binom{n}{k}$ ways to rearrange $k$ ones, and $(n-k)$ zeros. Thus, as the Bernoulli trials are independent, we have

$\sum^{\binom{n}{k}}_{i=1} p^x (1-p)^{1-x} = \binom{n}{x} p^x (1-p)^{1-x}$

3. Use of the binomial distribution

We use the binomial distribution when we have a series of $n$ independent Bernoulli trials, and we wish to find the probability of $x$ successes during those $n$ trials.

4. Mean & Variance

$E(X) = np$
$Var(X) = np(1-p)$

Proof

Recall that for independent random variables, the mean of their sum is equal to the sum of their means, ie,

$E(X_1 + X_2) = E(X_1)+E(X_2)$ .

Thus, for the sum of $n$ independent Bernoulli random variables,

$X = X_1 + X_2 + ... + X_n$

We have,

$E(X_1 + X_2 + ... + X_n) = E(X_1) + E(X_2) + ... + E(X_n)$

Thus, we have,

$E(X) = E(X_1 + X_2 + ... + X_n) = E(X_1) + E(X_2) + ... + E(X_n) = np$

Similarly, the variance of the sum of independent variables is equal to the sum of their variances, and following a similar reasoning as for the mean,

$Var(X) = Var(X_1 + X_2 + ... + X_n) \\ = Var(X_1) + Var(X_2) + ... + Var(X_n) \\= np(1-p)\quad\square$