Definition – Normal distribution.

Page contents
1. Definition

Definition

Let Z be a random variable. We say that X is normally distributed if it has pdf,

p(Z=z) = \frac{1}{\sigma\sqrt{2\pi }} e^{-\frac{1}{2} (\frac{z-\mu}{\sigma})^2}

and we write Z \sim N(\mu,\sigma^2). It is convention to denote the random variable by Z when it is normally distributed.

The standard normal distribution is as above, but with \mu = 0 and \sigma^2 = 1.

Bessel’s Correction

The maximum likelihood estimator for the population variance, based on an iid, normally distributed sample of size n \mathbf{X} = (X_1, ..., X_n), is

\bar{s}^2 = \frac{ \sum_{i=1}^n (x- \mu)^2}{n}

The expected value for the estimator \bar{s}^2 is \frac{n-1}{n}\sigma^2.

Proof

\begin{aligned} E(\bar{s}^2) = E(\frac{\sum_{i=1}^n (x- \mu)^2}{n}) &= \frac{1}{n}\sum_{i=1}^n E[(x- \mu)^2] \\\\ &= \frac{1}{n} E[\sum_{i=1}^n X_i ^2 -2\bar{x} \sum_{i=1}^n X_i + \sum_{i=1}^n \bar{x}^2] \\\\ &= \frac{1}{n} \cdot [\sum_{i=1}^n E(X_i ^2) -2(n\bar{x})\bar{x} +n\bar{x}^2] \\\\ &= E(X_i ^2) - E(\bar{x}^2)\end{aligned}

However, E(\bar{x}^2) = var(\bar{x}) + E(\bar{x})^2 = \frac{1}{n^2}var(\sum_{i=1}^n X_i) + \mu^2, so

\begin{aligned} E(X_i ^2) - E(\bar{x}^2) &= E(X_i ^2) - \frac{1}{n^2}var(\sum_{i=1}^n X_i) - \mu^2 \\\\ &= var(X_i) + \mu^2 - \frac{1}{n^2}var(\sum_{i=1}^n X_i) - \mu^2 \\\\ & = var(X_i) - \frac{1}{n^2}var(\sum_{i=1}^n X_i)\end{aligned}

Since, by assumption, the sample elements are independent and identically distributed, we may make use of the fact that var(X+Y) = var(X) + var(Y). Thus,

\begin{aligned} var(X_i) - \frac{1}{n^2}var(\sum_{i=1}^n X_i) &= var(X_i) - \frac{1}{n}var(X_i) \\\\ &= \sigma^2 - \frac{1}{n}\sigma^2 \\\\ &= \frac{n-1}{n}\sigma^2\quad \square\end{aligned}

We have shown that E(\bar{s}^2) = \frac{n-1}{n}\sigma^2 is a biased estimator, but to correct for this bias (and by the linearity of expectation), we have that the unbiased estimator for the population variance is \frac{n}{n-1}\bar{s}^2 = \frac{\sum_{i=1}^n (X_i - \mu)^2}{n-1}.