Bessel's Correction

Yann HOFFMANN

01 Jun 2021 — 2 min read

Short proof of why the maximum likelihood estimator (MLE) of the variance is a biased estimator.

When we measure the variation of our random variable (r.v.) with respect to the sample mean, we are acting as if there was an additional sample in the average, thus removing a degree of freedom and skewing the result towards less variance.

Conversely, the MLE of the variance with respect to the true population mean is unbiased. This can be seen in the factor n-1/n which tends to 1 as the sample mean reaches the true population mean.

In other words:
$\sigma_{MVU}^2 = \frac{n}{n-1} \sigma_{ML}^2$
$= \frac{1}{n-1} \sum_{i=1}^n (x_i - \mu_{ML})^2$

In other words:
$$ \sigma_{MVU}^2 = \frac{n}{n-1} \sigma_{ML}^2 $$
$$ = \frac{1}{n-1} \sum_{i=1}^n (x_i - \mu_{ML})^2$$

Full proof:

$$ \mathbb{E}[\sigma_{ML}^2] = \mathbb{E}[\frac{1}{n} \sum_{i=1}^n(x_i - \mu_{ML})^2] $$

$$ = \color{red}\mathbb{E}[\sum_{i=1}^n x_i^2] \color{black} -
\color{blue}2 \mathbb{E}[x_i \mu_{ML}]\color{black} +
\color{green}\mathbb{E}[\mu_{ML}^2] \color{black}$$

$$ = \color{red} \frac{1}{n} \sum_{i=1}^n \mathbb{E}[x_i^2] \color{black} -
\color{blue} \frac{2}{n} \mathbb{E}[x_i \frac{1}{n} \sum_{j=1}^n x_j] \color{black} +
\color{green} \frac{1}{n^2} \mathbb{E}[(\sum_{i=1}^n x_i)^2]\color{black}$$

$$ = \color{red} \frac{1}{n} \sum_{i=1}^n (\sigma^2 + \mu^2) \color{black} -
\color{blue}\frac{2}{n} \mathbb{E}[x_i^2 + x_i \sum_{j \neq i} x_j] \color{black} +
\color{green}\frac{1}{n^{2}\mathbb{E}[\sum_{i=1}}n x_i^2 + \sum_{i \neq j} x_i x_j]\color{black}$$

$$ = \color{red} \sigma^2 + \mu^2 \color{black} -
\color{blue} 2(\frac{1}{n}(\sigma^2 + \mu^2) + \frac{n-1}{n} \mu^2)\color{black} +
\color{green}\frac{n}{n^2} (\mu^2 + \sigma^2) + \frac{n^2 - n}{n^2} \mu^2 \color{black} $$

$$ = \frac{n-1}{n} \sigma^2 $$

Given that:
$$
\begin{cases}
\mathbb{E}[x_i x_j] = \mu^2 + \sigma^2 & \text{if $i=j$}a
\mathbb{E}[x_i x_j] = \mu^2 & if \ i \neq j
\end{cases}
$$

State of Research in Luxembourg (April 2025)

Luxembourg is betting big on research. Out of 6,000 students enrolled at university, around 600 are doctoral researchers. The country has developed a large number of research centers around its university. SnT, one of its biggest, is located on the Kirchberg university campus and employs 500 people. All this

Language Models Can Reduce Asymmetry in Information Markets – Slides

Language Models Can Reduce Asymmetry in Information Markets In this presentation, I explore how language models (LMs) can help address information asymmetry in digital marketplaces. The core challenge in such markets is that buyers need to inspect information before purchasing, while sellers must limit access to prevent unauthorized use. This

Analyzing Multi-Dimensional Data for Clinical Tests: A Practical Case Study

Purpose: The purpose of this article is to give a technical overview of the steps that can be taken to analyze temporal, multi-dimensional data to apply classification, clustering, and sequence analysis in the context of a clinical test. Introduction: In the spring of 2023 before starting my PhD, I worked

Master Thesis: Reinforcement Learning for Optimizing Compute Clusters

Main Contributions * Generalization of Q-learning to multi-objective environments * Implementation of a proof-of-concept on synthetic examples * Dashboard visualization of performance over time Preview Abstract Hyperparameter tuning of complex systems is a problem that relies on many decision-making algorithms and hand-crafted heuristics. Developers are often called upon to tune these hyperparameters by

Read more

State of Research in Luxembourg (April 2025)

Language Models Can Reduce Asymmetry in Information Markets – Slides

Analyzing Multi-Dimensional Data for Clinical Tests: A Practical Case Study

Master Thesis: Reinforcement Learning for Optimizing Compute Clusters