# Chebyshev’s Inequality

## Chebyshev's Inequality

Chebyshev's Inequality is a very powerful inequality, because it applies to any probability distribution.
Chebyshev's Inequality is used to estimate the probability that a random variable $X$ is within $k$ standard deviation of the mean.

Before we derive Chebyshev's Inequality, let us derive the Chebyshev's Theorem.

## Chebyshev's Theorem

If $g(x)$ is a non-negative function and $f(x)$ be p.m.f. or p.d.f. of a random variable $X$, having finite expectation and if $k$ is any positive real constant, then

 $$\begin{equation*} P[g(x)\geq k] \leq \frac{E[g(x)]}{k}\quad \text{ and }\quad P[g(x) < k] \geq 1-\frac{E[g(x)]}{k} \end{equation*}$$

#### Proof

Discrete Case

Let $X$ be a discrete random variable with p.m.f. $f(x)$. Let $g(x)$ be a non-negative function of $X$. Then $E[g(x)] = \sum g(x)f(x)$.

Let us introduce a new random variable $Y$ such that

 $$\begin{equation*} Y=\left\{ \begin{array}{ll} 0, & \hbox{g(x) < k;} \\ k, & \hbox{g(x) \geq k.} \end{array} \right. \end{equation*}$$

Hence,

 $$\begin{eqnarray*} E(Y) & = & 0\cdot P[Y=0] + k \times P[Y=k]\\ &=& k\times P[Y=k]\\ & = & k \times P[g(x)\geq k]. \end{eqnarray*}$$

But $g(x) \geq Y$. Therefore $E[g(x)] \geq E[Y]$.

 $$\begin{eqnarray*} \therefore E[g(x)] &\geq & k\times P[g(x) \geq k]\\ \Rightarrow P[g(x)\geq k] &\leq & \dfrac{E[g(x)]}{k}. \end{eqnarray*}$$

By taking complement of the above probability, we get

 $$\begin{eqnarray*} P[g(x) < k] & = & 1- P[g(x) \geq k]\\ \text{i.e., } P[g(x) < k] &\geq & 1 -\frac{E[g(x)]}{k}. \end{eqnarray*}$$

Continuous Case

Let $X$ be a continuous random variable with p.d.f. $f(x)$ and $g(x)$ be a non-negative function of $X$. Let $S$ be the set of all $x$ for which $g(x)\geq k$, i.e., $S = { x | g(x) \geq k}$, so $\overline{S} = { x | g(x) < k}$.

 $$\begin{eqnarray*} E[g(x)] &=& \int_{-\infty}^{\infty} g(x) f(x) \; dx \\ &=& \int_S g(x) f(x) \;dx +\int_{\overline{S}}g(x) f(x) \; dx\\ &\geq & \int_S g(x) f(x) \;dx \;\;\text{ (since both the integrals are positive)}\\ & \geq & k \int_S f(x) \;dx \;\; \text{ (since g(x) \geq k)}\\ & \geq & k \times P[g(x)\geq k]. \end{eqnarray*}$$

Therefore,

 $$\begin{equation*} P[g(x)\geq k]\leq \frac{E[g(x)]}{k}. \end{equation*}$$

Taking complement, we get

 $$\begin{equation*} P[g(x)< k]\geq 1-\frac{E[g(x)]}{k}. \end{equation*}$$

## Chebyshev's Inequality

Let $X$ be a random variable with mean $\mu$ and finite variance $\sigma^2$. Then for any real constant $k>0$,

 $$\begin{equation*} P[|X-\mu| \geq k\sigma] \leq \frac{1}{k^2}\quad \text{ and }\quad P[|X-\mu|< k\sigma] \geq 1-\frac{1}{k^2} \end{equation*}$$

OR

If $\mu$ and $\sigma$ are the mean and the standard deviation of a random variable $X$, then for any positive constant $k$, the probability is at least $1-\dfrac{1}{k^2}$ that $X$ will take on a value within $k$ standard deviations of the mean. In notation it can be written as

 $$\begin{equation*} P[|X-\mu|< k\sigma] \geq 1-\frac{1}{k^2} \end{equation*}$$

#### Proof

We know that $\sigma^2 = E(X-\mu)^2$.

Discrete Case

Let $X$ be a discrete random variable with p.m.f. $f(x)$.

 \begin{aligned} \sigma^2 &= E(X-\mu)^2 \\ &=\sum_x (x-\mu)^2 f(x) \\ &=\sum_{-\infty}^{\mu-k\sigma}(x-\mu)^2 f(x)+\sum_{\mu-k\sigma}^{\mu+k\sigma}(x-\mu)^2 f(x)\\ &\quad +\sum_{\mu+k\sigma}^{\infty}(x-\mu)^2 f(x) \end{aligned}

Since the second summand is nonnegative, we can form the Chebyshev's inequality by deleting the second summation as

 $$$$\label{eq1} \sigma^2 \geq \sum_{-\infty}^{\mu-k\sigma}(x-\mu)^2 f(x) +\sum_{\mu+k\sigma}^{\infty}(x-\mu)^2 f(x)$$$$

In the first part $x < \mu-k\sigma$ $\Rightarrow x-\mu< - k\sigma$ $\Rightarrow (x-\mu)^2\geq k^2\sigma^2$.

In the second part $x \geq \mu+k\sigma$ $\Rightarrow x-\mu\geq k\sigma$ $\Rightarrow (x-\mu)^2\geq k^2\sigma^2$.

Using these in \eqref{eq1}, we get

 $$\begin{eqnarray*} \sigma^2 &\geq & \sum_{-\infty}^{\mu-k\sigma}k^2\sigma^2 f(x) +\sum_{\mu+k\sigma}^{\infty}k^2\sigma^2 f(x) \\ \frac{1}{k^2} &\geq & \sum_{-\infty}^{\mu-k\sigma} f(x) +\sum_{\mu+k\sigma}^{\infty}f(x) \\ &\geq& P[X\leq \mu-k\sigma] + P[X\geq \mu +k\sigma]\\ &\geq& P[X-\mu \leq k\sigma] + P[X-\mu \geq k\sigma]\\ &\geq& P[|X-\mu| \geq k\sigma]\\ \text{i.e., } P[|X-\mu| \geq k\sigma]& \leq & \frac{1}{k^2}. \end{eqnarray*}$$

Taking complement, we get

 $$\begin{eqnarray*} P[|X-\mu| < k\sigma]& = & 1- P[|X-\mu|\geq k\sigma]\\ & \geq & 1-\frac{1}{k^2}. \end{eqnarray*}$$

Continuous Case

Let $X$ be a continuous random variable with p.d.f. $f(x)$.

 $$\begin{eqnarray*} \sigma^2 &=& E(X-\mu)^2 \\ &=&\int_x (x-\mu)^2 \cdot f(x)\; dx \\ &=&\int_{-\infty}^{\mu-k\sigma}(x-\mu)^2\cdot f(x)\; dx+\int_{\mu-k\sigma}^{\mu+k\sigma}(x-\mu)^2\cdot f(x)\; dx \\ & & +\int_{\mu+k\sigma}^{\infty}(x-\mu)^2 \cdot f(x)\; dx. \end{eqnarray*}$$

Since the integrand $(x-\mu)^2\cdot f(x)$ is nonnegative, we can form the inequality by deleting the second integral as

 $$$$\label{eq2} \sigma^2 \geq \int_{-\infty}^{\mu-k\sigma}(x-\mu)^2\cdot f(x)\; dx +\int_{\mu+k\sigma}^{\infty}(x-\mu)^2\cdot f(x)\; dx$$$$

In the first part $x < \mu-k\sigma$ $\Rightarrow x-\mu< - k\sigma$ $\Rightarrow (x-\mu)^2\geq k^2\sigma^2$.

In the second part $x \geq \mu+k\sigma$ $\Rightarrow x-\mu\geq k\sigma$ $\Rightarrow (x-\mu)^2\geq k^2\sigma^2$.

Using these in \eqref{eq2}, we get

 $$\begin{eqnarray*} \sigma^2 &\geq & \int_{-\infty}^{\mu-k\sigma}k^2\sigma^2\cdot f(x)\; dx +\int_{\mu+k\sigma}^{\infty}k^2\sigma^2\cdot f(x)\; dx \\ \frac{1}{k^2} &\geq & \int_{-\infty}^{\mu-k\sigma} f(x)\; dx +\int_{\mu+k\sigma}^{\infty}f(x)\; dx \\ &\geq& P[X\leq \mu-k\sigma] + P[X\geq \mu +k\sigma]\\ &\geq& P[X-\mu \leq-k\sigma] + P[X-\mu \geq k\sigma]\\ &\geq& P[|X-\mu| \geq k\sigma] \end{eqnarray*}$$

 \begin{aligned} \text{i.e., } P[|X-\mu| \geq k\sigma]& \leq \frac{1}{k^2}. \end{aligned}

Taking complement, we get

 $$\begin{eqnarray*} P[|X-\mu| < k\sigma]& = & 1- P[|X-\mu|\geq k\sigma]\\ & \geq & 1-\frac{1}{k^2}. \end{eqnarray*}$$

## Another Form of Chebyshev's Inequality

Let $X$ be a random variable with mean $\mu$ and finite variance $\sigma^2$. Then for any real constant $k>0$,

 $$\begin{equation*} P[|X-\mu| \geq k] \leq \frac{\sigma^2}{k^2} \end{equation*}$$

and

 $$$$P[|X-\mu|< k] \geq 1-\frac{\sigma^2}{k^2}$$$$

First inequality gives upper bound for the probability whereas the second inequality gives lower bound for the probability.