Chebyshev’s Inequality

Chebyshev's Inequality

Chebyshev's Inequality is a very powerful inequality, because it applies to any probability distribution.
Chebyshev's Inequality is used to estimate the probability that a random variable $X$ is within $k$ standard deviation of the mean.

Before we derive Chebyshev's Inequality, let us derive the Chebyshev's Theorem.

Chebyshev's Theorem

If $g(x)$ is a non-negative function and $f(x)$ be p.m.f. or p.d.f. of a random variable $X$, having finite expectation and if $k$ is any positive real constant, then

$$ \begin{equation*} P[g(x)\geq k] \leq \frac{E[g(x)]}{k}\quad \text{ and }\quad P[g(x) < k] \geq 1-\frac{E[g(x)]}{k} \end{equation*} $$

Proof

Discrete Case

Let $X$ be a discrete random variable with p.m.f. $f(x)$. Let $g(x)$ be a non-negative function of $X$. Then $E[g(x)] = \sum g(x)f(x)$.

Let us introduce a new random variable $Y$ such that

$$ \begin{equation*} Y=\left\{ \begin{array}{ll} 0, & \hbox{$g(x) < k$;} \\ k, & \hbox{$g(x) \geq k$.} \end{array} \right. \end{equation*} $$

Hence,

$$ \begin{eqnarray*} E(Y) & = & 0\cdot P[Y=0] + k \times P[Y=k]\\ &=& k\times P[Y=k]\\ & = & k \times P[g(x)\geq k]. \end{eqnarray*} $$

But $g(x) \geq Y$. Therefore $E[g(x)] \geq E[Y]$.

$$ \begin{eqnarray*} \therefore E[g(x)] &\geq & k\times P[g(x) \geq k]\\ \Rightarrow P[g(x)\geq k] &\leq & \dfrac{E[g(x)]}{k}. \end{eqnarray*} $$

By taking complement of the above probability, we get

$$ \begin{eqnarray*} P[g(x) < k] & = & 1- P[g(x) \geq k]\\ \text{i.e., } P[g(x) < k] &\geq & 1 -\frac{E[g(x)]}{k}. \end{eqnarray*} $$

Continuous Case

Let $X$ be a continuous random variable with p.d.f. $f(x)$ and $g(x)$ be a non-negative function of $X$. Let $S$ be the set of all $x$ for which $g(x)\geq k$, i.e., $S = { x | g(x) \geq k}$, so $\overline{S} = { x | g(x) < k}$.

$$ \begin{eqnarray*} E[g(x)] &=& \int_{-\infty}^{\infty} g(x) f(x) \; dx \\ &=& \int_S g(x) f(x) \;dx +\int_{\overline{S}}g(x) f(x) \; dx\\ &\geq & \int_S g(x) f(x) \;dx \;\;\text{ (since both the integrals are positive)}\\ & \geq & k \int_S f(x) \;dx \;\; \text{ (since $g(x) \geq k$)}\\ & \geq & k \times P[g(x)\geq k]. \end{eqnarray*} $$

Therefore,

$$ \begin{equation*} P[g(x)\geq k]\leq \frac{E[g(x)]}{k}. \end{equation*} $$

Taking complement, we get

$$ \begin{equation*} P[g(x)< k]\geq 1-\frac{E[g(x)]}{k}. \end{equation*} $$

Chebyshev's Inequality

Let $X$ be a random variable with mean $\mu$ and finite variance $\sigma^2$. Then for any real constant $k>0$,

$$ \begin{equation*} P[|X-\mu| \geq k\sigma] \leq \frac{1}{k^2}\quad \text{ and }\quad P[|X-\mu|< k\sigma] \geq 1-\frac{1}{k^2} \end{equation*} $$

OR

If $\mu$ and $\sigma$ are the mean and the standard deviation of a random variable $X$, then for any positive constant $k$, the probability is at least $1-\dfrac{1}{k^2}$ that $X$ will take on a value within $k$ standard deviations of the mean. In notation it can be written as

$$ \begin{equation*} P[|X-\mu|< k\sigma] \geq 1-\frac{1}{k^2} \end{equation*} $$

Proof

We know that $\sigma^2 = E(X-\mu)^2$.

Discrete Case

Let $X$ be a discrete random variable with p.m.f. $f(x)$.

$$ \begin{aligned} \sigma^2 &= E(X-\mu)^2 \\ &=\sum_x (x-\mu)^2 f(x) \\ &=\sum_{-\infty}^{\mu-k\sigma}(x-\mu)^2 f(x)+\sum_{\mu-k\sigma}^{\mu+k\sigma}(x-\mu)^2 f(x)\\ &\quad +\sum_{\mu+k\sigma}^{\infty}(x-\mu)^2 f(x) \end{aligned} $$

Since the second summand is nonnegative, we can form the Chebyshev's inequality by deleting the second summation as

$$ \begin{equation}\label{eq1} \sigma^2 \geq \sum_{-\infty}^{\mu-k\sigma}(x-\mu)^2 f(x) +\sum_{\mu+k\sigma}^{\infty}(x-\mu)^2 f(x) \end{equation} $$

In the first part $x < \mu-k\sigma$ $\Rightarrow x-\mu< - k\sigma$ $\Rightarrow (x-\mu)^2\geq k^2\sigma^2$.

In the second part $x \geq \mu+k\sigma$ $\Rightarrow x-\mu\geq k\sigma$ $\Rightarrow (x-\mu)^2\geq k^2\sigma^2$.

Using these in \eqref{eq1}, we get

$$ \begin{eqnarray*} \sigma^2 &\geq & \sum_{-\infty}^{\mu-k\sigma}k^2\sigma^2 f(x) +\sum_{\mu+k\sigma}^{\infty}k^2\sigma^2 f(x) \\ \frac{1}{k^2} &\geq & \sum_{-\infty}^{\mu-k\sigma} f(x) +\sum_{\mu+k\sigma}^{\infty}f(x) \\ &\geq& P[X\leq \mu-k\sigma] + P[X\geq \mu +k\sigma]\\ &\geq& P[X-\mu \leq k\sigma] + P[X-\mu \geq k\sigma]\\ &\geq& P[|X-\mu| \geq k\sigma]\\ \text{i.e., } P[|X-\mu| \geq k\sigma]& \leq & \frac{1}{k^2}. \end{eqnarray*} $$

Taking complement, we get

$$ \begin{eqnarray*} P[|X-\mu| < k\sigma]& = & 1- P[|X-\mu|\geq k\sigma]\\ & \geq & 1-\frac{1}{k^2}. \end{eqnarray*} $$

Continuous Case

Let $X$ be a continuous random variable with p.d.f. $f(x)$.

$$ \begin{eqnarray*} \sigma^2 &=& E(X-\mu)^2 \\ &=&\int_x (x-\mu)^2 \cdot f(x)\; dx \\ &=&\int_{-\infty}^{\mu-k\sigma}(x-\mu)^2\cdot f(x)\; dx+\int_{\mu-k\sigma}^{\mu+k\sigma}(x-\mu)^2\cdot f(x)\; dx \\ & & +\int_{\mu+k\sigma}^{\infty}(x-\mu)^2 \cdot f(x)\; dx. \end{eqnarray*} $$

Since the integrand $(x-\mu)^2\cdot f(x)$ is nonnegative, we can form the inequality by deleting the second integral as

$$ \begin{equation}\label{eq2} \sigma^2 \geq \int_{-\infty}^{\mu-k\sigma}(x-\mu)^2\cdot f(x)\; dx +\int_{\mu+k\sigma}^{\infty}(x-\mu)^2\cdot f(x)\; dx \end{equation} $$

In the first part $x < \mu-k\sigma$ $\Rightarrow x-\mu< - k\sigma$ $\Rightarrow (x-\mu)^2\geq k^2\sigma^2$.

In the second part $x \geq \mu+k\sigma$ $\Rightarrow x-\mu\geq k\sigma$ $\Rightarrow (x-\mu)^2\geq k^2\sigma^2$.

Using these in \eqref{eq2}, we get

$$ \begin{eqnarray*} \sigma^2 &\geq & \int_{-\infty}^{\mu-k\sigma}k^2\sigma^2\cdot f(x)\; dx +\int_{\mu+k\sigma}^{\infty}k^2\sigma^2\cdot f(x)\; dx \\ \frac{1}{k^2} &\geq & \int_{-\infty}^{\mu-k\sigma} f(x)\; dx +\int_{\mu+k\sigma}^{\infty}f(x)\; dx \\ &\geq& P[X\leq \mu-k\sigma] + P[X\geq \mu +k\sigma]\\ &\geq& P[X-\mu \leq-k\sigma] + P[X-\mu \geq k\sigma]\\ &\geq& P[|X-\mu| \geq k\sigma] \end{eqnarray*} $$

$$ \begin{aligned} \text{i.e., } P[|X-\mu| \geq k\sigma]& \leq \frac{1}{k^2}. \end{aligned} $$

Taking complement, we get

$$ \begin{eqnarray*} P[|X-\mu| < k\sigma]& = & 1- P[|X-\mu|\geq k\sigma]\\ & \geq & 1-\frac{1}{k^2}. \end{eqnarray*} $$

Another Form of Chebyshev's Inequality

Let $X$ be a random variable with mean $\mu$ and finite variance $\sigma^2$. Then for any real constant $k>0$,

$$ \begin{equation*} P[|X-\mu| \geq k] \leq \frac{\sigma^2}{k^2} \end{equation*} $$

and

$$ \begin{equation} P[|X-\mu|< k] \geq 1-\frac{\sigma^2}{k^2} \end{equation} $$

First inequality gives upper bound for the probability whereas the second inequality gives lower bound for the probability.

VRCBuzz co-founder and passionate about making every day the greatest day of life. Raju is nerd at heart with a background in Statistics. Raju looks after overseeing day to day operations as well as focusing on strategic planning and growth of VRCBuzz products and services. Raju has more than 25 years of experience in Teaching fields. He gain energy by helping people to reach their goal and motivate to align to their passion. Raju holds a Ph.D. degree in Statistics. Raju loves to spend his leisure time on reading and implementing AI and machine learning concepts using statistical models.

Leave a Comment