Home » Statistics » Confidence Interval » Confidence Interval for means when variances are known

# Confidence Interval for means when variances are known

In this article we will discuss about the theory of confidence interval for difference between two means when population standard deviations are knonw along with the step by step procedure to construct a confidence interval for difference between two population means when the population standard deviations are known.

## CI for difference between two means when variances are known

### Assumptions

a. The two samples are independent.
b. Both the samples are simple random sample.
c. Both the samples comes from population having normal distribution.
d. The two population variances $\sigma^2_1$ and $\sigma^2_2$ are known.

### Derivation

Let $X_1,X_2, \cdots, X_n$ be a random sample from $N(\mu_1,\sigma^2_1)$ with known $\sigma^2_1$.

Let $Y_1,Y_2, \cdots, Y_m$ be a random sample from $N(\mu_2,\sigma^2_2)$ with known $\sigma^2_2$.

And both the samples are independent.

Let $\overline{X}=\dfrac{1}{n}\sum_{i=1}^n X_i$ be the sample mean of the first sample and $\overline{Y}=\dfrac{1}{m}\sum_{i=1}^m Y_i$ be the sample mean of the second sample. Then $\overline{X}\sim N(\mu_1,\sigma^2_1/n)$ and $\overline{Y}\sim N(\mu_2,\sigma^2_2/m)$. Moreover, $\overline{X}$ and $\overline{Y}$ are independent.

Then,

$\overline{X}-\overline{Y}\sim N(\mu_1-\mu_2, \dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m})$.

Therefore,

 $$Z=\dfrac{\overline{X}-\overline{Y}-(\mu_1-\mu_2}{\sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}}\sim N(0,1)$$

Here $Z$ is a function of sample observations and parameters $\mu_1$ and $\mu_2$. Moreover, the distribution of $Z$ is independent of any unknown parameter. Hence $Z$ can be used as a pivotal quantity.

Therefore, there exist two numbers $z_1$ and $z_2$ ($z_1 < z_2$) depending on $\alpha$ ($0\leq \alpha \leq 1$) such that

 $$P(z_1 < Z < z_2) =1-\alpha$$

Therefore,

 \begin{aligned} & P(z_1 < Z < z_2) =1-\alpha\\ \Rightarrow & P\bigg(z_1 < \dfrac{\overline{X}-\overline{Y}-(\mu_1-\mu_2)}{\sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}} < z_2\bigg) =1-\alpha\\ \Rightarrow & P\bigg(z_1 \sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}} < \overline{X}-\overline{Y}-(\mu_1-\mu_2) < z_2 \sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}\bigg) =1-\alpha\\ \Rightarrow & P\bigg(-z_1 \sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}} > -(\overline{X}-\overline{Y})+(\mu_1-\mu_2) > -z_2\sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}\bigg) =1-\alpha\\ \Rightarrow & P\bigg(-z_2 \sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}} < -(\overline{X}-\overline{Y})+(\mu_1-\mu_2) < -z_1\sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}\bigg) =1-\alpha\\ \Rightarrow & P\bigg((\overline{X}-\overline{Y})-z_2 \sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}} < (\mu_1-\mu_2)< (\overline{X}-\overline{Y})-z_1\sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}\bigg) =1-\alpha\\ \end{aligned}
Thus, $100(1-\alpha)\%$ confidence interval for the difference $\mu_1 -\mu_2$ when variances are known is

 $$P\bigg((\overline{X}-\overline{Y})-z_2 \sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}, (\overline{X}-\overline{Y})-z_1\sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}\bigg)$$

where $z_1$ and $z_2$ can be determined from $P(z_1< Z < z_2) =1-\alpha$.

But the distribution of $Z$ is symmetric about zero. Therefore, $z_1 = -z2=-z{\alpha/2}$.

Hence, $100(1-\alpha)\%$ confidence interval for the difference $\mu_1-\mu_2$ when variances are known is

 $$P\bigg((\overline{X}-\overline{Y})-z_{\alpha/2} \sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}, (\overline{X}-\overline{Y})+z_{\alpha/2}\sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}\bigg)$$

### Step by Step Procedure

Let $X_1, X_2, \cdots, X_{n_1}$ be a random sample of size $n_1$ from a population with mean $\mu_1$ and standard deviation $\sigma_1$.

Let $Y_1, Y_2, \cdots, Y_{n_2}$ be a random sample of size $n_2$ from a population with mean $\mu_2$ and standard deviation $\sigma_2$. And the two sample are independent.

Let $\overline{X} = \frac{1}{n_1}\sum X_i$ and $\overline{Y} =\frac{1}{n_2}\sum Y_i$ be the sample means of first and second sample respectively.

Let $C=1-\alpha$ be the confidence coefficient. Our objective is to construct a $100(1-\alpha)$% confidence interval estimate for the difference $(\mu_1-\mu_2)$.

Step by step procedure to estimate the confidence interval for difference between two population means when variances are known is as follows:

#### Step 2 Specify the given information

Given that sample sizes $n_1, n_2$, samples means $\overline{X},\overline{Y}$, standard deviations $\sigma_1, \sigma_2$.

#### Step 3 Specify the formula

$100(1-\alpha)$% confidence interval estimate for the difference $(\mu_1-\mu_2)$ is

 \begin{aligned} (\overline{X} -\overline{Y})- E \leq (\mu_1-\mu_2) \leq (\overline{X} -\overline{Y}) + E. \end{aligned}

where

 \begin{aligned} E &= Z_{\alpha/2} \sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}} \end{aligned}
is called the margin of error.

#### Step 4 Determine the critical value

Find the critical value $Z_{\alpha/2}$ from the normal statistical table for desired confidence level.

#### Step 5 Compute the margin of error

The margin of error for the difference of means is
 \begin{aligned} E = Z_{\alpha/2} \sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}} \end{aligned}

#### Step 6 Determine the confidence interval

$100(1-\alpha)$% confidence interval estimate for the difference $(\mu_1-\mu_2)$ is

 \begin{aligned} (\overline{X} -\overline{Y})- E \leq (\mu_1-\mu_2) \leq (\overline{X} -\overline{Y}) + E. \end{aligned}

That is $100(1-\alpha)$% confidence interval estimate for the difference $(\mu_1-\mu_2)$ is $(\overline{X} -\overline{Y})\pm E$ or $\big((\overline{X} -\overline{Y})- E, (\overline{X} -\overline{Y})+E\big)$.

## Conclusion

In this tutorial, you learned about the derivation of confidence interval for the difference between two means when population variances are known. You also learned about the step by step procedure to construct desired confidence interval for the difference between two means when population variances are known.