In this article we will discuss about the theory of confidence interval for difference between two means when population standard deviations are knonw along with the step by step procedure to construct a confidence interval for difference between two population means when the population standard deviations are known.
CI for difference between two means when variances are known
Assumptions
a. The two samples are independent.
b. Both the samples are simple random sample.
c. Both the samples comes from population having normal distribution.
d. The two population variances $\sigma^2_1$ and $\sigma^2_2$ are known.
Derivation
Let $X_1,X_2, \cdots, X_n$ be a random sample from $N(\mu_1,\sigma^2_1)$ with known $\sigma^2_1$.
Let $Y_1,Y_2, \cdots, Y_m$ be a random sample from $N(\mu_2,\sigma^2_2)$ with known $\sigma^2_2$.
And both the samples are independent.
Let $\overline{X}=\dfrac{1}{n}\sum_{i=1}^n X_i$
be the sample mean of the first sample and $\overline{Y}=\dfrac{1}{m}\sum_{i=1}^m Y_i$
be the sample mean of the second sample. Then $\overline{X}\sim N(\mu_1,\sigma^2_1/n)$
and $\overline{Y}\sim N(\mu_2,\sigma^2_2/m)$
. Moreover, $\overline{X}$ and $\overline{Y}$ are independent.
Then,
$\overline{X}-\overline{Y}\sim N(\mu_1-\mu_2, \dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m})$
.
Therefore,
$$ Z=\dfrac{\overline{X}-\overline{Y}-(\mu_1-\mu_2}{\sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}}\sim N(0,1) $$
Here $Z$ is a function of sample observations and parameters $\mu_1$ and $\mu_2$. Moreover, the distribution of $Z$ is independent of any unknown parameter. Hence $Z$ can be used as a pivotal quantity.
Therefore, there exist two numbers $z_1$ and $z_2$ ($z_1 < z_2$) depending on $\alpha$ ($0\leq \alpha \leq 1$) such that
$$ P(z_1 < Z < z_2) =1-\alpha $$
Therefore,
$$ \begin{aligned} & P(z_1 < Z < z_2) =1-\alpha\\ \Rightarrow & P\bigg(z_1 < \dfrac{\overline{X}-\overline{Y}-(\mu_1-\mu_2)}{\sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}} < z_2\bigg) =1-\alpha\\ \Rightarrow & P\bigg(z_1 \sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}} < \overline{X}-\overline{Y}-(\mu_1-\mu_2) < z_2 \sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}\bigg) =1-\alpha\\ \Rightarrow & P\bigg(-z_1 \sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}} > -(\overline{X}-\overline{Y})+(\mu_1-\mu_2) > -z_2\sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}\bigg) =1-\alpha\\ \Rightarrow & P\bigg(-z_2 \sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}} < -(\overline{X}-\overline{Y})+(\mu_1-\mu_2) < -z_1\sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}\bigg) =1-\alpha\\ \Rightarrow & P\bigg((\overline{X}-\overline{Y})-z_2 \sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}} < (\mu_1-\mu_2)< (\overline{X}-\overline{Y})-z_1\sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}\bigg) =1-\alpha\\ \end{aligned} $$
Thus, $100(1-\alpha)\%$ confidence interval for the difference $\mu_1 -\mu_2$ when variances are known is
$$ P\bigg((\overline{X}-\overline{Y})-z_2 \sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}, (\overline{X}-\overline{Y})-z_1\sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}\bigg) $$
where $z_1$ and $z_2$ can be determined from $P(z_1< Z < z_2) =1-\alpha$.
But the distribution of $Z$ is symmetric about zero. Therefore, $z_1 = -z2=-z{\alpha/2}$.
Hence, $100(1-\alpha)\%$ confidence interval for the difference $\mu_1-\mu_2$ when variances are known is
$$ P\bigg((\overline{X}-\overline{Y})-z_{\alpha/2} \sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}, (\overline{X}-\overline{Y})+z_{\alpha/2}\sqrt{\dfrac{\sigma^2_1}{n}+\dfrac{\sigma^2_2}{m}}\bigg) $$
Step by Step Procedure
Let $X_1, X_2, \cdots, X_{n_1}$
be a random sample of size $n_1$
from a population with mean $\mu_1$
and standard deviation $\sigma_1$
.
Let $Y_1, Y_2, \cdots, Y_{n_2}$
be a random sample of size $n_2$
from a population with mean $\mu_2$
and standard deviation $\sigma_2$
. And the two sample are independent.
Let $\overline{X} = \frac{1}{n_1}\sum X_i$ and $\overline{Y} =\frac{1}{n_2}\sum Y_i$ be the sample means of first and second sample respectively.
Let $C=1-\alpha$ be the confidence coefficient. Our objective is to construct a $100(1-\alpha)$% confidence interval estimate for the difference $(\mu_1-\mu_2)$.
Step by step procedure to estimate the confidence interval for difference between two population means when variances are known is as follows:
Step 1 Specify the confidence level $(1-\alpha)$
Step 2 Specify the given information
Given that sample sizes $n_1, n_2$, samples means $\overline{X},\overline{Y}$, standard deviations $\sigma_1, \sigma_2$.
Step 3 Specify the formula
$100(1-\alpha)$% confidence interval estimate for the difference $(\mu_1-\mu_2)$ is
$$ \begin{aligned} (\overline{X} -\overline{Y})- E \leq (\mu_1-\mu_2) \leq (\overline{X} -\overline{Y}) + E. \end{aligned} $$
where
$$ \begin{aligned} E &= Z_{\alpha/2} \sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}} \end{aligned} $$
is called the margin of error.
Step 4 Determine the critical value
Find the critical value $Z_{\alpha/2}$ from the normal statistical table for desired confidence level.
Step 5 Compute the margin of error
The margin of error for the difference of means is
$$ \begin{aligned} E = Z_{\alpha/2} \sqrt{\frac{\sigma_1^2}{n_1}+\frac{\sigma_2^2}{n_2}} \end{aligned} $$
Step 6 Determine the confidence interval
$100(1-\alpha)$% confidence interval estimate for the difference $(\mu_1-\mu_2)$ is
$$ \begin{aligned} (\overline{X} -\overline{Y})- E \leq (\mu_1-\mu_2) \leq (\overline{X} -\overline{Y}) + E. \end{aligned} $$
That is $100(1-\alpha)$% confidence interval estimate for the difference $(\mu_1-\mu_2)$ is $(\overline{X} -\overline{Y})\pm E$ or $\big((\overline{X} -\overline{Y})- E, (\overline{X} -\overline{Y})+E\big)$.
Conclusion
In this tutorial, you learned about the derivation of confidence interval for the difference between two means when population variances are known. You also learned about the step by step procedure to construct desired confidence interval for the difference between two means when population variances are known.
To learn more about interval estimation and construction of confidence interval, please refer to the following tutorials:
CI calculator with examples for difference between two means when the population variances are known
Let me know in the comments if you have any questions on confidence interval for difference between two means when the population variances are known and your thought on this article.