Five number summary for grouped data

Five number summary for grouped data

A five number summary is a quick and easy way to determine the the center, the spread and outliers (if any) of a data set.

Five number summary includes five values, namely,

  • minimum value ($\min$),
  • first quartile ($Q_1$),
  • $\text{median }$ ($Q_2$),
  • third quartile ($Q_3$),
  • maximum value ($\max$).

Formula

  • $\min$= lower limit of the first class,
  • $\max$= upper limit of the last class,
  • $Q_i=l + \bigg(\dfrac{\dfrac{iN}{4} - F_<}{f}\bigg)\times h$ ; $i=1,2,\cdots,3$

where

  • $l$ is the lower limit of the $i^{th}$ quartile class
  • $N=\sum f$ total number of observations
  • $f$ frequency of the $i^{th}$ quartile class
  • $F_<$ cumulative frequency of the class previous to $i^{th}$ quartile class
  • $h$ is the class width

Example 1

A class teacher has the following data about the number of absences of 35 students of a class. Compute five number summary for the following frequency distribution.

No.of days ($x$) 2 3 4 5 6
No. of Students ($f$) 1 15 10 5 4

Solution

$x_i$ $f_i$ $cf$
2 1 1
3 15 16
4 10 26
5 5 31
6 4 35
Total 35

Minimum Value

The minimum number of absent days $\min = 2$.

Maximum Value

The maximum number of absent days $\max = 6$.

The formula for $i^{th}$ quartile is

$Q_i =\bigg(\dfrac{i(N)}{4}\bigg)^{th}$ value, $i=1,2,3$

where $N$ is the total number of observations.

First Quartile $Q_1$

$$ \begin{aligned} Q_{1} &=\bigg(\dfrac{1(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{1(35)}{4}\bigg)^{th}\text{ value}\\ &=\big(8.75\big)^{th}\text{ value} \end{aligned} $$

The cumulative frequency just greater than or equal to $8.75$ is $16$. The corresponding value of $X$ is the $1^{st}$ quartile. That is, $Q_1 =3$ days.

Thus, $25$ % of the students had absences less than or equal to $3$ days.

Median $M$

$$ \begin{aligned} M &=\bigg(\dfrac{N}{2}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{35}{2}\bigg)^{th}\text{ value}\\ &=\big(17.5\big)^{th}\text{ value} \end{aligned} $$

The cumulative frequency just greater than or equal to $8.75$ is $26$. The corresponding value of $X$ is the median. That is, $M =4$ days.

Thus, $50$ % of the students had absences less than or equal to $4$ days.

Third Quartile $Q_3$

$$ \begin{aligned} Q_{3} &=\bigg(\dfrac{3(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{3(35)}{4}\bigg)^{th}\text{ value}\\ &=\big(26.25\big)^{th}\text{ value} \end{aligned} $$

The cumulative frequency just greater than or equal to $26.25$ is $31$. The corresponding value of $X$ is the $3^{rd}$ quartile. That is, $Q_3 =5$ days.

Thus, $75$ % of the students had absences less than or equal to $5$ days.

Thus the five number summary of given data set is

$\min = 2$ days, $Q_1 = 3$ days, $\text{median }=4$ days, $Q_3=5$ days and $\max = 6$ days.

Example 2

The following table gives the amount of time (in minutes) spent on the internet each evening by a group of 56 students. Compute five number summary for the following frequency distribution.

Time spent on Internet ($x$) 10-12 13-15 16-18 19-21 22-24
No. of students ($f$) 3 12 15 24 2

Solution

The classes are inclusive. To make them exclusive type subtract 0.5 from the lower limit and add 0.5 to the upper limit of each class.

Class Interval Class Boundries $f_i$ $cf$
10-12 9.5-12.5 3 3
13-15 12.5-15.5 12 15
16-18 15.5-18.5 15 30
19-21 18.5-21.5 24 54
22-24 21.5-24.5 2 56
Total 56

Minumum Value

The minimum time spent on the internet$\min = 9.5 \text{ minutes}$.

Maximum Value

The maximum time spent on the internet$\max = 24.5 \text{ minutes}$.

Quartiles

The formula for $i^{th}$ quartile is

$Q_i =\bigg(\dfrac{i(N)}{4}\bigg)^{th}$ value, $i=1,2,3$

where $N$ is the total number of observations.

First Quartile $Q_1$

$$ \begin{aligned} Q_{1} &=\bigg(\dfrac{1(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{1(56)}{4}\bigg)^{th}\text{ value}\\ &=\big(14\big)^{th}\text{ value} \end{aligned} $$

The cumulative frequency just greater than or equal to $14$ is $15$. The corresponding class $12.5-15.5$ is the $1^{st}$ quartile class.

Thus

  • $l = 12.5$, the lower limit of the $1^{st}$ quartile class
  • $N=56$, total number of observations
  • $f =12$, frequency of the $1^{st}$ quartile class
  • $F_< = 3$, cumulative frequency of the class previous to $1^{st}$ quartile class
  • $h =3$, the class width

The first quartile $Q_1$ can be computed as follows:

$$ \begin{aligned} Q_1 &= l + \bigg(\frac{\frac{1(N)}{4} - F_<}{f}\bigg)\times h\\ &= 12.5 + \bigg(\frac{\frac{1*56}{4} - 3}{12}\bigg)\times 3\\ &= 12.5 + \bigg(\frac{14 - 3}{12}\bigg)\times 3\\ &= 12.5 + \big(0.9167\big)\times 3\\ &= 12.5 + 2.75\\ &= 15.25 \text{ minutes} \end{aligned} $$
Thus, $25$ % of the students spent less than or equal to $15.25$ minutes on the internet.

Median

$$ \begin{aligned} M &=\bigg(\dfrac{N}{2}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{56}{2}\bigg)^{th}\text{ value}\\ &=\big(28\big)^{th}\text{ value} \end{aligned} $$

The cumulative frequency just greater than or equal to $28$ is $30$. The corresponding class $15.5-18.5$ is the median class.

Thus

  • $l = 15.5$, the lower limit of the median class
  • $N=56$, total number of observations
  • $f =15$, frequency of the median class
  • $F_< = 15$, cumulative frequency of the class previous to median class
  • $h =3$, the class width

The median $M$ can be computed as follows:

$$ \begin{aligned} M &= l + \bigg(\frac{\frac{N}{2} - F_<}{f}\bigg)\times h\\ &= 15.5 + \bigg(\frac{\frac{56}{2} - 15}{15}\bigg)\times 3\\ &= 15.5 + \bigg(\frac{28 - 15}{15}\bigg)\times 3\\ &= 15.5 + \big(0.8667\big)\times 3\\ &= 15.5 + 2.6\\ &= 18.1 \text{ minutes} \end{aligned} $$

Thus, $50$ % of the students spent less than or equal to $18.1$ minutes on the internet.

Third Quartile $Q_3$

$$ \begin{aligned} Q_{3} &=\bigg(\dfrac{3(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{3(56)}{4}\bigg)^{th}\text{ value}\\ &=\big(42\big)^{th}\text{ value} \end{aligned} $$

The cumulative frequency just greater than or equal to $42$ is $54$. The corresponding class $18.5-21.5$ is the $3^{rd}$ quartile class.

Thus

  • $l = 18.5$, the lower limit of the $3^{rd}$ quartile class
  • $N=56$, total number of observations
  • $f =24$, frequency of the $3^{rd}$ quartile class
  • $F_< = 30$, cumulative frequency of the class previous to $3^{rd}$ quartile class
  • $h =3$, the class width

The third quartile $Q_3$ can be computed as follows:

$$ \begin{aligned} Q_3 &= l + \bigg(\frac{\frac{3(N)}{4} - F_<}{f}\bigg)\times h\\ &= 18.5 + \bigg(\frac{\frac{3*56}{4} - 30}{24}\bigg)\times 3\\ &= 18.5 + \bigg(\frac{42 - 30}{24}\bigg)\times 3\\ &= 18.5 + \big(0.5\big)\times 3\\ &= 18.5 + 1.5\\ &= 20 \text{ minutes} \end{aligned} $$
Thus, $75$ % of the students spent less than or equal to $20$ minutes on the internet.

Thus the five number summary of time spent on the internet is

$\min = 9.5$ minutes, $Q_1 = 15.25$ minutes, $\text{median }=18.1$ minutes, $Q_3=20$ minutes and $\max = 24.5$ minutes.

Hope you enjoyed the step by step solution to find five number summary for grouped data.

Do read more about step by step solution to find five number summary for ungrouped data and five number summary calculator.

If you have any doubt or queries feel free to post them in the comment section.

Leave a Comment