Five number summary for grouped data

Five number summary for grouped data

A five number summary is a quick and easy way to determine the the center, the spread and outliers (if any) of a data set.

Five number summary includes five values, namely,

  • minimum value ($\min$),
  • first quartile ($Q_1$),
  • $\text{median }$ ($Q_2$),
  • third quartile ($Q_3$),
  • maximum value ($\max$).

Formula

  • $\min$= lower limit of the first class,
  • $\max$= upper limit of the last class,
  • $Q_i=l + \bigg(\dfrac{\dfrac{iN}{4} - F_<}{f}\bigg)\times h$ ; $i=1,2,\cdots,3$

where

  • $l$ is the lower limit of the $i^{th}$ quartile class
  • $N=\sum f$ total number of observations
  • $f$ frequency of the $i^{th}$ quartile class
  • $F_<$ cumulative frequency of the class previous to $i^{th}$ quartile class
  • $h$ is the class width

Example 1

A class teacher has the following data about the number of absences of 35 students of a class. Compute five number summary for the following frequency distribution.

No.of days ($x$) 2 3 4 5 6
No. of Students ($f$) 1 15 10 5 4

Solution

$x_i$ $f_i$ $cf$
2 1 1
3 15 16
4 10 26
5 5 31
6 4 35
Total 35

Minimum Value

The minimum number of absent days $\min = 2$.

Maximum Value

The maximum number of absent days $\max = 6$.

The formula for $i^{th}$ quartile is

$Q_i =\bigg(\dfrac{i(N)}{4}\bigg)^{th}$ value, $i=1,2,3$

where $N$ is the total number of observations.

First Quartile $Q_1$

$$ \begin{aligned} Q_{1} &=\bigg(\dfrac{1(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{1(35)}{4}\bigg)^{th}\text{ value}\\ &=\big(8.75\big)^{th}\text{ value} \end{aligned} $$

The cumulative frequency just greater than or equal to $8.75$ is $16$. The corresponding value of $X$ is the $1^{st}$ quartile. That is, $Q_1 =3$ days.

Thus, $25$ % of the students had absences less than or equal to $3$ days.

Median $M$

$$ \begin{aligned} M &=\bigg(\dfrac{N}{2}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{35}{2}\bigg)^{th}\text{ value}\\ &=\big(17.5\big)^{th}\text{ value} \end{aligned} $$

The cumulative frequency just greater than or equal to $8.75$ is $26$. The corresponding value of $X$ is the median. That is, $M =4$ days.

Thus, $50$ % of the students had absences less than or equal to $4$ days.

Third Quartile $Q_3$

$$ \begin{aligned} Q_{3} &=\bigg(\dfrac{3(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{3(35)}{4}\bigg)^{th}\text{ value}\\ &=\big(26.25\big)^{th}\text{ value} \end{aligned} $$

The cumulative frequency just greater than or equal to $26.25$ is $31$. The corresponding value of $X$ is the $3^{rd}$ quartile. That is, $Q_3 =5$ days.

Thus, $75$ % of the students had absences less than or equal to $5$ days.

Thus the five number summary of given data set is

$\min = 2$ days, $Q_1 = 3$ days, $\text{median }=4$ days, $Q_3=5$ days and $\max = 6$ days.

Example 2

The following table gives the amount of time (in minutes) spent on the internet each evening by a group of 56 students. Compute five number summary for the following frequency distribution.

Time spent on Internet ($x$) 10-12 13-15 16-18 19-21 22-24
No. of students ($f$) 3 12 15 24 2

Solution

The classes are inclusive. To make them exclusive type subtract 0.5 from the lower limit and add 0.5 to the upper limit of each class.

Class Interval Class Boundries $f_i$ $cf$
10-12 9.5-12.5 3 3
13-15 12.5-15.5 12 15
16-18 15.5-18.5 15 30
19-21 18.5-21.5 24 54
22-24 21.5-24.5 2 56
Total 56

Minumum Value

The minimum time spent on the internet$\min = 9.5 \text{ minutes}$.

Maximum Value

The maximum time spent on the internet$\max = 24.5 \text{ minutes}$.

Quartiles

The formula for $i^{th}$ quartile is

$Q_i =\bigg(\dfrac{i(N)}{4}\bigg)^{th}$ value, $i=1,2,3$

where $N$ is the total number of observations.

First Quartile $Q_1$

$$ \begin{aligned} Q_{1} &=\bigg(\dfrac{1(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{1(56)}{4}\bigg)^{th}\text{ value}\\ &=\big(14\big)^{th}\text{ value} \end{aligned} $$

The cumulative frequency just greater than or equal to $14$ is $15$. The corresponding class $12.5-15.5$ is the $1^{st}$ quartile class.

Thus

  • $l = 12.5$, the lower limit of the $1^{st}$ quartile class
  • $N=56$, total number of observations
  • $f =12$, frequency of the $1^{st}$ quartile class
  • $F_< = 3$, cumulative frequency of the class previous to $1^{st}$ quartile class
  • $h =3$, the class width

The first quartile $Q_1$ can be computed as follows:

$$ \begin{aligned} Q_1 &= l + \bigg(\frac{\frac{1(N)}{4} - F_<}{f}\bigg)\times h\\ &= 12.5 + \bigg(\frac{\frac{1*56}{4} - 3}{12}\bigg)\times 3\\ &= 12.5 + \bigg(\frac{14 - 3}{12}\bigg)\times 3\\ &= 12.5 + \big(0.9167\big)\times 3\\ &= 12.5 + 2.75\\ &= 15.25 \text{ minutes} \end{aligned} $$
Thus, $25$ % of the students spent less than or equal to $15.25$ minutes on the internet.

Median

$$ \begin{aligned} M &=\bigg(\dfrac{N}{2}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{56}{2}\bigg)^{th}\text{ value}\\ &=\big(28\big)^{th}\text{ value} \end{aligned} $$

The cumulative frequency just greater than or equal to $28$ is $30$. The corresponding class $15.5-18.5$ is the median class.

Thus

  • $l = 15.5$, the lower limit of the median class
  • $N=56$, total number of observations
  • $f =15$, frequency of the median class
  • $F_< = 15$, cumulative frequency of the class previous to median class
  • $h =3$, the class width

The median $M$ can be computed as follows:

$$ \begin{aligned} M &= l + \bigg(\frac{\frac{N}{2} - F_<}{f}\bigg)\times h\\ &= 15.5 + \bigg(\frac{\frac{56}{2} - 15}{15}\bigg)\times 3\\ &= 15.5 + \bigg(\frac{28 - 15}{15}\bigg)\times 3\\ &= 15.5 + \big(0.8667\big)\times 3\\ &= 15.5 + 2.6\\ &= 18.1 \text{ minutes} \end{aligned} $$

Thus, $50$ % of the students spent less than or equal to $18.1$ minutes on the internet.

Third Quartile $Q_3$

$$ \begin{aligned} Q_{3} &=\bigg(\dfrac{3(N)}{4}\bigg)^{th}\text{ value}\\ &= \bigg(\dfrac{3(56)}{4}\bigg)^{th}\text{ value}\\ &=\big(42\big)^{th}\text{ value} \end{aligned} $$

The cumulative frequency just greater than or equal to $42$ is $54$. The corresponding class $18.5-21.5$ is the $3^{rd}$ quartile class.

Thus

  • $l = 18.5$, the lower limit of the $3^{rd}$ quartile class
  • $N=56$, total number of observations
  • $f =24$, frequency of the $3^{rd}$ quartile class
  • $F_< = 30$, cumulative frequency of the class previous to $3^{rd}$ quartile class
  • $h =3$, the class width

The third quartile $Q_3$ can be computed as follows:

$$ \begin{aligned} Q_3 &= l + \bigg(\frac{\frac{3(N)}{4} - F_<}{f}\bigg)\times h\\ &= 18.5 + \bigg(\frac{\frac{3*56}{4} - 30}{24}\bigg)\times 3\\ &= 18.5 + \bigg(\frac{42 - 30}{24}\bigg)\times 3\\ &= 18.5 + \big(0.5\big)\times 3\\ &= 18.5 + 1.5\\ &= 20 \text{ minutes} \end{aligned} $$
Thus, $75$ % of the students spent less than or equal to $20$ minutes on the internet.

Thus the five number summary of time spent on the internet is

$\min = 9.5$ minutes, $Q_1 = 15.25$ minutes, $\text{median }=18.1$ minutes, $Q_3=20$ minutes and $\max = 24.5$ minutes.

Hope you enjoyed the step by step solution to find five number summary for grouped data.

Do read more about step by step solution to find five number summary for ungrouped data and five number summary calculator.

If you have any doubt or queries feel free to post them in the comment section.

VRCBuzz co-founder and passionate about making every day the greatest day of life. Raju is nerd at heart with a background in Statistics. Raju looks after overseeing day to day operations as well as focusing on strategic planning and growth of VRCBuzz products and services. Raju has more than 25 years of experience in Teaching fields. He gain energy by helping people to reach their goal and motivate to align to their passion. Raju holds a Ph.D. degree in Statistics. Raju loves to spend his leisure time on reading and implementing AI and machine learning concepts using statistical models.

Leave a Comment