Charts and graphs are the fundamental components of modern researching. Graphs, histograms, scatter plots, etc. are considered to be the most effective approaches to summarise and display large amounts of numerical data. In addition, they can also be used to represent patterns, trends, and relationships between the variables.
In statistics, different charts, graphs, etc. serve different purposes. While a scatter plot is used to represent two variables for data set, the line plot represents information as a set of points in simple studies. However, complex studies involve computations which include data points clustered around the central/fundamental value. Representation of such data distribution demands an advanced method known as box & whisker plot.
So what is a box & whisker plot and how to draw it?
A box & whisker plot is a graphical approach used to display variation in data sets. This method provides additional details while enabling the multiple data sets to be displayed in the same graph.
Some of the steps involved in developing a box & whisker plot are:
1. Data collection and organization
In the initial step, ensure that you have access to all the necessary information. Collect the data and then organize it from the lowest numerical to the highest numerical value. Doing so will enable you to transfer the data onto the plot and different quartiles.
2. Median calculation
Calculate the median number to obtain the most accurate results. The median number will be the second quartile. The median is calculated by adding the two middle numbers and dividing the result by two. Organize the data with equal points on each side of the central plot point.
3. Quartile calculation
In the third step, calculate the third and fourth quartile. To calculate these quartiles, utilize the middle from above and below the median.
4. Developing the plot line
After calculating the quartiles, draw a long plotline from the smallest to the largest number in which you can record all the data. The smallest to the largest number should not be an outlier.
5. Creating a line for each quartile
Using the median number marks the center of the plotline. Next, choose the first and third quartile values and then add it to the plot line with vertical & short lines. If a horizontal plot and whisker plot is being developed, then the first quartile must be on the left of the plotline and third quartile on the right.
6. Developing box to connect the quartile
To highlight the data, develop a box to connect the first and third quartiles. Connect the first line of the first quartile with the last line of the third quartile to obtain a box. The box created will now have two sections.
7. Recording & connecting quartile
Extend the plotline to the minimum to the maximum numbers in the data set. Use dots if the numbers are at a distance.
Once the box is drawn, and the data is recorded, the next step is to interpret the box & whisker plot data. While interpreting consider the following sections of the plot.
- Median - Median or second quartile is the midpoint of the data. This point lets you identify the average of the data.
- Interquartile range - This measures the difference between the third & first quartile. The first quartile is subtracted from the third quartile to obtain an average number which isn’t included in the outliers.
- Whiskers plot - It extends from the center to the outer range of data. It may also extend to the outliers in some cases. The whisker will let you identify the greatest and lowest outliers in the data.
- Upper quartile - The third or upper quartile is the higher 75% of data points. It represents the average of the highest data points.
- Lower quartile - The first or the lowest quartile includes 25% lower data points. This represents the average where the lowest data points fall.
Why use a box & whisker plot?
Besides representing the data distribution, some of the benefits offered by box & whisker plot include displaying of outliers, identifying the data trend as a whole and faster identification of trends and their differences.