7/30/2019 (9) Basic Box-Plot
1/20
Applied Statistics and Computing Lab
BASIC BOX-PLOT
Applied Statistics and Computing Lab
Indian School of Business
7/30/2019 (9) Basic Box-Plot
2/20
Applied Statistics and Computing Lab
Learning goals
What are the components of a basic box-plot?
How is a basic box-plot constructed?
How to interpret it?
What are its salient features?
What are its limitations?
How is it related to Histogram? What is the effect of translation on a box-plot?
2
7/30/2019 (9) Basic Box-Plot
3/20
Applied Statistics and Computing Lab
Where does a box-plot fit in?
Exploratory Data-analytic tool for continuous
data
Visual display of certain important summary
statistics
3
7/30/2019 (9) Basic Box-Plot
4/20
Applied Statistics and Computing Lab
Why Box-plot?
Useful in Studying
Location
Spread
Distribution
Symmetry
Tail behaviour Skewness
Useful in comparison of different batches of Data, or a batch of data with
factors
Useful to study information of observations at the tails
Easy to compute and draw, yet informative
User-friendly
4
7/30/2019 (9) Basic Box-Plot
5/20
Applied Statistics and Computing Lab
Constructing a Basic Box-plot Data set 1. Suppose we have data on a batch (variable)
90, 41, 22, 135, 15, 72, 50, 26, 105
Step 1: Arrange the data in the increasing order:
15, 22, 26, 41, 50, 72, 90, 105, 135
Step 2: Get the Five-point Summary, consisting of (i) the Minimum, (ii) First
quartile ( ), (iii) Median, (iv) Third quartile ( ) and (v) the Maximum
For the above data, the Five-point Summary is:
Minimum= 15
First Quartile= 26
Median= 50
Third Quartile= 90
Maximum= 1355
7/30/2019 (9) Basic Box-Plot
6/20
Applied Statistics and Computing Lab
Constructing a Basic Box-plot (contd.) Step 3: Draw a boxof length equal to ( ). For now, we can choose the
width as per convenience. The lower and upper hinges of the box represent the
first and third quartiles. (In this case, the width is the vertical distance and hingesare the right and the left extremes)
6
7/30/2019 (9) Basic Box-Plot
7/20
Applied Statistics and Computing Lab
Constructing a Basic Box-plot (contd.) Step 4: From the middle of the lower hinge draw a line (parallel to the
lines corresponding to the length of the box) up to the minimum. Similarly
draw a line from the middle of the upper hinge (parallel to the lines
corresponding to the length of the box) up to the maximum. These lines
are called the whiskers.
Step 5: Draw a line at the median parallel to the hinges, dividing the box
into two parts.
7
7/30/2019 (9) Basic Box-Plot
8/20
Applied Statistics and Computing Lab
Horizontal and vertical Box-plots
The box-plot we saw is a horizontal box-plot
(here the scale is on the horizontal axis)
One can also have a vertical plot (where the
scale is on the vertical axis) There is no specific advantage of one over the
other, in general
8
7/30/2019 (9) Basic Box-Plot
9/20
Applied Statistics and Computing Lab
What can we get from a basic Box-
plot?
We can obviously get the position of the
location measure, spread and where themiddle 50% of observations are concentrated,
in a visual display
After all, a picture is worth a thousand words
What else can we get ?
9
7/30/2019 (9) Basic Box-Plot
10/20
Applied Statistics and Computing Lab
What can we get from a basic Box-
plot? (contd.)
10
Visuals from Aczel A., Sounderpandian J. Complete business statistics
7/30/2019 (9) Basic Box-Plot
11/20
Applied Statistics and Computing Lab
Interpretation of the basic box-plot
Thus the distribution of the Dataset 1 is right skewed.
11
7/30/2019 (9) Basic Box-Plot
12/20
Applied Statistics and Computing Lab
Basic Box-plot: Features and limitations
Features
Basic box-plot displays the location (median), intervals required for thefirst, second, third and fourth quarters of the data
It visually shows where the middle 50% of data is located
It tells us whether the data is symmetric or left skewed or right skewed
Limitations We cannot get modal information
We cannot identify unusual observations
It is hard to identify the tail behaviour
12
7/30/2019 (9) Basic Box-Plot
13/20
Applied Statistics and Computing Lab
Histogram and box-plot
In histogram, the width of the interval is fixed and the height
of the vertical bar is proportional to the (relative) frequency inthat interval
In a box-plot the relative frequency is fixed at 25% and the
intervals correspond to the first, second, third and fourth 25%
of the relative frequencies. More precisely, The left and right whiskers correspond to the first and fourth 25%
The part of the box from the first quartile to the median corresponds
to the second 25%, and
The part of the box from the median to the third quartile corresponds
to the third 25%
13
7/30/2019 (9) Basic Box-Plot
14/20
Applied Statistics and Computing Lab
Scores dataset
Comprises 50 students scores in their second
semester exam of the second course inQuantitative methods
We also have their GPA of first semester exam
and scores in 3 minors of the subject
= 60 =
20 = 1,2,3
= ( 10)
14
7/30/2019 (9) Basic Box-Plot
15/20
Applied Statistics and Computing Lab
Histogram and box-plot (contd.)
15
Box-plot of scoresHistogram of scores
7/30/2019 (9) Basic Box-Plot
16/20
Applied Statistics and Computing Lab
Histogram and box-plot (contd.)
Box-plot readily gives the information on the 5-point
summary and the information about skewness. However, it is
not possible to get information about the mode(s)
Histogram readily give information about mode(s). But it takes
some effort to extract information about the 5-point summary
Thus the two plots complement each other!
16
7/30/2019 (9) Basic Box-Plot
17/20
Applied Statistics and Computing Lab
Effect of translation Consider the box-plot of a variable X. Suppose we translate
from X to Y = aX + b where a is positive, the features of the
box-plot do not change
If we translate from X to Y = aX + b where a is negative, then
the features of the box-plot would be the same as those of
boxplot of (X)
We shall demonstrate this by getting the box-plots of ,
(3 +5) and (-3 +5) for the Scores data set ( is the score
in the 2nd
minor)
17
7/30/2019 (9) Basic Box-Plot
18/20
Applied Statistics and Computing Lab
Effect of translation
18
Box-plot of (3 +5)Box-plot of Box-plot of (-3 +5)
7/30/2019 (9) Basic Box-Plot
19/20
Applied Statistics and Computing Lab
R-codes The R-code for box-plot is boxplot(variable
name)
19
7/30/2019 (9) Basic Box-Plot
20/20
Applied Statistics and Computing Lab
Thank you
Top Related