Box whisker show

28

Click here to load reader

Transcript of Box whisker show

Page 1: Box whisker show

Application of Box-whisker Plot in Psychological Research

Dr. D. Dutta Roy, Ph.D.

Psychology Research UnitINDIAN STATISTICAL INSTITUTE

203, B.T. Road, Kolkata – 700 108E-mail: [email protected]

http://www.isical.ac.in/~ddroyVenue: Psychology Research Unit, ISI., Kolkata

Page 2: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Box-Whisker Plot

• It is a plot that displays summary information about the distribution of the values.

• SPSS and STATISTICA are useful statistical software to draw box whisker plot.

JOHN WILDER TUCKY(1915-2000)

Page 3: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

PROPERTIES

Page 4: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

HINGES

• There are two hinges 25th and 75th percentiles. • The lowest boundary of the box is the 25th

percentile and upper boundary of the box is 75th percentile.

• Horizontal line inside the box represents the median. 50% of the cases are included within the box.

Page 5: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Whiskers

The largest and smallest observed values that are not outliers are shown in lines.

Lines are drawn from the ends of the box to these values. These lines are called whiskers.

Page 6: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

OUTLYING VALUES

:cases with values that are more than three box-lengths from the upper or lower age of the box are called extreme values.

These are designated with an asterisk(*) and O.

Cases with values that are between 1.5 and 3 box-lengths from the upper or lower age of the box are called outliers and designated with a circle or O

6N =

DATA

6

5

4

3

2

1

0

6

6N =

DATA

12

10

8

6

4

2

0

6

3 box length

1.5 box length

Page 7: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Normal Probability Curve Properties

• Mean, Median, Mode values = 0.• The mean, median and the mode all

coincide and there is perfect balance between the right and left halves of the curve.

• Between the Mean and ( + - 1 SD) or the middle two-thirds = 68.27% of total cases.

• Between the Mean and (+ - 2 SD) = 95% of total cases.

• Between the Mean and (+ - 3 SD) = 99.7% or 100% of total cases.

• Skewness = 0.• Positive skewness = When distribution

spreads to the left, it is negatively skewed and positive skewness is opposite.

• Peakedness =Mesokurtic.

Page 8: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Box-Plot and NPC

Page 9: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Application

• Outlier detection• Detecting changes in learning process

Page 10: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

What is outlier ?

• Outliers are observations with a unique combination of characteristics identifiable as distinctly different from the other observations.

Do you find outliers in the pictures ?

Page 11: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Correlations    income  

incomePearson Correlation 1 

  Sig. (2-tailed)    N 60 

expenditurePearson Correlation 0.16988 

  Sig. (2-tailed) 0.20646   N 57 

Correlations (After Eliminating 99999)    income  

incomePearson Correlation 1 

  Sig. (2-tailed)    N 58 

expenditurePearson Correlation 0.50 

  Sig. (2-tailed) 0.00   N 55 **. Correlation is significant at the 0.01 level (2-tailed).

Impact of outliers

Page 12: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Types of Outliers

Page 13: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

1. Procedural Error

This is data entry error for mistake in coding.

Page 14: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Correlations    income  

incomePearson Correlation 1 

  Sig. (2-tailed)    N 60 

expenditurePearson Correlation 0.16988 

  Sig. (2-tailed) 0.20646   N 57 

Correlations (After Eliminating 99999)    income  

incomePearson Correlation 1 

  Sig. (2-tailed)    N 58 

expenditurePearson Correlation 0.50 

  Sig. (2-tailed) 0.00   N 55 **. Correlation is significant at the 0.01 level (2-tailed).

2. Extra ordinary event and researcher has own explanation.

Page 15: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

3. Extra ordinary event and researcher has no explanation.

4. Observations that fall within the ordinary range of values or each of the variables but are unique in their combination of values across the variables.

Page 16: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Is outlier harmful ? • Outliers can not be

categorized as either beneficial or problematic, but instead must be viewed within the context of the analysis and should be evaluated by the types of information they may provide.

Page 17: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Can outlier be detected ? • Robust statistics like correlation is seriously affected by the outliers. Therefore outlier

detection is prelude for item analysis, or testing reliability and validity of the psychological instrument using correlation coefficients.

• In univariate statistics, Outlier can be detected by stem-leaf plot and box-whisker plots. • In bivariate statistics, scatter plot and in multivariate statistics, Mahalanobis D2 is

useful for outlier detection.

6N =

DATA

12

10

8

6

4

2

0

6

Page 18: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

The Information out of properties

The box-plot contains an impressive amount of information. • From the median one can determine the central tendency or location. • From the length of the box one can determine the spread, or

variability, of observation. • If the median is not in the centre of the box, the observed values are

skewed. • If the median is closer to the bottom of the box than to the top, the

data are positively skewed. • If the median is closer to the top of the box than to the bottom the

distribution is negatively skewed. • The length of the tail is shown by the whiskers and the outline and

extreme points.

Page 19: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

CASE STUDY ONAPPLICATION OF BOX-

WHISKER PLOT IN DETECTING CHANGE IN LEARNING

PROCESS

Page 20: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Detecting change in learning process

• Learning is the modification of behaviour through practice and experience.

• Change in learning process can be usually detected using Learning curve.

• A learning curve is a graphical representation of the changing rate of learning for a given activity or tool.

• Typically, the increase in retention of information is sharpest after the initial attempts, and then gradually evens out, meaning that less and less new information is retained after each repetition.

Page 21: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

CASE STUDY

25 students were trained with 7 training modules of Fast ForWord.

Results were analyzed in terms of box-whisker plots.

Page 22: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Circus Sequence (CS)

• The participant develops listening accuracy by presenting sweep sounds at different frequencies, durations, and with different lengths of time between sounds. The frequencies and durations of the sound sweeps correspond to the rapid transitions in the sounds of the English language.

Page 23: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Results of CS

Min-Max

25%-75%

Median value

Box & Whisker Plot (CS exercise, Treatments = 34)

Per

cent

age

of S

ucce

ss

-10

10

30

50

70

90

110

T1T2

T3T4

T5T6

T7T8

T9T10

T11T12

T13T14

T15T16

T17T18

T19T20

T21T22

T23T24

T25T26

T27T28

T29T30

T31T32

T33T34

Size: Box size gradually becomes larger indicating inclusion of more number of cases in learning competency group. Location of median: Median moves upward with successive trials. This indicates successive learning competency across trials.Whiskers: Upper whisker gradually vanishes and lower whisker moves upward. This indicates achievement of learning competency of most cases though few cases found difficulty to achieve. After 100% achievement, box size increase indicating fluctuation of attention or operation of other intervening factors operate when one achieves the goal.

Page 24: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Old MacDonald’s Flying Farm (OM)

• Students use the computer mouse to catch and hold a flying animal. The animal repeats a single syllable several times, and students must release the animal when they hear a change in the syllable.

Page 25: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Results of OM

Min-Max

25%-75%

Median value

Box & Whisker Plot (OM exercise, Treatment = 20)

Per

cent

age

of S

ucce

ss

-10

0

10

20

30

40

50

60

70

80

90

100

110

T1T2

T3T4

T5T6

T7T8

T9T10

T11T12

T13T14

T15T16

T17T18

T19T20

Page 26: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Phonic Words (PW) • Students see two

pictures representing two similar words that differ only by initial or final consonant (“tack” versus “tag”). When students hear the word representing one of the pictures, they must click the picture that matches the word

Min-Max

25%-75%

Median value

Box & Whisker Plot (PW Exercise, Treatment = 20)

Per

cent

age

of S

ucce

ss

-10

10

30

50

70

90

110

T1T2

T3T4

T5T6

T7T8

T9T10

T11T12

T13T14

T15T16

T17T18

T19T20

Page 27: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

Compare relative effectiveness of training modules

Min-Max

25%-75%

Median value

Box & Whisker Plot (CS exercise, Treatments = 34)

Per

cent

age

of S

ucce

ss

-10

10

30

50

70

90

110

T1T2

T3T4

T5T6

T7T8

T9T10

T11T12

T13T14

T15T16

T17T18

T19T20

T21T22

T23T24

T25T26

T27T28

T29T30

T31T32

T33T34

Min-Max

25%-75%

Median value

Box & Whisker Plot (OM exercise, Treatment = 20)

Per

cent

age

of S

ucce

ss

-10

0

10

20

30

40

50

60

70

80

90

100

110

T1T2

T3T4

T5T6

T7T8

T9T10

T11T12

T13T14

T15T16

T17T18

T19T20

Min-Max

25%-75%

Median value

Box & Whisker Plot (PW Exercise, Treatment = 20)

Per

cent

age

of S

ucce

ss-10

10

30

50

70

90

110

T1T2

T3T4

T5T6

T7T8

T9T10

T11T12

T13T14

T15T16

T17T18

T19T20

Page 28: Box whisker show

Dr. D. Dutta Roy, Indian Statistical Institute

SUMMARY

• Box-whisker plot is useful statistical tool to detect outliers and to detect change in the learning process.

• Box plot is effective statistical tool to compare relative effectiveness of different training modules.