Box whisker show
Click here to load reader
-
Upload
d-dutta-roy -
Category
Technology
-
view
2.007 -
download
0
Transcript of Box whisker show
Application of Box-whisker Plot in Psychological Research
Dr. D. Dutta Roy, Ph.D.
Psychology Research UnitINDIAN STATISTICAL INSTITUTE
203, B.T. Road, Kolkata – 700 108E-mail: [email protected]
http://www.isical.ac.in/~ddroyVenue: Psychology Research Unit, ISI., Kolkata
Dr. D. Dutta Roy, Indian Statistical Institute
Box-Whisker Plot
• It is a plot that displays summary information about the distribution of the values.
• SPSS and STATISTICA are useful statistical software to draw box whisker plot.
JOHN WILDER TUCKY(1915-2000)
Dr. D. Dutta Roy, Indian Statistical Institute
PROPERTIES
Dr. D. Dutta Roy, Indian Statistical Institute
HINGES
• There are two hinges 25th and 75th percentiles. • The lowest boundary of the box is the 25th
percentile and upper boundary of the box is 75th percentile.
• Horizontal line inside the box represents the median. 50% of the cases are included within the box.
Dr. D. Dutta Roy, Indian Statistical Institute
Whiskers
The largest and smallest observed values that are not outliers are shown in lines.
Lines are drawn from the ends of the box to these values. These lines are called whiskers.
Dr. D. Dutta Roy, Indian Statistical Institute
OUTLYING VALUES
:cases with values that are more than three box-lengths from the upper or lower age of the box are called extreme values.
These are designated with an asterisk(*) and O.
Cases with values that are between 1.5 and 3 box-lengths from the upper or lower age of the box are called outliers and designated with a circle or O
6N =
DATA
6
5
4
3
2
1
0
6
6N =
DATA
12
10
8
6
4
2
0
6
3 box length
1.5 box length
Dr. D. Dutta Roy, Indian Statistical Institute
Normal Probability Curve Properties
• Mean, Median, Mode values = 0.• The mean, median and the mode all
coincide and there is perfect balance between the right and left halves of the curve.
• Between the Mean and ( + - 1 SD) or the middle two-thirds = 68.27% of total cases.
• Between the Mean and (+ - 2 SD) = 95% of total cases.
• Between the Mean and (+ - 3 SD) = 99.7% or 100% of total cases.
• Skewness = 0.• Positive skewness = When distribution
spreads to the left, it is negatively skewed and positive skewness is opposite.
• Peakedness =Mesokurtic.
Dr. D. Dutta Roy, Indian Statistical Institute
Box-Plot and NPC
Dr. D. Dutta Roy, Indian Statistical Institute
Application
• Outlier detection• Detecting changes in learning process
Dr. D. Dutta Roy, Indian Statistical Institute
What is outlier ?
• Outliers are observations with a unique combination of characteristics identifiable as distinctly different from the other observations.
Do you find outliers in the pictures ?
Dr. D. Dutta Roy, Indian Statistical Institute
Correlations income
incomePearson Correlation 1
Sig. (2-tailed) N 60
expenditurePearson Correlation 0.16988
Sig. (2-tailed) 0.20646 N 57
Correlations (After Eliminating 99999) income
incomePearson Correlation 1
Sig. (2-tailed) N 58
expenditurePearson Correlation 0.50
Sig. (2-tailed) 0.00 N 55 **. Correlation is significant at the 0.01 level (2-tailed).
Impact of outliers
Dr. D. Dutta Roy, Indian Statistical Institute
Types of Outliers
Dr. D. Dutta Roy, Indian Statistical Institute
1. Procedural Error
This is data entry error for mistake in coding.
Dr. D. Dutta Roy, Indian Statistical Institute
Correlations income
incomePearson Correlation 1
Sig. (2-tailed) N 60
expenditurePearson Correlation 0.16988
Sig. (2-tailed) 0.20646 N 57
Correlations (After Eliminating 99999) income
incomePearson Correlation 1
Sig. (2-tailed) N 58
expenditurePearson Correlation 0.50
Sig. (2-tailed) 0.00 N 55 **. Correlation is significant at the 0.01 level (2-tailed).
2. Extra ordinary event and researcher has own explanation.
Dr. D. Dutta Roy, Indian Statistical Institute
3. Extra ordinary event and researcher has no explanation.
4. Observations that fall within the ordinary range of values or each of the variables but are unique in their combination of values across the variables.
Dr. D. Dutta Roy, Indian Statistical Institute
Is outlier harmful ? • Outliers can not be
categorized as either beneficial or problematic, but instead must be viewed within the context of the analysis and should be evaluated by the types of information they may provide.
Dr. D. Dutta Roy, Indian Statistical Institute
Can outlier be detected ? • Robust statistics like correlation is seriously affected by the outliers. Therefore outlier
detection is prelude for item analysis, or testing reliability and validity of the psychological instrument using correlation coefficients.
• In univariate statistics, Outlier can be detected by stem-leaf plot and box-whisker plots. • In bivariate statistics, scatter plot and in multivariate statistics, Mahalanobis D2 is
useful for outlier detection.
6N =
DATA
12
10
8
6
4
2
0
6
Dr. D. Dutta Roy, Indian Statistical Institute
The Information out of properties
The box-plot contains an impressive amount of information. • From the median one can determine the central tendency or location. • From the length of the box one can determine the spread, or
variability, of observation. • If the median is not in the centre of the box, the observed values are
skewed. • If the median is closer to the bottom of the box than to the top, the
data are positively skewed. • If the median is closer to the top of the box than to the bottom the
distribution is negatively skewed. • The length of the tail is shown by the whiskers and the outline and
extreme points.
Dr. D. Dutta Roy, Indian Statistical Institute
CASE STUDY ONAPPLICATION OF BOX-
WHISKER PLOT IN DETECTING CHANGE IN LEARNING
PROCESS
Dr. D. Dutta Roy, Indian Statistical Institute
Detecting change in learning process
• Learning is the modification of behaviour through practice and experience.
• Change in learning process can be usually detected using Learning curve.
• A learning curve is a graphical representation of the changing rate of learning for a given activity or tool.
• Typically, the increase in retention of information is sharpest after the initial attempts, and then gradually evens out, meaning that less and less new information is retained after each repetition.
Dr. D. Dutta Roy, Indian Statistical Institute
CASE STUDY
25 students were trained with 7 training modules of Fast ForWord.
Results were analyzed in terms of box-whisker plots.
Dr. D. Dutta Roy, Indian Statistical Institute
Circus Sequence (CS)
• The participant develops listening accuracy by presenting sweep sounds at different frequencies, durations, and with different lengths of time between sounds. The frequencies and durations of the sound sweeps correspond to the rapid transitions in the sounds of the English language.
Dr. D. Dutta Roy, Indian Statistical Institute
Results of CS
Min-Max
25%-75%
Median value
Box & Whisker Plot (CS exercise, Treatments = 34)
Per
cent
age
of S
ucce
ss
-10
10
30
50
70
90
110
T1T2
T3T4
T5T6
T7T8
T9T10
T11T12
T13T14
T15T16
T17T18
T19T20
T21T22
T23T24
T25T26
T27T28
T29T30
T31T32
T33T34
Size: Box size gradually becomes larger indicating inclusion of more number of cases in learning competency group. Location of median: Median moves upward with successive trials. This indicates successive learning competency across trials.Whiskers: Upper whisker gradually vanishes and lower whisker moves upward. This indicates achievement of learning competency of most cases though few cases found difficulty to achieve. After 100% achievement, box size increase indicating fluctuation of attention or operation of other intervening factors operate when one achieves the goal.
Dr. D. Dutta Roy, Indian Statistical Institute
Old MacDonald’s Flying Farm (OM)
• Students use the computer mouse to catch and hold a flying animal. The animal repeats a single syllable several times, and students must release the animal when they hear a change in the syllable.
Dr. D. Dutta Roy, Indian Statistical Institute
Results of OM
Min-Max
25%-75%
Median value
Box & Whisker Plot (OM exercise, Treatment = 20)
Per
cent
age
of S
ucce
ss
-10
0
10
20
30
40
50
60
70
80
90
100
110
T1T2
T3T4
T5T6
T7T8
T9T10
T11T12
T13T14
T15T16
T17T18
T19T20
Dr. D. Dutta Roy, Indian Statistical Institute
Phonic Words (PW) • Students see two
pictures representing two similar words that differ only by initial or final consonant (“tack” versus “tag”). When students hear the word representing one of the pictures, they must click the picture that matches the word
Min-Max
25%-75%
Median value
Box & Whisker Plot (PW Exercise, Treatment = 20)
Per
cent
age
of S
ucce
ss
-10
10
30
50
70
90
110
T1T2
T3T4
T5T6
T7T8
T9T10
T11T12
T13T14
T15T16
T17T18
T19T20
Dr. D. Dutta Roy, Indian Statistical Institute
Compare relative effectiveness of training modules
Min-Max
25%-75%
Median value
Box & Whisker Plot (CS exercise, Treatments = 34)
Per
cent
age
of S
ucce
ss
-10
10
30
50
70
90
110
T1T2
T3T4
T5T6
T7T8
T9T10
T11T12
T13T14
T15T16
T17T18
T19T20
T21T22
T23T24
T25T26
T27T28
T29T30
T31T32
T33T34
Min-Max
25%-75%
Median value
Box & Whisker Plot (OM exercise, Treatment = 20)
Per
cent
age
of S
ucce
ss
-10
0
10
20
30
40
50
60
70
80
90
100
110
T1T2
T3T4
T5T6
T7T8
T9T10
T11T12
T13T14
T15T16
T17T18
T19T20
Min-Max
25%-75%
Median value
Box & Whisker Plot (PW Exercise, Treatment = 20)
Per
cent
age
of S
ucce
ss-10
10
30
50
70
90
110
T1T2
T3T4
T5T6
T7T8
T9T10
T11T12
T13T14
T15T16
T17T18
T19T20
Dr. D. Dutta Roy, Indian Statistical Institute
SUMMARY
• Box-whisker plot is useful statistical tool to detect outliers and to detect change in the learning process.
• Box plot is effective statistical tool to compare relative effectiveness of different training modules.