Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

download Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

of 35

Transcript of Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    1/35

    Chapter Eighteen

    Discriminant Analysis

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    2/35

    18-2

    Chapter Outline

    1) Overview

    2) Basic Concept

    3) Relation to Regression and ANOVA

    4) Discriminant Analysis Model

    5) Statistics Associated with Discriminant Analysis

    6) Conducting Discriminant Analysis

    i. Formulation

    ii. Estimationiii. Determination of Significance

    iv. Interpretation

    v. Validation

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    3/35

    18-3

    Chapter Outline

    7) Multiple Discriminant Analysis

    i. Formulation

    ii. Estimation

    iii. Determination of Significance

    iv. Interpretation

    v. Validation

    8) Stepwise Discriminant Analysis

    9) Internet and Computer Applications10) Focus on Burke

    11) Summary

    12) Key Terms and Concepts

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    4/35

    18-4Similarities and Differences between ANOVA,Regression, and Discriminant Analysis

    ANOVA REGRESSION DISCRIMINANT ANALYSIS

    SimilaritiesNumber of One One Onedependent

    variablesNumber ofindependent Multiple Multiple Multiplevariables

    DifferencesNature of the

    dependent Metric Metric CategoricalvariablesNature of theindependent Categorical Metric Metricvariables

    Table 18.1

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    5/35

    18-5

    Discriminant AnalysisDiscriminant analysis is a technique for analyzing data

    when the criterion or dependent variable is categoricaland the predictor or independent variables are interval innature.

    The objectives of discriminant analysis are as follows:

    Development ofdiscriminant functions, or linearcombinations of the predictor or independent variables,which will best discriminate between the categories of thecriterion or dependent variable (groups).

    Examination of whether significant differences exist

    among the groups, in terms of the predictor variables. Determination of which predictor variables contribute to

    most of the intergroup differences.

    Classification of cases to one of the groups based on thevalues of the predictor variables.

    Evaluation of the accuracy of classification.

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    6/35

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    7/35

    18-7

    Discriminant Analysis Model

    The discriminant analysis model involves linear combinations of

    the following form:

    D= b0 + b1X1 + b2X2 + b3X3 + . . . + bkXk

    where

    D = discriminant score

    b's = discriminant coefficient or weight

    X's = predictor or independent variable

    The coefficients, or weights (b), are estimated so that thegroups differ as much as possible on the values of thediscriminant function.

    This occurs when the ratio of between-group sum of squares towithin-group sum of squares for the discriminant scores is at a

    maximum.

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    8/35

    18-8

    Canonical correlation. Canonical correlationmeasures the extent of association between thediscriminant scores and the groups. It is a measureof association between the single discriminantfunction and the set of dummy variables that definethe group membership.

    Centroid. The centroid is the mean values for thediscriminant scores for a particular group. There areas many centroids as there are groups, as there isone for each group. The means for a group on all

    the functions are the group centroids. Classification matrix. Sometimes also called

    confusionorprediction matrix, the classificationmatrix contains the number of correctly classified andmisclassified cases.

    Statistics Associated with Discriminant Analysis

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    9/35

    18-9

    Discriminant function coefficients. Thediscriminant function coefficients (unstandardized)are the multipliers of variables, when the variablesare in the original units of measurement.

    Discriminant scores. The unstandardized

    coefficients are multiplied by the values of thevariables. These products are summed and added tothe constant term to obtain the discriminant scores.

    Eigenvalue. For each discriminant function, the

    Eigenvalue is the ratio of between-group to within-group sums of squares. Large Eigenvalues implysuperior functions.

    Statistics Associated with Discriminant Analysis

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    10/35

    18-10

    Fvalues and their significance. These arecalculated from a one-way ANOVA, with the groupingvariable serving as the categorical independentvariable. Each predictor, in turn, serves as the metricdependent variable in the ANOVA.

    Group means and group standard deviations.These are computed for each predictor for eachgroup.

    Pooled within-group correlation matrix. The

    pooled within-group correlation matrix is computedby averaging the separate covariance matrices for allthe groups.

    Statistics Associated with Discriminant Analysis

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    11/35

    18-11

    Standardized discriminant function coefficients. The

    standardized discriminant function coefficients are thediscriminant function coefficients and are used as the multiplierswhen the variables have been standardized to a mean of 0 anda variance of 1.

    Structure correlations. Also referred to as discriminantloadings, the structure correlations represent the simplecorrelations between the predictors and the discriminantfunction.

    Total correlation matrix. If the cases are treated as if theywere from a single sample and the correlations computed, atotal correlation matrix is obtained.

    Wilks' . Sometimes also called the Ustatistic, Wilks' foreach predictor is the ratio of the within-group sum of squares tothe total sum of squares. Its value varies between 0 and 1.Large values of (near 1) indicate that group means do notseem to be different. Small values of (near 0) indicate thatthe group means seem to be different.

    Statistics Associated with Discriminant Analysis

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    12/35

    18-12

    Conducting Discriminant AnalysisFig. 18.1

    Assess Validity of Discriminant Analysis

    Estimate the Discriminant Function Coefficients

    Determine the Significance of the Discriminant Function

    Formulate the Problem

    Interpret the Results

    18 13

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    13/35

    18-13Conducting Discriminant AnalysisFormulate the Problem

    Identify the objectives, the criterion variable, and the

    independent variables. The criterion variable must consist of two or more

    mutually exclusive and collectively exhaustive categories.

    The predictor variables should be selected based on atheoretical model or previous research, or the experienceof the researcher.

    One part of the sample, called the estimation or analysissample, is used for estimation of the discriminantfunction.

    The other part, called the holdoutor validation sample,is reserved for validating the discriminant function.

    Often the distribution of the number of cases in theanalysis and validation samples follows the distribution inthe total sample.

    18 14

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    14/35

    18-14

    A Sample Problem

    Suppose we want to determine thesalient characteristics of families thathave already visited a vacation resortduring the last two years.

    Data were obtained from a pretestsample of 42 households. Of these, 30households were included in the

    analysis sample and the remaining 12were part of the validation sample.

    18 15

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    15/35

    18-15

    Variables in Data Set

    DEPENDENT or CRITERION variable The households that visited a resort during the

    last two years are coded as 1; those that didnot, as 2 (VISIT variable as criterion with twocategories).

    INDEPENDENT or PREDICTORvariables Annual family income (INCOME) Attitude toward travel (TRAVEL)

    Importance attached to family vacation(VACATION) Household size (HSIZE) Age of the head of household (AGE)

    18 16

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    16/35

    18-16

    Information on Resort Visits: Analysis Sample

    Table 18.2

    Annual Attitude Importance Household Age of AmountResort Family Toward Attached Size Head of Spent on

    No. Visit Income Travel to Family Household Family($000) Vacation Vacation

    1 1 50.2 5 8 3 43 M (2)2 1 70.3 6 7 4 61 H (3)

    3 1 62.9 7 5 6 52 H (3)4 1 48.5 7 5 5 36 L (1)5 1 52.7 6 6 4 55 H (3)6 1 75.0 8 7 5 68 H (3)7 1 46.2 5 3 3 62 M (2)8 1 57.0 2 4 6 51 M (2)9 1 64.1 7 5 4 57 H (3)

    10 1 68.1 7 6 5 45 H (3)11 1 73.4 6 7 5 44 H (3)12 1 71.9 5 8 4 64 H (3)13 1 56.2 1 8 6 54 M (2)14 1 49.3 4 2 3 56 H (3)15 1 62.0 5 6 2 58 H (3)

    18 17

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    17/35

    18-17

    Annual Attitude Importance Household Age of AmountResort Family Toward Attached Size Head of Spent on

    No. Visit Income Travel to Family Household Family($000) Vacation Vacation

    16 2 32.1 5 4 3 58 L (1)17 2 36.2 4 3 2 55 L (1)

    18 2 43.2 2 5 2 57 M (2)19 2 50.4 5 2 4 37 M (2)20 2 44.1 6 6 3 42 M (2)21 2 38.3 6 6 2 45 L (1)22 2 55.0 1 2 2 57 M (2)23 2 46.1 3 5 3 51 L (1)24 2 35.0 6 4 5 64 L (1)25 2 37.3 2 7 4 54 L (1)

    26 2 41.8 5 1 3 56 M (2)27 2 57.0 8 3 2 36 M (2)28 2 33.4 6 8 2 50 L (1)29 2 37.5 3 2 3 48 L (1)30 2 41.3 3 3 2 42 L (1)

    Information on Resort Visits: Analysis Sample

    Table 18.2 cont.

    18 18I f ti R t Vi it

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    18/35

    18-18Information on Resort Visits:Holdout Sample

    Table 18.3

    Annual Attitude Importance Household Age of AmountResort Family Toward Attached Size Head of Spent on

    No. Visit Income Travel to Family Household Family($000) Vacation Vacation

    1 1 50.8 4 7 3 45 M(2)2 1 63.6 7 4 7 55 H (3)3 1 54.0 6 7 4 58 M(2)4 1 45.0 5 4 3 60 M(2)5 1 68.0 6 6 6 46 H (3)6 1 62.1 5 6 3 56 H (3)7 2 35.0 4 3 4 54 L (1)

    8 2 49.6 5 3 5 39 L (1)9 2 39.4 6 5 3 44 H (3)10 2 37.0 2 6 5 51 L (1)11 2 54.5 7 3 3 37 M(2)12 2 38.2 2 2 3 49 L (1)

    18 19C d ti Di i i t A l i

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    19/35

    18-19Conducting Discriminant AnalysisEstimate the Discriminant Function Coefficients

    The direct method involves estimating thediscriminant function so that all the predictors areincluded simultaneously.

    In stepwise discriminant analysis, the predictorvariables are entered sequentially, based on their

    ability to discriminate among groups.

    18-20R lt f T G Di i i t A l i

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    20/35

    18-20Results of Two-Group Discriminant Analysis(Table 18.4)

    Contd.

    Group Statistics

    60.520 5.400 5.800 4.333 53.7339.8307 1.9198 1.8205 1.2344 8.7706

    15 15 15 15 15

    15.000 15.000 15.000 15.000 15.000

    41.913 4.333 4.067 2.800 50.133

    7.5511 1.9518 2.0517 .9411 8.2710

    15 15 15 15 15

    15.000 15.000 15.000 15.000 15.000

    51.217 4.867 4.933 3.567 51.933

    12.7952 1.9780 2.0998 1.3309 8.5740

    30 30 30 30 3030.000 30.000 30.000 30.000 30.000

    MeanStd. Deviation

    Unw eighted

    Weighted

    Valid N (listw ise)

    Mean

    Std. Deviation

    Unw eighted

    Weighted

    Valid N (listw ise)

    Mean

    Std. Deviation

    Unw eightedWeighted

    Valid N (listw ise)

    visit

    1

    2

    Total

    income travel vacation hs ize age

    Pooled Within-Groups Matrices

    1.000 .197 .091 .089 -.014

    .197 1.000 .084 -.017 -.197

    .091 .084 1.000 .070 .017

    .089 -.017 .070 1.000 -.043

    -.014 -.197 .017 -.043 1.000

    income

    travel

    vacation

    hsize

    age

    Correlation

    income travel vacation hs ize age

    Tes ts of Equality of Group M eans

    .453 33.796 1 28 .000

    .925 2.277 1 28 .143

    .824 5.990 1 28 .021

    .657 14.636 1 28 .001

    .954 1.338 1 28 .257

    income

    travel

    vacation

    hsize

    age

    Wilks'Lambda F df1 df2 Sig.

    18-21R lt f T G Di i i t A l i

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    21/35

    18 21Results of Two-Group Discriminant Analysis(Table 18.4 contd.)

    Contd.

    Eigenvalues

    1.786a 100.0 100.0 .801

    Function

    1

    Eigenvalue % of Variance Cumulative %CanonicalCorrelation

    First 1 canonical discriminant functions w ere used in the

    analysis.

    a.

    Wilks' Lam bda

    .359 26.130 5 .000Test of Function(s)1

    Wilks'Lambda Chi-s quare df Sig.

    ndardize d Canonical Discrim inant Function Coef ficie

    .743

    .096

    .233

    .469

    .209

    income

    travel

    vacation

    hsize

    age

    1

    Function

    Structure Matrix

    .822

    .541

    .346

    .213

    .164

    income

    hsize

    vacation

    travel

    age

    1

    Function

    Pooled w ithin-groups correlations betw een discriminatingvariables and standardized canonical discriminant functionsVariables ordered by absolute size of correlation w ithin functio

    18-22R lt f T G Di i i t A l i

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    22/35

    18 22Results of Two-Group Discriminant Analysis(Table 18.4 contd.)

    Contd.

    nonical Dis crimin ant Function Coef ficie n

    .085

    .050

    .120

    .427

    .025

    -7.975

    income

    travel

    vacation

    hsize

    age

    (Constant)

    1

    Function

    Unstandardized coefficients

    nctions at Group Ce ntroid

    1.291

    -1.291

    visit1

    2

    1

    Function

    Unstandardized c anonical discriminantfunc tions evaluated at group means

    18-23R lt f T G Di i i t A l i

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    23/35

    18 23Results of Two-Group Discriminant Analysis(Table 18.4)

    Prior Pr obabilities for Grou ps

    .500 15 15.000

    .500 15 15.000

    1.000 30 30.000

    visit

    1

    2

    Total

    Prior Unw eighted Weighted

    Cases Used in Analysis

    lassification Function Coe fficient

    .678 .459

    1.509 1.381

    .938 .628

    3.322 2.218

    .832 .768

    -57.532 -36.936

    income

    travel

    vacation

    hsize

    age

    (Constant)

    1 2

    visit

    Fisher's linear discriminant functions

    Classification Resultsa

    12 3 150 15 15

    80.0 20.0 100.0

    .0 100.0 100.0

    visit

    12

    1

    2

    Count

    %

    Original

    1 2

    Predicted Group

    Membership

    Total

    90.0% of original grouped cases c orrectly classified.a.

    18-24

    C d ti Di i i t A l i

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    24/35

    18 24

    Conducting Discriminant AnalysisDetermine the Significance of Discriminant Function

    The null hypothesis that, in the population, themeans of all discriminant functions in all groups areequal can be statistically tested.

    In SPSS this test is based on Wilks' . If severalfunctions are tested simultaneously (as in the case of

    multiple discriminant analysis), the Wilks' statisticis the product of the univariate for each function.The significance level is estimated based on a chi-square transformation of the statistic.

    If the null hypothesis is rejected, indicatingsignificant discrimination, one can proceed tointerpret the results.

    18-25Conducting Discriminant Analysis

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    25/35

    18 25Conducting Discriminant AnalysisInterpret the Results

    The interpretation of the discriminant weights, or coefficients, is

    similar to that in multiple regression analysis. Given the multicollinearity in the predictor variables, there is no

    unambiguous measure of the relative importance of thepredictors in discriminating between the groups.

    With this caveat in mind, we can obtain some idea of the

    relative importance of the variables by examining the absolutemagnitude of the standardized discriminant functioncoefficients.

    Some idea of the relative importance of the predictors can alsobe obtained by examining the structure correlations, also calledcanonical loadingsor discriminant loadings. These simple

    correlations between each predictor and the discriminantfunction represent the variance that the predictor shares withthe function.

    Another aid to interpreting discriminant analysis results is todevelop a characteristic profile for each group by describingeach group in terms of the group means for the predictorvariables.

    18-26Conducting Discriminant Analysis

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    26/35

    Conducting Discriminant AnalysisAccess Validity of Discriminant Analysis

    Many computer programs, such as SPSS, offer a leave-

    one-out cross-validation option. The discriminant weights, estimated by using the analysis

    sample, are multiplied by the values of the predictorvariables in the holdout sample to generate discriminantscores for the cases in the holdout sample. The cases are

    then assigned to groups based on their discriminantscores and an appropriate decision rule. The hit ratio, orthe percentage of cases correctly classified, can then bedetermined by summing the diagonal elements anddividing by the total number of cases.

    It is helpful to compare the percentage of cases correctlyclassified by discriminant analysis to the percentage thatwould be obtained by chance. Classification accuracyachieved by discriminant analysis should be at least 25%greater than that obtained by chance.

    18-27

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    27/35

    Results of Three-Group Discriminant Analysis

    Table 18.5

    Group MeansAMOUNT INCOME TRAVEL VACATION HSIZE AGE

    1 38.57000 4.50000 4.70000 3.10000 50.300002 50.11000 4.00000 4.20000 3.40000 49.500003 64.97000 6.10000 5.90000 4.20000 56.00000Total 51.21667 4.86667 4.93333 3.56667 51.93333

    Group Standard Deviations

    1 5.29718 1.71594 1.88856 1.19722 8.097322 6.00231 2.35702 2.48551 1.50555 9.252633 8.61434 1.19722 1.66333 1.13529 7.60117Total 12.79523 1.97804 2.09981 1.33089 8.57395

    Pooled Within-Groups Correlation MatrixINCOME TRAVEL VACATION HSIZE AGE

    INCOME 1.00000TRAVEL 0.05120 1.00000

    VACATION 0.30681 0.03588 1.00000HSIZE 0.38050 0.00474 0.22080 1.00000

    AGE -0.20939 -0.34022 -0.01326 -0.02512 1.00000

    Contd.

    18-28

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    28/35

    Results of Three-Group Discriminant Analysis

    Table 18.5 cont.

    Wilks' (U-statistic) and univariate Fratio with 2 and 27 degrees of freedom.

    Variable Wilks' Lambda F Significance

    INCOME 0.26215 38.00 0.0000TRAVEL 0.78790 3.634 0.0400

    VACATION 0.88060 1.830 0.1797HSIZE 0.87411 1.944 0.1626

    AGE 0.88214 1.804 0.1840

    CANONICAL DISCRIMINANT FUNCTIONS

    % of Cum Canonical After Wilks'Function Eigenvalue Variance % Correlation Function Chi-square df Significance

    : 0 0.1664 44.831 10 0.001* 3.8190 93.93 93.93 0.8902 : 1 0.8020 5.517 4 0.242* 0.2469 6.07 100.00 0.4450 :

    * marks the two canonical discriminant functions remaining in the analysis.

    Standardized Canonical Discriminant Function Coefficients

    FUNC 1 FUNC 2INCOME 1.04740 -0.42076TRAVEL 0.33991 0.76851

    VACATION -0.14198 0.53354HSIZE -0.16317 0.12932

    AGE 0.49474 0.52447 Contd.

    18-29

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    29/35

    Results of Three-Group Discriminant Analysis

    Table 18.5 cont.

    Structure Matrix:Pooled within-groups correlations between discriminating variables and canonical discriminantfunctions (variables ordered by size of correlation within function)

    FUNC 1 FUNC 2INCOME 0.85556* -0.27833HSIZE 0.19319* 0.07749

    VACATION 0.21935 0.58829*

    TRAVEL 0.14899 0.45362*AGE 0.16576 0.34079*

    Unstandardized canonical discriminant function coefficientsFUNC 1 FUNC 2

    INCOME 0.1542658 -0.6197148E-01TRAVEL 0.1867977 0.4223430

    VACATION -0.6952264E-01 0.2612652

    HSIZE -0.1265334 0.1002796AGE 0.5928055E-01 0.6284206E-01(constant) -11.09442 -3.791600

    Canonical discriminant functions evaluated at group means (group centroids)Group FUNC 1 FUNC 21 -2.04100 0.418472 -0.40479 -0.65867

    3 2.44578 0.24020 Contd.

    18-30

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    30/35

    Results of Three-Group Discriminant Analysis

    Table 18.5 cont.

    Classification Results:Predicted Group Membership

    Actual Group No. of Cases 1 2 3

    Group 1 10 9 1 090.0% 10.0% 0.0%

    Group 2 10 1 9 010.0% 90.0% 0.0%

    Group 3 10 0 2 80.0% 20.0% 80.0%

    Percent of grouped cases correctly classified: 86.67%

    Classification results for cases not selected for use in the analysisPredicted Group Membership

    Actual Group No. of Cases 1 2 3

    Group 1 4 3 1 0

    75.0% 25.0% 0.0%

    Group 2 4 0 3 10.0% 75.0% 25.0%

    Group 3 4 1 0 325.0% 0.0% 75.0%

    Percent of grouped cases correctly classified: 75.00%

    18-31

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    31/35

    All-Groups ScattergramFig. 18.2

    -4.0

    Across: Function 1Down: Function 2

    4.0

    0.0

    -6.0 4.00.0-2.0-4.0 2.0 6.0

    11

    1

    1

    1

    11 1

    1

    212

    222

    2 3333

    3

    33

    2

    3*

    **

    *indicates a groupcentroid

    18-32

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    32/35

    Territorial MapFig. 18.3

    -4.0

    Across: Function 1Down: Function 2

    4.0

    0.0

    -6.0 4.00.0-2.0-4.0 2.0 6.0

    1

    1 3

    *

    -8.0

    -8.0

    8.0

    8.0

    1 3

    1 3

    1 3

    1 31 3

    1 31 3

    1 1 2 31 1 2 2 3 3

    1 1 2 2

    1 1 1 2 2 2 2 3 3

    1 1 1 2 2

    1 1 2 2

    1 1 2 2

    1 1 1 2 2

    1 1 2 2

    1 1 2 2

    1 1 1 2 2

    1 1 1 2 2

    1 1 2 2 2

    2 2 32 3 3

    2 2 3 3

    2 2 3

    2 2 3

    2 2 3

    2 2 3 3

    2 3 3

    2 3 3

    2 3 3

    **

    *Indicates agroup centroid

    18-33

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    33/35

    Stepwise Discriminant Analysis

    Stepwise discriminant analysis is analogous to

    stepwise multiple regression (see Chapter 17) in thatthe predictors are entered sequentially based on theirability to discriminate between the groups.

    An Fratio is calculated for each predictor by

    conducting a univariate analysis of variance in whichthe groups are treated as the categorical variable andthe predictor as the criterion variable.

    The predictor with the highest Fratio is the first to beselected for inclusion in the discriminant function, if it

    meets certain significance and tolerance criteria. A second predictor is added based on the highest

    adjusted or partial Fratio, taking into account thepredictor already selected.

    18-34

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    34/35

    Each predictor selected is tested for retention based

    on its association with other predictors selected. The process of selection and retention is continued

    until all predictors meeting the significance criteria forinclusion and retention have been entered in thediscriminant function.

    The selection of the stepwise procedure is based onthe optimizing criterion adopted. The Mahalanobisprocedure is based on maximizing a generalizedmeasure of the distance between the two closest

    groups. The order in which the variables were selected also

    indicates their importance in discriminating betweenthe groups.

    Stepwise Discriminant Analysis

    18-35

  • 8/4/2019 Malhotra18-Discriminant Analysis-With SPSS Output Inserts-2003 Format

    35/35

    SPSS Windows

    The DISCRIMINANT program performs both two-

    group and multiple discriminant analysis. To selectthis procedure using SPSS for Windows click:

    Analyze>Classify>Discriminant