Multivariate Research Assignment

download Multivariate Research Assignment

of 6

Transcript of Multivariate Research Assignment

  • 8/3/2019 Multivariate Research Assignment

    1/6

    Business Research Methods

    Assignment

    On

    Multivariate Analysis

    Submitted By:

    Tariq Mahmood Asghar

    Roll No. # 77

    MBA (A, B1)

  • 8/3/2019 Multivariate Research Assignment

    2/6

    DATA ANALYSIS

    The terms "statistics" and "data analysis" mean the same thing -- the study of how

    we describe, combine, and make inferences from numbers. A lot of people are

    scared of numbers (quantiphobia), but statistics has got less to do with numbers,

    and more to do with rules for arranging them. It even lets you create some of those

    rules yourself, so instead of looking at it like a lot of memorization, it's best to see it

    as an extension of the research mentality, something researchers do (crunch

    numbers) to obtain complete and total power over the numbers. After awhile, the

    principles behind the computations become clear, and there's no better way to

    accomplish this than by understanding the research purpose of statistics.

    MULTIVARIATE DATA ANALYSIS

    As the name indicates, multivariate analysis comprises a set of techniquesdedicated to the analysis of data sets with three or more variables. Multivariate

    analysis is the simultaneous analysis of three or more variables. It is frequently

    done to refine a bivariate analysis, taking into account the possible influence of a

    third variable on the original bivariate relationship. Multivariate analysis is also used

    to test the joint effects of two or more variables upon a dependent variable. In some

    instances, the association between two variables is assessed with a multivariate

    rather than a bivariate statistical technique. This situation arises when two or more

    variable are needed to express the functional form of the association.

    Multivariate Data Analysis refers to any statistical technique used to analyze datathat arises from more than one variable. This essentially models reality where each

    situation, product, or decision involves more than a single variable. The information

    age has resulted in masses of data in every field. Despite the quantum of data

    available, the ability to obtain a clear picture of what is going on and make intelligent

    decisions is a challenge. When available information is stored in database tables

    containing rows and columns, Multivariate Analysis can be used to process the

    information in a meaningful fashion.

    Multivariate analysis methods typically used for:

    Consumer and market research

    Quality control and quality assurance across a range of industries such as

    food and beverage, pharmaceuticals, chemicals, energy,

    telecommunications, etc.

    Process optimization and process control

    Research and development

  • 8/3/2019 Multivariate Research Assignment

    3/6

    With Multivariate Analysis you can:

    Obtain a summary or an overview of a table. This analysis is often called

    Principal Components Analysis or Factor Analysis. In the overview, it is

    possible to identify the dominant patterns in the data, such as groups,

    outliers, trends, and so on. The patterns are displayed as two plots

    Analyze groups in the table, how these groups differ, and to which group

    individual table rows belong. This type of analysis is called Classification and

    Discriminant Analysis

    Find relationships between columns in data tables, for instance relationships

    between process operation conditions and product quality. The objective is to

    use one set of variables (columns) to predict another, for the purpose of

    optimization, and to find out which columns are important in the relationship.

    Tools for Multivariate Analysis

    The Multiple Correlation Coefficients:

    A correlation is a way of measuring the linear association between two variables.

    But what is the correlation if Y, the dependent variable, is being predicted from more

    than one independent variable? Answer is the multiple correlations are the

    correlation between Y and the best linear combination of the independent variables.

    The best linear combination of independent variables is the multiple regression

    slopes.

    In bivariate correlation, the predictor, X, and predicted, Y, were perfectly correlated.

    In multiple correlation, the predictors, X1 and X2, are perfectly correlated with the

    predicted, Y. You already know how to establish that "best possible predictor" using

    multiple regressions described above. You already know how to solve for a

    correlation coefficient. In this section, we will show you how these principles are

    used to solve for the multiple correlation coefficient.

    The multiple correlation coefficients vary in terms of strength; it can vary between .

    00 and 1.00. The multiple correlations do not specify direction. If the value of the

    dependent variable does not change in any systematic way as any of theindependent variables varies in either direction, then there is no relationship and the

    coefficient is 0.00. Thus, if the plane describing the points is flat, indicating no

    systematic change in the dependent variable as we increase the values of the

    independent variables, the coefficient is 0.

  • 8/3/2019 Multivariate Research Assignment

    4/6

    In formula form, the multiple correlation between X and Y is equal to the covariance

    of Y and Y predicted from the best linear combination of the X divided by the

    product of the respective standard deviations.

    Multiple Regressions (MR):

    Multiple Regression is a "general linear model" with a wide range of applications. It

    is basically an extension of the bivariate correlation and simple regression analysis.

    The primary uses of MR are as follows: Prediction of a continuous Y with several continuous X variables: Unlike

    ordinary bivariate regression, MR allows the use of an entire set of variables

    to predict another.

    Use of categorical variables in prediction: Through the technique of

    dummy coding, categorical variables (such as marital status or treatment

    group) can be used in addition to continuous variables.

    Calculation of the unequal n ANOVA problem: Disproportional cell size in

    any factorial ANOVA design produces a correlation among the independent

    variables. MR estimates effects and interactions in this situation by the use of

    dummy codes.

    Model nonlinear relationships between Y and a set of X: By the addition

    of "polynomial" terms (e.g. quadratic, cubic, trends) into the equation,

    relationships that do not meet the linear assumptions can be analyzed.

    Multiple linear regression analysis (MLR):

    In MLR, several IV's (which are supposed to be axed or equivalently are measured

    without error) are used to predict with a least square approach one DV. If the IV's

    are orthogonal, the problem reduces to a set of univariate regressions. When the

    IV's are correlated, their importance is estimated from the partial coeffcient of

    correlation. An important problem arises when one of the IV's can be predicted from

    the other variables because the computations required by MLR can no longer be

    performed: This is called multicolinearity.

    Non-Linear Regression:

    Ordinary multiple regression assumes that each bivariate relationship between X

    and Y is linear, and that the relationship between Y and Y' is also linear. How can

    we examine how closely we have met this linearity assumption? We could, of

    course, always inspect all of the separate scatter plots between all Y and X

    relationships. But this does not address the second part of the linearity assumption.

  • 8/3/2019 Multivariate Research Assignment

    5/6

    We can determine the conformity of our data the MR linearity assumption via direct

    examination of residual scores. Recall that a residual or "error" score is equal to (Y -

    Y') and that we would expect this to be approximately zero once the linear

    regression determined. Thus, a plot of residual scores against [Yacute] that shows a

    peculiar pattern would suggest the violation of linearity. Two of the most common

    type of nonlinear relationships is represented in this way:

    These cases can be modeled through polynomial regression. Polynomial regression

    simply adds terms to the original equation to account for nonlinear relationships.

    Squaring the original variable accounts for a quadratic trend, a cubed term accounts

    for the cubic trend, and so on. After the linear component, each term adds another

    "bend" in the prediction line. As with traditional MR, polynomial regression can be

    interpreted by looking at R Squared and changes in R Squared.

    A couple notes of caution: in order to look at any higher order effect (such as the

    cubic), all lower order effects (the quadratic and the linear) must be placed into theequation; a degree of freedom is lost for each additional term added into the

    equation; and lastly, the N:k Ratio changes for each added term.

    Recall that trend tests in ANOVA were accomplished with subjects grouped into

    discrete categories. With polynomial regression, the subjects are not grouped, but

    rather each individual has a unique score on a continuous variable. Therefore, MR

    trend information is much more complete than information available with typical

    ANOVA.

    Multivariate analysis of variance (MANOVA):

    Multivariate analysis of variance (MANOVA) is a generalized form of analysis of

    variance (ANOVA) methods to cover cases where there is more than one

    (correlated) dependent variable and where the dependent variables cannot simply

    be combined. As well as identifying whether changes in the independent variables

    have a significant effect on the dependent variables, the technique also seeks to

  • 8/3/2019 Multivariate Research Assignment

    6/6

    identify the interactions among the independent variables and the association

    among dependent variables, if any.

    MANOVA Procedure

    MANOVA procedures are multivariate, significance test analogues of variousunivariate ANOVA experimental designs. MANOVA, as with its univariate

    counterparts typically involve random assignment of participants to levels of one or

    more nominal independent variables; however, all participants are measured on

    several continuous dependent variables.There are three basic variations of MANOVA:

    Hotelling's T: This is the MANOVA analogue of the two group T-test

    situation; in other words, one dichotomous independent variable, and

    multiple dependent variables. One-Way MANOVA: This is the MANOVA analogue of the one-way F

    situation; in other words, one multi-level nominal independent variable, and

    multiple dependent variables. Factorial MANOVA: This is the MANOVA analogue of the factorial ANOVA

    design; in other words, multiple nominal independent variables, and multiple

    dependent variables.While all of the above MANOVA variations are used in somewhat different

    applications, they all have one feature in common: they form linear combinations of

    the dependent variables which best discriminate among the groups in the particular

    experimental design. In other words, MANOVA is a test of the significance of group

    differences in some m-dimensional space where each dimension is defined by

    linear combinations of the original set of dependent variables. This relationship will

    be represented for each design into the following sections.