MB0024 - Set 2

download MB0024 - Set 2

of 15

Transcript of MB0024 - Set 2

  • 8/14/2019 MB0024 - Set 2

    1/15

    STATISTICS FORMANAGEMENT

    MB0024

    S ET 2

    MBA 1 SEM

    Name Mohammed Roohul Ameen

    Roll Number

    Learning Center SMU Riyadh ( 02543)

    Subject Statistics For Management

    Date of Submission 15 th August 2009

    Assignment Number MB0024

  • 8/14/2019 MB0024 - Set 2

    2/15

    Mohammed Roohul Ameen Roll Number: Assignment MBA 1 st Semester Subject: MB0024

    2

    This page is intentionally left blank

  • 8/14/2019 MB0024 - Set 2

    3/15

    Mohammed Roohul Ameen Roll Number: Assignment MBA 1 st Semester Subject: MB0024

    3

    1. What do you mean by sample survey? What are the different sampling methods? Briefly describethem.

    Sampling is that part of statistical practice concerned with the selection of individual observationsintended to yield some knowledge about a population of concern, especially for the purposes of statistical inference. Each observation measures one or more properties (weight, location, etc.) of anobservable entity enumerated to distinguish objects or individuals. Survey weights often need to beapplied to the data to adjust for the sample design. Results from probability theory and statisticaltheory are employed to guide practice. In business, sampling is widely used for gathering informationabout a population.

    Types of Sampling Simple random sampling Systematic sampling Stratified sampling Probability proportional to size sampling Cluster sampling Matched random sampling Quota sampling Mechanical sampling

    Convenience sampling Line-intercept sampling Panel sampling Event Sampling Methodology

    Simple random sampling In a simple random sample ('SRS') of a given size, all such subsets of theframe are given an equal probability. Each element of the frame thus has an equal probability of selection: the frame is not subdivided or partitioned. Furthermore, any given pair of elements has thesame chance of selection as any other such pair (and similarly for triples, and so on). This minimizesbias and simplifies analysis of results. In particular, the variance between individual results within the

    sample is a good indicator of variance in the overall population, which makes it relatively easy toestimate the accuracy of results.

    Systematic sampling relies on arranging the target population according to some ordering scheme andthen selecting elements at regular intervals through that ordered list. Systematic sampling involves arandom start and then proceeds with the selection of every k th element from then onwards. In thiscase, k =(population size/sample size). It is important that the starting point is not automatically the firstin the list, but is instead randomly chosen from within the first to the k th element in the list. A simpleexample would be to select every 10th name from the telephone directory (an 'every 10th' sample,also referred to as 'sampling with a skip of 10').

  • 8/14/2019 MB0024 - Set 2

    4/15

    Mohammed Roohul Ameen Roll Number: Assignment MBA 1 st Semester Subject: MB0024

    4

    Stratified sampling Where the population embraces a number of distinct categories, the frame can beorganized by these categories into separate "strata." Each stratum is then sampled as an independentsub-population, out of which individual elements can be randomly selected.

    Probability proportional to size sampling In some cases the sample designer has access to an "auxiliaryvariable" or "size measure", believed to be correlated to the variable of interest, for each element inthe population. This data can be used to improve accuracy in sample design. One option is to use theauxiliary variable as a basis for stratification, as discussed above.

    Cluster sampling Sometimes it is cheaper to 'cluster' the sample in some way e.g. by selectingrespondents from certain areas only, or certain time-periods only. (Nearly all samples are in somesense 'clustered' in time - although this is rarely taken into account in the analysis.)

    Cluster sampling is an example of 'two-stage sampling' or 'multistage sampling': in the first stage asample of areas is chosen; in the second stage a sample of respondents within those areas is selected.

    Matched random sampling A method of assigning participants to groups in which pairs of participantsare first matched on some characteristic and then individually assigned randomly to groups.

    The Procedure for Matched random sampling can be briefed with the following contexts,

    1. Two samples in which the members are clearly paired, or are matched explicitly by theresearcher. For example, IQ measurements or pairs of identical twins.

    2. Those samples in which the same attribute, or variable, is measured twice on each subject,under different circumstances. Commonly called repeated measures. Examples include thetimes of a group of athletes for 1500m before and after a week of special training; the milkyields of cows before and after being fed a particular diet.

    Quota sampling , the population is first segmented into mutually exclusive sub-groups, just as instratified sampling. Then judgment is used to select the subjects or units from each segment based on aspecified proportion. For example, an interviewer may be told to sample 200 females and 300 malesbetween the age of 45 and 60.

    It is this second step which makes the technique one of non-probability sampling. In quota samplingthe selection of the sample is non -random. For example interviewers might be tempted to interviewthose who look most helpful. The problem is that these samples may be biased because not everyonegets a chance of selection. This random element is its greatest weakness and quota versus probabilityhas been a matter of controversy for many years

    http://en.wikipedia.org/wiki/Randomhttp://en.wikipedia.org/wiki/Biased_sampleshttp://en.wikipedia.org/wiki/Biased_sampleshttp://en.wikipedia.org/wiki/Random
  • 8/14/2019 MB0024 - Set 2

    5/15

    Mohammed Roohul Ameen Roll Number: Assignment MBA 1 st Semester Subject: MB0024

    5

    Mechanical sampling is typically used in sampling solids, liquids and gases, using devices such as grabs,

    scoops, thief probes, the COLIWASAand riffle splitter. Care is needed in ensuring that the sample isrepresentative of the frame. Much work in the theory and practice of mechanical sampling wasdeveloped by Pierre Gy and Jan Visman.

    Convenience sampling (sometimes known as grab or opportunity sampling) is a type of nonprobabilitysampling which involves the sample being drawn from that part of the population which is close tohand. That is, a sample population selected because it is readily available and convenient. Theresearcher using such a sample cannot scientifically make generalizations about the total populationfrom this sample because it would not be representative enough.

    Panel sampling is the method of first selecting a group of participants through a random samplingmethod and then asking that group for the same information again several times over a period of time.Therefore, each participant is given the same survey or interview at two or more time points; eachperiod of data collection is called a "wave". This sampling methodology is often chosen for large scaleor nation-wide studies in order to gauge changes in the population with regard to any number of variables from chronic illness to job stress to weekly food expenditures.

    Event Sampling Methodology (ESM) is a new form of sampling method that allows researchers tostudy ongoing experiences and events that vary across and within days in its naturally-occurringenvironment. Because of the frequent sampling of events inherent in ESM, it enables researchers tomeasure the typology of activity and detect the temporal and dynamic fluctuations of workexperiences.

    http://en.wikipedia.org/w/index.php?title=Mechanical_sampling&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Mechanical_sampling&action=edit&redlink=1http://en.wikipedia.org/wiki/Solidhttp://en.wikipedia.org/wiki/Liquidhttp://en.wikipedia.org/wiki/Gashttp://en.wikipedia.org/w/index.php?title=Thief_probe&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Composite_Liquid_Waste_Sampler&action=edit&redlink=1http://en.wikipedia.org/wiki/Riffle_splitterhttp://en.wikipedia.org/wiki/Pierre_Gyhttp://en.wikipedia.org/wiki/Jan_Vismanhttp://en.wikipedia.org/wiki/Jan_Vismanhttp://en.wikipedia.org/wiki/Pierre_Gyhttp://en.wikipedia.org/wiki/Riffle_splitterhttp://en.wikipedia.org/w/index.php?title=Composite_Liquid_Waste_Sampler&action=edit&redlink=1http://en.wikipedia.org/w/index.php?title=Thief_probe&action=edit&redlink=1http://en.wikipedia.org/wiki/Gashttp://en.wikipedia.org/wiki/Liquidhttp://en.wikipedia.org/wiki/Solidhttp://en.wikipedia.org/w/index.php?title=Mechanical_sampling&action=edit&redlink=1
  • 8/14/2019 MB0024 - Set 2

    6/15

    Mohammed Roohul Ameen Roll Number: Assignment MBA 1 st Semester Subject: MB0024

    6

    2. What is the different between correlation and regression? What do you understand by RankCorrelation? When we use rank correlation and when we use Pearsonian Correlation Coefficient? Fita linear regression line in the following data X 12 15 18 20 27 34 28 48Y 123 150 158 170 180 184 176 130

    Correlation In statistics, correlation (often measured as a correlation coefficient, ) indicates the strength anddirection of a linear relationship between two random variables. That is in contrast with the usage of the term in colloquial speech, which denotes any relationship, not necessarily linear. In generalstatistical usage, correlation or co-relation refers to the departure of two random variables fromindependence. In this broad sense there are several coefficients, measuring the degree of correlation,adapted to the nature of the data.

    Correlation analysis deals with1) Measuring the relationship between variables.2) Testing the relationship for its significance.3) Giving confidence interval for population correlation measure.

    Regression In statistics, regression refers to any approach to modeling the relationship between one or morevariables denoted y and one or more variables denoted X , such that the model depends linearly on theunknown parameters to be estimated from the data. Such a model is called a "linear model." Mostcommonly, linear regression refers to a model in which the conditional mean of y given the value of X isan affine function of X . Less commonly, linear regression could refer to a model in which the median, orsome other quantile of the conditional distribution of y given X is expressed as a linear function of X .Like all forms of regression analysis, linear regression focuses on the conditional probability distributionof y given X , rather than on the joint probability distribution of y and X , which is the domain of multivariate analysis.

    Different between correlation and regression

    In probability theory and statistics, correlation, (often measured as a correlation coefficient), indicatesthe strength and direction of a linear relationship between two random variables. In general statisticalusage, correlation or co-relation refers to the departure of two variables from independence. In thisbroad sense there are several coefficients, measuring the degree of correlation, adapted to the natureof data.

    A number of different coefficients are used for different situations. The best known is the Pearsonproduct-moment correlation coefficient, which is obtained by dividing the covariance of the twovariables by the product of their standard deviations. Despite its name, it was first introduced byFrancis Galton.

  • 8/14/2019 MB0024 - Set 2

    7/15

    Mohammed Roohul Ameen Roll Number: Assignment MBA 1 st Semester Subject: MB0024

    7

    Linear regression is a form of regression analysis in which observational data are modeled by a leastsquares function which is a linear combination of the model parameters and depends on one or moreindependent variables. In simple linear regression the model function represents a straight line. Theresults of data fitting are subject to statistical analysis.

    Spearman's rank correlation coefficient or Spearman's rho , named after Charles Spearman and oftendenoted by the Greek letter (rho) or as rs , is a nonparametric measure of correlation that is, itassesses how well an arbitrary monotonic function could describe the relationship between twovariables, without making any other assumptions about the particular nature of the relationshipbetween the variables. Certain other measures of correlation are parametric in the sense of beingbased on possible relationships of a parameterized form, such as a linear relationship.

    In principle, is simply a special case of the Pearson product -moment coefficient in which two sets of

    data Xi and Yi are converted to rankings xi and yi before calculating the coefficient. In practice,however, a simpler procedure is normally used to calculate . The raw scores are converted to ranks,and the differences di between the ranks of each observation on the two variables are calculated.If there are no tied ranks, then is given by:

    Where:

    di = xi yi = the difference between the ranks of corresponding values Xi and Yi , andn = the number of values in each data set (same for both sets).

    If tied ranks exist, classic Pearson's correlation coefficient between ranks has to be used instead of thisformula.

    One has to assign the same rank to each of the equal values. It is an average of their positions in theascending order of the values.

  • 8/14/2019 MB0024 - Set 2

    8/15

    Mohammed Roohul Ameen Roll Number: Assignment MBA 1 st Semester Subject: MB0024

    8

    Conditions under which P.E can be used .1. Samples should be drawn from a normal population.2. The value of r must be determined from sample values. 3. Samples must have been selected at random.

    X 12 15 18 20 27 34 28 48Y 123 150 170 180 184 184 176 130

    Linear Regression Line for the above data can be plotted as :

    Total Numbers: 8Slope (b):0.16701Y-Intercept (a): 154.65Regression Equation: 154.66 + 0.17x

    Regression Equation(y) = a + bxRegression equation = 154.66+0.17x

    Suppose if we want to know the approximate y value for the variable x = 20.Then we can substitute the value in the above equation.

    = 154.65795 + 0.17x20= 158.05795

  • 8/14/2019 MB0024 - Set 2

    9/15

    Mohammed Roohul Ameen Roll Number: Assignment MBA 1 st Semester Subject: MB0024

    9

    3. What do you mean by business forecasting? What are the different methods of businessforecasting? Describe the effectiveness of time-series analysis as a mode of business forecasting.Describe the method of moving averages.

    Business forecasting has always been one component of running an enterprise. However, forecastingtraditionally was based less on concrete and comprehensive data than on face-to-face meetings andcommon sense. In recent years, business forecasting has developed into a much more scientificendeavor, with a host of theories, methods, and techniques designed for forecasting certain types of data. The development of information technologies and the Internet propelled this development intooverdrive, as companies not only adopted such technologies into their business practices, but intoforecasting schemes as well.

    Business forecasting involves a wide range of tools, including simple electronic spreadsheets;enterprise resource planning (ERP) and electronic data interchange (EDI) networks, advanced supplychain management systems, and other Web-enabled technologies. The practice attempts to pinpointkey factors in business production and extrapolate from given data sets to produce accurateprojections for future costs, revenues, and opportunities. This normally is done with an eye towardadjusting current and near-future business practices to take maximum advantage of expectations.

    Time Series Analysis is also used for the purpose of making business forecasting. The

    forecasting through time series analysis is possible only when the business data of various yearsare available which reflects a definite trend and seasonal variation.

    Extrapolation is the simplest method of business forecasting. By extrapolation, a businessmanfinds out the possible trend of demand of his goods and about their future price trends also.The accuracy of extrapolation depends on two factors: i) Knowledge about the fluctuations of the figures, ii) Knowledge about the course of events relating to the problem underconsideration.

    Regression Analysis The regression approach offers many valuable contributions to the solution

    of the forecasting problem. It is the means by which we select from among the any possiblerelationships between variables in a complex economy those which will be useful forforecasting. Regression relationship may involve one predicted or dependent and oneindependent variables simple regression, or it may involve relationships between the variable tobe forecast and several independent variables under multiple regressions. Statistical techniquesto estimate the regression equations are often fairly complex and time-consuming but there aremany computer programs now available that estimate simple and multiple regressions quickly.

    Modern Econometric Methods Econometric techniques, which originated in the eighteenthcentury, have recently gained in popularity for forecasting. The term econometrics refers to the

    application of mathematical economic theory and statistical procedures to economic data in

  • 8/14/2019 MB0024 - Set 2

    10/15

    Mohammed Roohul Ameen Roll Number: Assignment MBA 1 st Semester Subject: MB0024

    10

    order to verify economic theorems. Models take the form of a set of simultaneous equations.The values of the constants in such equations are supplied by a study of statistical time series.

    Exponential Smoothing Method This method is regarded as the best method of businessforecasting as compared to other methods. Exponential smoothing is a special kind of weighted

    average and is found extremely useful in short-term forecasting of inventories and sales.

    Choice of a Method of Forecasting The selection of an appropriate method depends on manyfactors the context of the forecast, the relevance and availability of historical data, the degreeof accuracy desired, the time period for which forecasts are required, the cost benefit of theforecast to the company, and the time available for making the analysis.

    Effectiveness of Time Series Analysis:

    The term time series analysis is used to distinguish a problem, firstly from more ordinary data analysisproblems (where there is no natural ordering of the context of individual observations), and secondlyfrom spatial data analysis where there is a context that observations (often) relate to geographicallocations. There are additional possibilities in the form of space-time models (often called spatial-temporal analysis). A time series model will generally reflect the fact that observations close togetherin time will be more closely related than observations further apart. In addition, time series models willoften make use of the natural one-way ordering of time so that values in a series for a given time willbe expressed as deriving in some way from past values, rather than from future values

    Methods for time series analyses are often divided into two classes: frequency-domain methods andtime-domain methods. The former centre around spectral analysis and recently wavelet analysis, andcan be regarded as model-free analyses well-suited to exploratory investigations. Time-domainmethods have a model-free subset consisting of the examination of auto-correlation and cross-correlation analysis, but it is here that partly and fully-specified time series models make theirappearance.

    Merits:i) It is an easy method of forecasting.ii) By this method a comparative study of variations can be made.iii) Reliable results of forecasting are obtained as this method is based on mathematical model.

  • 8/14/2019 MB0024 - Set 2

    11/15

    Mohammed Roohul Ameen Roll Number: Assignment MBA 1 st Semester Subject: MB0024

    11

    Moving Average

    In statistics, a moving average , also called rolling average , rolling mean or running average , is a type of

    finite impulse response filter used to analyze a set of data points by creating a series of averages of different subsets of the full data set. A moving average is not a single number, but it is a set of numbers, each of which is the average of the corresponding subset of a larger set of data points. Amoving average may also use unequal weights for each data value in the subset to emphasize particularvalues in the subset.

    A moving average is commonly used with time series data to smooth out short-term fluctuations andhighlight longer-term trends or cycles. The threshold between short-term and long-term depends onthe application, and the parameters of the moving average will be set accordingly. For example, it isoften used in technical analysis of financial data, like stock prices, returns or trading volumes. It is alsoused in economics to examine gross domestic product, employment or other macroeconomic timeseries. Mathematically, a moving average is a type of convolution and so it is also similar to the low-pass filter used in signal processing. When used with non-time series data, a moving average simplyacts as a generic smoothing operation without any specific connection to time, although typically somekind of ordering is implied.

    http://en.wikipedia.org/wiki/Averagehttp://en.wikipedia.org/wiki/Average
  • 8/14/2019 MB0024 - Set 2

    12/15

    Mohammed Roohul Ameen Roll Number: Assignment MBA 1 st Semester Subject: MB0024

    12

    4. What is definition of Statistics? What are the different characteristics of statistics? What are thedifferent functions of Statistics? What are the limitations of Statistics?

    Statistics is a mathematical science pertaining to the collection, analysis, interpretation or explanation,and presentation of data. Statisticians improve the quality of data with the design of experiments andsurvey sampling. Statistics also provides tools for prediction and forecasting using data and statisticalmodels. Statistics is applicable to a wide variety of academic disciplines, including natural and socialsciences, government, and business.

    Characteristic of Statistics

    a. Statistics Deals with aggregate of facts: Single figure cannot be analyzed.

    b. Statistics are affected to a marked extent by multiplicity of causes: The statistics of yield of paddy isthe result of factors such as fertility of soil, amount of rainfall, quality of seed used, quality and quantityof fertilizer used, etc.

    c. Statistics are numerically expressed: Only numerical facts can be statistically analyzed. Therefore,facts as price decreases with increasing production cannot be called statistics.

    d. Statistics are enumerated or estimated according to reasonable standards of accuracy: The factsshould be enumerated (collected from the field) or estimated (computed) with required degree of accuracy. The degree of accuracy differs from purpose to purpose. In measuring the length of screws,an accuracy upto a millimeter may be required, whereas, while measuring the heights of students in aclass, accuracy upto a centimeter is enough.

    e. Statistics are collected in a systematic manner: The facts should be collected according to plannedand scientific methods. Otherwise, they are likely to be wrong and misleading.

    f. Statistics are collected for a pre-determined purpose: There must be a definite purpose for collectingfacts. Eg. Movement of wholesale price of commodities.

    g. Statistics are placed in relation to each other: The facts must be placed in such a way that acomparative and analytical study becomes possible. Thus, only related facts which are arranged inlogical order can be called statistics.

  • 8/14/2019 MB0024 - Set 2

    13/15

    Mohammed Roohul Ameen Roll Number: Assignment MBA 1 st Semester Subject: MB0024

    13

    Functions of Statistics

    1. It simplifies mass data2. It makes comparison easier3. It brings out trends and tendencies in the data4. It brings out hidden relations between variables.5. Decision making process becomes easier.

    Major limitations of Statistics are :

    Statistics laws are true on average. Statistics are aggregates of facts. So single observation is nota statistics, it deals with groups and aggregates only.

    Statistical methods are best applicable on quantitative data. Statistical cannot be applied to heterogeneous data.

    It sufficient care is not exercised in collecting, analyzing and interpretation the data, statisticalresults might be misleading. Some errors are possible in statistical decisions. Particularly the inferential statistics involves

    certain errors. We do not know whether an error has been committed or not. Statistics can be misused and misinterpreted: Increasing misuse of Statistics has led to increasing distrust in statistics.

  • 8/14/2019 MB0024 - Set 2

    14/15

    Mohammed Roohul Ameen Roll Number: Assignment MBA 1 st Semester Subject: MB0024

    14

    5. What are the different stages of planning a statistical survey? Describe the various methods forcollecting data in a statistical survey.

    The planning stage consists of the following sequence of activities.

    Nature of the problem to be investigated should be clearly defined in an unambiguous manner.

    Objectives of investigation should be stated at the outset. Objectives could be to obtain certainestimates or to establish a theory or to verify a existing statement to find relationship betweencharacteristics etc.

    The scope of investigation has to be made clear. It refers to area to be covered, identification of units to be studied, nature of characteristics to be observed, accuracy of measurements,analytical methods, time, cost and other resources required.

    Whether to use data collected from primary or secondary source should be determined inadvance.

    The organization of investigation is the final step in the process. It encompasses thedetermination of number of investigators required, their training, supervision work needed,funds required etc.

    Collection of primary data can be done by anyone of the following methods.1) Direct personal observation2) Indirect oral interview3) Information through agencies4) Information through mailed questionnaires5) Information through schedule filled by investigators

  • 8/14/2019 MB0024 - Set 2

    15/15

    Mohammed Roohul Ameen Roll Number: Assignment MBA 1 st Semester Subject: MB0024

    15

    6. What are the functions of classification? What are the requisites of a good classification? What isTable and describe the usefulness of a table in mode of presentation of data?

    The functions of classification are:

    1) It reduce the bulk data2) It simplifies the data and makes the data more comprehensible3) It facilitates comparison of characteristics4) It renders the data ready for any statistical analysis

    Requisites of good classification are:1. Unambiguous: It should not lead to any confusion2. Exhaustive: every unit should be allotted to one and only one class3. Mutually exclusive: There should not be any overlapping.4. Flexibility: It should be capable of being adjusted to changing situation.5. Suitability: It should be suitable to objectives of survey.6. Stability: It should remain stable throughout the investigation7. Homogeneity: Similar units are placed in the same class.8. Revealing: Should bring out essential features of the collected data.

    Table is nothing but logical listing of related data in rows and columns.Objectives of tabulation are:- To simplify complex data To highlight important characteristics To present data in minimum space To facilitate comparison To bring out trends and tendencies To facilitate further analysis