Minitab 13.31 for Windows Tutorial - NTNU · 2010-07-15 · 16. selecting observations 16 17....

Post on 31-May-2020

1 views 0 download

Transcript of Minitab 13.31 for Windows Tutorial - NTNU · 2010-07-15 · 16. selecting observations 16 17....

Minitab 15 for Windows

Tutorial

Section of Applied Statistics

1

Contents

1. INTRODUCTION 2

2. GETTING STARTED 2

3. DATA TO BE ANALYSED 3

4. DATA ENTRY 3

5. EDITING DATA 3

6. LISTING THE DATA 4

7. SAVING THE DATA 5

8. DESCRIPTIVE STATISTICS 6

9. CALCULATIONS 7

10. X-Y SCATTERPLOTS 8

11. STARTING A SEPARATE ANALYSIS WITHIN THE SAME MINITAB PROJECT 9

12. READING DATA FROM A SEPARATE FILE 10

13. READING IN BLOCKS OF DATA 13

14. GENERATING DATA 14

15. SPLITTING THE WORKSHEET INTO GROUPS 15

16. SELECTING OBSERVATIONS 16

17. SORTING DATA 18

18. TABULATION 18

19. BAR CHARTS 20

20. REGRESSION 21

21. ANALYSIS OF VARIANCE 22

22. SAVING GRAPHS AND IMPORTING INTO WORD 2000 23

23. USING MINITAB HELP 24

24. LEAVING MINITAB 24

2

1. Introduction Minitab provides a wide range of basic and advanced statistics, including exploratory data analysis, basic statistics, regression, analysis of variance, multivariate analysis, time series, cross-tabulations, and simulations and distributions. It also has facilities to produce a comprehensive array of graphs.

2. Getting Started This tutorial document assumes that you have a basic knowledge of Windows. The Information Technology Services (ITS) provide introductory courses and documents for this purpose. This tutorial is based on the default installation of the software at Reading where your Unix home directory will be available to you as the N:\ drive and shown on the desktop in the Windows XP environment as My Documents. Access the package from the Start menu and select Statistics, then Minitab 15 and then the Minitab icon. Wait until Figure 1 appears on the screen. [For information - to exit Minitab, select the File option on the menu bar and choose Exit and confirm your intention. You do not need to save changes to the project at this stage.] Combination of commands from the pull-down menus will be shown as File Exit within this tutorial from now on.

Figure 1

Below the Minitab Menu bar are two sub-windows; Session and Worksheet1. The Session window is where any non-graphical output is displayed. Note that you can also type in commands in this window, but for this tutorial you will use pull-down menus. The Worksheet1 *** window is a spreadsheet, where you can type in and view your data.

3

3. Data to be Analysed The data used to illustrate a simple use of the Minitab package consists of three variables, a Plot code number, the number of Trees in that plot and the total yield of Apples (in unspecified units) from that plot. The data are used in the first few exercises to illustrate data entry and modification, descriptive statistics and an X-Y scatterplot.

4. Data Entry

Exercise 1

Move the cursor to the Worksheet1 window and click in the grey cell immediately below C1. This is above row 1 of the data and is where you will enter the names of the three variables Plot, Trees and Apples into C1, C2 and C3. The data should be then entered in the 7 rows below. 11 110 223 15 129 250 104 9 195 168 102 177 60 145 297 125 86 186 111 109 188 The Worksheet1 window should look like Figure 2.

Figure 2

5. Editing Data To show you how to edit data, we have introduced two mistakes in the data you have just typed in. (You may have made some more!)

4

Exercise 2

An incorrect value has been entered as the third observation for Trees. The value should be 90 not 9. To correct the value, use the mouse to move the cursor to the correct cell in the Worksheet1 window and type the correct value, 90 A row of data was omitted between rows 2 and 3, which you should now insert. Move the cursor to row 3 of column C1. Select Editor Insert Rows. An empty row is inserted above the active row, and the remaining rows are moved down. Type the values 88 122 266 to replace the row of ‘*’s. Note that in Minitab a ‘*’ represents a missing value.

6. Listing the Data Although you can see the data in the Worksheet1 window, it is sensible to have a listing of the data along with the output from your statistical analyses. The output will be displayed in the Session window.

Exercise 3

From the Menu Bar at the top of the screen, select Data Display Data.... A dialogue box for Display Data is shown. Select all three variables by double-clicking on each one in turn. Your Display Data dialogue box should look like Figure 3. Click on OK and the Session window should now show the results as in Figure 4.

Figure 3

Figure 4

5

7. Saving the Data First check that your data are correct. If there are any mistakes, make corrections by following the instructions in Section 5 and repeating Exercise 3. Now you should save the data in a file.

Exercise 4

Select File Save Current Worksheet As... In the dialogue-box shown in Figure 5, move to the N:\ drive and in the File name box, type the name of the file in which you wish to save the data e.g. apples. Do not make a change in the Save as type box. Select on Save to save the data to the named file.

Note: For information only, an appropriate statement will appear in the History window that can be seen by clicking on the icon. This will show a complete listing of all the Minitab commands that you have used throughout the session and can be saved for use on similar data sets.

6

Figure 5

8. Descriptive Statistics Given the nature of the data set you wish to look at some basic descriptive statistics (e.g. mean and standard deviation) for Trees and Apples.

Exercise 5

Select Stat Basic Statistics Display Descriptive Statistics... Select the variables Trees and Apples by double-clicking on each name in turn. The top of the Descriptive Statistics dialog box should look like Figure 6. Click on OK. The expected output is shown in Figure 7.

Figure 6

7

Figure 7

9. Calculations We now wish to calculate the average yield of apples per tree for each plot.

Exercise 6

Select Calc Calculator Type Average in the Store result in variable box. Move to the Expression box. Double-click on Apples ; click on ' / ' symbol; double-click on Trees Your screen should look like Figure 8. (The program adds the quotation marks displayed around the column names automatically). Click on OK. The results of the calculation can be seen in the Worksheet1 window in the next available column (C4). Check one or two values. Try to calculate C5 as the square root of the number of trees on each plot. How did you do this – hint, select from the list of Functions available? ……………………………………………………………………………………………………………………………………………………………………………………………………

Figure 8

8

Exercise 7

If you wanted to calculate Column Statistics e.g. the total yield of apples on all the plots (one number) in this experiment you can use Calc Column Statistics as shown in Figure 9. Check that the total yield is 1782.0 (in unspecified units). The value stored in K1 can be used in further calculations if required.

Figure 9

10. X-Y Scatterplots You wish to plot the yield of apples per plot against the number of trees in the plot to see if there is any relationship between them. This is also useful to spot any data values that may need further investigation and checking.

Exercise 8

Select Graph Scatterplot. Choose Simple. In the Plot dialog box choose the Y-variable to be Apples. In the same way choose Trees for the X-variable. The top of the Plot dialog box should look like Figure 10. Click on OK. The graph appears in a Graphics window as shown in Figure 11. To discard the graph, click in the box at the top left of the graphics window and click on Close Graph

9

Figure 10

Figure 11

11. Starting a Separate Analysis within the same Minitab Project The previous set of data and its analysis should be cleared from Minitab before the next exercise is attempted. This can be done in two ways as shown in Exercise 9.

Exercise 9

1. Create a new Minitab worksheet, using File New and choosing Minitab Worksheet from the dialogue box shown in Figure 12. This will create a new analysis within the same Minitab project and is useful if you want to keep “experiments” together as one unit.

2. Exit from Minitab, using File Exit and answering No to all other questions. This will

allow you to start a completely separate Minitab project. Using the first method described, clear the previous analysis now.

10

Figure 12

12. Reading data from a Separate File A data set that you wish to analyse may have been entered using an editor, database or a spreadsheet. This example assumes the data is in an ASCII (or plain text) file created by an editor such as Notepad or Office 2000 saving it as the correct format (MS-Dos Text (*.txt) or Data (*.dat)). Other forms of data entry (e.g. from Excel spreadsheets) can be opened directly from the File→Open Worksheet command. This tutorial does not extend that far and for more help on linking between other software and Minitab advice can be obtained from the Statistical Computing Advisory Service (see the back page of this document) or refer to the appropriate manual. There are many methods to layout data for input into Minitab but for the purpose of this tutorial, data values are separated by one or more spaces. Whilst this is a trivial example it does easily demonstrate the method needed. Data sets do not have to be confined to numeric values. For example, you have been supplied with a dataset called school.dat The four data values for each response represent: - School Name of school (character values) Quest The number of the multiple-choice question asked – yes/no answers only Number The number of responses to that question Correct the number of “yes” responses to that question A section of the data (19 observations) is shown here: - Willink 1 10 7 Willink 2 15 8 ………………….. Crossley 4 30 27 Crossley 5 30 29 If formatted input were not used, the records labelled Maiden E would have caused a problem, as there is a space within the name of the school.

11

Exercise 10

Select File Other Files Import Special Text and you should see Figure 13. Type in C1 - C4 into Store data in Column(s) and then select Format to obtain Figure 14. Select the User specified format and then fill in the dialogue box with the phrase A10,1X,F1,2X,F3,2X,F2 This indicates that the first 10 characters in the data file are alphanumeric (A10), there is one or two spaces (1X or 2X) separating each piece of information and that the numeric data values occupy one, two or three digits (F1, F2 or F3). Select OK within each dialogue box and then you will be shown Figure 15 and you must select the appropriate data file N:\school.dat. Click on Open to input the data set into the new Worksheet2 window. Name the columns appropriately.

Exercise 11

Using the data set just input, the percentage of correct answers possible to each question might be required. Follow the basic instructions in Exercise 6 calculate this percentage (label the column Percent and check a value for a random observation [Hint: Supply (Correct/Number)*100 to the Calc→Calculator dialogue-box] Plot the percentage of correct answers, Percent, against the question number, Quest, as a revision of Exercise 8.

Figure 13

12

Figure 14

Figure 15

13

13. Reading in Blocks of Data Occasionally, data has been typed into a file or database in a rectangular block as shown in the data set below. This data is held in a data file called N:\cropyd.dat. One way to deal with type of data in Minitab involves reading the values into six columns and then stacking them on top of one another. Try the next exercise to demonstrate this procedure. Water Added

0 50 100

Rep Rep Rep

1 2 1 2 1 2 Shaded Nutrient 1 132 129 134 130 130 132 Nutrient 2 129 127 135 133 129 127 Nutrient 3 134 154 133 135 127 126 Unshaded Nutrient 1 129 131 135 137 137 141 Nutrient 2 125 122 130 129 132 126 Nutrient 3 120 124 129 132 119 120

Exercise 12

Repeat Exercise 10Exercise 10, and read the data from N:\cropyd.dat into C1-C6 . This time select Blank delimited (numeric data only) from the Format dialogue-box so that the previously defined format is ignored. (Minitab will assume that each column of data is separated by a space). Now the data has to be stacked to create a single column C7 from the six columns already in place. From the drop-down menu select Data Stack Stack Columns, which should be as shown in Figure 16. Select all six columns to be stacked and then select the option Column of current worksheet and type C7 in the adjacent box. Click on OK. At this point you could delete the first six columns but this tutorial will continue with them in place.

Figure 16

14

14. Generating Data As described in the previous section the data in the file N:\cropyd.dat contains data values representing crop yields. The data have been collected from an experiment with three factors. The factors are Light (Shaded/Unshaded), Water Added (0, 50 and 100), Nutrient (levels 1, 2 and 3) and there are two Replicates. Rather than type in values to represent the order of the data corresponding to the factors, Minitab can generate data that follows a predefined pattern. Follow Exercises 13 and 14 to generate numbers and text values for the factors.

Exercise 13

Generate the values to represent the Replicate by selecting Calc Make Patterned Data Simple Set of Numbers. Complete the dialogue-box as shown in Figure 17 and select OK. Do you understand the values calculated and why? If not, ask the tutor. Complete columns C9 and C10 to represent Water and Nutrient respectively. Add column names if you wish. Look in the History window and the statements SET C8 3(1:2/1)6 SET C9 1(0:100/50)12 SET C10 12(1:3/1)1 should be visible. If not, then you may have specified the patterned data incorrectly. Try again or, if necessary, ask the tutor.

Figure 17

Exercise 14

You do not have to be confined by numeric data. Minitab will accept text values. These can be generated using Calc Make Patterned Data Text Values and fill the box in as shown in Figure 18. Click on OK. This is represented in the History window by TSET C11

15

6("Shaded" "Unshaded")3

Figure 18

15. Splitting the Worksheet into Groups It is possible that you may want to work on a group of observations using commands that do not have a BY option. In the following exercise split the worksheet into two further worksheets dependent on the amount of light applied to the experimental units and perform a test of normality for each group. As previously stated this tutorial is not able to offer statistical help and therefore this Exercise is only included to show how this function of splitting the worksheet can be used.

Exercise 15

From the main drop-down menu select Data Split worksheet and get the dialogue-box as shown in Figure 19. Use the text variable Light as the grouping variable. You are working on a worksheet that contains columns (variables) of unequal length so you may see warning messages appearing that certain columns have been excluded. This can be ignored (click on Cancel), but which columns are they and why are they ignored? …………………………………………………………………………………………………………………………………………………………………………………………………….. Perform a test for Normality on these two worksheets using Basic Statistics→Normality Test which does not contain the BY option.

16

Figure 19

16. Selecting Observations In the course of an analysis you may want to exclude certain observations on the assumption that they are not consistent with the overall hypothesis. In the crop data the values of yield when there was no water supplied are not valid. It is therefore necessary to exclude them from the next stage of the analysis. The next exercise demonstrates how this can be done.

Exercise 16

Move back to the Worksheet containing the complete set of data (from the main menu bar use Window→Worksheet2). Then from the main menu again select Data Subset Worksheet to get Figure 20. We want to include only the rows of data where Water (C9) is greater than zero. Select Rows that match and then the Condition button. On a screen similar to the Calculator dialogue-box type in C9 > 0 (see Figure 21). Click on OK twice and look at the contents of the new worksheet.

17

Figure 20

Figure 21

18

17. Sorting Data Occasionally, you may need to re-order your data dependent on the values in one particular column. Any number of columns can be re-ordered in this way and saved in further columns but only a maximum of four sorting columns can be identified unless you use the syntax command in full (which can be seen in the History window). Follow Exercise 17 to demonstrate this technique - remember to include the sorting column in the columns to be sorted (it is not done automatically).

Exercise 17

Select Data Sort and get Figure 22. In the example shown select C7 to be sorted by Light (in Descending order) into C17. Remember to include Light into C21 as well. Click OK and examine the output in the worksheet. Is it correct?

Figure 22

18. Tabulation Cross-tabulation or simple frequency tables can often be used as a check on your data as well as presenting summaries of other variables for reports. The following example is used simply to demonstrate the capabilities of Minitab to layout the original data from the previous exercises in a table similar to that shown in section 13.

19

Exercise 18

Return once again to the complete Worksheet2. Select Stat Tables Descriptive Statistics to reveal Figure 23. Complete the list of Categorical variables as in Figure 23. Select the Associated Variables button to produce Figure 24 and select C7. Select the option to produce the Sums of this variable and click OK once.

Figure 23

Figure 24

20

19. Bar Charts As explained in Section 10, a simple diagram can serve to summarise data very well and this illustration of a basic bar-chart is no exception. Follow Exercise 19 to display the means of the crop yield (C7) for each level of nutrient added (C10).

Exercise 19

Select Graph Bar Chart… Complete the dialogue-box as shown in Figure 26 and Figure 27.Figure 27 shows Mean but try others such as Sum or StDev. Do the charts agree with any statistical summaries that you may have already produced? .

Figure 25

Figure 27

21

There are many Minitab commands available for statistical analysis. All should be used initially under the guidance of a statistician as there are many ways of producing the wrong analysis and many procedures can be used for different forms of analysis, e.g. Stat ANOVA General Linear Model. This tutorial is not designed to show how to analyse your data but rather to give you a demonstration of how the drop-down menus work and the options they contain. The data we provide for you is very simple and in many places fabricated to show certain features.

20. Regression The data used here to illustrate the regression techniques in Minitab is from a clinical trial concerned with 25 patients suffering with cystic fibrosis. The data in contained in the dataset N:\cystic.dat comprises the following variables. Subject Subject Number Age Sex 0=male, 1=female Height (Cm) Weight (Kg) BMP Body Mass (Weight/Height2) as a percentage of the age-specific median in

normal individuals FEV1 Forced expiratory volume in 1 second RV Residual volume FRV Functional residual capacity TLC Total lung capacity PEmax Maximal static expiratory pressure (cm H2O)

A subset of the data is shown here: - 1 7 0 109 13.1 68 32 258 183 137 95 2 7 1 112 12.9 65 19 449 245 134 85 3 8 0 124 14.1 64 22 441 268 147 100 . . . 23 23 0 180 73.8 97 57 171 108 98 165 24 23 0 175 51.1 71 33 224 131 113 95 25 23 0 179 71.5 95 52 225 127 101 195

Exercise 20

Using the method described in Exercise 9 and 12 create a new worksheet and then import the data file, N:\cystic.dat, into C1-C11. You will not need to specify a format as all the data is numeric and separated by at least one blank space. Name each of the columns as suggested above (e.g. subject, age etc). ♦ Produce an X-Y Scatterplot as shown in Exercise 8 to check the relationship between

"Maximal static expiratory pressure (C11)" and "Weight (C5)". Can you suggest a relationship between the two variables? ………………………………………………………………………………………………

♦ Fit a linear regression between the two variables by selecting Stat Regression Regression… so that Figure 27 appears. Complete the dialogue-box

22

as shown where the Response variable is C11 and the Predictor is C5. Click OK and observe the results. Write down the equation of the line fitted. ………………………………………………………………………………………………

♦ Repeat the exercise with more pairs of variables, such as Height and Weight ♦ Multiple regression is an extension to this basic analysis whereby more than one predictor

is used to explain the variation in a response variable. Extend the list of predictors to analyse Pemax against Weight and Age. Write down the equation of the line fitted. ………………………………………………………………………………………………

Figure 26

Exercise 21

If you have time to spare, repeat the Regression analysis of “Maximal static expiratory pressure” and “Weight”. This time select the Graphs button and plot the Residuals versus the Fits (residual plot). This is a very useful aid to deciding whether a more complicated regression could better explain the relationship.

21. Analysis of Variance Minitab has several commands to perform analysis of variance dependent on your data collection. This exercise demonstrates a simple analysis of variance of a balanced designed experiment.

Exercise 22

23

A Minitab worksheet has been created on the diskette called N:\balaov.mtw. Create a new worksheet window (Exercise 9) and then using the option File→Open Worksheet select the appropriate file and look at the data. The data consists of the Yield of tomatoes from an agricultural experiment that was set out in 3 blocks (block) with 2 treatments (side with 2 levels and strain with 4 levels of classification). the data should be analysed in the following way assuming that strain is randomised across all plots in each block - a factorial randomised block design. From the main menu bar choose Stat→ANOVA Balanced ANOVA…. and get Figure 28. Complete the screen as shown in the figure with Yield as the Response variate and the Model as block side strain side*strain. Click OK to run the analysis. Repeat the analysis after selecting the Graph button and selecting Residuals versus Fits (as in Exercise 21).

Figure 27

22. Saving Graphs and Importing into Microsoft Word The quickest and easiest method to import a graph from Minitab into Microsoft Word can be done by copying the graphics window and then pasting it straight into the document. Try the next exercise on the graph just produced.

Exercise 23

Minimise your Minitab session and open a new blank Word document (from Start→ Microsoft Office→New Word Document). Find Minitab again from the taskbar and maximize the window. From the main menu bar choose Edit→Copy Graph and then return to the Microsoft Word document and paste the graph into position (<Ctrl><V> or Edit Paste Special).

24

23. Using Minitab Help To find out more about Minitab, select Help on the main menu bar. This is a useful tool when you cannot remember which dialogue-box to choose for simple tasks. Minitab includes a StatGuide and Tutorial but we recommend that these should be used in conjunction with statistical help provided by the School of Applied Statistics.

24. Leaving Minitab Before you finally leave Minitab, you should first save your work and data onto your N:\ drive. To leave Minitab, select Exit from the File menu. The dialogue box shown in Figure 30 should appear.

Figure 28

Select Yes when asked if you wish to save changes to the project and give it a suitable name e.g. Minitab_tutorial.