volweb.utk.eduvolweb.utk.edu/.../2019/04/STAT-201-Project-3-Spring-2019-V-2.6.docx  · Web...

11
Stat 201 – Project 3 – Spring 2019 Due Thursday, April 18, 2019 (1 minute before midnight, submitted to Canvas) Assignments submitted by 11:59pm on Tuesday, April 16 will receive +8 bonus points New Project File: For project 3 you will be using “STAT 201 – Spring 2019 – Project 3.jmp”. This file can be found on the STAT 201 webpage under the “Projects” tab. Getting Started: In this project, you will explore a subset (i.e., a sample) of some of the data collected from the survey that most Stat 201 students completed this semester. You will be including a substantial amount of output within your write- up. INCLUDE ONLY THE OUTPUT NECESSARY TO ANSWER THE PROJECT QUESTIONS. The data are found in the file “STAT 201 – Spring 2019 – Project 3.jmp”, which is located on the STAT 201 webpage under the “Projects” tab. This file contains 1,308 responses. In real life situations, researchers would use all of the data they have available after conducting a survey. For this project, however, you will get JMP to help you take a random sample from the entire data set so that each student will have different results, and therefore will be turning in a UNIQUE project. The size of the random sample will be 200 plus the last two of your UT ID. When you create your random sample from the original JMP file, JMP creates a new file that will be named “Subset of STAT 201 – Spring 2019 – Project 3”. You should immediately save a copy of this file by clicking the “File” menu and choosing “Save As…”. JMP will prompt you to keep the same name, which is acceptable, or you can rename it to something like “Stat Project3 – My Data”. Taking Screenshots: Although there are many ways to get JMP graphics into a written presentation, we want you to use the “screen shot” method in all cases. Please see the video at http://tinyurl.com/utk-screenshots for instructions on how to take selective screen shots on a PC or a Mac. Clearly label what question and part you are answering so your project is graded correctly! See page 5 for an example screenshot for a question. Tutorials and Write-up: See the JMP tutorials at http://web.utk.edu/~cwiek/201Tutorials/ and Project 3 Playlist for instructions on how to get JMP to perform most tasks. Use page 5 of this project for guidance in which tutorial to look at for each question in the project. In every question that asks you to produce output from JMP, we expect the output you produce to answer the question to be within the write-up. You should put this output immediately after your comments regarding that specific part of the assignment (i.e., not just a series of printouts from JMP at the back of your write-up). You can get help in 1

Transcript of volweb.utk.eduvolweb.utk.edu/.../2019/04/STAT-201-Project-3-Spring-2019-V-2.6.docx  · Web...

Page 1: volweb.utk.eduvolweb.utk.edu/.../2019/04/STAT-201-Project-3-Spring-2019-V-2.6.docx  · Web view6.Wrap-up and sign off: Give two to three sentences that showcase your meaningful findings

Stat 201 – Project 3 – Spring 2019Due Thursday, April 18, 2019

(1 minute before midnight, submitted to Canvas)Assignments submitted by 11:59pm on Tuesday, April 16 will receive +8 bonus points

New Project File: For project 3 you will be using “STAT 201 – Spring 2019 – Project 3.jmp”. This file can be found on the STAT 201 webpage under the “Projects” tab.

Getting Started: In this project, you will explore a subset (i.e., a sample) of some of the data collected from the survey that most Stat 201 students completed this semester. You will be including a substantial amount of output within your write-up. INCLUDE ONLY THE OUTPUT NECESSARY TO ANSWER THE PROJECT QUESTIONS. The data are found in the file “STAT 201 – Spring 2019 – Project 3.jmp”, which is located on the STAT 201 webpage under the “Projects” tab. This file contains 1,308 responses. In real life situations, researchers would use all of the data they have available after conducting a survey. For this project, however, you will get JMP to help you take a random sample from the entire data set so that each student will have different results, and therefore will be turning in a UNIQUE project. The size of the random sample will be 200 plus the last two of your UT ID. When you create your random sample from the original JMP file, JMP creates a new file that will be named “Subset of STAT 201 – Spring 2019 – Project 3”. You should immediately save a copy of this file by clicking the “File” menu and choosing “Save As…”. JMP will prompt you to keep the same name, which is acceptable, or you can rename it to something like “Stat Project3 – My Data”.

Taking Screenshots: Although there are many ways to get JMP graphics into a written presentation, we want you to use the “screen shot” method in all cases. Please see the video at http://tinyurl.com/utk-screenshots for instructions on how to take selective screen shots on a PC or a Mac. Clearly label what question and part you are answering so your project is graded correctly! See page 5 for an example screenshot for a question.

Tutorials and Write-up: See the JMP tutorials at http://web.utk.edu/~cwiek/201Tutorials/ and Project 3 Playlist for instructions on how to get JMP to perform most tasks. Use page 5 of this project for guidance in which tutorial to look at for each question in the project. In every question that asks you to produce output from JMP, we expect the output you produce to answer the question to be within the write-up. You should put this output immediately after your comments regarding that specific part of the assignment (i.e., not just a series of printouts from JMP at the back of your write-up). You can get help in the Stat 201 Lab with specific questions about the project. You can NOT ask a Stat 201 Lab worker to read your entire project for suggestions on what to change. Your finished work must be submitted within Canvas (see “Assignments”), and must be a Microsoft Word document (.doc or .docx).

JMP and Hodges Library computers: Using JMP installed on your own computer is much simpler than using JMP on a library computer! If you choose to use a computer in the library to do your project, be sure to first read the document “Using JMP in the Library”, found in MyLab under the Project Files tab. Also, you will need to save your project and your random sample subset file to a location you can access later, such as a memory stick. You could also e-mail these files to yourself for later use.

Writing a Good STAT 201 Project Report: Please take note that the last page of the instructions is a page titled “Writing a Good Stat 201 Project Report”. This page contains a series of guidelines for the written part of your report. A portion of your grade (6%) is related to following these guidelines.

1

Page 2: volweb.utk.eduvolweb.utk.edu/.../2019/04/STAT-201-Project-3-Spring-2019-V-2.6.docx  · Web view6.Wrap-up and sign off: Give two to three sentences that showcase your meaningful findings

1. The data for this project are found in the file “STAT 201 – Spring 2019 – Project 3.jmp”, which is located on the Stat 201 webpage under the “Projects” tab. From the full database, get JMP to help you take a random sample of size 200 plus the last two numbers on your UT ID. Save this file. You will be using this random sample data file, and the larger database, to answer the following questions. (6 points)

Scroll to the bottom of your random sample data file, and take a screen shot of the far-left hand portion of your file that includes the first column and at least the last 10 rows. (See the example on page 5.)

2. In this question you will be analyzing a categorical variable

Question Two

a) Choose a categorical variable with only two levels and pick a level you are interested in. Use JMP to summarize the variable you choose in a bar chart from graph builder and tabulate. Interpret your graphic and the percentages in the tabulate. Finally, make sure it is clear what level you are interested in analyzing. (8 points)

b) Do you believe the true proportion for your variable is 50%? Take a moment and briefly explain why you think the true proportion might or might not be 50%. Make sure to mention your understanding of the variable and why the truth for this value might or might not be 50%. NOTE: there is no “right” answer here, it’s just your opinion. (3 point)

c) Clearly state all three conditions you need to check regarding the data collected to run a test on the true proportion. Explain if these conditions are met. Assume that your sample is the sample and the population is all the students at The University of TN Knoxville. (6 points)

d) Write out the null and alternative hypothesis in statistical notation. After writing the statistical notation, include a short paragraph where you state the null and alternative in plain English. (6 points)

e) Use JMP to analyze the distribution which will produce a bar chart and frequency table. Add to the graphic a 95% confidence interval for the true populationproportion. Interpret your confidence interval in context of the problem below the output. (6 points)

f) Using your confidence interval, state your conclusion regarding the null hypothesis in context of the problem. It should be clear to the reader whether or not you believe 50% might be the true proportion. (4 points)

NOTE- For the next question you will need to make an Excel version of the random sample data set you created earlier. Open your random sample data file in JMP, then PC: File>Save As, select “Excel Workbook” as the File Type. MAC: File>Export>Excel>Next>Export.

2

Page 3: volweb.utk.eduvolweb.utk.edu/.../2019/04/STAT-201-Project-3-Spring-2019-V-2.6.docx  · Web view6.Wrap-up and sign off: Give two to three sentences that showcase your meaningful findings

3. In this question, you will be exploring students’ answers to question 15: “How many credit hours are you taking this semester?”

[For parts (a) through (i), once again assume you don’t have access to the full database (i.e., “the population”), but instead only have your sample. However, assume you do know that the population consists of all students at The University of TN Knoxville.]

a) From your random sample data file, for the variable Q15 - Credit Hours This Semester, get JMP to display a histogram, quantiles and summary statistics for this variable. Your histogram must be in “horizontal layout” and have a count axis. (3 points)

b) We must meet three conditions to perform a hypothesis test and construct a confidence interval for the population average of credit hours for UT students. Are the three conditions for doing this hypothesis test and calculating a valid confidence interval met in this case? State each condition. Clearly explain whether or not you think each condition is satisfied. Provide numerical justification where appropriate. (6 points)

c) Online research suggests that the mean hours a semester a student will take is 15 hours. State the null and alternative hypotheses suggested by this statement, using proper mathematical notation. [Although in a report to a non-technical person, you typically would avoid proper mathematical notation, you must use correct mathematical notation here. Hint: on a PC, INSERT tab, Symbol. On a Mac, Insert tab, Advanced Symbols, choose Symbol font.] You can also copy/paste from symbols in this project (π). (4 points)

d) Regardless of your answers to part (b), perform this hypothesis test by hand. Use the insert equation option in Word to write out the equation and find the t-statistic. After finding the t-statistic, plot it to find the p-value on the appropriate t-distribution and take a screenshot to include in your report. The following applet can be used to take pictures. NOTE: For this problem you must complete the mathematics by hand and create the output using an applet. (9 points)

e) Using = 0.05, state your conclusion regarding your null hypothesis. Be sure to state your conclusion in the context of the problem. (5 points)

f) Report your 95% confidence interval for , the population average credit hours of Stat 201 students in Statistics 201. Report two decimal place accuracy here. (Hint: the 95% confidence interval is already displayed in the output you generated for part [a].) Interpret what this interval means in the context of this problem. (6 points)

g) Use equation editor in Word to show the computations for a 95% confidence interval in part (f). You do not need to solve this equation. You just need to write the math you would solve using the output JMP created. Make sure to include the right t-statistic which will have to be found using the applet. The t-statistic will be a different one than the one from before. Use two decimal place accuracy. (HINT: there is an easy and a hard way to do this! Either way is acceptable.) (3 points)

3

Page 4: volweb.utk.eduvolweb.utk.edu/.../2019/04/STAT-201-Project-3-Spring-2019-V-2.6.docx  · Web view6.Wrap-up and sign off: Give two to three sentences that showcase your meaningful findings

h) From the Excel version of your random sample data file, use Excel to calculate Summary Statistics and the margin of error for a 95% confidence interval (see page 5 of this project for some hints on producing this output). Display within your report the Excel output you generated (it should look similar to the output shown on page 5 of this project). Report the numerical value of the margin of error Excel displayed. Does this reported value from Excel match the margin of error you calculated in part (g)? (6 points)

i) Since you performed a two-sided hypothesis test for using =0.05, and you created a 95% confidence interval for , there should be a direct correspondence between your decision about your null hypothesis and the values within your confidence interval. What value of would you be testing at if you had constructed a 98% confidence interval? Also, what type of test would you be performing? (6 points)

j) In research people are often worried about a Type I or II error. Imagine a scenario where a researcher cannot collect much data. Which error will this directly impact? What is one change the researcher can make to increase the power of their test? What is something they cannot change that impacts power? (6 points)

Additional point values:

Project organization and flow (3 points)

Projects should look neat and organized. Use the crop tool in Word if you need to improve screenshots. Your project should read like a report without the prompt of each question.

Use of the guidelines on page 6 (3 points)

The opening paragraph on the project should give a short summary (3-5 sentences) of the analysis they’re about to read. The closing paragraph should summarize interesting findings and discuss any ideas you have regarding further data collection and/or analysis.

4

Page 5: volweb.utk.eduvolweb.utk.edu/.../2019/04/STAT-201-Project-3-Spring-2019-V-2.6.docx  · Web view6.Wrap-up and sign off: Give two to three sentences that showcase your meaningful findings

JMP Tutorials and Excel Hints Needed for Each Question

Question Heading Tutorial Notes1 Miscellaneous Topics Taking a Simple

Random SampleThe “Random – sample size” you use

will be n=200.2a Bar Chart Making a Bar Chart2a Tabulate Using Tabulate in JMP2e Inference About a

Population ProportionConfidence Interval for a Population Proportion

Your data are “unaggregated”, so you will place nothing in the Freq box.

3a Graphical Display of Quantitative Data

Histogram & Box Plot

3f T-Tests (Confidence Intervals & Hypothesis

Testing)

One-sample t Test

3h Using Excel Only Excel for the PC or Excel 2016 for the Mac can perform this work. For

both PC and Mac, the “Analysis ToolPak” Add-In must be active: PC: File – Options – Add-Ins – Manage

Excel Add-ins – Go… - check Analysis ToolPak – OK. Mac: Tools – Excel Add Ins – check Analysis ToolPak –

OK. PC and Mac: See further instructions in the PowerPoint slides,

Chapter 14, slides 20-21.

Example Screen Shot for Question 1 Example Excel Output Question 3(h)

5

Page 6: volweb.utk.eduvolweb.utk.edu/.../2019/04/STAT-201-Project-3-Spring-2019-V-2.6.docx  · Web view6.Wrap-up and sign off: Give two to three sentences that showcase your meaningful findings

Writing a Good STAT 201 Project Report Writing a report to your boss about a statistical analysis he has asked you to do is very different than writing a novel, or writing to your Statistics instructor. What does it take to write a good project report? Of course, it’s important to know your audience when you write anything.

Let’s assume you are writing your project report for some busy executives in the company, and they have asked you to answer the questions in the project. They are very intelligent people, but they are not “Statisticians”. Assume that these executives have had some basic statistical education, but perhaps a long time ago. Keep this in mind as you complete your project.

Below are some guidelines for writing an effective project report:

1. The first sentence or two of your report should “orient” the reader. What is this document about? Who is it from? What will you be covering? On what date did you complete the analysis?

2. Answer each question on the project instructions using correct sentence structure, spelling and grammar. Sentences should be succinct and clear. You can assume the executives have a copy of the questions they asked.

3. Avoid using "statistical jargon". Explain the results of the analysis in a way that the executives can understand it.

4. As explained in the project instructions, graphics from JMP and/or Excel that address the project question must be imbedded within the document, at the point where the executives need to see them. Don’t make them hunt for the output at the back of your report.

5. Avoid including discussion and/or graphics within the report that have no relevance to the question being addressed.

6. Wrap-up and sign off: Give two to three sentences that showcase your meaningful findings and ideas regarding further data collection or analysis. The wrap-up should contain something meaningful related to the report you wrote and should not just restate results.

rev. 2019-11-20

6