MIDTERM EXAM – 1st December 2014 EXERCISE 1...

8
MIDTERM EXAM – 1 st December 2014 Statistics for Economics & Business –Domenico Vistocco EXERCISE 1 Associate the numbered boxplots with the corresponding densities (using the numbers from 1 to 8):

Transcript of MIDTERM EXAM – 1st December 2014 EXERCISE 1...

MIDTERM EXAM – 1st December 2014 Statistics  for  Economics  &  Business  –Domenico  Vistocco

  1  

EXERCISE 1 Associate the numbered boxplots with the corresponding densities (using the numbers from 1 to 8):

MIDTERM EXAM – 1st December 2014 Statistics  for  Economics  &  Business  –Domenico  Vistocco

  2  

Starting  from  the  previous  graphs,  shortly  comment  the  features  of  each  of  the  eight  distributions  and  the  reasoning  used  to  associate  the  graphs:  DISTRIBUTION Main distribution features

# 1

# 2

# 3

# 4

# 5

# 6

# 7

# 8

EXERCISE 2 In which of the two following data scatters the variables are more correlated? Shortly comment and motivate your answer: A

B

MIDTERM EXAM – 1st December 2014 Statistics  for  Economics  &  Business  –Domenico  Vistocco

  3  

EXERCISE 3 Try to guess a value for the correlation coefficient for each couple of variables depicted in the following data scatters, shortly motivating your answer:

(tip: answer by comparing the plots)

MIDTERM EXAM – 1st December 2014 Statistics  for  Economics  &  Business  –Domenico  Vistocco

  4  

EXERCISE 4 Nobel Prize Winners by Age, 1901 to 2014 The Nobel Prizes and the Prize in Economic Sciences have been awarded 567 times to 889 people and organizations since 1901. The boxplots on the right depict the distribution of the Laureates’ ages for the different fields.

a) Shortly comment the plot b) In 2014 Malala Yousafzai won the Nobel Peace Prize, making her the youngest winner in history. Identifies the position of the corresponding point on the graph and comment its particular feature c) What analysis you should use to study the relationship between Age and Field? Briefly mentions the steps to follow to measure such relationship and explain the statistical index that you would use at such aim.

MIDTERM EXAM – 1st December 2014 Statistics  for  Economics  &  Business  –Domenico  Vistocco

  5  

EXERCISE 5 Read carefully the following newspaper article: Player age in football - The clock is ticking (Source: The Economist) Jul 4th 2014, 11:36 BY S.H. FRESH off their dramatic extra-time victories in the Round of 16, Argentina and Belgium are set to face off on July 5th in the World Cup quarterfinals. With Lionel Messi in top form, the Argentines are strongly favoured: the latest betting lines have Belgium as a three-to-one underdog. Then again, before the World Cup started, virtually all handicappers (including The Economist’s own journalists) projected that Spain would reach at least the tournament’s semifinals. Instead, the defending champions were the first team to get knocked out. And one of the key factors that did the Spaniards in—a roster that was among the oldest in the Cup—could easily undermine Argentina as well. Following his squad’s early exit, Vicente del Bosque, the Spanish manager, dismissed concerns that his men were over the hill. “This is a mature team with players in their prime”, he insisted. On the surface, the results of the 2010 World Cup seem to confirm that he had little reason to worry. In that tournament, there was no statistically significant relationship between teams’ average age and their final standing. The two youngest teams were Spain and North Korea: one finished first, the other dead last. However, it is hard to detect the impact of a factor like age using a sample of just 32 teams in a single World Cup, because so many other variables also influence performance. After all, Spain and North Korea differed in every meaningful way except for their average age. In order to isolate the age factor, we must compare teams of otherwise roughly similar skill. One simple way to control for overall quality is to limit the study to defending World Cup champions, all of whom were good enough to win a title four years before the tournament in question.

And within this group, age seems to have a remarkably strong impact. The single strongest factor that influenced their performance was probably the (close to) home-field advantage: teams that played on their own continent performed nearly six places better in the final standings than those that had to travel further afield. But after adjusting for the effect of geography, a one-year increase in average age was associated with a four-place drop in performance (see chart). In other words, if a reigning champion simply brought back its roster from four years before, its mean age would increase by four years, and it would be expected to finish a dismal 17th. Although the sample of title defenders is small, the examples seem compelling. When Italy repeated as the victor in 1938—it is still only one of two teams to win back-to-back Cups—it had the second-youngest

team of any returning champion in tournament history. One-third of Cup victors won with an average age below 26, including Spain itself in 2010. Conversely, France in 2002 and Italy in 2010 sent two of the oldest squads, and neither won a single match. Had the oddsmakers placed greater weight on this variable, they would have been far more bearish on Spain’s chances—and on Argentina’s. The players on this year’s edition of La Roja had an average age of 28, two years older than those who won in South Africa in 2010. Based on that factor alone, they would not have even be expected to reach the quarterfinals. Yet even this rather gray Spanish squad was not the oldest in the 2014 World Cup. That honour goes to Mr Messi and Co.—who have the added misfortune of facing a Belgian team that is the tournament’s second-youngest. Why do a few piddling birthdays seem to be the difference between triumph and collapse? While there is clearly some value to experience and mastering the intricacies of the game, the raw physical demands of football at the highest level have grown increasingly extreme. In the 1970s players ran a modest four km (2.5 miles) per match; today the figure is over ten. In most other continuous-play sports, managers have the flexibility to rest older veterans to keep them fresh for key moments: the San Antonio Spurs won the National Basketball Association this year by keeping their three biggest stars on the bench 43% of the time. But football’s limit of just three substitutions per match puts a premium on endurance above all else. And it takes a huge amount of guile and technique to compensate for even a small loss of foot speed or stamina. As a result, modern football players tend to peak between the ages of 23 and 25, and are usually well into their decline phase by their late 20s. Managers are understandably reluctant to leave stars with a relatively recent record of success on the bench or off the team altogether. In addition to prompting an uproar from fans, promoting a green youngster over a battle-tested veteran could easily sow friction among players. But the evidence suggests that managers would be well-advised to kill their darlings at the first opportunity. For all but the most precocious or durable players, even a second World Cup appearance is probably one too many.

MIDTERM EXAM – 1st December 2014 Statistics  for  Economics  &  Business  –Domenico  Vistocco

  6  

Make a guess on the value of the slope estimated in this regression analysis (shortly motivating your answer):

EXERCISE 6 Passing by guessing. A quiz in a statistics course has fourmultiple-choice questions, each with five possible answers. A passing grade is three or more correct answers to the four questions. Allison has not studied for the quiz. She has no idea of the correct answer to any of the questions and decides to guess at random for each.

a. Find the probability she lucks out and answers all four questions correctly.

b. Find the probability that she passes the quiz.

EXERCISE 7 Death penalty and false positives For the decision about whether to convict someone charged with murder and give the death penalty, consider the variables reality (defendant innocent, defendant guilty) and decision (convict, acquit). a. Explain what the two types of errors are in this context. b. Jurors are asked to convict a defendant if they feel the defendant is guilty “beyond a reasonable doubt.” Suppose this means that given the defendant is executed, the probability that he or she truly was guilty is 0.99. For the 1234 people put to death from the time the death penalty was reinstated in 1977 until December 2010, find the probability that

(i) they were all truly guilty

MIDTERM EXAM – 1st December 2014 Statistics  for  Economics  &  Business  –Domenico  Vistocco

  7  

(ii) at least one of them was actually innocent. c. How do the answers in part b change if the probability of true guilt is actually 0.95? EXERCISE 8 You’d like to estimate the proportion of the 14,201 ( www.syr.edu/about/facts.html ) undergraduate students at Syracuse University whoare full-time students. You poll a random sample of 100students, of whom 94 are full-time. Unknown to you, the proportion of all undergraduate students who are fulltime students is 0.951. Let X denote a random variable for which x = 1 denotes full-time student and for which x = 0 denotes part@time student. a. Describe the data distribution. Sketch a graph representing the data distribution. b. Describe the population distribution. Sketch a graph representing the population distribution.

MIDTERM EXAM – 1st December 2014 Statistics  for  Economics  &  Business  –Domenico  Vistocco

  8  

c. Find the mean and standard deviation of the sampling distribution of the sample proportion for a sample of size 100. Explain what this sampling distribution represents. Sketch a graph representing this sampling distribution.

EXERCISE 9 Canada lottery In one lottery option in Canada (Source: Lottery Canada), you bet on a six-digit number between 000000 and 999999. For a $1 bet, you win $100,000 if you are correct. The mean and standard deviation of the probability distribution for the lottery winnings are µ = 0.10 (that is, 10 cents) and σ = 100.00. Joe figures that if he plays enough times every day, eventually he will strike it rich, by the law of large numbers. Over the course of several years, he plays 1 million times. Let 𝑥 denote his average winnings. a. Find the mean and standard deviation of the sampling distribution of 𝑥. b. About how likely is it that Joe’s average winnings exceed $1, the amount he paid to play each time? Use the central limit theorem to find an approximate answer.