Survey – extra credits (1.5pt)! Study investigating general patterns of college students’...

Survey – extra credits (1.5pt)!

• Study investigating general patterns of college students’ understanding of astronomical topics

• There will be 3~4 surveys this semester.• Anonymous survey (the accuracy of your responses will

not affect your course grade). But, be accurate, please!• Your participation is entirely voluntary. • SPARK: Assessments > Survey2• The second survey is due: 11:59pm, March 27th (Sun.)• Questions? - Hyunju Lee ([email protected]) or

Stephen Schneider ([email protected])

Funded by Hubble Space Telescope Education & Public Outreach grant

Uncertainty

• Ultimately, all measurements of physical quantities are subject to uncertainties.

• Variability in the results of repeated measurements arises because variables that can affect the measurements result are impossible to hold constant.

• Even if the "circumstances," could be precisely controlled, measures would still have an error associated with them, because measure apparatuses can only be manufactured with finite level of quality (the infinitely accurate instrument is only a theoretical abstraction)

• Steps can be taken to limit the amount of uncertainty, but it will always be there, no matter how refined (and expensive) technology can be.

• So, the real goal of an experiment is: reduce the uncertainty in the measures to the degree that is needed to prove or disprove a theory

How to keep uncertainty at bay

• Even if errors cannot be eliminated, they can be understood, controlled and minimized. For example:• Systematic errors: need to be thoroughly investigated to

understand in which way they affect the experiment, and minimized (often this is the most difficult thing to do)

• Random Errors: if they “randomly” add to or subtract from the true value, then a very effective way to minimize them is to take repeated measures (under the same conditions) and then take the average.

• When taking the average of large number of measures, the measures in deficiency compensate for those in excess, and the net result is a much better estimate of the true value

The Average• Suppose you have N repeated measurements of the

same physical quantity: • x1, x2, x3, …, xN

• Their average is their sum divided by their number, namely

• The average is a much better estimator of the true value than any individual measurement

• The more numerous the measurements, the more accurately their average estimates the true value

Why does the average work. I?

• Of course, this is no magic!

• Think of each measurement as the sum of the True Value (Tv) plus the Random Error (ε):

• In each measurementm the Random Error unpredictably adds (positive ε) or subtracts (negative ε) from the True Value Tv

• The magnitude of ε, too, varies at random from measure to measure. Some time ε is big, other small

Why does the average work.II?

• Examples of ε’s: • -0.1, +0.15, -1.5, +1.3, -0.01, +0.7, +0.9, -0.005, +1.0…, you got the idea!

• When we add measures together to take the average, some of the negative ε’s compensate for the positive ε’s:

• If the sum of all the random errors (the ε’s) were zero, the average would be exactly equal to the True Value!

• But in the average of only a few measurements, the compensation is almost certainly crude:• For example, the sum of the first two ε’s is -0.1+0.15 = +0.05

• The more measurements we average together, the better the compensation (i.e. the closer the sum of all the errors to zero)

Why many repeated measurements?

• In a large sample of measurements the likelihood of finding pairs of ε’s, one positive and the other negative, that have nearly exactly the same absolute value is high:• E.g.: -0.301 and +0.299; -0.001 and +0.0009

• In other words, large numbers of measurements explore more thoroughly all the possible realizations (i.e. all the possible values) of the random errors.

• As long as the errors are symmetrically distributed relatively to zero, i.e. there are as many positive ε’s as negative ones, and they have the same absolute values, the compensation will be nearly perfect.

• Larger numbers of measurements have larger information content of smaller ones because they contain more realizations of random errors.

• The important thing to know: the distribution of the errors

The probability distribution:how many times a measure with a given value is observed?

•Measures closer to the true value are more likely to occur

•In other words, measures with smaller random errors are more likely than those with larger errors (blue histogram: peaky probability distribution).

•Greatly deviant measures can be found, too, only more rarely

•Random numbers do not distribute around a peak. All numbers are equally likely (red histogram, flat probability distribution). E.g.: tossing dices

How big errors can be?The dispersion

• But how large an error one can encounter?

• That depends on how good the measures are.

• This is reflected in the width of the distribution function

• The larger the width of the distribution the larger the probability to have big random errors

• The width (a.k.a. dispersion) is estimated by the standard deviation:

• …here A is the average of the measurements

• In other words, one takes the average of the errors squared!

• Why squared? Because squared numbers are positive, so that the sum does not compensate. One wants to know the average error magnitude!

• The standard deviation is basically the average of the error absolute value (the final square root compensates for having taken the square of the errors)

The class’ measurements of the period of the pendulum

The average of many measures is a much more accurate estimator of the true value than the individual measures.

The standard deviation is an estimate of the typical error

3.002.722.942.872.692.972.942.722.942.912.872.972.852.602.502.603.00……

The importance of repeated measures

• Let’s do a computer simulation• i.e. let’s simulate measures of the period of the pendulum.• The true value is 1.000 sec, and the typical random error

of the measure is 0.5 sec.• (…but let’s pretend we do not know this).• Then let’s take three sets of data, one set with 100

measurements, one with 1,000 and one with 10,000• Then let’s compare the results, like calculating the

average, standard deviation, and plotting the distribution of the measures

The larger the sample of measure, the more accurate is the average

100 measures

1,000 measures

10,000 measures

10,000 measures contain much more information than 1,000 measures, which contain much more information than 100 measures, which contain much more information than 1 measures!That is why large samples are important

Distributions in Astronomythe distribution of galaxy luminosity

The Distribution of Galaxy Luminosity

• There are many more faint galaxies than bright one

• In fact, the number of very bright ones is uncertain, because they are rare

• (step the telescope a bit to the right, and you loose the two very bright ones)

• The distribution of galaxy luminosity (and colors) contains key information on how these systems formed and develop

Bright End Faint End

More on distributions

• When tossing one dice, what is the distribution of probability of each value?• A: all values are equally likely• B: some values are more likely than others

• When tossing two dices, what is the distribution of probability of the sum of two values?• A: all sums are equally likely• B: some sums are more likely than others

• Let’s try…

The answer: can you explain why?

•Each dice face has equal probability

•However, the probability to have 2 or 12 is low, because these can only be done with 1+1 or 6+6.

•Much more likely are numbers that can be mode by more combinations, such as 7, which can be made as 1+6, 2+5, 3+4.

Information content of distributions

• The shape of the distribution tells us which values are more likely to be found, and which are less likely

• In a gaussian (bell-shaped) distribution, the most likely value of all is the true value

• The least likely values are those with big deviations from the true value

• Unfortunately, medium-size deviations are likely, too.• This is why the standard deviations is a good indicators of

the random error• The average is a powerful way to “average out” these

deviations

Survey – extra credits (1.5pt)! Study investigating general patterns of college students’...

Documents

Transcript of Survey – extra credits (1.5pt)! Study investigating general patterns of college students’...