OezbekC07-How to Lie With Statistics

download OezbekC07-How to Lie With Statistics

of 37

Transcript of OezbekC07-How to Lie With Statistics

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    1/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 1 / 37

    Course "Empirical Evaluation in Informatics"

    Christopher Oezbek, Prof. Dr. Lutz PrecheltFreie Universitt Berlin, Institut fr Informatik

    http://www.inf.fu-berlin.de/inst/ag-se/

    How to lie w ith statistics

    What do they mean? Biased measures Biased samples

    What is the real reason? Misleading averages Misleading visualizations

    Pseudo-precision Plain false statements What is not being said?

    "Just try again" Incomparable measures Invalid measures

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    2/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 2 / 37

    "Empirische Bewertung in der Informatik"

    Christopher Oezbek, Prof. Dr. Lutz PrecheltFreie Universitt Berlin, Institut fr Informatik

    http://www.inf.fu-berlin.de/inst/ag-se/

    Wie man mit Statistik lgt

    Was ist berhaupt gemeint? Verzerrt das benutzte Ma? Verzerrt die

    Stichprobenauswahl? Ist das wirklich der Grund? Irrefhrende Mittelwerte Irrefhrende Darstellungen

    Pseudoprzision Glatte Falschaussagen Was wird nicht gesagt?

    "Probier einfach noch mal" Unvergleichbare Daten Gltigkeit von Maen

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    3/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 3 / 37

    Source

    This slide set was created roughly along the lines of

    Darrell Huff: "How to Lie With Statistics",(Victor Gollancz 1954, Pelican Books 1973, Penguin Books 1991)

    I urge everyone to read this book in full It is short (120 p.), entertaining, and insightful

    Many differenteditions available

    Other, similar booksexist as well

    http://images-eu.amazon.com/images/P/0393310728.03.LZZZZZZZ.jpghttp://images-eu.amazon.com/images/P/039309426X.01.LZZZZZZZ.jpghttp://images-eu.amazon.com/images/P/0140136290.03.LZZZZZZZ.jpghttp://images-eu.amazon.com/images/P/0393310728.03.LZZZZZZZ.jpg
  • 8/3/2019 OezbekC07-How to Lie With Statistics

    4/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 4 / 37

    Example:Human Growth Hormone (HGH)

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    5/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 5 / 37

    Remark

    We use this real spam email as an arbitrary example and will make unwarranted assumptions about what is behind it

    for illustrative purposes

    I do not claim that HGH treatment is useful, useless, or harmful

    Note:

    HGH is on the IOC doping list http://www.dshs-koeln.de/biochemie/rubriken/01_doping/06.html

    "Fr die therapeutische Anwendung von HGH kommen derzeit nur zweiwesentliche Krankheitsbilder in Frage: Zwergwuchs bei Kindern und HGH-Mangel beim Erwachsenen"

    "Die Wirksamkeit von HGH bei Sportlern muss allerdings bisher stark in

    Frage gestellt werden, da bisher keine wissenschaftliche Studie zeigenkonnte, dass eine zustzliche HGH-Applikation bei Personen, die einenormale HGH-Produktion aufweisen, zu Leistungssteigerungen fhrenkann."

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    6/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 6 / 37

    Problem 1: What do they mean?

    "Body fat loss: up to 82%" OK, can be measured

    "Wrinkle reduction: up to 61%" Maybe they count the wrinkles and measure their depth?

    "Energy level: up to 84%"

    What is this?

    Also note they use language loosely:

    Loss in percent: OK; reduction in percent: OK

    Level in percent??? (should be 'increase')

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    7/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 7 / 37

    Lesson: Dare ask what

    Always question the definition of the measures for whichsomebody gives you statistics

    Surprisingly often, there is no stringent definition at all

    Or multiple different definitions are used

    and incomparable data get mixed

    Or the definition has dubious value

    e.g. "Energy level" may be a subjective estimate of patients whoknew they were treated with a "wonder drug"

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    8/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 8 / 37

    Problem 2:A maximum does not say much

    Wrinkle reduction: up to 61% So that was the best value. What about the rest? Maybe the distribution was like this:

    reduction

    o o ooo o

    ooo

    ooo oooooo

    oooo

    ooo o ooo

    oo

    oo oo ooooooo o

    oo oooo

    o o ooo oooo

    ooooo

    o ooo ooo oo o

    oo ooo

    oo

    0 10 20 30 40 50 60

    M

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    9/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 9 / 37

    Lesson:Dare ask for unbiased measures

    Always ask for neutral, informative measures in particular when talking to a party with vested interest

    Extremes are rarely useful to show that someting is generallylarge (or small)

    Averages are better

    But even averages can be very misleading

    see the following example later in this presentation

    If the shape of the distribution is unknown, we need summaryinformation about variability at the very least

    e.g. the data from the plot in the previous slide hasarithmetic mean 10 and standard deviation 8

    Note: In different situations, rather different kinds of informationmight be required for judging something

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    10/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 10 / 37

    Problem 3: Underlying population

    Wrinkle reduction: up to 61% Maybe they measured a very special set of people?

    reduction

    M

    oo ooo o oo o oooo oooooo o

    oo oo oo ooo ooooo o oo o oo ooo o oo o oo oo ooo oo oo o oo ooo

    ooo oo oo oo ooo oooo

    M

    o o o oo o ooo ooo ooooo ooo ooo oo o ooo o ooo oo ooo oooo o oo ooo oo o ooo oo oo o ooooo ooo ooo oo o oo ooooo

    healthy

    heartAttack

    -20 0 20 40 60

    Note:Thisd

    atais

    purefantasy!

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    11/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 11 / 37

    Lesson: Insist on unbiased samples

    How and where from the data was collected can have atremendous impact on the results

    It is important to understand whether there is a certain(possibly intended) tendency in this

    A fair statistic talks about possible bias it contains If it does not, ask.

    Notes:

    A biased sample may be the best one can get Sometimes we can suspect that there is a bias, but cannot be

    sure

    P bl 4

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    12/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 12 / 37

    Problem 4:Is HGH even part of the cause?

    Wrinkle reduction: up to 61% Maybe that could happen even without HGH?

    reduction

    M

    o o o oo ooo o ooo ooo oo o oo ooo oo o ooo o oo ooo oooo oo o o oo oo ooo o oo o oo oo o oo oo o oo o ooo oo o oo ooo oo

    M

    oo ooo o oo o oooo oooooo o oo oo oo ooo oo ooo o oo o oo ooo o oo o oo oo

    ooo oo oo o oo ooo ooo oo oooo ooo oooo

    M

    o o o oo o ooo ooo ooooo ooo ooo oo o ooo o ooo oo ooo oooo o oo ooo ooo ooo oo o

    o o ooooo ooo ooo oo o oo ooooo

    h.A.,noHGH

    healthy

    heartAttack

    -20 0 20 40 60

    Note:This

    datais

    purefantasy!

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    13/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 13 / 37

    Lesson: Question causality

    Sometimes the data is not just biased, it contains hardlyanything else than bias

    If somebody presents you with a presumably causalrelationship ("A causes B"), ask yourself What other influences besides A may be important?

    What is the relative weight of A compared to these?

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    14/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 14 / 37

    Example 2: Tungu and Bulugu

    We look at the yearly per-capitaincome in two small hypotheticisland states:Tungu and Bulugu

    Statement:"The average yearly income

    in Tungu is 94.3% higherthan in Bulugu"

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    15/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 15 / 37

    Problem 1: Misleading averages

    The island states are rather small:81 people in Tungu and 80 in Bulugu

    And the income distribution is not as even in Tungu:

    income

    M

    oo oo oo ooooo o o

    oo oo o oooo ooo ooo ooo ooo o oo oo o

    oo oo o o o oo oo o

    oo ooo oooo

    o oooo oo o

    oo

    o oooooo o o

    M

    o oooo o ooo ooo o o ooo oo o ooo ooo o o o oooo oo o oooo o oo o ooo ooo o oo o o ooo ooo o oo o oo ooo o oo ooo o o oo

    Bulugu

    Tungu

    0 1000 2000 3000 4000 5000

    Note:This

    datais

    purefa

    ntasy!

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    16/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 16 / 37

    Misleading averages and outliers

    The only reason is Dr. Waldner, owner of a smallsoftware company in Berlin, who since last yearis enjoying his retirement in Tungu

    income

    M

    oooo oo ooo

    oo o o oo oo ooooo oo o oooooo oo

    o o oo oo

    o oo oo o o o oooo ooo ooo ooo o o oo oo oo

    oo

    oo o

    ooooo o o

    M

    o oooo o ooo ooo o o ooo oo o ooo ooo o oo oooo oo o oooo o oo o ooo ooo o oooo ooo ooo o ooo oo ooo oooooo o o oo o

    Bulugu

    Tungu

    10 3.0 10 3.5 10^4.0 10^4.5 10^5.0

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    17/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 17 / 37

    Lesson: Question appropriateness

    A certain statistic (very often the arithmetic average) may beinappropriate for characterizing a sample

    If there is any doubt, ask that additional information beprovided

    such as standard deviation

    or some quantiles, e.g. 0, 0.25, 0.5, 0.75, 1Note: 0.25 quantileis equivalent to25-percentileetc.

    M

    o oooo o ooo ooo o o

    ooo oo o

    ooo ooo o oo oooo oo o oooo

    o oo o ooo ooo o oooo ooo oo

    oo ooo oo ooo o

    ooooo o o oo oTungu

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    18/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 18 / 37

    Logarithmic axes

    Waldner earns 160.000 per year. How much more that is thanthe other Tunguans have, is impossible to see on thelogarithmic axis we just used

    income

    M

    oooooooooooooooo

    ooooooooooooooooo

    oooo oooooooooooooooooooooooooooooooooooooooo

    ooo

    M

    oooooooooooooooooooooooooooooooooooooooooooooooooooooooooo oooooooooooooooooooooo o

    Bulugu

    Tungu

    0 50000 100000 150000

    Lesson:

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    19/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 19 / 37

    Lesson:Beware of inappropriate visualizations

    Logarithmic axes are useful for reading hugely differentvalues from a graph with some precision

    But they totally defeat the imagination

    There are many more kinds of inappropriate visualizations see later in this presentation

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    20/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 20 / 37

    Problem 3: Misleading precision

    "The average yearly income in Tungu is 94.3% higherthan in Bulugu"

    Assume that tomorrow Mrs. Alulu Nirudu from Tungu givesbirth to her twins

    There are now 83 rather than 81 people on Tungu

    The average income drops from 3922 to 3827 The difference to Bulugu drops from 94.3% to 89.7%

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    21/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 21 / 37

    Lesson: Do not be easily impressed

    The usual reason for presenting very precise numbers is thewish to impress people

    "Round numbers are always false"

    But round numbers are much easier to remember and compare

    Clearly tell people you will not be impressed by precision in particular if the precision is purely imaginary

    Example 3:

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    22/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 22 / 37

    Example 3:Phantasmo Corporation stock price

    We look at the recentdevelopment of theprice of shares forPhantasmo

    Corporation

    "Phantasmo shows a

    remarkably strongand consistent valuegrowth and continuesto be a top

    recommendation"

    0 100 200 300 400

    180

    182

    184

    186

    188

    190

    192

    day

    stock

    price

    (Phan

    tasmo

    andthis

    data

    are pu

    rely i

    magin

    ary)

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    23/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 23 / 37

    Problem: Looks can be misleading

    The following two plots showexactly the same data!

    and the same as theplot on the previous slide!

    0 100 200 300 400

    18

    0

    182

    18

    4

    186

    188

    190

    192

    day

    0 1 0 0 2 0 0 3 0 0 4 0 0

    180

    182

    184

    186

    188

    190

    192

    d a y

    stock

    price

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    24/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 24 / 37

    Problem: Scales can be misleading

    What really happenedis shown here

    We intuitively interpreta trend plot on a ratio

    scale

    0 100 200 300 400

    180

    182

    18

    4

    186

    188

    190

    1

    92

    day

    stock

    price

    0 100 200 300 400

    0

    50

    100

    150

    200

    day

    stock

    price

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    25/37

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    26/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 26 / 37

    Problem: Scales can be missing

    The most insolentpersuaders may evenleave the scale outaltogether

    0 100 200 300 400

    day

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    27/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 27 / 37

    Problem: Scales can be abused

    Observe theglobalimpressionfirst

    Problem:

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    28/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 28 / 37

    People may invent unexpected things

    Quelle: Werbeanzeige der Donau-Universitt Krems

    DIE ZEIT, 07.10.2004

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    29/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 29 / 37

    Lesson: Seeing is believing

    but often it shouldn't be

    Always consider what it really is that you are seeing

    Do not believe anything purely intuitively Do not believe anything that does not have a well-definedmeaning

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    30/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 30 / 37

    Example 4: blend-a-med Night Effects

    What do they not say?

    What exactly does "sichtbar" mean? What were the results of the clinical trials? What other effects does Night Effects have?

    Example 6:h ( S )

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    31/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 31 / 37

    economic growth (D vs. USA)

    On 2003-10-30, the US Buerau of Economic Analysis (BEA)announced

    USA economic growth in 3rd quarter: 7.2%

    Assume that same day the German Statistisches Bundesamthad announced

    D economic growth in 3rd quarter: 2%

    (Note: This value is fictitious)

    Note: Both values refer to gross domestic product (GDP,"Brutto-Inlandsprodukt", BIP)

    Which economy was growing faster?

    P bl Diff t d fi iti

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    32/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 32 / 37

    Problem: Different definitions

    The US BEA extrapolates the growth for each quarter to afull year

    Statistisches Bundesamt does not

    Thus, the actual US growth factor during (from start toend of) this quarter was only x, where x4 = 1.072.

    x = 1.0175

    US growth was only 1.75% in this quarter

    R l ld l 25 f ld li bilit

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    33/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 33 / 37

    Real-world example: 25-fold reliability

    "Warum billigere Tintenpatronenverwenden, wenn Original HP Tintenbis zu 25-mal zuverlssiger sind?"

    "Why use cheaper ink cartridges when

    genuine HP ink is up to 25 times morereliable?"

    25 f ld li bilit l ti

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    34/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 34 / 37

    25-fold reliability explanation

    DOA: Dead-on-arrival (

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    35/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 35 / 37

    25-fold reliability explanation (2)

    Percentage of PFcartridges (less than75% of the avg. capacityof all cart's.) per brand

    0 20 40 60 80 100 120

    0

    10

    20

    30

    40

    50

    size

    percent

    25 fold reliability explanation (3)

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    36/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 36 / 37

    25-fold reliability explanation (3)

    More problems with this data:

    52/120 = 43% is what they used 52/103 = 50% is right if PF excludes DOA (as claimed)

    (5217)/103 = 34% is right if PF includes DOA

  • 8/3/2019 OezbekC07-How to Lie With Statistics

    37/37

    Christopher Oezbek [email protected], Lutz Prechelt, [email protected] 37 / 37

    Thank you!