This report is very disappointing. What kind of software are you using?

42
This report is very disappointing. What kind of software are you using?

description

This report is very disappointing. What kind of software are you using?. DataSpace AgeStone Age Analysis Space Age++ Stone Age+. Space Age and Stone Age Syndrome. Data:Space Age/Stone Age Analysis:Space Age/Stone Age. Life and Death with Averages and Variability. - PowerPoint PPT Presentation

Transcript of This report is very disappointing. What kind of software are you using?

Page 1: This report is very disappointing. What kind of software are you using?

This report is very disappointing. What kind of software are you using?

Page 2: This report is very disappointing. What kind of software are you using?

Space Age and Stone Age Syndrome

• Data: Space Age/Stone Age

• Analysis: Space Age/Stone Age

Data Space Age Stone AgeAnalysisSpace Age + +Stone Age +

Page 3: This report is very disappointing. What kind of software are you using?

Life and Death withAverages and Variability

Happy Hunter:

First shot-- one inch on the left of the animalSecond shot-- one inch on the right of the animal

So, on the average, shot on the spot; a perfect average shot!

Senior Lawyer:

Initially in my career I lost some cases I should have won Lately in my career I have won some cases I should have

lost So, on the average, justice has been accomplished.

Page 4: This report is very disappointing. What kind of software are you using?

Life and Death with Averages and Variability

• Happy Hunter:– First shot, one inch on the left of the animal; second shot, one

inch on the right of the animal. So, on the average, shot on the spot; a perfect average shot

• Tourist:– I wish to cross the river. I cannot swim. Can you help?– Native: Certainly! Average depth of this river around here is

known to be well below three feet. You look to be six. – Tourist: You are encouraging, and yet not quite helpful.

Depth is usually uneven. Variability sure is a matter of life and death.

Page 5: This report is very disappointing. What kind of software are you using?

Life and Death with Averages and Variability

• Birds: – Concerned about the typical direction in which disoriented

birds of a certain species fly, someone goes out in an open field, stands facing north, and observes a bird vanish at the horizon at an angle of 10 degrees. A little later, he finds a second bird vanish at the horizon at an angle of 350 degrees. What can be said of the typical direction based on the evidence. After submitting these data to a computer and requesting the average direction, the software returns a value of (10 + 350)/2 =180 degrees. The report concludes that, on average, the birds are flying south. Of course, the exact opposite is true, demanding correct and appropriate software.

Page 6: This report is very disappointing. What kind of software are you using?

Blind Men and the Elephant

by J. G. Saxe (1816‑1887)

It was six men of IndostanTo learning much inclined.

Who went to see the Elephant(Though all of them were blind).

That each by observationMight satisfy his mind.

The First approached the Elephant,And happening to fall

Against his broad and sturdy side,At once began to bawl:

“God bless! but the ElephantIs very like a wall!”

Page 7: This report is very disappointing. What kind of software are you using?

The Second, feeling of the tusk,Cried, “Ho! what have we here

So very round and smooth and sharp?To me tis mighty clear

This wonder of an ElephantIs very like a spear!”

The Third approached the animal,And happening to take

The squirming trunk within his hands,Thus boldly up and spake:

“I see,” quoth he, “the ElephantIs very like a Snake!”

The Fourth reached out an eager hand,And felt about the knee,

“What most this wondrous beast is likeIs mighty plain,” quoth he:

“Tis clear enough the Elephant Is very like a tree!”

Page 8: This report is very disappointing. What kind of software are you using?

The Fifth who chanced to touch the ear,Said: “E'en the blindest man

Can tell what this resembles most;Deny the fact who can,

This marvel of an ElephantIs very like a fan!”

The Sixth no sooner had begunAbout the beast to grope.

Than, seizing on the swinging tailThat fell within his scope,

“I see,” quoth he, “the ElephantIs very like a rope!”

And so these men of IndostanDisputed loud and long,

Each in his own opinionExceeding stiff and strong.

Thought each was partly in the rightAnd all were in the wrong!

Page 9: This report is very disappointing. What kind of software are you using?

Comprehensive vs. Comprehensible:1. For lack of information, we do not quite comprehend the situation.2. We therefore collect information, tending to collect comprehensiveinformation.3. Because the information is comprehensive, we do not quite comprehend it.4. Therefore we summarize the information through a set of indices(statistics) so that it would be comprehensible.5. Now, however, we do not comprehend quite what the indices exactly mean.6. Therefore we do not quite comprehend the situation.7. Thus, without (all) information, or with (partial) information, orwith summarized information, we do not quite comprehend a situation!This dilemma is not to suggest a bleak picture for one's ability to understand, predict, or manage a situation in the face of uncertainty. It is more to suggest a need to clearly state the purpose, formulation and solution for the study under consideration, in line of Data Quality Objectives.

Page 10: This report is very disappointing. What kind of software are you using?

How Many of Them are Out ThereThis scenario takes place in a court of law.

The issue is about the abundance of species seemingly endangered, threatened, or rare. The judge orders an investigation. A seasoned investigator conducts the survey. He reports having seen 75 individual members of the species under consideration.

The judge invites comments.

Industrial Lobby: The reported record of 75 members makes sense. The visibility factor is low in such surveys. The investigator has surely missed some of them that are out there. The exploitation should not cause alarm.

Environmental Lobby: The reported record of 75 members makes sense. The investigator is an expert in such surveys. He has observed and recorded most of them that are out there. And, therefore, only a few are out there. The species population needs to be protected.

The scenario is a typical one. It brings home the issues characteristic of field observations often lacking a sampling frame necessary for the classical sampling theory to apply. One needs to work with visibility analysis instead. Satisfactory estimation of biological population abundance depends largely, in such cases, on adequate measurement of visibility, variously termed catchability, audibility, etc. And, this is not a trivial problem!

Page 11: This report is very disappointing. What kind of software are you using?

Am I a Specialist or a Generalist?

My wife: I am a specialist...because I do `something;' not cooking, not washing, not shopping, etc.

My son: I am a generalist...because I read, play, swim, drive, draw, etc.

My Dean: I am a specialist...because I do statistics; not physics, not chemistry, not astronomy, etc.

My Head: I am a generalist...because I do statistical ecology,environmental statistics, risk assessment, journal editing, etc.

In other words, the degree of specialization/diversification has to be relative to the categories identified.

Page 12: This report is very disappointing. What kind of software are you using?

Diversity Measurement and Comparison

Basic Question

S(2) = 1 + 1 = ?

S(n) = 1 + 1 + .. + 1 = ?

n times

Page 13: This report is very disappointing. What kind of software are you using?
Page 14: This report is very disappointing. What kind of software are you using?
Page 15: This report is very disappointing. What kind of software are you using?

WHAT IS A WATERSHED?

• A watershed is an area of land, which drains water (and everything the water carries) to a common outlet.

• The critical thing to remember about watersheds is that the streams and rivers, the hills, and the bottom lands are all part of an inter-connected system.

• Every activity on the land, in the water or even in the air has the potential to affect a watershed.

Page 16: This report is very disappointing. What kind of software are you using?

Figure 4. River basins, watersheds, and stream order. One watershed within the Patapsco River Basin is that of Herring Run. The numbers beside the streams indicate each stream’s order. The smallest permanently flowing stream is termed first order, and the union of two first order streams creates a second order stream. A third order stream is formed where two second order

streams join.

Page 17: This report is very disappointing. What kind of software are you using?

Selected landscape metrics for the medium-delineated watersheds

Metric Name Definition

PSCV Patch Size Coefficient of Variation

Variability in patch size, or the size of homogeneous land cover areas, relative to the mean patch size

DFLD Double Log Fractal Dimension

2 divided by the slope of the regression line calculated by regressing the log of the patch area against the log of patch perimeter

IJIInterspersion and Juxtaposition Index

Measures the unevenness in patch types across a watershed

CONTAG Contagion Index

Measures the unevenness in patch types across all pixels in a watershed

Page 18: This report is very disappointing. What kind of software are you using?

Metric Name Definition

LPI Largest Patch Index

Percentage of watershed comprised by the largest continuous patch of homogeneous land cover type

PSCV Patch Size Coefficient of Variation

Variability in patch size, or the size of homogeneous land cover areas, relative to the mean patch size

DFLD Double Log Fractal Dimension

2 divided by the slope of the regression line calculated by regressing the log of the patch area against the log of patch perimeter

CONTAG Contagion Index

Measures the unevenness in patch types across all pixels in a watershed

Selected landscape metric for the large-delineated watersheds

Page 19: This report is very disappointing. What kind of software are you using?
Page 20: This report is very disappointing. What kind of software are you using?
Page 21: This report is very disappointing. What kind of software are you using?
Page 22: This report is very disappointing. What kind of software are you using?
Page 23: This report is very disappointing. What kind of software are you using?
Page 24: This report is very disappointing. What kind of software are you using?

Comparison of Three Watershed Types Using Conditional Entropy Profiles

0

1

2

3

4

5

6

7

1 2 3 4 5 6 7 8

Resolution

En

tro

py

Mostly Forested

Transitional

Agricultural/Urban

Page 25: This report is very disappointing. What kind of software are you using?

Comparison of Land Cover Proportions for Three Types of Watersheds

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

Land Cover Type

Pro

po

rtio

n

Medium 25

Medium 76

Medium 126

Page 26: This report is very disappointing. What kind of software are you using?
Page 27: This report is very disappointing. What kind of software are you using?
Page 28: This report is very disappointing. What kind of software are you using?

Figure 1. Example of perfect positive and perfect negative correlation between two coordinates (variables).

Page 29: This report is very disappointing. What kind of software are you using?

1 2 3 4 5 6 7

8 9 10

11

12

13

14

15

16

17

18

19 20

21

22

23

24

25 26 27

28

29

30

31

32

33

34

35

36

37

38

39

40

41 42

43

44

45 46 47 48

49

50

51

52

53

54

55

56

57 58

59

60

61 62 63 64

65

66

67

68

69

70

71

72 73

74 75

76

77 78 79

80

81

82

83 84 85

86

87

88

89 90 91 92

93 94

95

96

97

98 99

100

101

102

103

104

105

106

107

108 109

110

111

112 113

114

115

116

117

118

119

120

121

122

123

124

125 126

127 128

129

130

131

132

133

134

135

136

137

138

139

140 141

Hasse Diagram (all countries)

Page 30: This report is very disappointing. What kind of software are you using?

Hasse Diagram (W Europe)

Iceland Sweden Finland Norway

Austria

Switzerland Spain

France Germany

Portugal

Italy

Greece Belgium Netherlands Denmark UK

Ireland

Page 32: This report is very disappointing. What kind of software are you using?
Page 33: This report is very disappointing. What kind of software are you using?

Cumulative Rank Frequency Operator Cumulative Rank Frequency Operator – 4– 4

An Example of the ProcedureAn Example of the Procedure

• We illustrate with the following poset containing 6 elements

Poset(Hasse Diagram)

a b

dc

e f

Page 34: This report is very disappointing. What kind of software are you using?

Cumulative Rank Frequency Operator Cumulative Rank Frequency Operator – 5– 5

An Example of the ProcedureAn Example of the ProcedureIn the example from the preceding slide, there are a total of 16 linear extensions, giving the following cumulative frequency table.

Rank

Element 1 2 3 4 5 6

a 9 14 16 16 16 16

b 7 12 15 16 16 16

c 0 4 10 16 16 16

d 0 2 6 12 16 16

e 0 0 1 4 10 16

f 0 0 0 0 6 16

Each entry gives the number of linear extensions in which the element (row label) receives a rank equal to or better that the column heading

Page 35: This report is very disappointing. What kind of software are you using?

Cumulative Rank Frequency Operator Cumulative Rank Frequency Operator – 6– 6

An Example of the ProcedureAn Example of the Procedure

0

4

8

12

16

1 2 3 4 5 6

Rank

Cum

ulat

ive

Fre

quen

cyabcdef

16

The curves are stacked one above the other and the result is a linear ordering of the elements: a > b > c > d > e > f

Page 36: This report is very disappointing. What kind of software are you using?

Cumulative Rank Frequency Operator Cumulative Rank Frequency Operator – 7– 7

An example where An example where F must be iteratedmust be iterated

Original Poset(Hasse Diagram)

a f

eb

c g d

h

a

f

e

b

ad

c

h

g

a

f

e

b

ad

c

h

g

F F 2

Page 37: This report is very disappointing. What kind of software are you using?

Cumulative Rank Frequency Operator Cumulative Rank Frequency Operator – 8– 8

An example where An example where F results in tiesresults in ties

Original Poset(Hasse Diagram)

a

cb

d

a

b, c (tied)

d

F

•Ties reflect symmetries among incomparable elements in the original Hasse diagram

• Elements that are comparable in the original Hasse diagram will not become tied after applying F operator

Page 38: This report is very disappointing. What kind of software are you using?

0 10 20 30 40 50 60 70 80 90 100 110

HEI Rank

0

10

20

30

40

50

60

70

80

90

100

110

Ran

k In

terv

als

Upper endpoints

Lower endpoints

Midpoints

• Rank intervals for all 106 countries. The intervals (countries) are labeled by their midpoints as shown along the horizontal axis. For each interval, the lower endpoint and the upper endpoint are shown vertically. The length of each interval corresponds to the ambiguity inherent in attempting to rank the country among all 106 countries.

Page 39: This report is very disappointing. What kind of software are you using?

0 10 20 30 40 50 60 70 80 90 100 110

Midpoint

0

10

20

30

40

50

60

70

80

90

100

110

Ran

k In

terv

als

Upper endpoints

Lower endpoints

Midpoints

• Rank intervals for all 106 countries, plotted against their HEI rank. The HEI rank appears as the 45-degree line. The HEI tends to be optimistic (closer to lower endpoint) for better ranked countries and pessimistic (closer to upper endpoint) for poorer ranked countries.

Page 40: This report is very disappointing. What kind of software are you using?

SIR (maximum likelihood estimate)

more than 100% above expected (28)50% to 100% above expected (93)15% to 49% above expected (279)within 15% of expected (471)15% to 50% below expected (338)more than 50% below expected (100)very sparse data (104)

Breast Cancer by ZIP CodeNew York State, 1993-1997

Simple SIRs as observed/expected

Page 41: This report is very disappointing. What kind of software are you using?

cluster * SIR LL Young Multiple Atypical Late StageCases Cancers Demographics of Diagnosis

LF2 2.09 10.36 2 1 1 2LM14 1.5 36 2 0 0 2LM4 2.04 19.21 2 0 0 2LF7 1.51 15.43 1 1 1 1B2 1.21 31.3 2 1 0 2B4 1.25 28.4 1 0 0 0LM1 2.32 21.91 0 1 0 2LM3 2.13 21.26 1 1 0 1LM7 2.12 13.33 1 0 0 2

* LF = lung, female; LM = lung, male; B = breast

Ranking Possible Disease Clusters in the State of New York

Data Matrix

Page 42: This report is very disappointing. What kind of software are you using?

Logo for Statistics, Ecology, Environment, and Society