Exercise 1

5
Biology 112 Dr. Paradise Page 1 Exercise 1: How to Record and Present Your Data Graphically Using Excel Dr. Chris Paradise, edited by Steven J. Price Introduction In this world of high technology and information overload scientists must communicate effectively and efficiently. How can we get our point across to both fellow scientists and nonscientists in a way that is meaningful to both groups? To communicate effectively, scientists must be clear, concise, and consistent. To communicate efficiently scientists must know what data to present in order to emphasize the point they wish to make. Numerical results are useful because they help answer a question or test a hypothesis. Raw data that you collect to test a particular hypothesis must be digested in some way and presented, not as a jumbled mass of field notes and numbers, but as concise and informative tables, diagrams, or graphs. Objective For you to learn and gain perspective on the ways in which raw data may be abstracted and presented in a report, and to prepare sample graphs. Recording Your Data Writing down a number may seem like a simple thing to do, but the numbers that we read from some measuring device are limited in the number of significant digits that we obtain. Every piece of equipment that we use to measure with allows us to ascertain the value in question to a certain degree. For instance, a ruler with centimeters as the smallest gradation can measure to the nearest centimeter. We then estimate the last digit, in this case would to the nearest millimeter. So there is a degree of uncertainty in our estimate of the last digit. Let's say an object is measured and found to be 1.269 meters long. The rightmost digit is in the millimeter position and is our "uncertain" digit. There are four significant digits in this measurement, which gives us a certain amount of confidence as to the precision and accuracy of the measurement, based on the quality of our measuring device. Precision comes into play when you measure the same quantity more than once. How close does your repeated measure come to the original measurement? That is the degree of precision of your instrument. Measuring quantities more than once with the ruler described above, we might expect to get readings that differ by about 1 millimeter. Such a ruler is precise to + 1 mm. However, if we had a ruler that measured to a tenth of a millimeter that ruler would have one more significant digit than our first ruler would. It might also be considered more precise. Absorbance of light is measured on a spectrophotometer and while measuring the quantity of light absorbed by a certain solution you want to be sure that the machine is zeroed properly and that you are obtaining the correct reading. So initially, your objective is to check your precision. However, you may discover that your readings are inaccurate because the machine is not zeroed properly. Accuracy is the term used to describe how close any measured value is to the true value. In general, a more precise measurement will also be more accurate, assuming that calibration of the measuring device has been performed properly. Presenting Your Data Numerical results are generally useful because they may help answer a question or test a hypothesis. Your task is to make certain that readers understand why certain data do or do not answer a given question. Usually this requires that the raw data be digested and processed in some way. Presenting data in tabular or graphic form are two methods that scientists use to assist

description

fill

Transcript of Exercise 1

Page 1: Exercise 1

Biology 112 Dr. Paradise

Page 1

Exercise 1: How to Record and Present Your Data Graphically Using Excel Dr. Chris Paradise, edited by Steven J. Price

Introduction In this world of high technology and information overload scientists must communicate effectively and efficiently. How can we get our point across to both fellow scientists and nonscientists in a way that is meaningful to both groups? To communicate effectively, scientists must be clear, concise, and consistent. To communicate efficiently scientists must know what data to present in order to emphasize the point they wish to make. Numerical results are useful because they help answer a question or test a hypothesis. Raw data that you collect to test a particular hypothesis must be digested in some way and presented, not as a jumbled mass of field notes and numbers, but as concise and informative tables, diagrams, or graphs. Objective For you to learn and gain perspective on the ways in which raw data may be abstracted and presented in a report, and to prepare sample graphs. Recording Your Data Writing down a number may seem like a simple thing to do, but the numbers that we read from some measuring device are limited in the number of significant digits that we obtain. Every piece of equipment that we use to measure with allows us to ascertain the value in question to a certain degree. For instance, a ruler with centimeters as the smallest gradation can measure to the nearest centimeter. We then estimate the last digit, in this case would to the nearest millimeter. So there is a degree of uncertainty in our estimate of the last digit. Let's say an object is measured and found to be 1.269 meters long. The rightmost digit is in the millimeter position and is our "uncertain" digit. There are four significant digits in this measurement, which gives us a certain amount of confidence as to the precision and accuracy of the measurement, based on the quality of our measuring device. Precision comes into play when you measure the same quantity more than once. How close does your repeated measure come to the original measurement? That is the degree of precision of your instrument. Measuring quantities more than once with the ruler described above, we might expect to get readings that differ by about 1 millimeter. Such a ruler is precise to +1 mm. However, if we had a ruler that measured to a tenth of a millimeter that ruler would have one more significant digit than our first ruler would. It might also be considered more precise. Absorbance of light is measured on a spectrophotometer and while measuring the quantity of light absorbed by a certain solution you want to be sure that the machine is zeroed properly and that you are obtaining the correct reading. So initially, your objective is to check your precision. However, you may discover that your readings are inaccurate because the machine is not zeroed properly. Accuracy is the term used to describe how close any measured value is to the true value. In general, a more precise measurement will also be more accurate, assuming that calibration of the measuring device has been performed properly. Presenting Your Data Numerical results are generally useful because they may help answer a question or test a hypothesis. Your task is to make certain that readers understand why certain data do or do not answer a given question. Usually this requires that the raw data be digested and processed in some way. Presenting data in tabular or graphic form are two methods that scientists use to assist

Page 2: Exercise 1

Biology 112 Dr. Paradise

Page 2

their readers (and themselves) in interpretation of their results. Tables and figures should always be labeled with a table or figure number, such as "Table 1" or "Figure 3." Always refer to the table or figure in the text, either to point out a trend in the data or to discuss the significance of the data. Never put tables or figures into a protocol or write-up without discussing them in the text! Tables are generally used to show the relationships between treatments and controls when it is necessary to present all of the information obtained from an experiment. However, this method can become cumbersome when large data sets are displayed. One way around this is to use graphs, which often demonstrate relationships without burdening the reader with large quantities of numbers. There are advantages and disadvantages to both styles of presentation. Tables are useful when one wants the reader to see the actual numbers. They are also useful to show frequency counts of categorical variables. As previously mentioned, tables can be a disadvantage when one has to present large data sets, because it forces the reader to visualize the relationships among many numbers and the complex interactions between many variables. That's where graphs come in. When a reader looks at a graph, they should be able to quickly grasp the point that is being made. The graph should effectively show a trend (e.g., an increase in the measured variable over time) or a relationship (e.g., the difference between a control and treatments). Graphs generally consist of an independent and a dependent variable. The independent variable is generally displayed on the abscissa (the x-axis) and can be a measured quantity, such as time, or a category, and may be set by the investigator. This variable should be unaffected, or as the name implies, independent, of the variables being studied. The dependent variable, on the other hand, is affected by the independent factor under investigation (or at least one hypothesizes it is prior to experimentation). The dependent variable is graphed against the independent variable so that one can see how the dependent variable changes with changes in the independent variable. There are many different ways to graph data. Line graphs and histograms are probably the most popular and will be the types you will use most frequently. Line graphs show the relationship of a dependent variable to an independent variable when the independent variable is a continuous measurement, such as time (Fig. 1). Histograms, or bar graphs, are the best way to show frequency distributions of categorical variables. The frequency is the number or proportion and is the dependent variable, while the various categories used make up the independent variable. Let's say, for example, that we wanted to test the effects of robin flock size on feeding rates of robins. We have hypothesized that larger flocks of birds will be more successful in terms of number of worms caught by each individual. Below is a portion of the fictional raw data used to construct Table 2 and Figure 1. This table is fairly well organized, which is the key when producing tables and graphs. This table has titles at the head of each column; labels and titles are critical in tables and graphs. Always title your graphs and tables so the reader knows what they are examining, and label axes and rows and columns so that one can quickly see what variables are presented and determine the scale of numbers. However, getting back to our raw data table, a report that simply included these raw data unaltered would be flawed because it is difficult to extract the significant aspects of these results in this form. The first step might be to analyze the field data to determine the number of worms taken per robin for each 30 minutes of hunting time, using the following equation:

Page 3: Exercise 1

Biology 112 Dr. Paradise

Page 3

[(# of worms captured)/(median # of birds in flock)] / [(total observation time (min.))/(30 minutes)]

Table A.1. Raw data from experiment on flock size and worm capture rate observations.

Date and Time Number of robins in flock Number of worms taken by entire flock

12 Apr 0705-0740 18-22 16

12 Apr 0800-0830 6-7 2

12 Apr 1530-1635 14-20 18

12 Apr 1610-1630 3-5 1

13 Apr 0730-0845 16-19 15

13 Apr 0750-0845 23-28 31

13 Apr 0815-0925 2 3

This provides us with a standard result that will permit direct comparisons among the various flocks of robins (Table A.2).

Table A.2. The relation between number of robins in a foraging flock and the average foraging success of individuals in flock. Observations made Apr. 12-13 from 0700-0900 & 1500-1700.

Total observation time

(min.)

Number of robins in flock

Median number

Worms taken by flock

Success rate (see equation)

70 2 2 3 0.6

65 3-5 4 1 0.1

25 6-7 6.5 2 0.4

45 14-20 17 18 1.4

20 16-19 17.5 15 1.3

30 18-22 20 16 0.8

55 23-28 25.5 31 0.7

There are still other ways that this material might have been offered to the reader. For example, the data in Table 2 could be used to make a scatter diagram (Figure 1). Diagrams, graphs and charts tend to have a more visual impact than a table, and as a result, they project their meaning more quickly and dramatically. They do so, however, at the expense of some information that might appear in a table. Figure 1 does not allow the reader to determine how long the flocks were watched in order to produce the success rate scores that appear as points on the diagram, whereas Table 2 contains this information. You as a scientist and writer must decide which of several possible ways of presenting data conveys information most advantageously, in relation the question you are asking or hypothesis you are testing.

Page 4: Exercise 1

Biology 112 Dr. Paradise

Page 4

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

0 10 20 30

Median number of robins in flock

Su

cc

es

s r

ate

(#

/30

min

ute

s)

Figure 1. The correlation between foraging success for worms and number of robins in a flock. Using Excel Provided below are examples of graphs created using Microsoft Excel – there are two of each graph, and one of each is considered high quality, while the other is of poor quality. In each case the legend provides hints and guidelines for how to professionally create graphs, and why the poor quality graphs are of poor quality.

Figure 2a. Tree Size Does Not Affect

Ice Storm Damage

y = -0.0363x + 8.3809

0

2

4

6

8

10

12

14

16

18

20

0 20 40 60 80 100

Tree Diameter (cm)

No

. o

f B

ran

ch

es B

roke

n

Tree Size Does Not Affect Ice Storm Damage

y = -0.0363x + 8.3809

0

5

10

15

20

25

0 30 60 90 120 150

Tree Diameter (cm)

No.

of

Bra

nches B

roken

Figure 2. Both panels depict the same data. However, the figure on the left is of much higher quality than the one on the right. In particular, note in 2a that: 1) the sizes of the fonts are large, easy to read, and of consistent size, 2) the area showing the actual data is maximized, 3) the contrast is high (dark points on a white background), 4) the figure number appears in the title/legend, and 5) the axis labels contain the units of measurement. In the right panel: 1) the fonts are generally smaller and of inconsistent size, 2) the data points are light blue on a gray

Page 5: Exercise 1

Biology 112 Dr. Paradise

Page 5

background (low contrast), 3) the gridlines make it difficult to see the data, 4) there is no figure number, and 5) the scales are not adjusted to maximize the usage of the graph area (there’s a lot of empty space on the right and top).

Figure 3a. Average Damage to Specific Tree

Genera

0

2

4

6

8

10

12

Oaks (6 spp.) Elms (3 spp.) Maples (2

spp.)Genus

Avera

ge #

of B

roken B

ranches

(+/-

1 s

.d.)

Average Damage to Specific Tree

Genera

0

2

4

6

8

10

12

Oaks (6 spp.) Elms (3 spp.) Maples (2 spp.)Genus

Av

era

ge

# o

f B

rok

en

Bra

nc

he

s (

+/-

1

s.d

.)

Oaks (6 spp.)

Elms (3 spp.)

Maples (2 spp.)

Figure 3. Both panels depict the same data. The figure on the left is of higher quality than the one on the right. In particular, note in 3a that: 1) the sizes of the fonts are large, easy to read, and of consistent size, 2) the area showing the actual data is maximized, 3) the contrast is high (dark bars on a white background), 4) the figure number appears in the title/legend, and 5) the axis labels contain the units of measurement. In the right panel: 1) the fonts are generally smaller and of inconsistent size, 2) the data bars are unnecessarily different colors (use different colors for different series of data plotted against the same independent variable), 3) averages are plotted without error bars, 4) there is no figure number, 5) there is an unnecessary and redundant legend, 6) the labels overlap the scales, and 7) the y-axis scale is not adjusted to maximize the usage of the graph area (there’s a lot of empty space at the top).

Assignment

This handout is very important and should be referred to throughout the semester. Next week you will need to present data from the simulations we run in lab to illustrate two major points regarding your data. Use the information in this document to construct publication-quality graphs in Excel. Cut and paste the graphs into Word and add proper figure and table legends (see Appendix C in handout). Work in pairs and turn in via electronic submission. That is, email the Word document (NOT Excel files) to me. The file should be named using the following convention: lastnames_graphs_ex1.doc.