Using Excel Biostatistics 212 Lecture 4. Housekeeping Finish Lab 2 today and/or start Lab 3 Mac...

29
Using Excel Biostatistics 212 Lecture 4

Transcript of Using Excel Biostatistics 212 Lecture 4. Housekeeping Finish Lab 2 today and/or start Lab 3 Mac...

Using Excel

Biostatistics 212

Lecture 4

Housekeeping

• Finish Lab 2 today and/or start Lab 3

• Mac Addendum

• Copying and pasting from Stata

Today...

• Why are we talking about spreadsheets?• Pro’s and Con’s of using a spreadsheet for:

– Data management, Statistics, Calculating, Modeling, Tables, Figures

• Cells• Formulas• Cutting and pasting formulas• Spreadsheet style• Examples

Why spreadsheets?

• Excel is very popular, and very useful– Store numbers and text– Calculations– Desktop graphics – Tables and Figures– Flexible creation of ledgers, models, other

complex programs

Why spreadsheets?

• How is a spreadsheet different than Stata’s data editor?– Less structured– Formulas– Formatting

Why spreadsheets?

• How is a spreadsheet different than a database program like Access?– Less structured– Formula chains– Formatting

Pro’s and Con’s of spreadsheets

• For data management– Pro’s

• Easy start – just name columns and start typing

– Con’s• No structure• Can’t sort or query data• Can’t generate new variables• “Flat” file – no relational table structure allowed

Pro’s and Con’s of spreadsheets

• For statistical analysis– Pro’s

• Easy start, if you know how to do formulas

– Con’s• Extremely limited range of options

• Difficult to document

Pro’s and Con’s of spreadsheets

• For calculating, or “modeling”– Pro’s

• Repetitive calculations easy

• Complex calculations easy

– Con’s• Simple, 1-time calculations not as fast

• Sometimes hard to decipher in retrospect

Pro’s and Con’s of spreadsheets

• Tables and Figures – will discuss in Sessions 5 and 6

Cells

• The basic building block of a spreadsheet

• Can contain:– Numbers– Text– Dates, times, other special formats– “blanks”

• start with 230 x 66536 = 15.3 million blank cells

Cells, cont

• Enter anything you like into each cell (numbers, text, symbols, etc) using keyboard

• Contents displayed on spreadsheet

• Organized and named by column/row

Formulas

• Use when you want the contents of one cell to depend on the contents of other cells

•ALWAYS starts with: =

(an “equals sign”)

Formulas

• Can contain:– Numbers– Text– References to cells– The usual math operators (+ - * / ^ )– Built-in functions

Formulas

• Cell contents update automatically when a referenced cell content changes

• “Chains” of formulas make for flexible calculating

Formulas

• Contents of a cell displayed on spreadsheet

• The formula determining that content is displayed in the “formula box” (term?)

• Example

Formulas

• Types of formulas– Arithmetic

• +, -, *, /, ^

– Logic• IF(boolean, value 1, value 2)

– Returns value 1 if TRUE, value2 if FALSE

• AND(boolean, boolean, boolean…)– Returns TRUE if all booleans are true, otherwise FALSE

• OR(boolean, boolean, boolean…)– Returns TRUE if any booleans are true, otherwise FALSE

Formulas

• Types of formulas, cont– Functions, for example:

• SUM(range of cells)– Returns the sum of the values in the range

– SUM(A5:A10)

• AVERAGE(range of cells)– Returns the average of the values in the range

• STDEV(range of cells)– Returns the standard deviation

• NORMINV(probability, mean of dist, SD of dist)– Returns the z-value associated with a given probability…

Formulas

• Types of formulas, cont– Functions, for example:

• LN(number)– Returns the natural log of a number

• ABS(number)– Returns the absolute value of a number

• LEFT(text, number of characters=x)– Returns x number of characters from the text in the cell, starting

at the left side…

• NOW()– Returns the current date, time

Formulas

• Tips– Use parentheses

• IF(SUM(A5:A10)>5,1,IF(C9=“y”,2,3))

– Or do in multiple steps

Cutting and pasting formulas

• Excel assumes the cell references are ABSOLUTE, and you’re just moving the location of the formula cell

• Example

Copying and pasting formulas

• Excel assumes the cell references are RELATIVE

• Shortcut (drag little square)

• Example

Copying and pasting formulas

• If you want to FIX the position of a referenced cell, use $’s= A5 + $B$6

• Example

Examples

• Repetitive calculations– Back-transforming linear regression coefficients

• Complex calculations– 2 x 2 template

• Modeling– Mortgage calculator– Risk integrator

Spreadsheet style

• Formatting– Text– Column width– Borders– Placement of stuff on the page

Spreadsheet style

• For models:– Inputs on the left, in red– Outputs on the right, in blue, boxed, bolded, etc– Calculations on other sheets– “Protect” all cells besides inputs

• Format/Cells…/Protection

• Tools/Protect

Summary

• Know what Excel can do for you

• It’s all about formulas

Lab 3

• Practice with:– A repetitive calculation spreadsheet

– A complex calculation spreadsheet

– Introduction to making a figure with Excel

• Extra credit puzzle challenge – 2x2 excel template

• Due 10/25/05

To come…

• Next two lectures– Epidemiologic analysis with Stata

• 2 x 2 tables, confounding and interaction

• Epitab commands

• Logistic regression introduction