Biometry. Lecture 1 -...

35
Biometry. Lecture 1 Alexey Shipunov Minot State University January 13, 2016 Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 1 / 31

Transcript of Biometry. Lecture 1 -...

Page 1: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Biometry. Lecture 1

Alexey Shipunov

Minot State University

January 13, 2016

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 1 / 31

Page 2: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Outline

1 Course in generalDescription

2 Computer literacyComputer knowledge and skills needed

3 StatisticsWhat is statisticsDataSamples

4 RNon-R softwareStarting with R

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 2 / 31

Page 3: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Outline

1 Course in generalDescription

2 Computer literacyComputer knowledge and skills needed

3 StatisticsWhat is statisticsDataSamples

4 RNon-R softwareStarting with R

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 2 / 31

Page 4: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Outline

1 Course in generalDescription

2 Computer literacyComputer knowledge and skills needed

3 StatisticsWhat is statisticsDataSamples

4 RNon-R softwareStarting with R

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 2 / 31

Page 5: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Outline

1 Course in generalDescription

2 Computer literacyComputer knowledge and skills needed

3 StatisticsWhat is statisticsDataSamples

4 RNon-R softwareStarting with R

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 2 / 31

Page 6: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Course in general Description

Course in generalDescription

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 3 / 31

Page 7: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Course in general Description

Course description

Course will cover introductory statistic concepts in a form designedspecifically for biology majors, its goal is to strengthen Biology andChemistry students statistical knowledge and abilities. It is a practical,software-based examination of the concepts of sampling, hypothesestesting (non-parametric and parametric), descriptive statistics,contingency, correlation, analysis of variation, linear models and basicmultivariate techniques. Only biological, real-world data will be used.Course will concentrate on underlying principles, applicability andpractical use of methods covered. R statistical environment will beused as a main software tool.The course relies on the computer literacy: file system and basic fileoperations, basic text operations, spreadsheets, vector and rastergraphics, Internet file formats and protocols.

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31

Page 8: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Course in general Description

Main concepts

What is data and how to process itWhat are statistical hypotheses and how to prove themHow to get answers from one-, two- and multidimensional data

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 5 / 31

Page 9: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Course in general Description

What should be your skills by May: Exam 4

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 6 / 31

Page 10: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Course in general Description

Instructor

Dr. Alexey ShipunovOffice: Moore 229Office Hours: Mondays, Wednesdays, 1 p.m. to 3 p.m.Phone: 858-3116E-mail: [email protected] — this is thepreferrable way of communication.

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 7 / 31

Page 11: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Course in general Description

Know your Syllabus!

http://ashipunov.info/shipunov/school/biol_240/

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 8 / 31

Page 12: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Computer literacy Computer knowledge and skills needed

Computer literacyComputer knowledge and skills

needed

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 9 / 31

Page 13: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Computer literacy Computer knowledge and skills needed

Checklist of the necessary computer skills

File system and basic file operations, working with file manager:use only lowercase letters, numbers and underscore (dot forextension), learn how to use ZIP foldersUnderstanding of the simple and formatting text:use Notepad, Text or other simple text editors; be aware ofdifferent line endings on Mac, Windows and Unix/Linux; be awareof invisible symbols including tabulationbasic text operations (copy/paste etc.)Spreadsheets:know basic operations, use LibreOffice Calc instead of Excel ifyou likeVector and raster graphics:will be explained due courseInternet file formats and protocols:HTML, PDF, http://, ftp://, mailto:

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 10 / 31

Page 14: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Statistics What is statistics

StatisticsWhat is statistics

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 11 / 31

Page 15: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Statistics What is statistics

Definition of Statistics

Data collection Collecting any numerical data, e.g. unemployment rateper state.

Sampling Working with any subsets (samples) of data, like votingpolls.

Data analysis Procedures used to analyze data, such as ANOVA orchi-square statistic.

Research Science that develops mathematical procedures todescribe data.

In all, statistics is about data.

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 12 / 31

Page 16: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Statistics Data

StatisticsData

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 13 / 31

Page 17: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Statistics Data

Small data

Small data is often self-explanatory.Experiments with cognition show that it is easy to operate with 5-9objects in mind.Visual inspection gives an average value close to 2.

2 3 4 2 1 2 2 0

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 14 / 31

Page 18: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Statistics Data

Uniform data

2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 22 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 22 2 2 2 2 2 2 2 2 2 3 2 2 2 2 2 2 2 2 2 2 2 2 2 2 22 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2 2

Visual inspection again gives an average value close to 2.Uniform data could be (relatively) big, but understandable withoutspecial tools.

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 15 / 31

Page 19: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Statistics Data

Real data

Data from Shipunov et al., 2012

88 22 52 31 51 63 32 57 68 27 15 20 26 3 33 7 35 1728 32 8 19 60 18 30 104 0 72 51 66 22 44 75 87 95 6577 34 47 108 9 105 24 29 31 65 12 82

However, in most cases biological data is much more complicated.Therefore, we will need specific (statistical) tools even forpreliminary description of data.

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 16 / 31

Page 20: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Statistics Samples

StatisticsSamples

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 17 / 31

Page 21: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Statistics Samples

Sampling

Biologists often work with largenumbers of objects and thereforeneed to sample (subset) initialpopulation.Sampling gives you free hands, itis robust from errors and it ischeaper than full research.Moreover, philosophically, anyresearch is based on sampling.However, the sample may notnecessary be a goodrepresentative of a population.Only statistical tools will help todetermine the reliability of thesample.

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 18 / 31

Page 22: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Statistics Samples

Typical problem of sampling

Even samples chosen at random from two different populationsmay not necessary be different.Whereas experiment requires simpler statistical tools, observationfrequently needs things like data mining.

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 19 / 31

Page 23: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

Statistics Samples

Experiments vs. observation

Experiment requires controlledconditions whereasobservation minimizes theinfluence.Again, only carefulexamination of samples withappropriate tools will makeresults of experiment robust.

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 20 / 31

Page 24: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

R Non-R software

RNon-R software

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 21 / 31

Page 25: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

R Non-R software

Calculators

Calculator is almost always embedded into OSToo elaborative if we use samples

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 22 / 31

Page 26: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

R Non-R software

Spreadsheets

MS Excel, OpenOffice.org/LibreOffice Calc, GnumericVery handy for data input and visualizationDo not contain advanced and optimized statistical methodsAre not able to conduct complex calculations

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 23 / 31

Page 27: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

R Non-R software

Graphical statistical software

SPSS, MiniTab and many othersHave a high diversity of different graphs and plotsWill fail if you need to repeat the complex procedures with differentdatasets

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 24 / 31

Page 28: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

R Non-R software

Statistical environments

SAS, S-Plus and RFull control: it is possible to implement every statistical methodUser should remember commands

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 25 / 31

Page 29: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

R Starting with R

RStarting with R

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 26 / 31

Page 30: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

R Starting with R

R history

Started in 1993 as non-commercial analog of S-PlusR is just another implementation of S statistical languagedeveloped in AT&TIn last five years, became a standard for statistical researchHas more than 7,700 extension packages

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 27 / 31

Page 31: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

R Starting with R

R pros and cons

Extremely flexible, open sourceNo GUI: which command?

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 28 / 31

Page 32: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

R Starting with R

Final question (2 points)

What is sampling?

Together with name and answer, supply your 4-digit class ID

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 29 / 31

Page 33: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

R Starting with R

Final question (2 points)

What is sampling?

Together with name and answer, supply your 4-digit class ID

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 29 / 31

Page 34: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

R Starting with R

Summary

Statistics is:

Gathering dataMaking samplesApplying toolsDevelop new ways of things above

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 30 / 31

Page 35: Biometry. Lecture 1 - ashipunov.infoashipunov.info/shipunov/school/biol_240/2016_spring/lec_240_01.pdf · Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 4 / 31. Course in general

R Starting with R

For Further Reading

A. Shipunov.Biometry [Electronic resource].2012—onwards.Mode of access:http://ashipunov.info/shipunov/school/biol_240

A. Shipunov, and many others.Visual statistics. Use R!2015—onwards.Mode of access: http://ashipunov.info/shipunov/school/biol_240/en/visual_statistics.pdf

Shipunov (MSU) Biometry. Lecture 1 January 13, 2016 31 / 31