Community and gradient analysis: Matrix approaches in macroecology
Introduction to Damaris Zurell Dynamic Macroecology Swiss Federal Research Institute WSL...

Introduction to
Damaris ZurellDynamic MacroecologySwiss Federal Research Institute [email protected]
http://www.rproject.org/
R is a tool …
Data manipulation
y ~ x Data modelling
Data visualisation
• Integrating different data sources
• Aggregating data, disintegrate, transform ...
• Statistical modelling• Numeric simulations
• Visualising models• Make your own graphics
R is an environment
The R environment: „more than an incremental accretion of very specific and inflexible tools“
„fully planned and coherent system“
R history
• First, there was S – developed in 1976 byJohn Chambers in Bell Laboratories at AT&T, as programming language for statistics, stochastic simulation, and graphical display
• 1988, commercial implementation in SPLUS (Insightful Corp.)
• 1992, Ross Ihaka and Robert Gentleman start free implementation R under the GNU General Public License, mainly for teaching purposes
• 1997, founding of R Development Core Team (abbrev: R Core Team) with today 20 persons from science and economy
• 1998, founding of Comprehensive R Archive Network (CRAN) – today >4000 additional packages
• 2000, first version completely compatible with S : R1.0.0
R pros R cons• Open Source, on many operating
systems• „at the pulse of science“ – new
methods by scientists/developers implemented in R and available as packages
• Publication ready graphics• Excellent for simulations,
programming, computer intensive analyses, automating
• Best option for statistical computing
• Active user community: help by R Core Team, RHelp mailing list, fast bugfixing
• no fancy graphical user interface, bulky – steep learning curve for newbies, high beginner‘s frustration
• Easy to make mistakes• Computation of big data sets is
limited by RAM• „Many ways lead to Rome“
• An interpreted programming language – Commands are executed immediately
• Data types: empty values, numerical, logical, character• Data structures/object types: scalar, vector, matrix, array,data
frame, list• During one session, all objects are stored in your workspace• builtin and selfdefined functions
R is plain
Command line language: This is the prompt:>All commands follow after the prompt
R is a great calculator• Simple algebra
> 2+24
• Assign your results to a variable> X < 2+2 # assignment operator „<“> x^216
• Vector based calculations> mass< c(10,13,6) # 3 Massen> acceleration < c(2.2,1.7,3.1)> (force < mass * acceleration )22.0 22.1 18.6
R is a great calculator• Simple statistics
> (x < sample (1:20,10))4 15 12 14 18 3 9 20 19 16> mean(x)13> sd(x)5.981453
• Set operationsunion intersect setdiff
• Advanced statisticspbinom(40,100,0.5) # coin toss: is the coin unbiased?0.02844397(pshare < pbirthday(18,366,coincident=2))0.3461382
R is a numeric simulator
• Builtin functions for common probability distributions
• e.g. simulate 10 000 pseudorandom numbers from 100 coin tosses– How often do you get heads?
> heads<rbinom(10000,100,0.5)> hist(heads)
R Probability distributions
functions: d (density) probability density functionp (probability) cummulative distribution functionQ calculate quantilesR draw random numbers
Examples:Normal dnorm pnorm qnorm rnorm
Binomial dbinom pbinom …
Poisson dpois ..
R Probability distributions? distributions
Function Distribution
_beta() Beta
_binom() Binomial
_cauchy() Cauchy
_chisqu() χ2
_exp() Exponential
_f() F
_gamma() Gamma
_geom() Geometric
_hyper() Hypergeometric
_logis() Logistic
_lnorm() Lognormal
_multinom() Multinomial
_nbinom() Negative binomial
_norm() Normal
_pois() Poisson
_signrank() Wilcox signed rank statistic (One sample case)
_t() T
_unif() Uniform
_weibull() Weibull
_wilcox() Wilcox signed rank statistic (Two sample case)
R accepts all kinds of data sources
• Files (text, binary, data sets from other statistic programs)> Example < read.csv(“example.csv",header=T)> example2 < read.table(“example2.txt",header=T)
Cclipboard > cohesion<read.table(file="clipboard",sep="\t",header=T)
• Database > library(RODBC)
> mdbConnect<odbcConnectAccess("GPDDdist")> sqlTables(mdbConnect)
• Web > con < url('http:/anywebsite.com/test.txt')> example3 < read.table(con, header=T)
• R Objects (binary)> load(“example.RData")
R writes to all kinds of data sources
• to files> write.csv(example,“example.csv")> write.table(example,“example2.txt",row.names=F)
• to the clipboard> write.table(CORMAT,file="clipboard",sep="\t",col.names=NA)
• to data bases> channel < odbcConnect("test")> sqlSave(channel, USArrests, rownames = "state", addPK=TRUE)> close(channel)
to R Objects> save(example3,“example.RData")
R visualising
• Many graphic functions are generic – they respond „intelligently“ to different object types> plot(iris)> plot(Petal.Length,Petal.Width, pch=as.numeric(Species))
R visualising
• Many graphic functions are generic – they respond „intelligently“ to different object types> boxplot(iris)> boxplot(Petal.Length~Species,data=iris,ylab="Petal.Length")
R visualising
• R Graph Gallery: http://gallery.renthusiasts.com/thumbs.php
R statistical modelling
• Linear model> fm < lm(y ~ x, data=dummy)> summary(fm)Call:lm(formula = y ~ x, data = dummy)
Residuals: Min 1Q Median 3Q Max 4.3400 1.7353 0.2107 1.4644 4.8445
Coefficients: Estimate Std. Error t value Pr(>t) (Intercept) 1.9150 1.2155 1.575 0.133 x 0.8581 0.1015 8.457 1.1e07 ***Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
Residual standard error: 2.617 on 18 degrees of freedomMultiple Rsquared: 0.7989, Adjusted Rsquared: 0.7877 Fstatistic: 71.52 on 1 and 18 DF, pvalue: 1.102e07
R statistical modelling
And much more ...
Dormann & Kühn (2009): Angewandte Statistik für die biologischen Wissenschaften.
R geostatistical analyses
• variograms, Kriging etc.
www.mathworks.de
Hengl 2009
R as programming language> hi.there < function() {+ cat("Hello World!\n")+ }> hi.there()Hello World!
R as programming language> hi.there < function() {+ cat("Hello World!\n")+ }> hi.there()Hello World!
• Built your own function to keep your code tidy
• Built „new“ functions (and write packages)
• Dynamic models …
R extensions
• Integrate other source codes
• Batch processing
• Call from terminal
R GUIs
http://www.rcommander.com/
http://rstudio.org/
R Literature
• http://www.rproject.org/– Manuals– „Contributed
Documentation“