intro_to_R1
-
Upload
samuel-ramirez -
Category
Documents
-
view
217 -
download
0
Transcript of intro_to_R1
-
8/12/2019 intro_to_R1
1/36
A Short Introduction to R
By and Richard Harris, School of Geographical Sciences, University of Bristol
A Short Introduction to R by Richard Harrisis licensed under a Creative Coons Attribution!
"onCoercial!ShareAli#e $%& Unported 'icense%
Based on a (or# at (((%social!statistics%org%
You are free:
to Share) to copy, distribute and transit the (or#
to Remix) to adapt the (or#
Under the following conditions:
Attribution ) *ou ust attribute the (or# in the follo(ing anner+ Based on A Short Introduction
to R by Richard Harris (((%social!statistics%org-%
Noncommercial) *ou ay not use this (or# for coercial purposes% Use for education in a
recognised higher education institution a University- is perissible%
Share Alike) If you alter, transfor, or build upon this (or#, you ay distribute the resulting
(or# only under the sae or siilar license to this one%
With the understanding that:
Waiver) Any of the above conditions can be (aived if you get perission fro the copyright
holder Richard Harris, rich%harris.bris%ac%u#-
Public omain) /here the (or# or any of its eleents is in the public doain under applicable
la(, that status is in no (ay affected by the license%
!ther Rights) In no (ay are any of the follo(ing rights affected by the license+
*our fair dealing or fair use rights, or other applicable copyright e0ceptions and liitations1
2he author3s oral rights1
Rights other persons ay have either in the (or# itself or in ho( the (or# is used, such as publicity
or privacy rights%Notice) 4or any reuse or distribution, you ust a#e clear to others the license ters of this
(or# (hich applies also to derivatives%
5ocuent version 6, 7&67-
http://creativecommons.org/choose/www.social-statistics.orghttp://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/http://www.social-statistics.org/http://www.social-statistics.org/mailto:[email protected]://creativecommons.org/licenses/by-nc-sa/3.0/http://creativecommons.org/licenses/by-nc-sa/3.0/http://www.social-statistics.org/http://www.social-statistics.org/mailto:[email protected]://creativecommons.org/choose/www.social-statistics.org -
8/12/2019 intro_to_R1
2/36
-
8/12/2019 intro_to_R1
3/36
Introduction
2his docuent presents a short introduction to R highlighting soe geographical functionality%
Specifically, it provides+
A basic overvie( of the 3nuts and bolts3 of R Session 6-
Soe e0aple of data analysis and siple apping in R Session 7-
Soe further inforation about the (or#ings of R Session $-
2he docuent is provided in good faith and the contents have been tested by the author% Ho(ever,
use is entirely as the user3s ris#% Absolutely no responsibility or liability is accepted by the author
for conse8uences arising fro ho(soever this docuent is used% It is is licensed under a Creative
Coons Attribution!"onCoercial!ShareAli#e $%& Unported 'icense see above-%
"efore starting the following should be considered#
4irst, you (ill notice that in this docuent the pages and, ore unusually, the lines are nubered%2he reason is educational+ it a#es directing a class to a specific part of a page easier and faster% 4or
other readers, the line nubers can be ignored%
Second, the sessions presue that, as (ell as R, a nuber of additional R pac#ages libraries- have
been installed and are available to use% 2he coplete list of pac#ages used is Rgoogle9aps, png, sp
and spdep% 2o install theses pac#ages, use
> install.packages(c("RgoogleMaps", "png", "sp", "spdep"))
4urther instructions for ho( to install pac#ages can be found in Section $%:%6, 3Installing and
loading one or ore of the pac#ages3on page $:%
2hird, each session is (ritten to be copleted in a single sitting% If that is not possible, then it (ouldnorally be possible to stop at a convenient point, save the (or#space before 8uitting R, then
reload the saved (or#space (hen you (ish to continue% "ote, ho(ever, that (hilst the additional
pac#ages libraries- need only be installed once, they ust be loaded each tie you begin again in
R and re8uire the% Any ob;ects that (ere attached before 8uitting R also need to be attached again
to ta#e you bac# to the point at (hich you left off% See the sections entitled 3Saving and loading
(or#spaces3, 3Attaching a data frae3and 3Installing and loading one or ore of the pac#ages3on
pages 6&, 7
76
7?
7@
-
8/12/2019 intro_to_R1
4/36
7
-
8/12/2019 intro_to_R1
5/36
Session 1: Getting Started with R
2his session provides a brief introduction to ho( R (or#s and to introduce soe of the ore
coon coands and procedures%
1.1 About RR is an open source soft(are pac#age, licensed under the G"U General ublic 'icence% *ou can
obtain and install it for free, (ith versions available for Cs, 9acs and 'inu0% 2o find out (hat3s
available, go to the Coprehensive R Archive "et(or# CRA"- at http+cran%r!pro;ect%org
Being free is not necessarily a good reason to use R% Ho(ever, R is not ;ust free, it is also (ell
developed, (ell docuented, (idely used and (ell supported by an e0tensive user counity% It is
not ;ust soft(are for 3hobbyists3% It is (idely used in research, both acadeic and coercial%
In his boo#R in a Nutshell3Reilly, 7&6&-, Doseph Adler (rites, ER is very good at plotting
graphics, analyFing data, and fitting statistical odels using data that fits in the coputer3s
eory%
"evertheless, no soft(are is a perfect tool for every ;ob and Adler adds that Eit3s not good at storing
data in coplicated structures, efficiently 8uerying data, or (or#ing (ith data that doesn3t fit in the
coputer3s eory%
2o these caveats it should be added that R does not offer spreadsheet editing of data to the level
found, for e0aple, in 9icrosoft 0cel% Conse8uently, it is often easier to prepare and 3clean3 data
prior to loading the into R% 2here is an add!in to R that provides soe integration (ith 0cel% Go
to http+rco%univie%ac%atand loo# for R0cel%
A possible barrier to learning R is that it is generally coand!line driven% 2hat is, the user types a
coand that the soft(are interprets and responds to% 2his can be daunting for those (ho are usedto e0tensive graphical user interfaces GUIs- (ith drop!do(n enus, tabs, pop!up enus, left or
right!clic#ing and other navigational tools to steer you through a process% It ay ean that R ta#es
a (hile longer to learn1 ho(ever, that tie is (ell spent% nce you #no( the coands it is usually
uch faster to type the than to (or# through a series of enu options% 2hey can be easily edited
to change things such as the siFe or colour of sybols on a graph, and a log or script of the
coands can be saved for use on another occasion%
Saying that, a fairly siple and platfor independent GUI called R Coander can be installed
see http+cran%r!pro;ect%org(ebpac#agesRcdrinde0%htl-% 4ield et al%3s boo#Discovering
Statistics Using Rprovides a coprehensive introduction to statistical analysis in R using both
coand!lines and R Coander%
1.2 Getting Started
Assuing R has been installed in the noral (ay on your coputer, clic#ing on the lin#shortcut to
R on the des#top (ill open the RGui, offering soe drop!do(n enu options, and also the R
Console, (ithin (hich R coands are typed and e0ecuted% 2he appearance of the RGui differs a
little depending upon the operating syste being used /indo(s, 9ac or 'inu0- but having used
one it should be fairly straightfor(ard to navigate around another%
$
$
=
76
7?
7@
$&
$$
$=
http://cran.r-project.org/http://rcom.univie.ac.at/http://cran.r-project.org/web/packages/Rcmdr/index.htmlhttp://cran.r-project.org/http://rcom.univie.ac.at/http://cran.r-project.org/web/packages/Rcmdr/index.html -
8/12/2019 intro_to_R1
6/36
-
8/12/2019 intro_to_R1
7/36
-
8/12/2019 intro_to_R1
8/36
1.3.2 Logging
*ou can save the contents of the R Console (indo( to a te0t file% 2he easiest (ay to do this is to
clic# on the R Console to ta#e the focus fro the Scripting (indo(- and then use 4ile L Save
History in /indo(s- or 4ile L Save As 9ac-% "ote that graphics are not usually plotted in the R
Console and therefore need to be saved separately%
1.4 Some R Basics
1.4.1 Functions, assignments and getting elp
It is helpful to understand R as an ob;ect!oriented syste that assigns inforation to ob;ects (ithin
the current (or#space% 2he (or#space is siply all the ob;ects that have been created or loaded
since beginning the session in R% 'oo# at it this (ay+ the ob;ects are li#e bo0 files, containing useful
inforation, and the (or#space is a larger storage bo0, #eeping the inforation together% A useful
feature of this is that R can operate on ultiple tables of data at once+ they are ;ust stored as
separate ob;ects (ithin the (or#space%
2o vie( the ob;ects currently in the (or#space, type
> ls()cha%acte%(!)
5oing this runs the function ls(), (hich lists the contents of the (or#space% 2he result,
cha%acte%(!), indicates that the (or#space is epty% Assuing it currently is-%
2o find out ore about a function, type 8or help (ith the function nae,
> 8ls()> help(ls)
2his (ill provide details about the function, including e0aples of its use% It (ill also list the
arguents re8uired to run the arguent, soe of (hich ay be optional and soe of (hich ay
have default values (hich can be changed if re8uired% Consider, for e0aple,
> 8log()
A re8uired arguent is 0, (hich is the data value or values% 2yping log()oits any data and
generates an error% Ho(ever, log(1!!)(or#s ;ust fine% 2he arguent base ta#es a default value of e6
(hich is appro0iately 7%@7 and eans the natural logarith is calculated% Using log(1!!,
ase91!)gives the coon logarith, (hich can also be calculated using the convenience function
log1!(1!!).
2he results of atheatical e0pressions can be assigned to ob;ects, as can the outcoe of any
coands e0ecuted in the R Console% /hen the ob;ect is given a nae that is different to otherob;ects (ithin the current (or#space, a ne( ob;ect (ill be created% /here the nae and ob;ect
already e0ists, the previous contents of the ob;ect (ill be over!(ritten, (ithout (arning N so be
carefulK
> a 7 1! : > p%int(a)[1] > 7 1! $ 2> p%int()[1] 2!> p%int(a $ )
[1] 1!!> a 7 a $
=
$
=
76
7?
7@
$&
$$
$=
$
-
8/12/2019 intro_to_R1
9/36
> p%int(a)[1] 1!!
In these e0aples the assignent is achieved using the cobination of 7 and , as in a 7 1!!%
Alternatively, 1!! > acould be used or, ore siply, a 9 1!!% 2he p%int(..)coand can often
be oitted, though it is useful, and soeties necessary for e0aple, (hen (hat you had hoped
(ould appear on screen doesn3t-%
> & 9 a $ > p%int(&)[1] 2!!!> &[1] 2!!!> s*%t()[1] .421-3> p%int(s*%t(), digits9-) # The additional pa%amete% no; speci&ies
# the nme% o& signi&icant &ig%es[1] .4> c(a,) # The c(...) &nction comines its a%gments[1] 1!! 2!
> c(a,s*%t())[1] 1!!.!!!!!! .421-3> p%int(c(a,s*%t()), digits9-)[1] 1!!.!! .4
Although the naing of ob;ects is fle0ible, there are soe e0ceptions,
> 2a 7 1!=%%o%/ nepected smol in "2a"
"ote also that R is case sensitive, so a and A are different ob;ects> a 7 1!> ? 7 2!> a 99 ?[1] @?AB=
2he follo(ing is not sensible because it (on3t appear in the (or#space, although it is there,
> .a 7 1!> ls()[1] "a" "" "&"> .a[1] 1!> %m(.a, ?) # Remo'es the oCects .a and ? (see elo;)
1.4.2 Removing o!"ects from te #or$space
4ro typing ls()(e #no( that the (or#space no longer is epty% 2o reove an ob;ect fro the
(or#space it can be referenced e0plicitly or indirectly by its position in the (or#space% 2o see ho(
the second of these options (ill (or#, type
> ls()[1] "a" "" "&"
2he output returned fro the ls()function here is a vector of length three (here the first eleent is
the first ob;ect alphabetically- in the (or#space, the second is the second ob;ect, and so forth% /e
can access specific eleents by using notation of the for ls[inde.nme%]% So, the first eleent,
@
$
=
76
7?
7@
$&
$$
$=
$
-
8/12/2019 intro_to_R1
10/36
the first ob;ect in the (or#space can be obtained using,
> ls()[1] # Det the %ackets %ightE some %onded some s*a%e[1] "a"> ls()[2][1] ""
"ote ho( the s8uare brac#ets[F] are being used to reference specific eleents (ithin the vector%
Siilarly,
> ls()[-][1] "&"> ls()[c(1,-)][1] "a" "&"> ls()[c(1,2,-)][1] "a" "" "&"> ls()[c(1/-)] # 1/- means the nme%s 1 to -[1] "a" "" "&"
Using the reove function, %m(...), the second and third ob;ects in the (or#space can be reoved
using> %m(list9ls()[c(1,-)])> ls()[1] ""
Alternatively, ob;ects can be reoved by nae
> %m()
2o delete all the ob;ects in the (or#space and therefore epty it, type the follo(ing code but N be
(arnedK N there is no undo function% /henever %m(...)is used the ob;ects are deleted peranently%
> %m(list9ls())> ls()
cha%acte%(!) # Gn othe% ;o%ds, the ;o%kspace is empt
1.4.3 Saving and loading #or$spaces
Because ob;ects are deleted peranently, a sensible precaution prior to using %m(...)is to save the
(or#space% 2o do so perits the (or#space to be reloaded if necessary and the ob;ects recovered%
ne (ay to save the (or#space is to use
> sa'e.image(&ile.choose(ne;9T))
Alternatively, the drop!do(n enus can be used 4ile L Save /or#space in the /indo(s version
of the RGui-% In either case, type the e0tension %R5ata anually else it ris#s being oitted, a#ing
it harder to locate and reload (hat has been saved% 2ry creating a couple of ob;ects in your(or#space and then save it (ith the naes (or#space6%R5ata
2o load a previously saved (or#space, use
> load(&ile.choose())
or the drop!do(n enus%
/hen 8uitting R, it (ill propt to save the (or#space iage% If the option to save is chosen it (ill
be saved to the file %R5ata (ithin the (or#ing directory% Assuing that directory is the default one,
the (or#space (ill be reloaded autoatically each and every tie R is opened, (hich could be
useful or it could be irritating% 2o stop it, locate and delete the file% 2he current (or#ing directory is
identified using the get (or#ing directory, get;d()and changed ost easily using the drop!do(n
>
$
=
76
7?
7@
$&
$$
$=
$ get;d()[1] "se%sgg%Ch"
*our (or#ing directory (ill differ fro the above%
A good strategy for file anageent is to create a ne( folder for each pro;ect in R, saving the
(or#space regularly in it using a naing convention such as 5ecO>O6%R5ata, 5ecO>O7%R5ata etc%2hat (ay you can easily find and recover (or#%
1. !uitting R
Before 8uitting R, you ay (ish to save the (or#space% 2o 8uit R use either the drop!do(n enus
or
> *()
As proised, you (ill be propted (hether to save the (or#space% Ans(ering yes (ill save the
(or#space to the file %R5ata in the current (or#ing directory see section 6%?%$,3Saving and loading
(or#spaces3, on page6&, above-% 2o e0it (ithout the propt, use> *(sa'e 9 "no")
r, ore siply,
> *("no")
1." Getting #e$p
In addition to the use of the 8or help(F)docuentation and the aterial available at CRA",
http+cran%r!pro;ect%org,R has an active user counity% Helpful ailing lists can be accessed
fro (((%r!pro;ect%orgail%htl%
erhaps the best all round introduction to R is theAnIntroduction to R(hich is freely available atCRA" http+cran%r!pro;ect%organuals%htl- or by using the drop!do(n Help enus in the RGui%
It is clear and succinct%
I also have a free introduction to statistical analysis in R (hich accopanies the boo# Statistics for
Geograph and !nviron"ental Science% It can be obtained fro http+(((%social!statistics%orgP
pQ$:?%
2here are any boo#s available% 9y favourite, (ith a oderate level statistical leaning and (ritten
(ith clarity is,
9aindonald, D% Braun, D%, 7&&@%Data Analsis and Graphics using R7ndedition-% Cabridge+
CU%
I also find useful,
Adler, D%, 7&6&%R in a Nutshell% 3Reilly+ Sebastopol, CA%
Cra(ley, 9D, 7&&:% Statistics# An Introduction using R% Chichester+ /iley (hich is a shortened
version of $he R %oo&by the sae author-%
4ield, A%, 9iles, D% 4ield, %, 7&67%Discovering Statistics Using R% 'ondon+ Sage
Ho(ever, none of these boo#s is about apping or spatial analysis of particular interest to e as a
geographer-% 4or that, the authoritative guide a#ing the lin#s bet(een geographical inforation
%Applied Spatial Data Analsis with R%
Berlin+ Springer%
Also helpful is,
/ard, 9%5% S#rede Gleditsch, V%, 7&&>% Spatial Regression 'odels% 'ondon+ Sage% /hich usesR code e0aples-%
2he follo(ing boo# has a short section of aps as (ell as other graphics in R and is also, as the
title suggests, good for practical guidance on ho( to analyse surveys using cluster and stratified
sapling, for e0aple-+
'uley, 2%, 7&6&% (o"ple) Surves. A Guide to Analsis Using R.Hobo#en, "D+ /iley%
Springer publish an ever!gro(ing series of boo#s under the banner Use RK If you are interested in
visualiFation, tie!series analysis, Bayesian approaches, econoetrics, data ining, W, then you3ll
find soething of relevance at http+(((%springer%coseries=
-
8/12/2019 intro_to_R1
13/36
-
8/12/2019 intro_to_R1
14/36
"e0t the nuber of coluns and ro(s, and a chec# N ro(!by!ro( N to see if the data are coplete
have no issing data-%
> ncol(schools.data)> n%o;(schools.data)> complete.cases(schools.data)
It is not the ost coprehensive chec# but everything appears to be in order%
2.3 Some simp$e graphics
2he file schools%csv contains inforation about the location and soe attributes of schools in
Greater 'ondon in 7&&>-% 2he locations are given as a grid reference asting, "orthing-% 2he
inforation is not real but is realistic% It should not, ho(ever, be used to a#e inferences about real
schools in 'ondon%
f particular interest is the average attainent on leaving priary school of pupils entering their
first year of secondary school% 5o soe schools in 'ondon attract higher attaining pupils ore than
othersP 2he variable attainent contains this inforation%
A stripchart and then a histogra (ill sho( that not surprisingly- there is variation in the average
prior attainent by school%
> attach(schools.data)> st%ipcha%t(attainment, method9"stack", la9"Mean H%io% ?ttainment Bchool")> hist(attainment, col9"light le", o%de%9"da%k le", &%e*9@, lim9c(!,!.-!),+ la9IMean attainment)
Here the histogra is scaled so the total area sus to one% 2o this (e can add a rug plot,
> %g(attainment)
also a density curve, a "oral curve for coparison and a legend%
> lines(densit(so%t(attainment)))> 7 se*(&%om92-, to9-, 9!.1)> 7 dno%m(, mean(attainment), sd(attainment))> lines(, , lt9"dotted")> %m(, )> legend("top%ight", legend9c("densit c%'e","Jo%mal c%'e"),+ lt9c("solid","dotted"))
If (ould be interesting to #no( if attainent varies by school type% A siple (ay to consider this is
to produce a bo0 plot% 2he data contain a series of duy variables for each of a series of school
types oluntary Aided Church of ngland+ coe Q 61 oluntary Aided Roan Catholic+ rc Q 61
oluntary controlled faith school+ vol%con Q 61 another type of faith school+ other%faith Q 61 a
selective school (ith an entrance e0a-+ selective Q 6-% /e (ill cobine these into a single,
categorical variable then produce the bo0 plot sho(ing the distribution of average attainent by
school type%
4irst the categorical variable+
> school.tpe 7 %ep("Jot @aithBelecti'e", times9n%o;(schools.data))> school.tpe[coe991] 7 "K? Lo="> school.tpe[%c991] 7 "K? RL"> school.tpe['ol.con991] 7 "KL"> school.tpe[othe%.&aith991] 7 "the% @aith"> school.tpe[selecti'e991] 7 "Belecti'e"
> school.tpe 7 &acto%(school.tpe)
67
$
=
76
7?
7@
$&
$$
$=
$ le'els(school.tpe)[1] "Jot @aithBelecti'e" "the% @aith" "Belecti'e" [etc.]
"o( the bo0 plots+
> pa%(mai9c(1,1.,!.,!.)) # Lhanges the g%aphic ma%gins> oplot(attainment N school.tpe, ho%iOontal9T, la9"Mean attainment", las91,+ ce.ais9!.5) # Gncldes options to d%a; the oes and laels ho%iOontall
> aline('9mean(attainment), lt9"dashed") # ?dds the mean 'ale to the plot> legend("top%ight", legend9"D%and Mean", lt9"dashed")
Figure +.1. A histogra" with annotation in R
Figure +.+. 'ean prior attain"ent , school tpe
"ot surprisingly, the selective schools recruit the pupils (ith highest average prior attainent%
6$
$
=
-
8/12/2019 intro_to_R1
16/36
-
8/12/2019 intro_to_R1
17/36
schools in 'ondon by the proportion of their inta#e (ho are free school eal eligible% 2he result is
the regression line sho(n on the scatterplot above-%
2he second adds a variable giving the proportion of the inta#e of a (hite ethnic group%
2he third adds a duy variable indicating (hether the school is selective or not%
> model1 7 lm(attainment N &sm, data9schools.data)> smma%(model1)
Lall/lm(&o%mla 9 attainment N &sm, data 9 schools.data)
Residals/ Min 1S Median -S Ma2.5541 !.41- !.1153 !.54 -.3351
Loe&&icients/ =stimate Btd. =%%o% t 'ale H%(>t)(Gnte%cept) 26.316! !.115 25.12 72e13 $$$
&sm 3.36 !.-3!- 15.14 72e13 $$$Bigni&. codes/ ! U$$$V !.!!1 U$$V !.!1 U$V !.! U.V !.1 U V 1
Residal standa%d e%%o%/ 1.145 on -3 deg%ees o& &%eedomMltiple Rs*a%ed/ !.4,?dCsted Rs*a%ed/ !.4-3@statistic/ --!.- on 1 and -3 P@, p'ale/ 7 2.2e13
> model2 7 lm(attainment N &sm + ;hite, data9schools.data)> smma%(model2)
Lall/lm(&o%mla 9 attainment N &sm + ;hite, data 9 schools.data)
Residals/ Min 1S Median -S Ma2.62 !.426 !.1-- !.111 -.45-4
Loe&&icients/ =stimate Btd. =%%o% t 'ale H%(>t)(Gnte%cept) -!.12! !.1646 12.21 7 2e13 $$$&sm 4.2!2 !.21 14.2! 7 2e13 $$$;hite !.5422 !.2463 -.12 !.!!163 $$
Bigni&. codes/ ! U$$$V !.!!1 U$$V !.!1 U$V !.! U.V !.1 U V 1
Residal standa%d e%%o%/ 1.13 on -3 deg%ees o& &%eedomMltiple Rs*a%ed/ !.554, ?dCsted Rs*a%ed/ !.56@statistic/ 14-.6 on 2 and -3 P@, p'ale/ 7 2.2e13
> model- 7 pdate(model2, . N . + selecti'e)> smma%(model-)
Lall/lm(&o%mla 9 attainment N &sm + ;hite + selecti'e, data 9 schools.data)
6:
$
=
76
7?
7@
$&
$$
$=
$
-
8/12/2019 intro_to_R1
18/36
Residals/ Min 1S Median -S Ma2.3232 !.32! !.!-4 !.3!4 -.321
Loe&&icients/ =stimate Btd. =%%o% t 'ale H%(>t)(Gnte%cept) 26.14!3 !.1356 142.412 72e13 $$$
&sm .2-51 !.-61 1.53 72e13 $$$;hite !.2266 !.226 1.!22 !.-!4selecti'e -.435 !.2--5 1.542 72e13 $$$Bigni&. codes/ ! U$$$V !.!!1 U$$V !.!1 U$V !.! U.V !.1 U V 1
Residal standa%d e%%o%/ !.6156 on -3- deg%ees o& &%eedomMltiple Rs*a%ed/ !.352-, ?dCsted Rs*a%ed/ !.3463@statistic/ 26.5 on - and -3- P@, p'ale/ 7 2.2e13
'oo#ing at the ad;usted R!s8uared value, each odel appears to be an iproveent on the one that
precedes it arginally so for odel 7-% Ho(ever, loo#ing at the last odel $-, (e ay suspect that
(e could drop the (hite ethnicity variable (ith no significant loss in the aount of variancee0plained% An analysis of variance confirs that to be the case%
> model 7 pdate(model-, . N . ;hite)> ano'a(model, model-)?nalsis o& Ka%iance Tale
Model 1/ attainment N &sm + selecti'eModel 2/ attainment N &sm + ;hite + selecti'e Res.P& RBB P& Bm o& B* @ H%(>@)1 -3 -!4.22 -3- -!3. 1 !.55222 1.!4 !.-!4
2he residual error, easured by the residual su of s8uares RSS-, is not very different for the t(oodels, and that difference, &%>>7, is not significant 4 Q 6%&?:, p Q &%$&@-%
2. Some simp$e maps
2he schools data contain geographical coordinates and are therefore geographical data%
Conse8uently they can be apped% 2he siplest (ay for point data is to use a 7!diensional plot,
a#ing sure the aspect ratio is fi0ed correctly%
> plot(=asting, Jo%thing, asp91, main9"Map o& Aondon schools")
Aongst the attribute data for the schools, the variable esl gives the proportion of pupils (ho spea#
nglish as an additional language% It (ould be interesting for the siFe of the sybol on the ap tobe proportional to it%
> plot(=asting, Jo%thing, asp91, main9"Map o& Aondon schools",+ ce9s*%t(esl$))
It ight also be nice to add a little colour to the ap% /e ight, for e0aple, change the default
plotting 3character3 to a filled circle (ith a yello( bac#ground%
> plot(=asting, Jo%thing, asp91, main9"Map o& Aondon schools",+ ce9s*%t(esl$), pch921, g9"ello;")
A ore interesting option (ould be to have the circles filled (ith a colour gradient that is related to
a second variable in the data N the proportion of pupils eligible for free school eals for e0aple%
2o achieve this, (e can begin by creating a siple colour palette+
6=
$
=
76
7?
7@
$&
$$
$=
$ palette 7 c("ello;","o%ange","%ed","p%ple")
/e no( cut the free school eals eligibility variable into 8uartiles four classes, each containing
appro0iately the sae nuber of observations-%
> map.class 7 ct(&sm, *antile(&sm), laels9@?AB=, inclde.lo;est9TR=)
/hat has happened is that the fs variable has been split into four groups (ith the value 6 given to
the first 8uarter of the data schools (ith the lo(est proportions of eligible pupils-, the value 7 given
to the ne0t 8uarter, then $, and finally the value ? for schools (ith the highest proportions of 4S9
eligible pupils%
2here are, then, no( four ap classes and the sae nuber of colours in the palette% Schools in
ap class 6 and (ith the lo(est proportion of fs!eligible pupils- (ill be coloured yello(, the ne0t
class (ill be orange, and so forth%
Bringing it all together,
> plot(=asting, Jo%thing, asp91, main9"Map o& Aondon schools",+ ce9s*%t(esl$), pch921, g9palette[map.class])
It (ould be good to add a legend, and perhaps a scale bar and "orth arro(% "evertheless, as a first
ap in R this isn3t too badK
Figure +.-. A si"ple point "ap in R
/hy don3t (e be a bit ore abitious and overlay the ap on a Google 9aps tile, adding a legend
as (e do soP 2his re8uires us to load an additional library for R and to have an active Internetconnection%
> li%a%(RgoogleMaps)
If it hasn3t been installed, it could be using install.packages(c("RgoogleMaps","png"))(hich
installs both it and another pac#age, png, that it re8uires for any functions-%
Assuing that the data frae, schools%data, reains in the (or#space and attachedit (ill be if you
have follo(ed the instructions above-, and that the colour palette created above has not been
deleted, then the ap sho(n in 4igure 7%? is created (ith the follo(ing code+
> MMap 7 MapWackg%ond(lat9Aat, lon9Aong)
> HlotnBtaticMap(MMap, Aat, Aong, ce9s*%t(esl$), pch921,
6@
$
=
76
7?
7@
-
8/12/2019 intro_to_R1
20/36
g9palette[map.class])> legend("tople&t", legend9paste("7",tappl(&sm, map.class, ma)), pch921, pt.g9palette, pt.ce91., g9";hite", title9"H(@BMeligile)")> legKals 7 se*(&%om9!.2,to91,9!.2)> legend("top%ight", legend9%ond(legKals,-), pch921, pt.g9";hite", pt.ce9s*%t(legKals$), g9";hite", title9"H(=BA)")
Reeber that the data are siulated% 2he points sho(n on the ap are not the true locations of
schools in 'ondon%
Figure +.. A slightl less si"ple "ap produced in R
2." Some simp$e geographica$ ana$)sis
Reeber the regression odels fro earlierP It (ould be interesting to test the assuption thatthe residuals e0hibit independence by loo#ing for spatial dependencies% 2o do this (e (ill consider
to (hat degree the residual value for any one school correlates (ith the ean residual value for its
si0 nearest other schools the choice of si0 is copletely arbitrary-%
4irst, (e (ill ta#e a copy of the schools data and convert that into an e0plicitly spatial ob;ect in R+
> detach(schools.data)> schools. 7 schools.data> li%a%(sp)> attach(schools.)> coo%dinates(schools.) 7 c("=asting", "Jo%thing")> # Lon'e%ts into a spatial oCect> class(schools.)
6>
$
=
76
-
8/12/2019 intro_to_R1
21/36
> detach(schools.)> p%oCst%ing(schools.) 7 LRB("+p%oC9tme%c datm9BDW-3")> # Bets the Loo%dinate Re&e%encing Bstem
Second, (e find the si0 nearest neighbours for each school%
> li%a%(spdep)> nea%est.si 7 knea%neigh(schools., k93, R?JJ9@)
> # R?JJ 9 @ to o'e%%ide the se o& the R?JJ package that ma not e installed/e can learn fro this that the si0 nearest schools to the first school in the data ro( 6- are schools
:, $>, 7, ?&, 77$ and =+
> nea%est.siXnn[1,][1] -5 2 ! 22- 3
2he neighbours ob;ect, nearest%si0, is an ob;ect of class #nn+
> class(nea%est.si)
It is ne0t converted into the ore generic class of neighbours%
> neigho%s 7 knn2n(nea%est.si)> class(neigho%s)
[1] "n"> smma%(neigho%s)Jeigho% list oCect/Jme% o& %egions/ -34Jme% o& nonOe%o links/ 22!2He%centage nonOe%o ;eights/ 1.3-544?'e%age nme% o& links/ 3[etc.]
2he connections bet(een each point and its neighbours can then be plotted% It ay ta#e a fe(
inutes%
> plot(neigho%s, coo%dinates(schools.))
Having identified the si0 nearest neighbours to each school (e could give each e8ual (eight in aspatial (eights atri0 or, alternatively, decrease the (eight (ith distance a(ay so the first nearest
neighbour gets ost (eight and the si0th nearest the least-% Creating a atri0 (ith e8ual (eight
given to all neighbours is straightfor(ard%
> spatial.;eights 7 n2list;(neigho%s)
2he other possibility (ill not be considered further here but is achieved by creating then supplying
a list of general (eights to the function-
/e no( have all the inforation re8uired to test (hether there are spatial dependencies in the
residuals% 2he ans(er is yes 9oran3s I Q &%76>, p Z &%&&6, indicating positive spatial
autocorrelation-%
> lm.mo%antest(model, spatial.;eights)
Dloal Mo%ans G &o% %eg%ession %esidals
data/model/ lm(&o%mla 9 attainment N &sm + selecti'e, data 9 schools.data);eights/ spatial.;eightsMo%an G statistic standa%d de'iate 9 4.612, p'ale 9 1.2-e1alte%nati'e hpothesis/ g%eate%sample estimates/
se%'ed Mo%ans G =pectation Ka%iance!.215161352 !.!!-554! !.!!!454!115
6 sa'e.image(&ile.choose(ne;9T))
> %m(list9ls()) # We ca%e&l, it deletes e'e%thingE
7&
$
=
-
8/12/2019 intro_to_R1
23/36
Session 3: A Litt$e ,ore about the wor(ings o& R
2his session provides a little ore guidances on the 3inner (or#ings3 of R% All the coands are
contained in file session$%R and can be run using it see 3Scripting3on p%@-%
3.1 '$asses and t)pes'et us create t(o ob;ects, each a vector containing ten eleents% 2he first (ill be the nubers fro
one to ten, recorded as integers% 2he second (ill be the sae se8uence but no( recorded as real
nubers that is, 3floating point3 nubers, those (ith a decial place-%
> 7 1/1!> [1] 1 2 - 3 4 5 6 1!> c 7 se*(&%om91.!, to91!.!, 91)> c[1] 1 2 - 3 4 5 6 1!
"ote that in the second case, (e could ;ust type,> c 7 se*(1, 1!, 1)> c[1] 1 2 - 3 4 5 6 1!
2his (or#s because if (e don3t e0plicitly define the arguent so oit &%om91etc%- then R (illassue that (e are giving values to the arguents in their default order, (hich in this case is fro,
to and by%2ype ?seqand loo# under Usage for this to a#e a little ore sense%
In any case, the t(o ob;ects, b and c, are printed the sae on screen but one is an ob;ect of class
integer (hereas the other is an ob;ect of class nueric and of type double double precision in the
eory space-%
> class()[1] "intege%"> class(c)[1] "nme%ic"> tpeo&(c)[1] "dole"
ften it possible to coerce an ob;ect fro one class and type to another%
> 7 1/1!> class()[1] "intege%"
> 7 as.dole()> class()[1] "nme%ic"> tpeo&()[1] "dole"> class(c)> c 7 as.intege%(c)> class(c)[1] "intege%"> c[1] 1 2 - 3 4 5 6 1!> c 7 as.cha%acte%(c)> class(c)[1] "cha%acte%"
76
$
=
76
7?
7@
$&
$$
$=
$ set.seed(1!1)> 'a%2 7 - $ 'a%1 + 1! + %no%m(1!!, !, 2)# ;hich, ecase n, mean and sd a%e the &i%st th%ee a%gments into %no%m# is the same as ;%iting 'a%2 7 - $ 'a%1 + 1! + %no%m(n91!!, mean91!!, sd92!)> head('a%2)[1] 23.2316 --.5-!1 22.6554 11.!45 --4.-64 26!.1211
"e0t, the t(o variables are gathered together in a data table, of class data frae, (here each ro( is
an observation and each colun is a variable% 2here is ore about data fraes on page 7@, inSection $%735ata fraes3-
> mdata 7 data.&%ame( 9 'a%1, 9 'a%2)> class(mdata)[1] "data.&%ame"> head(mdata) 1 54.4!62 23.23162 1!-.34254 --.5-!1- 5-.254- 22.6554 1-1.6!32 11.!45 1!3.6!13 --4.-643 5-.6!3- 26!.1211> n%o;(mdata) # The nme% o& %o;s in the data[1] 1!!> ncol(mdata) # The nme% o& colmns[1] 2
In this case, plotting the data frae (ill produce a scatter plot% 2he line of best fit also sho(n in
4igure $%6 (ill be added shortly-%
> plot(mdata)
If there had been ore than t(o coluns in the data table, or if they had not been arranged in 0, y
order, then the plot could be produced by referencing the coluns directly% All the follo(ing are
e8uivalent+
7$
$
=
76
7?
7@
$&
-
8/12/2019 intro_to_R1
26/36
> ;ith(mdata, plot(, )) # Ye%e the o%de% is , > ;ith(mdata, plot( N )) # Ye%e it is N > plot(mdataX, mdataX)> plot(mdata[,1], mdata[,2]) # Hlot sing the &i%st and second colmns> plot(mdata[,2] N mdata[,1])
2he attach(...)coand could also be used% 2his is introduced in Section $%7%7,3Attaching a data
frae3on page7 model1 7 lm( N , data9mdata) # lm is sho%t &o% linea% model> class(model1)[1] "lm"
odel6 is an ob;ect of class l, short for linear odel% Using the smma%(...)function suarises
the relationship bet(een y and 0%
> smma%(model1)Lall/lm(&o%mla 9 N , data 9 mdata)Residals/ Min 1S Median -S Ma4.1!2 13.24 !.5 1.155 4.26!Loe&&icients/
=stimate Btd. =%%o% t 'ale H%(>t)(Gnte%cept) 5.332 1-.32!5 !.3- !.24 -.!!2 !.1-1- 22.545 72e13 $$$Bigni&. codes/ ! U$$$V !.!!1 U$$V !.!1 U$V !.! U.V !.1 U V 1
Residal standa%d e%%o%/ 2-.4 on 65 deg%ees o& &%eedomMltiple Rs*a%ed/ !.52-, ?dCsted Rs*a%ed/ !.5!4@statistic/ 2-. on 1 and 65 P@, p'ale/ 7 2.2e13
"o( using the plot(...)function on the ob;ect of class l has an effect that is soe(hat different
fro the previous t(o cases% It produces a series a diagnostic plots to help chec# the assuptions of
7?
$
=
76
7?
7@
$&
-
8/12/2019 intro_to_R1
27/36
-
8/12/2019 intro_to_R1
28/36
> names(mdata)[1] "" ""
or (ith
> colnames(mdata)[1] "" ""
2he ro( naes appear to be the nubers fro 6 to 6&& the nuber of ro(s in the data-, though
actually they are character data+
> %o;names(mdata) [1] "1" "2" "-" "" "" "3" "4" "5" [etc.]> class(%o;names(mdata))[1] "cha%acte%"
2he colun naes can be changed either individually or together% Individually+
> names(mdata)[1] 7 "'1"> names(mdata)[2] 7 "'2"> names(mdata)[1] "'1" "'2"
And all at once+> names(mdata) 7 c("","")> names(mdata)[1] "" ""
W as can the ro( naes,
> %o;names(mdata)[1] 7 "!"> %o;names(mdata) [1] "!" "2" "-" "" "" "3" "4" "5" [etc.]> %o;names(mdata) 9 se*(&%om9!, 91, length.ot9n%o;(mdata))> %o;names(mdata) [1] "!" "1" "2" "-" "" "" "3" "4" "5" [etc.]
2he above can be especially useful (hen erging data tables (ith GIS shapefiles in R because thefirst entry in an attribute table for a shapefile usually is given an I5 of &-% ther(ise, it is usually
easiest for the first ro( in a data table to be labelled 6, so let3s put the bac# to ho( they (ere%
> %o;names(mdata) 9 1/n%o;(mdata)> %o;names(mdata) [1] "1" "2" "-" "" "" "3" "4" "5" [etc.]
3.2.1 Referencing ro#s and columns in a data frame
2he s8uare brac#et notation can be used to inde0 specific ro(, coluns or cells in the data frae%
4or e0aple+
> mdata[1,] # The &i%st %o; o& data 1 54.4!62 23.2316> mdata[2,] # The second %o; o& data 2 1!-.3426 --.5-!1> %ond(mdata[2,],2) # The second %o;, %onded to 2 decimal places 2 1!-.34 --.5-> mdata[n%o;(mdata),] # The &inal %o; o& the data 1!! 6!.-166 231.2-3
> mdata[,1] # The &i%st colmn o& data
7=
$
=
76
7?
7@
$&
$$
$=
$
-
8/12/2019 intro_to_R1
29/36
[1] 54.4!62 1!-.34254 5-.254- 1-1.6!32 [etc.]
> mdata[,2] # The second colmn, ;hich is also F[1] 23.2316 --.5-!1 22.6554 11.!45 --4.-64 [etc.]
> mdata[,ncol(mdata)] # F the &inal colmn o& data [1] 23.2316 --.5-!1 22.6554 11.!45 --4.-64 [etc.]> mdata[1,1] # The data in the &i%st %o; o& the &i%st colmn[1] 54.4!62
> mdata[,2] # The data in the &i&th %o; o& the second colmn[1] --4.-64> %ond(mdata[,2],!)[1] --5
Specific coluns of data can also be referenced using the $notation
> mdataX # =*i'alent to mdata[,1] ecase the colmn name is [1] 54.4!62 1!-.34254 5-.254- 1-1.6!32 1!3.6!13 [etc.]> mdataX [1] 23.2316 --.5-!1 22.6554 11.!45 --4.-64 26!.1211 [etc.]> smma%(mdataX) Min. 1st S. Median Mean -%d S. Ma..41 6!.12 1!2.-! 1!2.2! 11-.5! 15.!!
> smma%(mdataX) Min. 1st S. Median Mean -%d S. Ma.1!. 25.1 -1.1 -1.3 -.4 4.3
> mean(mdataX)[1] 1!2.1444> median(mdataX)[1] -1.1223> sd(mdataX) # Di'es the standa%d de'iation o& [1] 14.63-66> oplot(mdataX)
> oplot(mdataX, ho%iOontal9T, main9"Woplot o& 'a%iale ")Bo0plots are soeties said to be easier to read (hen dra(n horiFontally-
ne (ay to avoid the use of the X notation is to use the function ;ith(...)instead+
> ;ith(mdata, 'a%()) # Di'es the 'a%iance o& [1] -22.4!5> ;ith(mdata, plot(, la9"se%'ation nme%"))
3.2.2 %ttacing a data frame
Soeties any of the (ays to access a specific part of a data table becoes tiresoe and it is useful
to reference the colun or variable nae directly% 4or e0aple, instead of having to typemean(mdata[,1]), mean(mdataX)or ;ith(mdata, mean())it (ould be easier ;ust to refer to the
variable of interest, 0, as in mean()%
2o achieve this the attach(...)coand is used% Copare, for e0aple,
> mean()=%%o% in mean() / oCect not &ond
(hich generates an error because there is not an ob;ect called 0 in the (or#space1 it is only a
colun nae (ithin the data frae ydata- (ith
> attach(mdata)> mean()
[1] 1!2.1444
7@
$
=
76
7?
7@
$&
$$
$=
$
-
8/12/2019 intro_to_R1
30/36
(hich (or#s fine% If, to use the earlier analogy, ob;ects in R3s (or#space are li#e bo0 files, then no(
you have opened one up and its contents (hich include the variable 0- are visible%
2o detach the contents of the data frae use detach(...)
> detach(mdata)> mean()=%%o% in mean() / oCect not &ond
It is sensible to use detach (hen the data frae is no longer being used or else confusion can arise
(hen ultiple data fraes contain the sae colun naes, as in the follo(ing e0aple+
> attach(mdata)> mean() # This ;ill gi'e the mean o& mdataX[1] 1!2.1444> mdata2 9 data.&%ame( 9 1/1!, 911/2!)> head(mdata2) 1 1 112 2 12- - 1-
1 13 3 13> attach(mdata2)The &ollo;ing oCect(s) a%e masked &%om mdata/ , > mean() # This ;ill no; gi'e the mean o& mdata2X[1] .> detach(mdata2)> mean()[1] 1!2.1444> detach(mdata)
> %m(mdata2)
3.2.3 Su!&setting te data ta!le and logical 'ueries
Subsets of a data frae can be created by referencing specific ro(s (ithin it% 4or e0aple, iagine
(e (ant a table only of those observations that have a a value above the ean of soe variable%
> attach(mdata)> sset 7 ;hich( > mean())> class(sset)[1] "intege%"> sset[1] 2 4 5 6 11 12 1 15 16 2! 21 22 2 -! -1 -- [etc.]
> mdata.s 7 mdata[sset,]> head(mdata.s) 2 1!-.3426 --.5-!1 1-1.6!3 11.!45 1!3.6!2 --4.-644 1!6.453 -.415 11.433 -1.5116 111.13 -34.423
"ote ho( the ro( naes of this subset have been inherited fro the parent data frae%
A ore direct approach is to define the subset as a logical vector that is either true or false
dependent upon (hether a condition is et%
7>
$
=
76
7?
7@
$&
$$
$=
$
:6
-
8/12/2019 intro_to_R1
31/36
> sset 7 > mean()> class(sset)[1] "logical"> sset [1] @?AB= TR= @?AB= TR= TR= @?AB= TR= TR= TR= [etc.]> mdata.s 7 mdata[sset,]> head(mdata.s)
2 1!-.3426 --.5-!1 1-1.6!3 11.!45 1!3.6!2 --4.-644 1!6.453 -.415 11.433 -1.5116 111.13 -34.423
A yet ore parsionious (ay of achieving the sae is+
> mdata.s 7 mdata[ > mean(),]# Belects those %o;s that meet the logical condition, and all colmns
> head(mdata.s)
2 1!-.3426 --.5-!1 1-1.6!3 11.!45 1!3.6!2 --4.-644 1!6.453 -.415 11.433 -1.5116 111.13 -34.423
In the sae (ay, to select those ro(s (here 0 is greater than or e8ual to the ean of 0 andy is
greater than or e8ual to the ean of y
> mdata.s 7 mdata[ >9 mean() Z >9 mean(),]# The smol Z is sed &o% and
r, those ro(s (here 0 is less than the ean of 0 or y is less than the ean of y
> mdata.s 7 mdata[ 7 mean() 7 mean(),]
# The smol is sed &o% o%
3.2.4 ,issing data
9issing data is given the value J?% 4or e0aple,
> mdata[1,1] 9 J?> mdata[2,2] 9 J?> head(mdata) 1 J? 23.2316
2 1!-.34254 J?- 5-.254- 22.6554 1-1.6!32 11.!45 1!3.6!13 --4.-643 5-.6!3- 26!.1211
R (ill, by default, report "A or an error (hen soe calculations are tried (ith issing data+
> mean(mdataX)[1] J?> *antile(mdataX)=%%o% in *antile.de&alt(mdataX) /missing 'ales and JaJs not allo;ed i& na.%m is @?AB=
2o overcoe this, the default can be changed or the issing data reoved%
7 mean(mdataX, na.%m9T)[1] 1!2.-23-> *antile(mdataX, na.%m9T) # Pi'ides the data into *a%tiles ! 2 ! 4 1!!1!.4 252.1!3 -1-.4-3 -3.532 4.3!2!
4or the second, there are various (ays to reove the issing data% 4or e0aple W> sset 7 Eis.na(mdataX)
W creates a logical vector (hich is true (here the data values of 0 are not issing the Ein the
e0presion eans not-+
> head(sset)[1] @?AB= TR= TR= TR= TR= TR=
Using the subset,
> 2 7 mdataX[sset]> mean(2)[1] 1!2.-23-
9ore succinctly,
> ;ith(mdata, mean([Eis.na()]))[1] 1!2.-23-
Alternatively, a ne( data frae could be created (ithout any issing data (hereby any ro( (ith
any issing value is oitted%
> sset 7 complete.cases(mdata)> head(sset)[1] @?AB= @?AB= TR= TR= TR= TR=> mdata.complete 9 mdata[sset,]
> head(mdata.complete) - 5-.254- 22.6554 1-1.6!32 11.!45 1!3.6!13 --4.-643 5-.6!3- 26!.12114 1!6.455 -.415 11.4336 -1.511
3.2. Reading data &rom a &i$e into a data &rame
2he accopanying file schools%csv contains inforation about the location and soe attributes of
schools in Greater 'ondon in 7&&>-% 2he locations are given as a grid reference asting,"orthing-% 2he inforation is not real but is realistic% It should not, ho(ever, be used to a#e
inferences about real schools in 'ondon%
A standard (ay to read a file into a data frae, (ith cases corresponding to lines and variables to
fields in the file, is to use the %ead.tale(...) coand%
> 8%ead.tale
In the case of schools%csv, it is coa deliited and has colun headers% 'oo#ing through the
arguents for %ead.talethe data ight be read into R using
> schools.data 7 %ead.tale("schools.cs'", heade%9T, sep9",")
2his (ill only (or# if the file is located in the (or#ing directory, else the location path- of the file
$&
$
=
76
7?
7@
$&
$$
$=
$ schools.data 7 %ead.tale(&ile.choose(), heade%9T, sep9",")
'oo#ing through the usage of read%table in the R help page, a variant of the coand is found
(here the defaults are for coa deliited data% So, ost siply, (e could use,
schools.data 7 %ead.cs'(&ile.choose())
Having read!in the data, soe basic chec#s of it are helpful,
> head(schools.data, n9-) @BM =?A B=J ;hite lk.ca% lk.a&% indian pakistani [etc.]1 !.36 !.5- !.!-1 !.214 !.!-2 !.222 !.!!2 !.!2!2 !.-61 !.2 !.!!1 !.-! !.!54 !.123 !.!!- !.!12- !.4!5 !.6- !.!-5 !.!5 !.!!! !.2-6 !.!!! !.!!# Kie;s the &i%st th%ee lines o& the data> ncol(schools.data)[1] 14> n%o;(schools.data)
[1] -33> smma%(schools.data) @BM =?A B=J [etc.]Min. /!.!!!! Min. /!.!!!! Min. /!.!!!!!1st S./!.1-2- 1st S./!.142 1st S./!.!!5!!Median /!.2!! Median /!.-13 Median /!.!2!!!Mean /!.24!2 Mean /!.-61 Mean /!.!2-!5-%d S./!.-564 -%d S./!.122 -%d S./!.!-!!Ma. /!.44-! Ma. /1.!!!! Ma. /!.11-!!
It sees to be fine%
4or ore about iporting and e0porting data in R, consult the R help docuent, R 5ata
Iport0port%
3.3 Lists
A list is a little li#e a data frae but offers a ore fle0ible (ay to gather ob;ects of different classes
together% 4or e0aple,
> mlist 7 list(schools.data, model1, "a")> class(mlist)[1] "list"
2o find the nuber of coponents in a list, use length(...),
> length(mlist)[1] -
Here the first coponent is the data frae containing the schools data% 2he second coponent is the
linear odel created earlier% 2he third is the character Ea% 2o reference a specific coponent,
double s8uare brac#ets are used+
> head(mlist[[1]], n9-) @BM =?A B=J ;hite lk.ca% lk.a&% indian pakistani [etc.]1 !.36 !.5- !.!-1 !.214 !.!-2 !.222 !.!!2 !.!2!2 !.-61 !.2 !.!!1 !.-! !.!54 !.123 !.!!- !.!12- !.4!5 !.6- !.!-5 !.!5 !.!!! !.2-6 !.!!! !.!!
> smma%(mlist[[2]])Lall/
$6
$
=
76
7?
7@
$&
$$
$=
$ class(mlist[[-]])[1] "cha%acte%"
2he double s8uare brac#ets can be cobined (ith single ones% 4or e0aple,> mlist[[1]][1,] @BM =?A B=J ;hite lk.ca% lk.a&% indian pakistani [etc.]1 !.36 !.5- !.!-1 !.214 !.!-2 !.222 !.!!2 !.!2!
is the first ro( of the schools data% 2he first cell of the sae data is
> mlist[[1]][1,1][1] 24
3.4 -riting a &unction
In brief, a function is (ritten in R in the follo( (ay,
> &nction.name 7 &nction(list o& a%gments) 0+ &nction code+ %et%n(%eslt)+
So, a siple function to divide the product of t(o nubers by their su could be,
> m.&nction 7 &nction(1, 2) 0+ %eslt 7 (1 $ 2) (1 + 2)+ %et%n(%eslt)+
"o( running the function
> m.&nction(-, 4)[1] 2.1
3. R pac(ages &or mapping and spatia$ data ana$)sis
By default, R coes (ith a base set of pac#ages and ethods for data analysis and visualiFation%
Ho(ever, there are any other pac#ages available, too, that greatly e0tend R3s value and
functionality% 2hese pac#ages are listed alphabetically at http+cran%r!pro;ect%org(ebpac#ages
availableOpac#agesObyOnae%htl%
Because there are so any, it can be useful to bro(se the pac#ages by topic at http+cran%r!
pro;ect%org(ebvie(s-% 2he topic, or 3tas# vie(3 of particular interest here is the analysis of spatial
data+ http+cran%r!pro;ect%org(ebvie(sSpatial%htl
$7
$
=
76
7?
7@
$&
$$
$=
$