data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict,...
-
Upload
nguyenphuc -
Category
Documents
-
view
225 -
download
5
Transcript of data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict,...
![Page 1: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/1.jpg)
DataStructuresinPythonOctober2,2017
![Page 2: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/2.jpg)
Whatisadatastructure?
• Waytostoredataandhavesomemethodtoretrieveandmanipulateit
• Lotsofexamplesinpython:• List,dict,tuple,set,string• Array• Series,DataFrame
• Someoftheseare“built-in”(meaningyoucanjustusethem),othersarecontainedwithinotherpythonpackages,likenumpy andpandas
![Page 3: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/3.jpg)
BasicPythonDataStructures(built-in)• List,dict,tuple,set,string
• Eachofthesecanbeaccessedinavarietyofways
• Decisiononwhichtouse?Dependsonwhatsortoffeaturesyouneed(easyindexing,immutability,etc)• Mutablevsimmutable
• Mutable– canchange• Immutable– doesn’tchange x=something#immutabletype
printxfunc(x)printx#printsthesamething
x=something#mutabletypeprintxfunc(x)printx#mightprintsomethingdifferent
![Page 4: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/4.jpg)
BasicStructure:List
• Veryversatile,canhaveitemsofdifferenttypes,ismutable
• Tocreate:usesquarebrackets[]tocontaincommaseparatedvalues
• Example:>>l=[‘a’,‘b’,123]• >>l[’a’,‘b’,123]
• Togetvaluesout:>>l[1](useindex,startswith0)>>b
• Wesawthesebackinlab3
![Page 5: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/5.jpg)
BasicStructure:Set
• Setisanunorderedcollectionwithnoduplicatevalues,ismutable• Createusing{}• Example:>>s={1,2,3}• >>s
set([1,2,3])
• Usefulforeliminatingduplicatevaluesfromalist,doingoperationslikeintersection,difference,union
![Page 6: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/6.jpg)
BasicStructure:Tuple
• Tupleholdsvaluesseparatedbycommas,areimmutable• Createusing,or()tocreateempty• Example:>>t=1,2,3
• >>t(1,2,3)
>>type(t)type‘tuple’
• Usefulwhenstoringdatathatdoesnotchange,whenneedingtooptimizeperformanceofcode(pythonknowshowmuchmemoryneeded)
![Page 7: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/7.jpg)
BasicStructure:Dict• Representedbykey:value pair
• Keys:canbyanyimmutabletypeandunique• Values:canbeanytype(mutableorimmutable)
• Tocreate:usecurlybraces{}ordict()andlistbothkeyandvalue• >>> letters = {1: 'a', 2: 'b', 3: 'c', 4: 'd'}
>>> type(letters) <type 'dict'>
• Toaccessdataindictionary,callbythekey• >>>letters[2]
'b'• Haveusefulmethodslikekeys(),values(),iteritems(),itervalues() usefulforaccessingdictionaryentries
• Usefulwhen:• Needassociationbetweenkey:value pair• Needtoquicklylookupdatabasedonadefinedkey• Valuesaremodified
![Page 8: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/8.jpg)
Array:UseNumPy!
• Whatisanarray?• “listoflists”• SimilartoMatlab insomeways
• Createa2x3array• [123;456]:matlab• np.array([[1.,2.,3.],[4.,5.,6.]])
• WhatisNumPy?• NumericalPython• Pythonlibraryveryusefulforscientificcomputing
• HowtoaccessNumPy?• Needtoimportitintoyourpythonworkspaceorintoyourscript
• >>importnumpy asnp
>>>importnumpy asnp>>> y = np.array([[1.,2.,3.], [4.,5.,6.]]) >>> y array([[ 1., 2., 3.],
[ 4., 5., 6.]]) >>>
![Page 9: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/9.jpg)
WhyuseaNumPy array?
• Whatisit?• “multidimensionalarrayofobjectsofallthesametype”
• Morecompactforthanlist(don’tneedtostorebothvalueandtypelikeinalist)• Reading/writingfasterwithNumPy• Getalotofvectorandmatrixoperations
• Can’tdo“vectorized”operationsonlist(likeelement-wiseaddition,multiplication)
• Canalsodothestandardstuff,likeindexing,comparisons,logicaloperations
![Page 10: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/10.jpg)
CreatingNumPy ArraysCreatingNumPy arrayandcheckingifeachelementis>3
CreateNumPy array,printoutarraydimensions,anduseindexingtools
Create2x2NumPy arraywithjustzeros
![Page 11: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/11.jpg)
MoreCreatingNumPy Arrays
• arange:like“range”,returnsanndarray
• Usereshapetodefine/changeshapeofarray
![Page 12: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/12.jpg)
OperationswithNumPy Arrays
• Arithmeticoperations(e.g.+,-,*,/,**)withscalarsandbetweenequal-sizearrays– doneelementbyelement• Anewarrayiscreatedwiththeresult
• Universalfunctions(forexample:sin,cos,exp)alsooperateelementwiseonanarray,newarrayresults
![Page 13: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/13.jpg)
Becareful:*vsdot
• *isproductoperator,operateselementwiseinNumPy arrays
A*B– elementwisemultiplication
.dot– matrixproduct
![Page 14: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/14.jpg)
OtherUsefulNumPy ArrayOperations• Sum,min,max:canbeusedtogetvaluesforallelementsinarray
• Canuse(axis=#)tospecifycertainrowsandcolumns
Getsumofallelementsinarray,alsominandmaxwithinarray
Sumofeachcolumn(axis=0)
Minofeachrow(axis=1)
Cumulativesumalongeachrow
![Page 15: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/15.jpg)
IndexingwithNumPy Arrays• 1Darrays(justlikelists)
• Multidimensionalarrays:workwithanindexperaxis
Createarrayusingarange
Pulloutelementatposition3
Pulloutelementsinpositionsstartingat3,before6
Elementatrow3,column4
Eachrowin2nd columnEachrowin2nd column
Eachcolumnin2nd and3rd row
![Page 16: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/16.jpg)
Whatispandas?
• OpensourcepackagewithuserfriendlydatastructuresanddataanalysistoolsforPython• BuiltontopofNumPy,givesmoretools
• Veryusefulfortabulardataincolumns(i.e.spreadsheets),timeseriesdata,matrixdata,etc
• Twomaindatastructures:• Series(1-dimensional)• DataFrame (2-dimensional)
• Howtoaccess:• Needtoimportitintoyourpythonworkspaceorintoyourscript
>>importpandasaspd
paneldata:multidimensionalstructureddatasets
![Page 17: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/17.jpg)
Pandas:Series• Effectivelya1-DNumPy arraywithanindex• 1Dlabeledarraythatcanholdanydatatype,withlabelsknownasthe“index”
>>>s=pd.Series(data,index=index)
datacanbeanarray,scalar,oradict
![Page 18: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/18.jpg)
Pandas:Series
• Canusingslicingtograboutvalues
• Canalsouseindextograboutvalues
![Page 19: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/19.jpg)
Pandas:DataFrame• Mostcommonlyusedpandasobject• DataFrame isbasicallyatablemadeupofnamedcolumnsofseries• Thinkspreadsheetortableofsomekind• Cantakedatafrom
• Dict of1Darrays,lists,dicts,Series• 2Dnumpy array• Series• AnotherDataFrame
• Canalsodefineindex(rowlabels)andcolumns(columnlabels)• SeriescanbedynamicallyaddedtoorremovedfromtheDataFrame
![Page 20: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/20.jpg)
CreatingDataFrames
• Fromdict ofSeriesordicts:Have2series(oneandtwo)
NewDataFrame (df)isunionofthe2Seriesindices
Outputincludesrowlabels(index)andcolumnlabelsasspecified
NotetheNaN reportedbecauseofno4th valuein“one”Usingarrays/listsissimilar:
Ifnoindexisgiven,indexwillberange(n)wherenisarraylength
![Page 21: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/21.jpg)
AccessingDataFrame Info
Canaccessspecificrows
Canaccessspecificrowsandcolumns
GrabspecificcolumnfromexistingDataFrame
![Page 22: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/22.jpg)
AccessingDataFrame InfoGrabspecificcolumnfromexistingDataFrame
Makeanewcolumnthroughoperationsonothers
Getridofcolumns
![Page 23: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/23.jpg)
WorkingwithDataFrames Create2differentDataFrames
Addthedataframes together
Noteelementwiseaddition,withtheresulthavingtheunionofrowandcolumnlabels,evenifyoudon’thavevaluesineachposition
LotsofNumPy elementwisefunctionsworkonDataFrames,asdooperationsliketranspose(.T),.dot
![Page 24: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/24.jpg)
OthercoolthingstodowithDataFrames
Basicstatistics
sorting
![Page 25: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/25.jpg)
OthercoolthingstodowithDataFrames
Grabbingdatathatmeetacertaincondition
Filteringdatatograbonlydatathatcontainscertainvaluesusing.isin
Addanewcolumnatendofdataframe
![Page 26: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/26.jpg)
DataFrames:groupby
• Thisallowsyoutosplitupdataintogroupsbasedonsomecriteria,applysomefunction,andgetaresult
Using“groupby”toselectrowsthatcontainsamevalueinE,thensumthosevalues
![Page 27: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/27.jpg)
PlottingDatainSeries
Createdaseriesof1000randomnumbers,withanindexofdatesstartingat1/1/2000
Plottedthecumulativesumofthoserandomnumbers
![Page 28: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/28.jpg)
PlottingDatainDataFrames
Using.plot()withDataFrames willplotallofthecolumnswithlabels
![Page 29: data Structures In Python - R Grapenthin · Basic Python Data Structures (built-in) •List, dict, tuple, set, string •Each of these can be accessed in a variety of ways •Decision](https://reader034.fdocuments.us/reader034/viewer/2022042708/5ade96f37f8b9aa5088e56ce/html5/thumbnails/29.jpg)
Nextup:
• Labtoday– workingwithdatastructures
• Nextweek:howtogetdataintoandoutofpython(I/Otopics)