CompSci516 Data Intensive Computing Systems · 2016-11-02 · 2. Horizontal vs. Vertical Scaling...

CompSci 516DataIntensiveComputingSystems

Lecture18NoSQLand

ColumnStore

Instructor:Sudeepa Roy

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

1

Announcements

• HW3(lastHW)hasbeenpostedonSakai

• SameproblemsasinHW1butinMongoDB(NOSQL)

• Dueintwoweeksaftertoday’slecture(~11/16)

2DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems

ReadingMaterialNOSQL:• “ScalableSQLandNoSQLDataStores”RickCattell,SIGMODRecord,December2010(Vol.39,No.4)• seewebpagehttp://cattell.net/datastores/ forupdatesandmorepointers

ColumnStore:• D.Abadi,P.Boncz,S.Harizopoulos,S.Idreos andS.Madden.TheDesignandImplementationof

ModernColumn-OrientedDatabaseSystems.FoundationsandTrendsinDatabases,vol.5,no.3,pp.197–280,2012.

• SeeVLDB2009tutorial:http://nms.csail.mit.edu/~stavros/pubs/tutorial2009-column_stores.pdf

Optional:• “Dynamo:Amazon’sHighlyAvailableKey-valueStore”ByGiuseppeDeCandia et.al.SOSP

2007

• “Bigtable:ADistributedStorageSystemforStructuredData”FayChanget.al.OSDI2006

DukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 3

NoSQL


Sofar-- RDBMS

• RelationalDataModel• RelationalDatabaseSystems(RDBMS)• RDBMSshave– acompletepre-definedfixedschema– aSQLinterface– andACIDtransactions


Today• NoSQL:”new”databasesystems– nottypicallyRDBMS– relaxonsomerequirements,gainefficiencyandscalability

• Newsystemschoosetouse/notuseseveralconceptswelearntsofar– e.g.SystemXdoesnotuselocksbutusemulti-versionCC(MVCC)or,

– SystemYusesasynchronousreplication• therefore,itisimportanttounderstandthebasics(Lectures1-17)eveniftheyarenotusedinsomenewsystems!


Warnings!

• MaterialfromCattell’spaper(2010-11)–someinfowillbeoutdated– seewebpagehttp://cattell.net/datastores/ forupdatesandmorepointers

• WewillfocusonthebasicideasofNoSQLsystems

• Optional readingslidesattheend– therearealsocomparisontablesintheCattell’spaperifyouareinterested


OLAPvs.OLTP

• OLTP(OnLine TransactionProcessing)– Recalltransactions!– Multipleconcurrentread-writerequests– Commercialapplications(banking,onlineshopping)– Datachangesfrequently– ACIDproperties,concurrencycontrol,recovery

• OLAP(OnLine AnalyticalProcessing)– Manyaggregate/group-byqueries– multidimensionaldata– Datamostlystatic– WillstudyOLAPCubesoon


NewSystems• WewillexamineanumberofSQLandso- called“NoSQL”systemsor“datastores”

• DesignedtoscalesimpleOLTP-styleapplicationloads– todoupdatesaswellasreads– incontrasttotraditionalDBMSsanddatawarehouses– toprovidegoodhorizontalscalabilityforsimpleread/writedatabaseoperationsdistributedovermanyservers

• OriginallymotivatedbyWeb2.0applications– thesesystemsaredesignedtoscaletothousandsormillionsofusers


NewSystemsvs.RDMS• Whenyoustudyanewsystem,compareitwithRDBMS-sonits– datamodel– consistencymechanisms– storagemechanisms– durabilityguarantees– availability– querysupport

• Thesesystemstypicallysacrificesomeofthesedimensions– e.g.database-widetransactionconsistency,inordertoachieveothers,e.g.higheravailabilityandscalability


NoSQL

• Manyofthenewsystemsarereferredtoas“NoSQL”datastores

• NoSQLstandsfor“NotOnlySQL”or“NotRelational”– notentirelyagreedupon

• Next:sixkeyfeaturesofNoSQLsystems


NoSQL:SixKeyFeatures

1. theabilitytohorizontallyscale“simpleoperations”throughputovermanyservers

2. theabilitytoreplicateandtodistribute(partition)dataovermanyservers

3. asimplecalllevelinterfaceorprotocol(incontrasttoSQLbinding)

4. aweakerconcurrencymodelthantheACIDtransactionsofmostrelational(SQL)databasesystems

5. efficientuseofdistributedindexesandRAMfordatastorage6. theabilitytodynamicallyaddnewattributestodatarecords


ImportantExamplesofNewSystems

• Threesystemsprovideda“proofofconcept”andinspiredmanyotherdatastores

1. Memcached2. Amazon’sDynamo3. Google’sBigTable


1.Memcached:mainfeatures

• popularopensourcecache

• supportsdistributedhashing(later)

• demonstratedthatin-memoryindexes canbehighlyscalable,distributing andreplicatingobjectsovermultiplenodes


2.Dynamo:mainfeatures

• pioneeredtheideaofeventualconsistencyasawaytoachievehigheravailabilityandscalability

• datafetchedarenotguaranteedtobeup-to-date

• butupdatesareguaranteedtobepropagatedtoallnodeseventually


3.BigTable :mainfeatures

• demonstratedthatpersistentrecordstoragecouldbescaledtothousandsofnodes


BASE(notACIDJ)

• RecallACIDforRDBMSdesiredpropertiesoftransactions:– Atomicity,Consistency,Isolation,andDurability

• NOSQLsystemstypicallydonotprovideACID

• BasicallyAvailable• Softstate• Eventuallyconsistent

DukeCS,Fall2016 CompSci 516:DataIntensiveComputingSystems 18

ACIDvs.BASE• TheideaisthatbygivingupACIDconstraints,onecanachievemuchhigherperformanceandscalability

• Thesystemsdifferinhowmuchtheygiveup– e.g.mostofthesystemscallthemselves“eventuallyconsistent”,meaningthatupdatesareeventuallypropagatedtoallnodes

– butmanyofthemprovidemechanismsforsomedegreeofconsistency,suchasmulti-versionconcurrencycontrol(MVCC)


“CAP”Theorem

• OftenEricBrewer’sCAPtheoremcitedforNoSQL

• A systemcanhaveonlytwooutofthreeofthefollowingproperties:– Consistency,– Availability– Partition-tolerance

• TheNoSQLsystemsgenerallygiveupconsistency– However,thetrade-offsarecomplex


TwofociforNoSQLsystems

1. “Simple”operations

2. HorizontalScalability


1.“Simple”Operations

• Readingorwritingasmallnumberofrelatedrecordsineachoperation– e.g.keylookups– readsandwritesofonerecordorasmallnumberofrecords

• Thisisincontrasttocomplexqueries,joins,orread-mostlyaccess

• Inspiredbyweb,wheremillionsofusersmaybothreadandwritedatainsimpleoperations– e.g.searchandupdatemulti-serverdatabasesofelectronic

mail,personalprofiles,webpostings,wikis,customerrecords,onlinedatingrecords,classifiedads,andmanyotherkindsofdata


2.HorizontalScalability

• Shared-NothingHorizontalScaling

• Theabilitytodistributeboththedataandtheloadofthesesimpleoperationsovermanyservers– withnoRAMordisksharedamongtheservers

• Not“vertical”scaling– whereadatabasesystemutilizesmanycoresand/orCPUsthatshareRAManddisks

• Someofthesystemswedescribeprovidebothverticalandhorizontalscalability


2.Horizontalvs.VerticalScaling

• Effectiveuseofmultiplecores(verticalscaling)isimportant– butthenumberofcoresthatcansharememoryislimited

• horizontalscalinggenerallyislessexpensive– canusecommodityservers

• Note:horizontalandverticalpartitioningarenotrelatedtohorizontalandverticalscaling– exceptthattheyarebothusefulforhorizontalscaling(Lecture17)


WhatisdifferentinNOSQLsystems

• WhenyoustudyanewNOSQLsystem,noticehowitdiffersfromRDBMSintermsof

1. ConcurrencyControl2. DataStorageMedium3. Replication4. Transactions


ChoicesinNOSQLsystems:1.ConcurrencyControl

a) Locks– somesystemsprovideone-user-at-a-timereadorupdatelocks– MongoDBprovideslockingatafieldlevel

b) MVCCc) None– donotprovideatomicity– multipleuserscaneditinparallel– noguaranteewhichversionyouwillread

d) ACID– pre-analyzetransactionstoavoidconflicts– nodeadlocksandnowaitsonlocks


ChoicesinNOSQLsystems:2.DataStorageMedium

a) StorageinRAM– snapshotsorreplicationtodisk– poorperformancewhenoverflowsRAM

b) Diskstorage– cachinginRAM


ChoicesinNOSQLsystems:3.Replication

• whethermirrorcopiesarealwaysinsynca) Synchronousb) Asynchronous– faster,butupdatesmaybelostinacrash

c) Both– localcopiessynchronously,remotecopies

asynchronously


ChoicesinNOSQLsystems:4.TransactionMechanisms

a) supportb) donotsupportc) inbetween– supportlocaltransactionsonlywithinasingle

objector“shard”– shard=ahorizontalpartitionofdataina

database


ComparisonfromCattell’spaper(2011)


DataModelTerminologyforNoSQL

• UnlikeSQL/RDBMS,theterminologyforNoSQLisofteninconsistent– wearefollowingnotationsinCattell’spaper

• Allsystemsprovideawaytostorescalarvalues– e.g.numbersandstrings

• Someofthemalsoprovideawaytostoremorecomplexnestedorreferencevalues


DataModelTerminologyforNoSQL

• Thesystemsallstoresetsofattribute-valuepairs– butusefourdifferentdatastructures

1. Tuple2. Document3. ExtensibleRecord4. Object


1.Tuple

• Sameasbefore• A“tuple”isarowinarelationaltable– attributenamesarepre-definedinaschema– thevaluesmustbescalar– thevaluesarereferencedbyattributename– incontrasttoanarrayorlist,wheretheyarereferencedbyordinalposition


2.Document

• Allowsvaluestobenesteddocumentsorlistsaswellasscalarvalues

• Theattributenamesaredynamicallydefinedforeachdocumentatruntime

• Adocumentdiffersfromatupleinthattheattributesarenotdefinedinaglobalschema– anda widerrangeofvaluesarepermitted


3.ExtensibleRecord

• A hybrid betweenatupleandadocument• familiesofattributesaredefinedinaschema• butnewattributescanbeadded(withinanattributefamily)onaper-recordbasis

• Attributesmaybelist-valued


4.Object

• Analogoustoanobjectinprogramminglanguages– butwithouttheproceduralmethods

• Valuesmaybereferencesornestedobjects


DataStoreCategories• Thedatastoresaregroupedaccordingtotheirdatamodel• Key-valueStores:

– storevaluesandanindextofindthem– basedonaprogrammer- definedkey

• DocumentStores:– storedocuments– Thedocumentsareindexedandasimplequerymechanismis

provided• ExtensibleRecordStores:

– storeextensiblerecordsthatcanbepartitionedverticallyandhorizontallyacrossnodes

– Somepaperscallthese“widecolumnstores”• RelationalDatabases:

– store(andindexandquery)tuples– e.g.thenewRDBMSsthatprovidehorizontalscaling


ExampleNOSQLsystems

• Key-valueStores:– ProjectVoldemort,Riak,Redis,Scalaris,TokyoCabinet,Memcached/Membrain/Membase

• DocumentStores:– AmazonSimpleDB,CouchDB,MongoDB,Terrastore

• ExtensibleRecordStores:– Hbase,HyperTable,Cassandra,Yahoo’sPNUTS

• RelationalDatabases:– MySQLCluster,VoltDB,Clustrix,ScaleDB,ScaleBase,NimbusDB,GoogleMegastore(alayeronBigTable)


Key-valuestore:1/2• Thesimplestdatastores• datamodelsimilartothememcached distributedin-memorycache– withasinglekey-valueindexforallthedata– doesnotprovidesecondaryindicesorkeys

• butunlikememcached,generallyprovide– apersistencemechanism– additionalfunctionalitylike replication,versioning,locking,transactions,sorting,etc

• Theclientinterfaceprovidesinserts,deletes,andindexlookups

DukeCS,Spring2016 CompSci516:DataIntensiveComputingSystems 39

Key-valuestore:2/2• Allkey-valuestoresprovidescalabilitythroughkey

distributionovernodes• Voldemort,Riak,TokyoCabinet,andenhancedmemcached

systemscanstoredatainRAMorondisk– TheothersstoredatainRAM,andprovidediskasbackup,orrely

onreplicationandrecoverysothatabackupisnotneeded• Scalaris andenhancedmemcached systemsusesynchronous

replication– therestuseasynchronous

• Scalaris andTokyoCabinetimplementtransactions– theothersdonot.

• VoldemortandRiak usemulti-versionconcurrencycontrol– theothersuselocks


UseCase:Key-valuestore

• ifyouhaveasimpleapplicationwithonlyonekindofobject,andyou onlyneedtolookupobjectsupbasedononeattribute

• Supposeyouhaveawebapplication– thatdoesmanyRDBMSqueriestocreateatailoredpagewhenauserlogsin

– Supposeittakesseveralsecondstoexecutethosequeries,andtheuser’sdataisrarelychanged

– youmightwanttostoretheuser’stailoredpageasasingleobjectinakey-valuestore


Documentstore:1/3• Documentstoressupportmorecomplexdatathanthekey-valuestores

• “documentstore”maybeconfusing– thesesystemscouldstore“documents”inthetraditionalsense(articles,MicrosoftWordfiles,etc.)

– butadocumentinthesesystemscanbeanykindof“pointerless object”

• Unlikethekey-valuestores,thesesystemsgenerallysupport– secondaryindexes– multipletypesofdocuments(objects)perdatabase,and– nesteddocumentsorlists

• LikeotherNoSQLsystems,thedocumentstoresdonotprovideACIDtransactionalproperties


Documentstore:2/3• Thedocumentstoresareschema-less,exceptfor

– attributes(whicharesimplyaname,andarenotpre- specified)– collections(whicharesimplyagroupingofdocuments),and– indexesdefinedoncollections(explicitlydefined,exceptinSimpleDB)– Therearesomedifferencesintheirdatamodels,e.g.SimpleDB does

notallownesteddocuments

• Thedocumentstoresareverysimilarbutusedifferentterminology– e.g.aSimpleDB Domain=CouchDB Database=MongoDBCollection

(=Terrastore Bucket)– SimpleDB callsdocuments“items”– anattributeisafieldinCouchDB,orakeyinMongoDB(orTerrastore)


Documentstore:3/3• Unlikethekey-valuestores,thedocumentstores“typically”provideamechanismtoquerycollectionsbasedonmultipleattributevalueconstraints

• donotprovideexplicitlocks– haveweakerconcurrencyandatomicitypropertiesthantraditionalACID-compliantdatabases

• Documentscanbedistributedovernodesinallofthesystems– Allofthesystemscanachievescalabilitybyreading(potentially)out-of-datereplicas


Usecase:DocumentStore

• applicationwithmultipledifferentkindsofobjects– e.g.inaDepartmentofMotorVehiclesapplication,withvehiclesanddrivers

• whereyouneedtolookupobjectsbasedonmultiplefields– e.g.,adriver’sname,licensenumber,ownedvehicle,orbirthdate


ExtensibleRecordStores:1/1

• MotivatedbyGoogle’ssuccesswithBigTable– stilltherecentextensiblerecordstorescannotcomeclosetoBigTable’s

scalability• Basicdatamodelisrowsandcolumns• Basicscalabilitymodelissplittingbothrowsandcolumnsover

multiplenodes• Rowsaresplitacrossnodesthroughsharding ontheprimarykey

– Theytypicallysplitbyrangeratherthanahashfunction• Columnsofatablearedistributedovermultiplenodesbyusing

“columngroups”– awayforthecustomertoindicatewhichcolumnsarebeststored

together• Bothhorizontalandverticalpartitioningcanbeusedsimultaneously

onthesametable


Usecase:ExtensibleRecordStore• usescasessimilartothosefordocumentstores:

– multiplekindsofobjects,withlookupsbasedonanyfield.

• However,aimedathigherthroughput,andmayprovidestrongerconcurrencyguarantees,– atthecostofslightlymorecomplexity thanthedocumentstores

• SupposestoringcustomerinformationforaneBay-styleapplication,andyouwanttopartitionyourdatabothhorizontallyandvertically:– clustercustomersbycountry,sothatyoucanefficientlysearchallof

thecustomersinonecountry– separatetherarely-changed“core”customerinformationsuchas

customeraddressesandemailaddressesinoneplace,and– putcertainfrequently-updatedcustomerinformation(suchascurrent

bidsinprogress)inadifferentplace,toimproveperformanceDukeCS,Fall2016 CompSci516:DataIntensiveComputingSystems 47

ScalableRDBMS:1/1

• SomeRDBMSsareexpectedtoprovidescalabilitycomparablewithNoSQLdatastores

• But,withtwoprovisos:– Usesmall-scopeoperations: Operationsthatspanmanynodes,

e.g.joinsovermanytables,willnotscalewellwithsharding– Usesmall-scopetransactions:Likewise,transactionsthatspan

manynodesaregoingtobeveryinefficient,withthecommunicationand2PCoverhead

• TypicalNOSQLsystemsmakethesetwoimpossible• ScalableRDBMSallowsthem,butpenalizesacustomerfor

theseoperations• Havehigher-levelSQLlanguageandACIDproperties

– butpayapricewhentheyspannodes


Usecase:ScalableRDBMS

• Ifyourapplicationrequiresmanytableswithdifferenttypesofdata– arelationalschemacentralizesandsimplifiesdatadefinitionandSQLsimplifiesoperations

– orforprojectswithmanyprogrammers• However,moreusefuliftheapplicationdoesnotrequire– updatesorjoinsthatspanmanynodes– transactioncoordination– or,datamovement


ConsistentHashing


inDynamoDB

ConsistentHashing(CH)• Recalldynamichashingschemes• Ifthe#ofslots(directorysize)changes,thenalmostallkeyshadtoberemapped

• Inconsistenthashing(CH),with#keys=Kand#slots=N,onlyK/Nkeysneedtoberemappedonaverage

• AppliestothedesignofDistributedHashTable(DHTs)forUniformLoadDistribution– partitionakeyspace amongasetofsites/nodes– additionallyprovideanoverlaynetworkthatconnectsnodessuchthatthenodesresponsibleforanykeycanbeefficientlylocated


DynamoDB :CH1/2• [ref.theDynamoDB paper,sec4.3]• Mustscaleincrementally• Consistenthashingisusedtodynamicallydistribute

dataarounda“ring”ofnodes(=sites)• Theoutputofahashfunctionistreatedasacircular

ring• Eachnodeisassignedarandomvalueinthisspace

– representsthe“position”onthering


• Dataitemidentifiedbyakey• Assigntoanodebyhashing

thekeyto

DynamoDB :CH2/2• Dataitemidentifiedbyakey• Assigntoanodebyhashingthekeytoyielditsposition

onthering• Walktheringclockwisetofindthefirstnodewitha

positionlargerthantheitem’sposition• Eachnodeisresponsiblefortheregioninthering

betweenitanditspredecessornodeonthering


• Note:• departureorarrivalofanodeonly

affectsitsimmediateneighbor• Theothernodesremainunaffected• K/Nonaverage!

DynamoDB :ChallengesinCH• However,thisbasicCHalgorithmposessomechallenges1. Randompositionassignmentofeachnodeonthering

leadstonon-uniformdataandloaddistribution2. Thebasicalgorithmisoblivioustotheheterogeneityinthe

performanceofthenodes

• Solution:DynamousesavariantofCH


• Eachnodegetsassignedtomultiplepointsinthering

• called“virtualnode”• onenodetakescareofmultiple

virtualnodes

DynamoDB:VirtualNodes• Usingvirtualnodeshasadvantages

1. Ifanodebecomesunavailable(duetofailuresorroutinemaintenance),theloadhandledbythisnodeisevenlydispersedacrosstheremainingavailablenodes

2. Whenanodebecomesavailableagain,oranewnodeisaddedtothesystem,thenewlyavailablenodeacceptsaroughlyequivalentamountofloadfromeachoftheotheravailablenodes


3. Thenumberofvirtualnodesthatanodeisresponsiblecandecidedbasedonitscapacity,accountingforheterogeneityinthephysicalinfrastructure

DynamoDB:Replication• Dynamoreplicatesitsdataonmultiple(N)hostsforhigh

availabilityanddurability• Eachkeykisassignedtoacoordinatorwhichisinchargeof

replication– coordinatorhandlesallkeysinitsrange

• Coordinatorreplicateseachkeyitisinchargeof– bystoringitlocally– replicatingitattheN-1clockwisesuccesor nodesinthering

• EachnodeisinchargeofregionoftheringbetweenitanditsN-th predecessor


NodeBreplicateskeyKatnodesCandDNodeDwillstorekeysintherange(A,B],(B,C],(C,D]Note:theremaybe<N“physical”nodes

CHHistory• ProposedbyCStheoreticiansfromMIT:

– Karger-Lehman-Leighton-Panigrahy-Levine-Lewin– “ConsistentHashingandRandomTrees:DistributedCachingProtocols

forRelievingHotSpotsontheWorldWideWeb”– STOC1997

• ConsistenthashinggavebirthtoAkamaiTechnologies– FoundedbyDannyLewinandTomLeightonin1998– Akamai’scontentdeliverynetworkisoneofthelargestdistributed

computingplatforms– Nowmarketcap$12Band6200employees– Managingweb-presenceofmanymajorcompanies

• 2001:TheconceptofDistributedHashTable(DHT)isproposed(howtolookforafile)andCHwasre-purposed

• NowusedinDynamo,Couchbase,Cassandra,Voldemort,Riak,..


SQLvs.NOSQL


Argumentsforbothsidesstillacontroversialtopic

WhychooseRDBMSoverNoSQL:1/31. Ifnewrelationalsystemscandoeverything

thataNoSQLsystemcan,withanalogousperformanceandscalability(?),andwiththeconvenienceoftransactionsandSQL,NoSQLisnotneeded

2. RelationalDBMSshavetakenandretainedmajoritymarketshareoverothercompetitorsinthepast30years– (network,object,andXMLDBMSs)


WhychooseRDBMSoverNoSQL:2/33. SuccessfulrelationalDBMSshavebeenbuilt

tohandleotherspecificapplicationloads inthepast:– read-onlyorread-mostlydatawarehousing– OLTPonmulti-coremulti-diskCPUs– in-memorydatabases– distributeddatabases,and– nowhorizontallyscaleddatabases


WhychooseRDBMSoverNoSQL:3/3

4. Whileno“onesizefitsall”intheSQLproductsthemselves,thereisacommoninterfacewithSQL,transactions,andrelationalschemathatgiveadvantagesintraining,continuity,anddatainterchange


WhychooseNoSQLoverRDBMS:1/31. Wehaven’tyetseengoodbenchmarksshowing

thatRDBMSscanachievescaling comparablewithNoSQLsystemslikeGoogle’sBigTable

2. Ifyouonlyrequirealookupofobjectsbasedonasinglekey– thenakey-valuestoreisadequateandprobablyeasiertounderstand

thanarelationalDBMS– Likewiseforadocumentstoreonasimpleapplication:youonlypay

thelearningcurveforthelevelofcomplexityyourequire


WhychooseNoSQLoverRDBMS:2/3

3. Someapplicationsrequireaflexibleschema– allowingeachobjectinacollectiontohavedifferentattributes

– WhilesomeRDBMSsallowefficient“packing”oftupleswithmissingattributes,andsomeallowaddingnewattributesatruntime,thisisuncommon


WhychooseNoSQLoverRDBMS:3/3

4. ArelationalDBMSmakes“expensive”(multi- nodemulti-table)operations“tooeasy”– NoSQLsystemsmakethemimpossibleorobviouslyexpensiveforprogrammers

5. WhileRDBMSshavemaintainedmajoritymarketshareovertheyears,otherproductshaveestablishedsmallerbutnon-trivialmarketsinareaswherethereisaneedforparticularcapabilities– e.g.indexedobjectswithproductslikeBerkeleyDB,orgraph-following

operationswithobject-orientedDBMSs


ColumnStore


Rowvs.ColumnStore

• Rowstore– storeallattributesofatupletogether– storagelike“row-majororder”inamatrix

• Columnstore– storeallrowsforanattribute(column)together– storagelike“column-majororder”inamatrix

• e.g.– MonetDB,Vertica(earlier,C-store),SAP/SybaseIQ,GoogleBigtable (withcolumngroups)



Ack:SlidefromVLDB2009tutorialonColumnstore


AdditionalandOptionalSlidesonMongoDB

(MaybeusefulforHW3)https://docs.mongodb.com

MongoDB

• MongoDBisanopensourcedocumentstorewritteninC++• providesindexesoncollections• lockless• providesadocumentquerymechanism• supportsautomaticsharding• Replicationismostlyusedforfailover• doesnotprovidetheglobalconsistencyofatraditionalDBMS

– butyoucangetlocalconsistencyontheup-to-dateprimarycopyofadocument

• supportsdynamicquerieswithautomaticuseofindices,likeRDBMSs

• alsosupportsmap-reduce– helpscomplexaggregationsacrossdocs

• providesatomicoperationsonfields


Optionalslide:Readyourself

MongoDB:AtomicOpsonFields• Theupdatecommandsupports“modifiers”thatfacilitateatomic

changestoindividualvalues– $setsetsavalue– $inc incrementsavalue– $pushappendsavaluetoanarray– $pushAll appendsseveralvaluestoanarray– $pullremovesavaluefromanarray,and$pullAll removesseveral

valuesfromanarray• Sincetheseupdatesnormallyoccur“inplace”,theyavoidthe

overheadofareturntriptotheserver• Thereisan“updateifcurrent”conventionforchangingadocument

onlyiffieldvaluesmatchagivenpreviousvalue• MongoDBsupportsafindAndModify commandtoperforman

atomicupdateandimmediatelyreturntheupdateddocument– usefulforimplementingqueuesandotherdatastructuresrequiring

atomicity



MongoDB:Index• MongoDBindicesareexplicitlydefinedusinganensureIndex call– anyexistingindicesareautomaticallyusedforqueryprocessing

• Tofindallproductsreleasedlastyear(2015)orlatercostingunder$100youcouldwrite:

• db.products.find({released:{$gte:newDate(2015,1,1,)},price{‘$lte’:100},})



MongoDB:Data

• MongoDBstoresdatainabinaryJSON-likeformatcalledBSON– BSONsupportsboolean,integer,float,date,stringandbinarytypes

–MongoDBcanalsosupportlargebinaryobjects,eg.imagesandvideos

– Thesearestoredinchunksthatcanbestreamedbacktotheclientforefficientdelivery



MongoDB:Replication

• MongoDBsupportsmaster-slavereplicationwithautomaticfailoverandrecovery– Replication(andrecovery)isdoneatthelevelofshards

– Replicationisasynchronousforhigherperformance,sosomeupdatesmaybelostonacrash



CompSci516 Data Intensive Computing Systems · 2016-11-02 · 2. Horizontal vs. Vertical Scaling...

Documents

Transcript of CompSci516 Data Intensive Computing Systems · 2016-11-02 · 2. Horizontal vs. Vertical Scaling...