DB - NoSQL - v180405d · Vertical vs. Horizontal Scaling "scale up" • Vertical scaling ("scale...

15
2 L22: NoSQL CS3200 Database design (sp18 s2) https://course.ccs.neu.edu/cs3200sp18s2/ 4/5/2018 Several slides courtesy of Benny Kimelfeld

Transcript of DB - NoSQL - v180405d · Vertical vs. Horizontal Scaling "scale up" • Vertical scaling ("scale...

2

L22:NoSQL

CS3200 Databasedesign(sp18 s2)https://course.ccs.neu.edu/cs3200sp18s2/4/5/2018 SeveralslidescourtesyofBennyKimelfeld

3

Outline

• Introduction• TransactionConsistency• 4maindatamodels- DocumentStores(e.g.,MongoDB)- Key-ValueStores(e.g.,Redis)- GraphDatabases(e.g.,Neo4j)- Column-FamilyStores

• ConcludingRemarks

4

SQLMeansMorethanSQL

• SQLstandsforthequerylanguage• ButcommonlyreferstothetraditionalRDBMS:- Relationalstorage ofdata

• Eachtupleisstoredconsecutively- Joins asfirst-classcitizens

• Infact,normalformspreferjoinstomaintenance- Strongguarantees ontransactionmanagement

• Noconsistencyworrieswhenmanytransactionsoperatesimultaneouslyoncommondata

• Focusonscalingup- Thatis,makeasinglemachinedomore,faster

5

Verticalvs.HorizontalScaling

"scaleup"

• Verticalscaling("scaleup"):youscalebyaddingmorepower(CPU,RAM)

• Horizontalscaling("scaleout"): youscalebyaddingmoremachines

"scaleout"

6

TrendsDriveCommonRequirements

Social media + mobile computing

• Explosion in data, always available, constantly read and updated

• High load of simple requests of a common nature

• Some consistency can be compromised (e.g., 👍 )

Cloud computing + open source

• Affordable resources for management / analysis of data

• People of various skills / budgets need software solutions for distributed analysis of massive data

Database solutions need to scale out(utilize distribution, “scale horizontally”)

7

CompromisesRequired

Whatisneededforeffectivedistributed,data-anduser-intensiveapplications?

1. Usedatamodelsandstoragethatallowtoavoidjoinsofbigobjects

2. Relaxtheguaranteesonconsistency

8

NoSQL

• NotOnlySQL- Nottheotherthing!- TermintroducedbyCarloStrozzi in1998todescribeanalternativedatabasemodel

- Becamethenameofamovement followingEricEvans’sreuseforadistributed-databaseevent

• Seminalpapers:- Google’sBigTable

• Chang,Dean,Ghemawat,Hsieh,Wallach,Burrows,Chandra,Fikes,Gruber:Bigtable:ADistributedStorageSystemforStructuredData.OSDI2006:205-218

- Amazon’sDynamoDB• DeCandia,Hastorun,Jampani,Kakulapati,Lakshman,Pilchin,Sivasubramanian,Vosshall,Vogels:Dynamo:amazon'shighly

availablekey-valuestore.SOSP2007:205-220

9

NoSQL fromnosql-database.org

“• NextGenerationDatabasesmostlyaddressingsomeofthepoints:beingnon-relational,distributed,open-source andhorizontallyscalable.

• Theoriginalintentionhasbeenmodernweb-scaledatabases.Themovementbeganearly2009andisgrowingrapidly.Oftenmorecharacteristicsapplysuchas:schema-free,easyreplicationsupport,simpleAPI,eventuallyconsistent/BASE(notACID),ahugeamountofdataandmore.

• Sothemisleadingterm“nosql”(thecommunitynowtranslatesitmostlywith“notonlysql”)shouldbeseenasanaliastosomethinglikethedefinitionabove.

10

WhatisNoSQL?

Source:GeekandPoke:http://geekandpoke.typepad.com/geekandpoke/2011/01/nosql.html

11

CommonNoSQL Features

• Non-relationaldatamodels• Flexiblestructure- Noneedtofixaschema,attributescanbeaddedandreplacedonthefly

• Massiveread/writeperformance;availabilityviahorizontalscaling- Replication andsharding (datapartitioning)- Potentiallythousandsofmachinesworldwide

• Opensource(veryoften)• APIstoimposelocality

12

Whenthedatabasegrows:PartitioningTables

Source:http://cloudgirl.tech/data-partitioning-vertical-horizontal-hybrid-partitioning/

13Source:http://cloudgirl.tech/data-partitioning-vertical-horizontal-hybrid-partitioning/

Verticalvs.Horisontal Partitioning

14

HorizontalPartitioning("sharding")

Source:http://cloudgirl.tech/data-partitioning-vertical-horizontal-hybrid-partitioning/

15

Verticalvs.Horizontalpartitioning

Source:http://www.piyushgupta.co.uk/2016/04/database-scaling-jargons.html,http://slideplayer.com/slide/12131436/70/images/17/SQL+Azure+Azure+Custom+Sharding.jpg

16Source:Anefficientschemeforprobabilisticskylinequeriesoverdistributeduncertaindata,Xiaoyong Li,Yijie Wang,Jie Yu