My First 100 days with a Cassandra Cluster

Post on 14-Apr-2017

1.133 views 2 download

Transcript of My First 100 days with a Cassandra Cluster

My First 100 days with a Cassandra Cluster

Presented by : Gustavo René Antúnez

February, 2017

© 2016 Pythian. Confidential

ABOUT PYTHIAN

Pythian’s 400+ IT professionals help

companies adopt and manage disruptive technologies to better compete

2

TECHNICAL EXPERTISE

Infrastructure: Transforming and managing the IT infrastructure that supports the business

DevOps: Providing critical velocity in software deployment by adoptingDevOps practices

Cloud: Using the disruptive nature of cloud for accelerated, cost-effective growth

Databases: Ensuring databasesare reliable, secure, available and continuously optimized

Big Data: Harnessing the transformative power of data on a massive scale

Advanced Analytics: Mining data for insights & business transformationusing data science

3

© 2016 Pythian. Confidential 4

Welcome to RMOUG 2017

Where do I come From–OracleDBA

• StartedwithVersion9.2in2004– SpeakeratOracleOpenWorld,OracleDevelopersDayandCollaborate

–Co-PresidentofORAMEX(MexicoOracleUserGroup)

–WebEventsChairforIOUGCloudComputingSpecialInterestGroup(SIG)

– InternationalChairRACSpecialInterestGroup(SIG);

–MovieFanatic&MusicLover–BringingthebestfromMéxico(Mexihtli)totherestoftheworldandintheprocessphotographingit:)

– rene-ace.com–@rene_ace

• #TD16

5

© 2016 Pythian. Confidential 6

Where do I come From

rene-ace.com@rene_ace• #TD17

© 2016 Pythian. Confidential 7

How did you get to be a DBA

© 2016 Pythian. Confidential 8

6th Happiest Job of 2015!

http://www.forbes.com/sites/susanadams/2014/03/20/the-happiest-and-unhappiest-jobs-in-2014/

Work-life balance

Relationship with boss and co-workers

Daily tasksJob resources

Field will grow by 15% between

2012 and 2022

DBA can be the key driver of

success

© 2016 Pythian. Confidential 9

Happiest Job of 2034?• 47percentofAmericanjobsareathighriskofbeingtakenbycomputerswithinthenexttwodecades.

– 1stWave• Computerswillstartreplacingpeopleinespeciallyvulnerablefieldsliketransportation/logistics,productionlabor,andadministrativesupport.

– 2ndWave• Dependentuponthedevelopmentofgoodartificialintelligence.Thiscouldnextputjobsinmanagement,scienceandengineering,andtheartsatrisk.

© 2016 Pythian. Confidential 10

The most important questionNormalizeorDenormalize

• Goalofnormalizationistostoreafactinoneplacetominimizeupdate,deleteandinsertanomalies*

• Normalizeddata,dependingonhowcomplextheschemabecomes,oftenaffectsqueryperformance

http://blog.rdx.com/cassandra-and-relational-database-schema-comparison-query-vs-relationship-modeling

© 2016 Pythian. Confidential 11

The most important questionNormalizeorDenormalize

• Normalizetoreducedataanomaliesanddenormalizetoimprovequeryperformance.

• Inrelationalsystems,administratorsmodelthedata.InCassandra,administratorsdesignschemasthatarebasedonquerypatterns.

• Denormalizationprocessisthemergingofattributesthatareoftenaccessedtogetherintoasingleschemaobject

http://blog.rdx.com/cassandra-and-relational-database-schema-comparison-query-vs-relationship-modeling

© 2016 Pythian. Confidential 12

What is Cassandra ? • NoSQLdatabase,developedinJavaOne• FullydistributedDB

• MeaningthatthereisnomasterDB,unlikeOracleorMySQL.

• Linearlyscalable• Basedon2coretechnologies,Google’sBigTableandAmazon’sDynamo

• 2versionsofCassandra• CommunityEdition.-ThisisdistributedundertheApache™License

• EnterpriseEdition.-ThisisdistributedbyDatastax

© 2016 Pythian. Confidential 13

CAPTheorem

• Inadistributedsystemyoucanonlyhavetwooutofthefollowingthreeguaranteesacrossawrite/readpair:

• Consistency.-Areadisguaranteedtoreturnthemostrecentwriteforagivenclient.

• Availability.-Anon-failingnodewillreturnareasonableresponsewithinareasonableamountoftime(noerrorortimeout).

• PartitionTolerance.-Thesystemwillcontinuetofunctionwhennetworkpartitionsoccur.

N1 N2

X X

N1 N2

N1 N2

What is Cassandra ?

What is Cassandra ?

© 2016 Pythian. Confidential 14

CAPTheorem

• Onefallacyofdistributedcomputingisthatnetworksarereliable

• AP-Availability/PartitionTolerance-Returnthemostrecentversionofthedatayouhave,whichcouldbestale.Willalsoacceptwritesthatcanbeprocessedlaterwhenthepartitionisresolved

• CP-Consistency/PartitionTolerance-Waitforaresponsefromthepartitionednodewhichcouldresultinatimeouterror.

© 2016 Pythian. Confidential 15

What is Cassandra ?

CassandraisaBASE(BasicallyAvailable,Softstate,Eventuallyconsistent)typesystem

• NotanACID(Atomicity,Consistency,Isolation,Durability)typesystem

CassandraisclassifiedasanAPsystem

© 2016 Pythian. Confidential 16

It Can be as easy as … • Startyourmachineandinstallthefollowing:

• ntp(Packagesarenormallyntp,ntpdataandntp-doc)

• wget(Unlessyouhaveyourpackagescopiedoverviaothermeans)

• vim(Oryourfavoritetexteditor)• YumPackageManagement• Rootorsudoaccesstotheinstallmachine• LatestversionofOracleJavaSERuntimeEnvironment(JRE)8(recommended)orOpenJDK7.

• Python2.6+(neededifinstallingOpsCenter)

© 2016 Pythian. Confidential 17

It Can be as easy as … • InstallCassandra.~$ sudo yum install dsc21-2.1.5-1 cassandra2.1.5-1

• Installoptionalutilities.~$ sudo yum install cassandra21-tools-2.1.5-1

• StartCassandraservice~$ sudo service cassandra stop

~$ sudo rm -rf /var/lib/cassandra/data/system/*

• Inthecassandra-rackdc.propertiesfile#indicatetherackanddcforthisnodedc=Pythianrack=RAC1

~$ sudo service cassandra start

© 2016 Pythian. Confidential 18

Where is everything in Cassandra?Directories Description/var/lib/cassandra Datadirectories/var/log/cassandra Logdirectory/var/run/cassandra Runtimefiles/usr/share/cassandra Environmentsettings/usr/share/cassandra/lib

JARfiles/usr/bin Optionalutilities,suchassstablelevelreset,

sstablerepairedset,andsstablesplit/usr/bin Binaryfiles/usr/sbin/etc/cassandra Configurationfiles/etc/init.d Servicestartupscript/etc/security/limits.d Cassandrauserlimits/etc/default/usr/share/doc/cassandra/examples

Samplecassandra.yamlfilesforstresstesting

© 2016 Pythian. Confidential 19

I come from this world…

12cVersionArchitecture…

© 2016 Pythian. Confidential 20

I come from this world…Oracle…

101010

Online Redo Log10100

Data Files Control Files

Segment

Database

Tablespace

Extent

Oracle data block

Schema Data file

OS block

Logical Datafile

Physical Datafile

© 2016 Pythian. Confidential 21

I come from this world…RAC-ForNodePointofFailure

RAC Cluster

Node3Node2

ASM Disks

Node1

Public Network

Storage NetworkASM Network

CSS Network

ASM ASM ASM

DBB DBBDBB

GlobalDataServices–Service Failover / Load Balancing

© 2016 Pythian. Confidential 22

I come from this world…Dataguard-ForFailover

Primary

Standby

FarSyncInstance

SYNCASYNC

Zerodatalossfailover

© 2016 Pythian. Confidential 23

Cassandra Architecture

CassandraCluster

N1

Node

N2

Node

Rack1

DatacenterMéxico

N3

Node

N4

Node

Rack2

DatacenterPortugal

© 2016 Pythian. Confidential 24

One Ring to Rule them All• Thetotalamountofdatamanagedbytheclusterisrepresentedasaring

• Eachnodeisassignedapartofthedatabasetoholdbasedoneachtable’sprimarykey.

• Toguaranteebothavailabilityanddurabilitymultiplenodeswillbeassignedtothesamedata.

• Thereisnomasternodeallnodescanperformalloperations

1

4

3

2

A-F,T-Z,M-S

G-L,A-F,T-Z

M-S,G-L,A-F

T-Z,M-S,G-L

© 2016 Pythian. Confidential 25

Gossip• Peer-to-peercommunicationprotocolinwhichnodesperiodicallyexchangestateinformation

• Runseverysecondandexchangesstatemessageswithuptothreeothernodesinthecluster

• Failuredetection• Itdetermineslocallyfromgossipstateandhistoryifanothernodeinthesystemisdownorhascomebackup.

© 2016 Pythian. Confidential 26

Consistent Hashing• Ahashconsistsofoneormorearithmeticoperationsonapieceofdata

• Commonwayofloadbalancingacrossseveralnodes

• Hashfunctionmusthaveaupperandlowerboundsoobjectscanbemappedinacircle

• CommonHashalgorithms– Simplechecksums– MessageDigest(MD5)– SecureHashAlgorithm(SHA-1/2)– MurmurHash

© 2016 Pythian. Confidential 27

Partitioners• Determineshowdataisdistributedacrossthenodesinthecluster

• Functionforderivingatokenrepresentingarowfromitspartitionkeybyhashing.

CassandraOffers:– Murmur3Partition– RandomPartitioner– ByteOrderedPartitioner

(NotRecommended)

© 2016 Pythian. Confidential 28

Coordinators

• Actsasaproxybetweentheclientapplicationandthenodesthatownthedatabeingrequested.

• Anyclientrequestcanbesenttoanynode.

© 2016 Pythian. Confidential 29

Snitch• Isresponsibleforkeepingallofthenodesuptodateonwhatnodehaswhatdata,whatnodesarecurrentlydown,whatnodesarebootstrapping,etc.

• ItInterpretsthetopology

Themostpopularare:– Gossipingpropertyfile

snitch– EC2Snitch– EC2Multi-regionsnitch– DynamicSnitch

© 2016 Pythian. Confidential 30

© 2016 Pythian. Confidential 31

Logical database container DataisStoredinKeyspaces

© 2016 Pythian. Confidential 32

Model Around Your Queries

• DetermineWhatQueriestoSupport• Groupingbyanattribute• Orderingbyanattribute• Filteringbasedonsomesetofconditions

• Createatablewhereyoucansatisfyyourquery• generallymeansyouwilluseroughlyonetableperquerypattern

© 2016 Pythian. Confidential 33

A Cassandra Table or Column Family

CoordinatorSnitchCommitlogWriterMemtablewriterMemTableFlush(Sstablewriter)ReaderMemtablesBloomFilters

CassandraNodeCommitLog

10100

SSTables

© 2016 Pythian. Confidential 34

A Cassandra Table or Column Family• ConsistsofoneormoreSStablesand0ormoreMEMtables

• SStablestandsforSortedStringTable.• E.G.alloftheColumnsintheSStablearesortedinorderbykey.

• EachSStableconsistsofthedatatable,bloomfilter,indexandsomeotherminorfiles.

• SStablesareimmutable.Oncewrittentheyareneveralteredonlyreadandeventuallydeleted

videogames-events-data-jb-1.dbvideogames-events-filters-jb-1.dbvideogames-events-index-jb-1.dbvideogames-events-data-jb-2.dbvideogames-events-filters-jb-2.dbvideogames-events-index-jb-2.dbvideogames-events-data-jb-3.dbvideogames-events-filters-jb-3.dbvideogames-events-index-jb-3.dbvideogames-events-data-jb-4.dbvideogames-events-filters-jb-4.dbvideogames-events-index-jb-4.db

SStablesondisk/var/lib/cassandra

© 2016 Pythian. Confidential 35

Replication Factor (RF) and Consistency

• ReplicationFactoristhenumberofcopiesofcolumnsstoredinthering

• Replicationfactorshouldnotexceedthenumberofnodesinthecluster

– RF=1isonecopythismeansthatthedataforeachcolumnisstoredonlyonceinthering.

– RF=3(default)meanseverycolumnstoredinthedatabaseisstoredthreetimes.

– Quorum.-Thereadandwritemustbeacked/returnedfromaquorumofnodes.

© 2016 Pythian. Confidential 36

Replication Factor (RF) and Consistency

• Consistency– Whenwriteorreadis

performedtheapplicationcanchoosetowaitforncopiesofthedatatobewrittenorreadthisisreferredtoasconsistencyofn.

– ThereisaspecialconsistencyvaluecalledquorumwhichmeansaresponsefromRF/2+1nodesisrequired.

© 2016 Pythian. Confidential 37

How to make sure we don’t loose data

• Threeanti-entropymechanismsinCassandra1)Hintedhandoff2)Readrepair3)Repair

A.K.A.Anti-Entropy

© 2016 Pythian. Confidential 38

Write Path

© 2016 Pythian. Confidential 39

Compactions

• SStablesareimmutable.• Deletesandupdatesarejustnew

writes• SStablesaremergedtogetherby

partitionedkey.Oldobsoletedataisdiscarded.

• LotsofSStablesbecomeafew.• Compactioncanrequirealotof

diskspace.DONOTLETyourdisksgetmorethan50%full.

© 2016 Pythian. Confidential 40

CQL - Cassandra Query LanguageCQLisnotSQL

• DefaultandprimaryinterfaceintotheCassandraDatabase(since2.0)• Cassandradoesnotsupportjoinsorsubqueries• Onlywaytocreateusersanduserbasedpermissions

• Verysimilar:cqlsh> CREATE KEYSPACE sandbox WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', DC1 : 1}; cqlsh> USE sandbox; cqlsh:sandbox>CREATE TABLE data (id uuid, data text, PRIMARY KEY (id)); cqlsh:sandbox> INSERT INTO data (id, data) values (c37d661d-7e61-49ea-96a5-68c34e83db3a, 'testing'); cqlsh:sandbox> SELECT * FROM data;

© 2016 Pythian. Confidential 41

© 2016 Pythian. Confidential 42

Feature/Function DSE/Cassandra OracleRDBMSCore architecture “Masterless”; peer-to-peer with

all nodes being the same Traditional standalone

High availability Continuous availability with built in redundancy and hardware rack awareness in both single and multiple data centers

Oracle Dataguard (for failover) and Oracle RAC (Node SPOF) GoldenGate

Data model Google Bigtable Relational/tabular Data consistency model Tunable consistency (CAP

theorem consistency per operation

Traditional ACID

Storage model Targeted directories with separation

Tablespaces

Logical database container

Keyspace Database

Backup/recovery Online, point-in-time restore Online, point-in-time restore

Enterprise management/monitoring

DataStax OpsCenter Oracle Enterprise Manager

© 2016 Pythian. Confidential 43

Lessons Learned

• UnderstandtheDataModelDifferences• HardwareSetupdoesMatter• Grepthelogsforerrorsandwarnings• Makesureeachnodeiscreatedproperly• Knowyourtools

• nodetoolutility• Cassandrabulkloader(sstableloader)• jconsole/JavaVisualVM• Cassandra-Stress• OpsCenter

© 2016 Pythian. Confidential 44

© 2016 Pythian. Confidential 45

rene-ace.com

Thank you – Q&A CONSULTING & STRATEGY

IMPLEMENTATIONMANAGED SERVICES

To contact us sales@pythian.com

1-877-PYTHIAN

To follow us http://www.pythian.com/blog

http://goo.gl/bImXcJ

@pythian

http://goo.gl/DMXExf