Pythian: My First 100 days with a Cassandra Cluster

43
My First 100 days with a Cassandra Cluster Presented by : Gustavo René Antúnez DBA Team Lead Carlos Rolo Cassandra MVP September, 2015

Transcript of Pythian: My First 100 days with a Cassandra Cluster

Page 1: Pythian: My First 100 days with a Cassandra Cluster

My First 100 days with a Cassandra Cluster

Presented by : Gustavo René Antúnez DBA Team Lead Carlos Rolo Cassandra MVP September, 2015

Page 2: Pythian: My First 100 days with a Cassandra Cluster

2

Welcome to Cassandra Summit 2015

Page 3: Pythian: My First 100 days with a Cassandra Cluster

• 18  Years  of  Data  infrastructure  management  consulting

• 200+  Top  brands• 6000+  databases  under  

management• Over  400  DBA’s,  in  35  countries  • Top  5%  of  DBA  work  force,  9  

Oracle  ACE’s,  2  Microsoft  MVP’s,  1  Cassandra  MVP  

• Oracle,  Microsoft,  MySQL,  Datastax  partners,  Netezza,  Hadoop  and  MongoDB  plus  UNIX  Sysadmin  and  Oracle  apps

About Pythian

Page 4: Pythian: My First 100 days with a Cassandra Cluster

Where does René come from–Oracle  DBA  

• Started  with  Version  9.2  in  2004  – Speaker  at  Oracle  Open  World,  Developers  Day  and  Collaborate    

– APress  Q1  2016:  “Prac%cal  Data  Refresh”  

–Movie  Fanatic  &  Music  Lover  –Bringing  the  best  from  México  (Mexihtli)  to  the  rest  of  the  world  and  in  the  process  photographing  it  :)  

– rene-­‐ace.com  –@rene_ace

4

Page 5: Pythian: My First 100 days with a Cassandra Cluster

Where does Carlos come

5

• Cassandra  Consultant    • First  contact  was  0.8    • Cassandra  MVP  &  DataStax  Certified  Architect    

• Lisbon  Cassandra  Meetup    • Passion  for  distributed  systems    • Loves  a  good  challenge    • Waterpolo  is  my  sport    • @cjrolo

Page 6: Pythian: My First 100 days with a Cassandra Cluster

How did you get to be a DBA

6

Page 7: Pythian: My First 100 days with a Cassandra Cluster

6th Happiest Job of 2015!

7

http://www.forbes.com/sites/susanadams/2014/03/20/the-happiest-and-unhappiest-jobs-in-2014/

Work-life balance

Relationship with boss and co-workers

Daily tasksJob resources

Field will grow by 15% between

2012 and 2022

DBA can be the key driver of

success

Page 8: Pythian: My First 100 days with a Cassandra Cluster

Happiest Job of 2034?

Oxford University: THE FUTURE OF EMPLOYMENT: HOW SUSCEPTIBLE ARE JOBS TO COMPUTERISATION?

• 47  percent  of  American  jobs  are  at  high  risk  of  being  taken  by  computers  within  the  next  two  decades.  

– 1st  Wave    • Computers  will  start  replacing  people  in  especially  vulnerable  fields  like  transportation/logistics,  production  labor,  and  administrative  support.  

– 2nd  Wave  • Dependent  upon  the  development  of  good  artificial  intelligence.  This  could  next  put  jobs  in  management,  science  and  engineering,  and  the  arts  at  risk.

8

Page 9: Pythian: My First 100 days with a Cassandra Cluster

What is Cassandra ?• NoSQL  database,  developed  in  JavaOne    • Fully  distributed  DB  

• Meaning  that  there  is  no  master  DB,  unlike  Oracle  or  MySQL.  

• Linearly  scalable  • Based  on  2  core  technologies,  Google’s  Big  Table  and  Amazon’s  Dynamo  

• 2  versions  of  Cassandra  • Community  Edition.-­‐  This  is  distributed  under  the  Apache™  License  

• Enterprise  Edition  .-­‐  This  is  distributed  by  Datastax

9

Page 10: Pythian: My First 100 days with a Cassandra Cluster

CAP  Theorem

• In  a  distributed  system  you  can  only  have  two  out  of  the  following  three  guarantees  across  a  write/read  pair:  

• Consistency.-­‐  A  read  is  guaranteed  to  return  the  most  recent  write  for  a  given  client.  

• Availability.-­‐A  non-­‐failing  node  will  return  a  reasonable  response  within  a  reasonable  amount  of  time  (no  error  or  timeout).  

• Partition  Tolerance.-­‐The  system  will  continue  to  function  when  network  partitions  occur.

10

N1 N2

X X

N1 N2

N1 N2

What is Cassandra ?

Page 11: Pythian: My First 100 days with a Cassandra Cluster

What is Cassandra ?

• Cassandra  is  a  BASE  (Basically  Available,  Soft  state,  Eventually  consistent)  type  system

11

• Not  an  ACID  (Atomicity,  Consistency,  Isolation,  Durability)  type  system  

Page 12: Pythian: My First 100 days with a Cassandra Cluster

It Can be as easy as …

• Start  your  machine  and  install  the  following:  • ntp  (Packages  are  normally  ntp,  ntpdata  and  ntp-­‐doc)  

• wget  (Unless  you  have  your  packages  copied  over  via  other  means)  

• vim  (Or  your  favorite  text  editor)  • Yum  Package  Management    • Root  or  sudo  access  to  the  install  machine  • Latest  version  of  Oracle  Java  SE  Runtime  Environment  (JRE)  8  (recommended)  or  OpenJDK  7.  

• Python  2.6+  (needed  if  installing  OpsCenter)

12

Page 13: Pythian: My First 100 days with a Cassandra Cluster

It Can be as easy as …

13

• Install  Cassandra.  ~$ sudo yum install dsc21-2.1.5-1 cassandra2.1.5-1

• Install  optional  utilities.  ~$ sudo yum install cassandra21-tools-2.1.5-1

• Start  Cassandra  service  ~$ sudo service cassandra stop

~$ sudo rm -rf /var/lib/cassandra/data/system/*

• In  the  cassandra-­‐rackdc.properties  file  #  indicate  the  rack  and  dc  for  this  node  dc=Pythian  rack=RAC1  

~$ sudo service cassandra start

Page 14: Pythian: My First 100 days with a Cassandra Cluster

Where is everything in Cassandra?

14

Directories Description/var/lib/cassandra Data  directories/var/log/  cassandra Log  directory/var/run/  cassandra Runtime  files/usr/share/  cassandra Environment  settings/usr/share/  cassandra/lib

JAR  files/usr/bin Optional  utilities,  such  as  sstablelevelreset,  

sstablerepairedset,  and  sstablesplit/usr/bin Binary  files/usr/sbin/etc/cassandra Configuration  files/etc/init.d Service  startup  script/etc/security/  limits.d Cassandra  user  limits/etc/default/usr/share/  doc/cassandra/examples

Sample  cassandra.yaml  files  for  stress  testing

Page 15: Pythian: My First 100 days with a Cassandra Cluster

I come from this world…

12c  Version  Architecture…

15

Page 16: Pythian: My First 100 days with a Cassandra Cluster

I come from this world…Oracle…

16

101010

Online Redo Log10100

Data Files Control Files

Segment

Database

Tablespace

Extent

Oracle data block

Schema Data file

OS block

Logical Datafile

Physical Datafile

Page 17: Pythian: My First 100 days with a Cassandra Cluster

I come from this world…

17

RAC  -­‐  For  Node  Point  of  Failure

RAC Cluster

Node3Node2

ASM Disks

Node1

Public Network

Storage NetworkASM Network

CSS Network

ASM ASM ASM

DBB DBBDBB

Global  Data  Services    – Service Failover / Load Balancing

Page 18: Pythian: My First 100 days with a Cassandra Cluster

I come from this world…

18

Dataguard  -­‐  For  Failover

Primary

Standby

Far  Sync  Instance

SYNCASYNC

Zero  data  loss  failover

Page 19: Pythian: My First 100 days with a Cassandra Cluster

Cassandra Architecture

Cassandra  Cluster

19

N1

Node

N2

Node

Rack  1

Datacenter  México

N3

Node

N4

Node

Rack  2

Datacenter  Portugal

Page 20: Pythian: My First 100 days with a Cassandra Cluster

One Ring to Rule them All

20

• The  total  amount  of  data  managed  by  the  cluster  is  represented  as  a  ring  

• Each  node  is  assigned  a  part  of  the  database  to  hold  based  on  each  table’s  primary  key.  

• To  guarantee  both  availability  and  durability  multiple  nodes  will  be  assigned  to  the  same  data.  

• There  is  no  master  node  all  nodes  can  perform  all  operations

1

4

3

2

A-F,T-Z,M-S

G-L,A-F,T-Z

M-S,G-L,A-F

T-Z,M-S,G-L

Page 21: Pythian: My First 100 days with a Cassandra Cluster

Gossip

21

• Peer-­‐to-­‐peer  communication  protocol  in  which  nodes  periodically  exchange  state  information    

• Runs  every  second  and  exchanges  state  messages  with  up  to  three  other  nodes  in  the  cluster    

• Failure  detection    • It  determines  locally  from  gossip  state  and  history  if  another  node  in  the  system  is  down  or  has  come  back  up.

Page 22: Pythian: My First 100 days with a Cassandra Cluster

Consistent Hashing

22

• A  hash  consists  of  one  or  more  arithmetic  operations  on  a  piece  of  data    

• Common  way  of  load  balancing  across  several  nodes  

• Hash  function  must  have  a  upper  and  lower  bound  so  objects  can  be  mapped  in  a  circle  

• Common  Hash  algorithms  – Simple  checksums  – Message  Digest  (MD5)  – Secure  Hash  Algorithm  (SHA-­‐1/2)  – MurmurHash

Page 23: Pythian: My First 100 days with a Cassandra Cluster

Partitioners

23

• Determines  how  data  is  distributed  across  the  nodes  in  the  cluster    

• Function  for  deriving  a  token  representing  a  row  from  its  partition  key  

Cassandra  Offers:  – Murmur3Partition  – RandomPartitioner  – ByteOrderedPartitioner

Page 24: Pythian: My First 100 days with a Cassandra Cluster

Virtual Nodes

24

• Solution  for  avoiding  calculating  node  tokens  and  thinking  about  the  cluster  size  before  hand  

• Each  node  has  multiple  virtual  nodes  

• Each  node  virtual  node  own  a  much  smaller  subset  of  data  

Page 25: Pythian: My First 100 days with a Cassandra Cluster

Coordinators

25

• Acts  as  a  proxy  between  the  client  application  and  the  nodes  that  own  the  data  being  requestedAny  client  request  can  be  sent  to  any  node.

Page 26: Pythian: My First 100 days with a Cassandra Cluster

Snitch

26

• Is  responsible  for  keeping  all  of  the  nodes  up  to  date  on  what  node  has  what  data,  what  nodes  are  currently  down,  what  nodes  are  bootstrapping,  etc.    

• It  Interprets  the  topology

The  most  popular  are:  – Gossiping  property  file  

snitch  – EC2  Snitch  – EC2  Multi-­‐region  snitch  – Dynamic  Snitch

Page 27: Pythian: My First 100 days with a Cassandra Cluster

27

Page 28: Pythian: My First 100 days with a Cassandra Cluster

Logical database container

28

Data  is  Stored  in  Keyspaces

Page 29: Pythian: My First 100 days with a Cassandra Cluster

A CASSANDRA TABLE OR COLUMN FAMILY

29

CoordinatorSnitchCommitlog  WriterMem  table  writerMem  Table  Flush  (Sstable  writer)ReaderMem  tablesBloom  Filters

Cassandra  NodeCommitLog

10100

SSTables

Page 30: Pythian: My First 100 days with a Cassandra Cluster

A CASSANDRA TABLE OR COLUMN FAMILY

30

• Consists  of  one  or  more  SStables  and  0  or  more  MEMtables  

• SStable  stands  for  Sorted  String  Table.    • E.G.  all  of  the  Columns  in  the  SStable  are  sorted  in  order  by  key.  

• Each  SStable  consists  of  the  data  table,  bloom  filter,  index  and  some  other  minor  files.  

• SStables  are  immutable.  Once  written  they  are  never  altered  only  read  and  eventually  deleted

videogames-events-data-jb-1.dbvideogames-events-filters-jb-1.dbvideogames-events-index-jb-1.dbvideogames-events-data-jb-2.dbvideogames-events-filters-jb-2.dbvideogames-events-index-jb-2.dbvideogames-events-data-jb-3.dbvideogames-events-filters-jb-3.dbvideogames-events-index-jb-3.dbvideogames-events-data-jb-4.dbvideogames-events-filters-jb-4.dbvideogames-events-index-jb-4.db

SStables  on  disk  /var/lib/cassandra

Page 31: Pythian: My First 100 days with a Cassandra Cluster

REPLICATION FACTOR (RF) AND CONSISTENCY

31

• Replication  Factor  is  the  number  of  copies  of  columns  stored  in  the  ring  

• Replication  factor  should  not  exceed  the  number  of  nodes  in  the  cluster

– RF=1  is  one  copy  this  means  that  the  data  for  each  column  is  stored  only  once  in  the  ring.  

– RF=3  (default)  means  every  column  stored  in  the  database  is  stored  three  times.  

– Quorum  .-­‐  The  read  and  write  must  be  acked/returned  from  a  quorum  of  nodes.

Page 32: Pythian: My First 100 days with a Cassandra Cluster

REPLICATION FACTOR (RF) AND CONSISTENCY

32

• Consistency  – When  write  or  read  is  

performed  the  application  can  choose  to  wait  for  n  copies  of  the  data  to  be  written  or  read  this  is  referred  to  as  consistency  of  n.  

– There  is  a  special  consistency  value  called  quorum  which  means  a  response  from  RF/2+1  nodes  is  required.

Page 33: Pythian: My First 100 days with a Cassandra Cluster

HOW TO MAKE SURE WE DON’T LOOSE DATA

33

• Three  anti-­‐entropy  mechanisms  in  Cassandra  1)  Hinted  handoff  2)  Read  repair    3)  Repair

A.K.A.  Anti-­‐Entropy

Page 34: Pythian: My First 100 days with a Cassandra Cluster

WRITE PATH

34

Page 35: Pythian: My First 100 days with a Cassandra Cluster

COMPACTIONS

35

• SStables  are  immutable.  • Deletes  and  updates  are  just  new  

writes    • SStables  are  merged  together  by  

partitioned  key.Old  obsolete  data  is  discarded.  

• Lots  of  SStables  become  a  few.  • Compaction  can  require  a  lot  of  

disk  space.  DO  NOT  LET  your  disks  get  more  than  50%  full.    

Page 36: Pythian: My First 100 days with a Cassandra Cluster

CQL - Cassandra Query Language

36

CQL  is  not  SQL

• Default  and  primary  interface  into  the  Cassandra  Database  (since  2.0)  • Cassandra  does  not  support  joins  or  subqueries  • Only  way  to  create  users  and  user  based  permissions  

• Very  similar:  cqlsh> CREATE KEYSPACE sandbox WITH REPLICATION = { 'class' : 'NetworkTopologyStrategy', DC1 : 1}; cqlsh> USE sandbox; cqlsh:sandbox>CREATE TABLE data (id uuid, data text, PRIMARY KEY (id)); cqlsh:sandbox> INSERT INTO data (id, data) values (c37d661d-7e61-49ea-96a5-68c34e83db3a, 'testing'); cqlsh:sandbox> SELECT * FROM data;

Page 37: Pythian: My First 100 days with a Cassandra Cluster

37

Page 38: Pythian: My First 100 days with a Cassandra Cluster

38

Feature/Function   DSE/Cassandra Oracle  RDBMS  Core architecture “Masterless”; peer-to-peer with

all nodes being the same Traditional standalone

High availability Continuous availability with built in redundancy and hardware rack awareness in both single and multiple data centers

Oracle Dataguard (for failover) and Oracle RAC (Node SPOF) GoldenGate

Data model Google Bigtable Relational/tabular Data consistency model Tunable consistency (CAP

theorem consistency per operation

Traditional ACID

Storage model Targeted directories with separation

Tablespaces

Logical database container

Keyspace Database

Backup/recovery Online, point-in-time restore Online, point-in-time restore

Enterprise management/monitoring

DataStax OpsCenter Oracle Enterprise Manager

Page 39: Pythian: My First 100 days with a Cassandra Cluster

LESSONS LEARNED

39

• Understand  the  Data  Model  Differences  • Hardware  Setup  does  Matter  • Grep  the  logs  for  errors  and  warnings  • Make  sure  each  node  is  created  properly  • Know  your  tools  

• nodetool  utility  • Cassandra  bulk  loader  (sstableloader)  • jconsole/JavaVisualVM  • Cassandra-­‐Stress  • OpsCenter

Page 40: Pythian: My First 100 days with a Cassandra Cluster

40

Page 41: Pythian: My First 100 days with a Cassandra Cluster

FIT-ACER

• F – Focus (SLOW DOWN! Are you ready?)

• I – Identify server/DB name, time, authorization

• T – Type the command (do not hit enter yet)

• A – Assess the command (SPEND TIME HERE!)

• C – Check the server / database name again

• E – Execute the command

• R – Review and document the results

41

Page 42: Pythian: My First 100 days with a Cassandra Cluster

42

rene-ace.com

Page 43: Pythian: My First 100 days with a Cassandra Cluster

43

To contact us

[email protected]

1-877-PYTHIAN

To follow us

http://www.pythian.com/blog

http://www.facebook.com/pages/The-Pythian-Group/163902527671

@pythian

http://www.linkedin.com/company/pythian

Thank you – Q&A