Scaling Web Applications With Cassandra Presentation

45
introduction to cassandra eben hewitt september 29. 2010 web 2.0 expo new york city

description

scaling Web Applications With Cassandra Presentation in databases in sever

Transcript of Scaling Web Applications With Cassandra Presentation

  • introduction to cassandraeben hewitt

    september 29. 2010web 2.0 exponew york city

  • director, application architecture at a global corp

    focus on SOA, SaaS, Events

    i wrote this

    @ebenhewitt

  • agendacontextfeaturesdata modelapi

  • nosql big datamongodbcouchdbtokyo cabinetredisriakwhat about?Poet, Lotus, Xindicetheyve been around foreverrdbms was once the new kid

  • innovation at scalegoogle bigtable (2006)consistency model: strongdata model: sparse mapclones: hbase, hypertableamazon dynamo (2007)O(1) dhtconsistency model: client tune-ableclones: riak, voldemort

    cassandra ~= bigtable + dynamo

  • provenThe Facebook stores 150TB of data on 150 nodes

    web 2.0

    used at Twitter, Rackspace, Mahalo, Reddit, Cloudkick, Cisco, Digg, SimpleGeo, Ooyala, OpenX, others

  • cap theoremconsistencyall clients have same view of dataavailabilitywriteable in the face of node failurepartition toleranceprocessing can continue in the face of network failure (crashed router, broken network)

  • daniel abadi: pacelc

  • write consistencyread consistency

    LevelDescriptionZEROGood luck with thatANY1 replica (hints count)ONE1 replica. read repair in bkgndQUORUM (DCQ for RackAware)(N /2) + 1ALLN = replication factor

    LevelDescriptionZEROUmmmANYTry ONE insteadONE1 replicaQUORUM (DCQ for RackAware)Return most recent TS after (N /2) + 1 reportALLN = replication factor

  • agendacontextfeaturesdata modelapi

  • cassandra propertiestuneably consistentvery fast writeshighly availablefault tolerantlinear, elastic scalabilitydecentralized/symmetric~12 client languages Thrift RPC API~automatic provisioning of new nodes0(1) dht big data

  • write op

  • Staged Event-Driven ArchitectureA general-purpose framework for high concurrency & load conditioningDecomposes applications into stages separated by queuesAdopt a structured approach to event-driven concurrency

  • instrumentation

  • data replication

  • partitioner smack-downRandom Preservingsystem will use MD5(key) to distribute data across nodeseven distribution of keys from one CF across ranges/nodes

    Order Preservingkey distribution determined by tokenlexicographical orderingrequired for range queries scan over rows like cursor in indexcan specify the token for this node to usescrabble distribution

  • agendacontextfeaturesdata modelapi

  • structure

  • keyspace~= databasetypically one per applicationsome settings are configurable only per keyspace

  • column familygroup records of similar kindnot same kind, because CFs are sparse tablesex:UserAddressTweetPointOfInterestHotelRoom

  • think of cassandra as row-orientedeach row is uniquely identifiable by keyrows group columns and super columns

  • column familyn= 42user=ebenkey123key456user=alisonicon=

    nickname=The Situation

  • json-like notationUser {123 : { email: [email protected], icon: },

    456 : { email: [email protected], location: The Danger Zone}}

  • 0.6 example$cassandra f$bin/cassandra-cli cassandra> connect localhost/9160

    cassandra> set Keyspace1.Standard1[eben][age]=29cassandra> set Keyspace1.Standard1[eben][email][email protected]> get Keyspace1.Standard1[eben'][age']=> (column=6e616d65, value=39, timestamp=1282170655390000)

  • a column has 3 partsnamebyte[]determines sort orderused in queriesindexedvaluebyte[]you dont query on column valuestimestamplong (clock)last write wins conflict resolution

  • column comparatorsbyteutf8longtimeuuidlexicaluuid

    ex: lat/long

  • super columnsuper columns group columns under a common name

  • PointOfInterestsuper column familyCentral Park10017

    Empire State Bldg

    Phoenix Zoo85255desc=Fun to walk in.phone=212. 555.11212desc=Great view from 102nd floor!

  • PointOfInterest { key: 85255 { Phoenix Zoo { phone: 480-555-5555, desc: They have animals here. }, Spring Training { phone: 623-333-3333, desc: Fun for baseball fans. }, }, //end phx

    key: 10019 { Central Park { desc: Walk around. It's pretty.} , Empire State Building { phone: 212-777-7777, desc: Great view from 102nd floor. } } //end nyc}ssuper columnsuper column familyflexible schemakeycolumn super column family

  • about super column familiessub-column names in a SCF are not indexedtop level columns (SCF Name) are always indexedoften used for denormalizing data from standard CFs

  • agendacontextfeaturesdata modelapi

  • slice predicatedata structure describing columns to returnSliceRangestart column namefinish column name (can be empty to stop on count)reversecount (like LIMIT)

  • read apiget() : Columnget the Col or SC at given ColPath COSC cosc = client.get(key, path, CL);

    get_slice() : Listget Cols in one row, specified by SlicePredicate: List results = client.get_slice(key, parent, predicate, CL);

    multiget_slice() : Mapget slices for list of keys, based on SlicePredicate Map results = client.multiget_slice(rowKeys, parent, predicate, CL);

    get_range_slices() : List returns multiple Cols according to a rangerange is startkey, endkey, starttoken, endtoken: List slices = client.get_range_slices( parent, predicate, keyRange, CL);

  • write apiclient.insert(userKeyBytes, parent, new Column(band".getBytes(UTF8), Funkadelic".getBytes(), clock), CL);

    batch_mutatevoidbatch_mutate( map, CL)removevoidremove(byte[], ColumnPathcolumn_path,Clock,CL)

  • batch_mutate//create paramMap mutationMap = new HashMap();

    //create Cols for MutsColumn nameCol = new Column("name".getBytes(UTF8),Funkadelic.getBytes("UTF-8"), new Clock(System.nanoTime()););Mutation nameMut = new Mutation();nameMut.column_or_supercolumn = nameCosc; //also phone, etc

    Map muts = new HashMap();List cols = new ArrayList();cols.add(nameMut);cols.add(phoneMut);muts.put(CF, cols);//outer map key is a row key; inner map key is the CF namemutationMap.put(rowKey.getBytes(), muts);//send to serverclient.batch_mutate(mutationMap, CL);

  • raw thrift: for masochists only

    pycassa (python)fauna (ruby)hector (java)pelops (java)kundera (JPA)hectorSharp (C#)

  • what aboutSELECT WHEREORDER BYJOIN ON GROUP

    ?

  • rdbms: domain-based model what answers do I have?

    cassandra: query-based model what questions do I have?

  • SELECT WHEREcassandra is an index factory

    USERKey: UserIDCols: username, email, birth date, city, stateHow to support this query?

    SELECT * FROM User WHERE city = Scottsdale

    Create a new CF called UserCity:USERCITYKey: cityCols: IDs of the users in that city.Also uses the Valueless Column pattern

  • Use an aggregate key state:city: { user1, user2}

    Get rows between AZ: & AZ; for all Arizona users

    Get rows between AZ:Scottsdale & AZ:Scottsdale1 for all Scottsdale usersSELECT WHERE pt 2

  • ORDER BYRows are placed according to their Partitioner:

    Random: MD5 of keyOrder-Preserving: actual key

    are sorted by key, regardless of partitionerColumns are sorted according to CompareWith or CompareSubcolumnsWith

  • is cassandra a good fit?you need really fast writesyou need durabilityyou have lots of data > GBs>= three serversyour app is evolvingstartup mode, fluid data structureloose domain data points of interest

    your programmers can dealdocumentationcomplexityconsistency modelchangevisibility toolsyour operations can dealhardware considerationscan move dataJMX monitoring

  • thank you!@ebenhewitt

    **************