C* Summit EU 2013: Cassandra Internals
-
Upload
planet-cassandra -
Category
Technology
-
view
1.001 -
download
3
description
Transcript of C* Summit EU 2013: Cassandra Internals
CASSANDRA EU 2013
CASSANDRA INTERNALS
Aaron Morton @aaronmorton
!
Co-Founder & Principal Consultant www.thelastpickle.com
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License #CassandraEU
About The Last Pickle. Work with clients to deliver and improve
Apache Cassandra based solutions.
Apache Cassandra Committer, DataStax MVP, Hector Maintainer, Apache Usergrid Committer.
Based in New Zealand & Austin, TX.
#CassandraEUwww.thelastpickle.com
Architecture Code
#CassandraEUwww.thelastpickle.com
Cassandra Architecture.
API's
Cluster Aware
Cluster Unaware
Clients
Disk
#CassandraEUwww.thelastpickle.com
Cassandra Cluster Architecture.
API's
Cluster Aware
Cluster Unaware
Clients
Disk
API's
Cluster Aware
Cluster Unaware
Disk
Node 1 Node 2
#CassandraEUwww.thelastpickle.com
Dynamo Cluster Architecture.
API's
Dynamo
Database
Clients
Disk
API's
Dynamo
Database
Disk
Node 1 Node 2
www.thelastpickle.com #CassandraEU
Architecture API
Dynamo Database
#CassandraEUwww.thelastpickle.com
API Transports. !
Thrift Native Binary
!
#CassandraEUwww.thelastpickle.com
Thrift Transport. !
//Custom TServer implementations o.a.c.thrift.CustomTThreadPoolServer o.a.c.thrift.CustomTHsHaServer
#CassandraEUwww.thelastpickle.com
API Transports.
Thrift Native Binary
#CassandraEUwww.thelastpickle.com
Native Binary Transport. !
Beta in Cassandra 1.2, now GA. Uses Netty. CQL 3 only.
#CassandraEUwww.thelastpickle.com
o.a.c.transport.Server.run() !
//Setup the Netty server new ExecutionHandler() new NioServerSocketChannelFactory() ServerBootstrap.setPipelineFactory()
#CassandraEUwww.thelastpickle.com
o.a.c.transport.Message.Dispatcher.messageReceived() !
//Process message from client ServerConnection.validateNewMessage() Request.execute() ServerConnection.applyStateTransition() Channel.write()
#CassandraEUwww.thelastpickle.com
Messages. !
Defined in the Native Binary Protocol
$SRC/doc/native_protocol.spec
#CassandraEUwww.thelastpickle.com
API Services. !
JMX Thrift
CQL 3 !
#CassandraEUwww.thelastpickle.com
JMX Management Beans. !
Spread around the code base.
Interfaces named *MBean
#CassandraEUwww.thelastpickle.com
JMX Management Beans. !
Registered with names such as org.apache.cassandra.db:
type=StorageProxy
#CassandraEUwww.thelastpickle.com
API Services. !
JMX Thrift CQL 3
!
#CassandraEUwww.thelastpickle.com
o.a.c.thrift.CassandraServer !
// Implements Thrift Interface // Access control // Input validation // Mapping to/from Thrift and internal types
#CassandraEUwww.thelastpickle.com
Thrift Interface. !
Thrift IDL $SRC/interface/cassandra.thrift
#CassandraEUwww.thelastpickle.com
o.a.c.thrift.CassandraServer.get_slice() !
// get columns for one row Tracing.begin() ClientState cState = state() cState.hasColumnFamilyAccess() multigetSliceInternal() !
#CassandraEUwww.thelastpickle.com
CassandraServer.multigetSliceInternal() !
// get columns for may rows ThriftValidation.validate*() // Create ReadCommands getSlice() !
#CassandraEUwww.thelastpickle.com
CassandraServer.getSlice() !
// Process ReadCommands // return Thrift types !
readColumnFamily() thriftifyColumnFamily() !
#CassandraEUwww.thelastpickle.com
CassandraServer.readColumnFamily() !
// Process ReadCommands // Return ColumnFamilies !
StorageProxy.read() !
#CassandraEUwww.thelastpickle.com
API Services. !
JMX Thrift
CQL 3 !
#CassandraEUwww.thelastpickle.com
o.a.c.cql3.QueryProcessor !
// Prepares and executes CQL3 statements // Used by Thrift & Native transports // Access control // Input validation // Returns transport.ResultMessage
!
!
#CassandraEUwww.thelastpickle.com
CQL3 Grammar. !
ANTLR Grammar $SRC/o.a.c.cql3/Cql.g
#CassandraEUwww.thelastpickle.com
o.a.c.cql3.statements.ParsedStatement !
// Subclasses generated by ANTLR // Tracks bound term count // Prepare CQLStatement prepare()
#CassandraEUwww.thelastpickle.com
o.a.c.cql3.statements.CQLStatement !
checkAccess(ClientState state) validate(ClientState state) execute(ConsistencyLevel cl, QueryState state, List<ByteBuffer> variables)
#CassandraEUwww.thelastpickle.com
statements.SelectStatement.RawStatement !
// Implements ParsedStatement // Input validation prepare()
#CassandraEUwww.thelastpickle.com
statements.SelectStatement.execute() !
// Create ReadCommands StorageProxy.read()
www.thelastpickle.com #CassandraEU
Architecture API
Dynamo Database
#CassandraEUwww.thelastpickle.com
Dynamo Layer. o.a.c.service
o.a.c.net !
o.a.c.dht o.a.c.gms
o.a.c.locator o.a.c.stream
#CassandraEUwww.thelastpickle.com
o.a.c.service.StorageProxy !
// Cluster wide storage operations // Select endpoints & check CL available // Send messages to Stages // Wait for response // Store Hints
#CassandraEUwww.thelastpickle.com
o.a.c.service.StorageService !
// Ring operations // Track ring state // Start & stop ring membership // Node & token queries
#CassandraEUwww.thelastpickle.com
o.a.c.service.IResponseResolver !
preprocess(MessageIn<T> message) resolve() throws DigestMismatchException !
RowDigestResolver RowDataResolver RangeSliceResponseResolver
#CassandraEUwww.thelastpickle.com
Response Handlers / Callback.
implements IAsyncCallback<T> !
response(MessageIn<T> msg) !
#CassandraEUwww.thelastpickle.com
o.a.c.service.ReadCallback.get()
//Wait for blockfor & data response condition.await(timeout, TimeUnit.MILLISECONDS) !
throw ReadTimeoutException() !
resolver.resolve()
#CassandraEUwww.thelastpickle.com
o.a.c.service.StorageProxy.fetchRows() !
getLiveSortedEndpoints() new RowDigestResolver() new ReadCallback() MessagingService.sendRR() --------------------------------------- ReadCallback.get() # blocking catch (DigestMismatchException ex) catch (ReadTimeoutException ex)
#CassandraEUwww.thelastpickle.com
Dynamo Layer !
o.a.c.service o.a.c.net
!
o.a.c.dht o.a.c.gms
o.a.c.locator o.a.c.stream
#CassandraEUwww.thelastpickle.com
o.a.c.net.MessagingService.verb<<enum>> !
MUTATION READ REQUEST_RESPONSE TREE_REQUEST TREE_RESPONSE
(And more...)
#CassandraEUwww.thelastpickle.com
o.a.c.net.MessagingService.verbHandlers !
new EnumMap<Verb, IVerbHandler>(Verb.class)
#CassandraEUwww.thelastpickle.com
o.a.c.net.IVerbHandler<T> !
doVerb(MessageIn<T> message, String id);
!
#CassandraEUwww.thelastpickle.com
o.a.c.net.MessagingService.verbStages !
new EnumMap<MessagingService.Verb, Stage>(MessagingService.Verb.class)
#CassandraEUwww.thelastpickle.com
o.a.c.net.MessagingService.receive() !
runnable = new MessageDeliveryTask( message, id, timestamp); !
StageManager.getStage( message.getMessageType()); !
stage.execute(runnable);
#CassandraEUwww.thelastpickle.com
o.a.c.net.MessageDeliveryTask.run() !
// If dropable and rpc_timeout MessagingService.incrementDroppedMessages(v
erb); return; !
MessagingService.getVerbHandler(verb) verbHandler.doVerb(message, id)
#CassandraEUwww.thelastpickle.com
Architecture API Layer
Dynamo Layer Database Layer
#CassandraEUwww.thelastpickle.com
Database Layer !
o.a.c.concurrent o.a.c.db
!
o.a.c.cache o.a.c.io
o.a.c.trace
#CassandraEUwww.thelastpickle.com
o.a.c.concurrent.StageManager !
stages = new EnumMap<Stage, ThreadPoolExecutor>(Stage.class); !
getStage(Stage stage)
#CassandraEUwww.thelastpickle.com
o.a.c.concurrent.Stage !
READ MUTATION GOSSIP REQUEST_RESPONSE ANTI_ENTROPY
(And more...)#CassandraEUwww.thelastpickle.com
Database Layer. o.a.c.concurrent
o.a.c.db !
o.a.c.cache o.a.c.io
o.a.c.trace
#CassandraEUwww.thelastpickle.com
o.a.c.db.Table !
// Keyspace open(String table) getColumnFamilyStore(String cfName) !
getRow(QueryFilter filter) apply(RowMutation mutation, boolean writeCommitLog)
#CassandraEUwww.thelastpickle.com
o.a.c.db.ColumnFamilyStore !
// Column Family getColumnFamily(QueryFilter filter) getTopLevelColumns(...) !
apply(DecoratedKey key, ColumnFamily columnFamily, SecondaryIndexManager.Updater indexer)
#CassandraEUwww.thelastpickle.com
o.a.c.db.IColumnContainer !
addColumn(IColumn column) remove(ByteBuffer columnName) !
ColumnFamily SuperColumn !
(Removed in 2.0)
#CassandraEUwww.thelastpickle.com
o.a.c.db.ISortedColumns !
addColumn(IColumn column, Allocator allocator) removeColumn(ByteBuffer name) !
ArrayBackedSortedColumns AtomicSortedColumns TreeMapBackedSortedColumns
#CassandraEUwww.thelastpickle.com
o.a.c.db.Memtable !
put(DecoratedKey key, ColumnFamily columnFamily, SecondaryIndexManager.Updater indexer) !
flushAndSignal(CountDownLatch latch, Future<ReplayPosition> context)
#CassandraEUwww.thelastpickle.com
o.a.c.db.ReadCommand !
getRow(Table table) !
SliceByNamesReadCommand SliceFromReadCommand RangeSliceCommand
(Additional classes for paging in 2.0)
#CassandraEUwww.thelastpickle.com
o.a.c.db.IDiskAtomFilter !
getMemtableColumnIterator(...) getSSTableColumnIterator(...) !
IdentityQueryFilter NamesQueryFilter SliceQueryFilter
#CassandraEUwww.thelastpickle.com
Summary CustomTThreadPoolServer Message.Dispatcher
CassandraServer QueryProcessor
ReadCommand
StorageProxy
IResponseResolver
IAsyncCallback
MessagingService
IVerbHandler
Table ColumnFamilyStore IDiskAtomFilter
API
Dynamo
Database
#CassandraEUwww.thelastpickle.com
Thanks. !
#CassandraEUwww.thelastpickle.com
Aaron Morton @aaronmorton
www.thelastpickle.com !
Licensed under a Creative Commons Attribution-NonCommercial 3.0 New Zealand License