My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015

Click here to load reader

Transcript of My Futuristic Vision of the Future of Cassandra's Future - NGCC 2015

I am and always will be the optimist. The hoper of far-flung hopes and the dreamer of improbable dreams.

The Doctor, Season 6, Episode 6

THESE ARE THE SPEAKERS NOTES.

https://www.flickr.com/photos/boyce-d/4205175031

Next Generation Cassandra Conference Gary Dusbabek

Gary from SVDSCommitter since 2009Seen a lot of what has transpiredCall for presentations first went out.Interesting work I had done around metrics at Rackspace.Fat Client and other ways of extending Cassandra

CurrentGeneration(Past)NextGeneration(Futshure)Realized a few things as I started preparing.But thats not really next generation, is it?As I was working on metrics for Rackspace, I thoughtIt would be nice to change this, and that, and, and, and.This is not the Current Generation Cassandra Conference

Some of these ideas and challenges are worth talking about.

https://www.flickr.com/photos/saarblitz/16803524015

A Heretics Vision of the Future of Cassandra Gary Dusbabek

Leveraging Cassandra to Build New Distributed SystemsAt some points I will wax heretical. I will hold up this blue star. Indication to suspend your disbelief and/or cognitive biases.We will pause in the middle of this presentation for a reality check and a few deep breathing exercises.

Current StatusThe FutshureHow to Get ThereMeasuring SuccessAddress ConcernsRoadmap

For the most part, this will be a fun ride.Humor me.

https://www.flickr.com/photos/simone_pittaluga/6877522821

Current StateRegular releasesMany usersAmong the top NoSQL databasesFastScalableUsability is getting better

This is remarkable

https://www.flickr.com/photos/downeym/6063328180

What is our trajectory?What trajectory are we on?Not What new features are we going to add?But What is the next generation of this project?

https://www.flickr.com/photos/blair25/3240324932

What is our trajectory?What trajectory are we on?Not What new features are we going to add?But What is the next generation of this project?Its more of a path.

https://www.flickr.com/photos/blair25/3240324932Six MonthsvsThree Years

What is the next generation for this project?If you had to rate cassandra as being in the beginning- middle- or end- of its lifecycle, where would you put it?Umm. What is the expected lifecycle?Dunno.Look at life cycles and trajectories of other open source projects.

https://www.flickr.com/photos/judy-van-der-velden/6637487865Other Projects

Look at life cycles and trajectories of other open source projects.Examine attributes that made them successful.

https://www.flickr.com/photos/judy-van-der-velden/6637487865Apache Httpd

Version 1 2 Migration

Incremental upgrades until version 2.Then incremental upgrades to 1 and 2.Point: Devs dug in and made necessary changes for the future in 2.Mistakes:But 1.3 continued to be good enough, which ruled over 2s dunno2 had some FUD problems (cautionary tale)Upgrade process was not simple (multiple files, etc.)Composability

Existed as a library for a long time.Begat SolrMerged with it,Then kind of split from it again.Elasticsearch has its roots in Lucene.Point: Focus on how ES composes Lucene

PostgresSince 1996

Dozens of service and support companies(see http://www.postgresql.org/support/professional_support)

Postgres (success story)since 1996dozens of companies providing support and servicesEvidence of a healthy ecosystem.Project is obviously playing a very long game.Since 2007Spawned subprojects:HBase, Pig, AvroHive, Spark

since 2007.roughly same age as CassandraHas spawned subprojects, etc., Pig, Hive, HBase, Spark, AvroInteresting:Few years older than cassandraStarting to lose its edge => look at traction around SPARKProlonged by YARN

Since 2009Ten service and support companies.

Cassandra, since 2009. Our wiki lists 10 support companiesNo subprojectsJust a really good database.

Ecosystems

Every open source project is different (use snowflake image here).Is there anything in common?Summarize: these projects spawned ecosystemsWhat enabled them to do this?Answer: each project did different things.Question: What is the cassandra community doing to develop an ecosystem?My take: not muchBut I think there are good reasons for this.

https://www.flickr.com/photos/chaoticmind75/5529107926

We are just getting started

The good news is that I think were just getting started.Lets look at some data (that suggests we are just getting started).

https://www.flickr.com/photos/matthewpaulson/5794901439

WARNING!Possibly meaningless data.I draw my own conclusions.Wrong in the past.

WARNING: Data ahead.Used to support my argumentsCould be used to support other argumentsYou do not have to agree with me

https://www.flickr.com/photos/arenamontanus/3492063978

#cassandra = 285source: https://twitter.com/postgresql/status/586210482433818624

https://twitter.com/postgresql/status/586210482433818624 (#cassandra was 285)Job Postings

Cassandra more popular than postgres in this metric.Still kind of small given a different context.

Overall trend = downLucene+Solr the winner hereMongo makes strong showingFindings?

What conclusions can you draw?Not scientific. Draw your own.I think it points to: that were just getting started....or that we are just our own snowflake.

https://www.flickr.com/photos/tambako/3593686294

Futuristic Vision

Taking all those things into accountRisks of having futuristic views

Our predictions are always sterileBecause they tend towards utopiaWhat we need is to distil the ideasAnd figure out what we like about the utopiaIsolate them and target them.In this picture you dont see the garbage menOr all the pipes and systems undergroundI cant imagine a world without tractors and ditches.There will always be tractors and ditches.

https://www.flickr.com/photos/x-ray_delta_one/3815958811Reusable Composable Parts

Id like to introduce my Futuristic Version of the Future of Cassandras FutureAn Ecosystem where the parts of cassandra can be re-used to build systems that may outlive the project itself.It will end up growing our base.There are good side effects (will get to those later)You might be thinking...

And thats ok.Get this:Reusable Composable PartsHealthier EcosystemBetter SoftwareMore UsersMore Committers

I think a Cassandra made up of parts will have some positive attributes:

Ok. What parts?

What do we need to do?

What is the next stepThis is how we could proceed.

https://www.flickr.com/photos/iansand/3999841402

What Must Be Done TodayModularize the code

Modularize the code so we can be more nimble.Nimble?Make reusable parts.Maybe even subprojects.

https://www.flickr.com/photos/pedromourapinheiro/5075612989

What Must Be Done TodayCommit Log

Commit LogA concurrent journaling system that accepts bytes and supports checkpoints and recovery.Refactor out the parts that keep track of CF last write.

https://www.flickr.com/photos/pedromourapinheiro/5075612989

What Must Be Done TodayInternode Messaging

Internode MessagingAlmost its own framework.Core: a way to send acked and non-acked messages between nodes with guaranteed delivery.Also includes the notion of verbs and handlers.Wed need ad hoc verbs.

https://www.flickr.com/photos/pedromourapinheiro/5075612989

What Must Be Done TodayFailure Detector + Gossiper

Failure DetectorClustering software

https://www.flickr.com/photos/pedromourapinheiro/5075612989

What Must Be Done TodayPluggable Storage

Not the first time this topic has come upOptimized for read cases.Probably more interested in tuning the lookup/query strategy for reducing seeks.

https://www.flickr.com/photos/pedromourapinheiro/5075612989

What Must Be Done TodaySEDA Architecture

Internal event and worker queues.Seeing some of this shake out in guavaCombine Function object and Threadpools.

https://www.flickr.com/photos/pedromourapinheiro/5075612989

These next two things are hard for me to say.But Im going to say them anyway.

What Must Be Done TodayAdopt a modern build tool

Current build file is a discombobulation

https://www.flickr.com/photos/pedromourapinheiro/5075612989

What Must Be Done TodayKill the singletons

Kill the singletons

https://www.flickr.com/photos/pedromourapinheiro/5075612989

Evidence of Success(or failure)

This is where we get to experience some aspect of the utopia

SubprojectsMore CommittersMore Users (cross-project adoption)

Generally a bigger footprintThis isnt all...

StabilityEasier testsNimbler

Since the systems are independent and less coupledThis would be a BIG help for the project moving forward.Might even help us hit some of those short term goals more quickly.Less coupling means that we can write better tests more easily.

Some may argue: at what expense?There are always tradeoffs, right?Well cover those.

Have Concerns?

This is where I address your concerns before you have them.Maybe.There are many reasons that make this not practicalReasons why this may not happen.

https://www.flickr.com/photos/boyce-d/4205175031

Have Concerns?Best case - no bugsWorst case - many bugs(byte buffers, anybody?)

Best case: no bugs introduced.

https://www.flickr.com/photos/boyce-d/4205175031

Have Concerns?Touches every classComplicated Merges

Complicates merges.

https://www.flickr.com/photos/boyce-d/4205175031

Have Concerns?Gives up the short term

https://www.flickr.com/photos/boyce-d/4205175031

Have Concerns?Is this right for the database?

Is this right for the database?

https://www.flickr.com/photos/boyce-d/4205175031

Have Concerns?Is this right for the project?

Is this right for the database?

https://www.flickr.com/photos/boyce-d/4205175031

Real Question:BenefitsCost

Real question: Do the benefits outweigh the costs?Not something Ill attempt to argue for or against here.But I set forth what I perceive as the benefits early on:Stability, Testing, Bigger Footprint.Call to greatness.

https://www.flickr.com/photos/archeon/2941655917

We have the opportunity to make something great.

Call to greatnessNot just something great,But a great thing even betterAnd the chance of greater things

https://www.flickr.com/photos/lara604/5405044734THANK YOU!!!!11Gary [email protected]@gdusbabekYes, were [email protected]#Something went wrong with the font on the "Thank You"Photo & Image CreditsConan O'BrienThe InternetGoathttps://www.flickr.com/photos/saarblitz/16803524015Roadhttps://www.flickr.com/photos/simone_pittaluga/6877522821Stoneshttps://www.flickr.com/photos/downeym/6063328180Froghttps://www.flickr.com/photos/blair25/3240324932Clockhttps://www.flickr.com/photos/judy-van-der-velden/6637487865FeatherApache FoundationLuceneApache FoundationSolrApache FoundationElasticsearchElasticsearch.comPostgresPostgreSQL Global Development GroupHadoopApache FoundationCassandraApache FoundationSnowflakehttps://www.flickr.com/photos/chaoticmind75/5529107926Duckhttps://www.flickr.com/photos/matthewpaulson/5794901439Warninghttps://www.flickr.com/photos/arenamontanus/3492063978Monkeyhttps://www.flickr.com/photos/tambako/3593686294Futurehttps://www.flickr.com/photos/x-ray_delta_one/3815958811Jackie ChanInternet MemeGearshttps://www.flickr.com/photos/iansand/3999841402Cranehttps://www.flickr.com/photos/pedromourapinheiro/5075612989FistpumpInternet MemeTardishttps://www.flickr.com/photos/boyce-d/4205175031Scalehttps://www.flickr.com/photos/archeon/2941655917Baconhttps://www.flickr.com/photos/lara604/5405044734

All Flickr images are CC BY-NC-ND 2.0