Surviving the Hadoop Revolution

23
© 2016 IBM Corporation Hadoop Summit – Dublin 2016 Surviving the Hadoop Revolution Adriana Zubiri ([email protected]) Program Director, Analytics Scott C. Gray ([email protected]) Senior Architect and STSM, Big SQL, Big Data Open Source

Transcript of Surviving the Hadoop Revolution

Page 1: Surviving the Hadoop Revolution

© 2016 IBM CorporationHadoop Summit – Dublin 2016

Surviving the Hadoop Revolution

Adriana Zubiri ([email protected])Program Director, AnalyticsScott C. Gray ([email protected])Senior Architect and STSM, Big SQL, Big Data Open Source

Page 2: Surviving the Hadoop Revolution

© 2016 IBM Corporation2 Hadoop Summit – Dublin – April 2016

Who Are We?

Adriana ZubiriProgram Director, IBM AnalyticsFormer development lead, IBM Big SQLData Warehouse Performance expertDatabase engine developer (DB2)

Scott C. GrayLead Architect of IBM Open Platform for Apache HadoopOn the ODPi (http://odpi.org) Technical Steering CommitteeFormer lead architect of IBM Big SQLOpen source developer (jsqsh and jline2)

Disclaimer: The perspective of this presentation is shaped by our backgrounds & experiences and it is our own and not that of IBM

Page 3: Surviving the Hadoop Revolution

© 2016 IBM Corporation3 Hadoop Summit – Dublin – April 2016

AgendaThe business impact of open source

- Once upon a time- Then the world changed- What customers want … or they think they want…- Adapt or else…

Software Engineering: a new era- Traditional vs open development model- Integrating into the community- Changing processes: getting agile

What’s Next?

Page 4: Surviving the Hadoop Revolution

© 2016 IBM Corporation4 Hadoop Summit – Dublin – April 2016

Proprietary software- own the software end to end

Major releases years apart- focus on high quality and Service Level

agreements (SLAs)

Focus on patents - protect from the competition

Competition was well defined- relatively small number of players

Customers with large IT budgets- multi-year contracts, large profit margins

Page 5: Surviving the Hadoop Revolution

© 2016 IBM Corporation5 Hadoop Summit – Dublin – April 2016

Setting up the stage…. …for Hadoop

Data volume & variety explosion

Expensive storage

Budget restrictions due to economic climate

Open source getting popularity at enterprises

Large data sets

Cheap commodity hardware

Easy to Scale

Resilient

Open & Free software

Page 6: Surviving the Hadoop Revolution

© 2016 IBM Corporation6 Hadoop Summit – Dublin – April 2016

- To move to open source• Open software vs. vendor lock-in

- Inexpensive software• Unwilling to run unsupported software• Free software is not free: deployment costs

- Bleeding edge and stable high quality software

What customers want…

Page 7: Surviving the Hadoop Revolution

© 2016 IBM Corporation7 Hadoop Summit – Dublin – April 2016

In the meantime… at many Corporate HQs….

No! It will take part of our business!

Yes!We risk becoming irrelevant!!

Do we embrace and support?

Page 8: Surviving the Hadoop Revolution

© 2016 IBM Corporation8 Hadoop Summit – Dublin – April 2016

What Business Model?

Open Source Software Pure Plays Revenue based on maintenance and Support

Value Add Plays They leverage Open Source Software Proprietary value on top of Open Source and not just

depend on maintenance revenue Software as a Service companies are deploying this

strategy

Page 9: Surviving the Hadoop Revolution

© 2016 IBM Corporation9 Hadoop Summit – Dublin – April 2016

Products that are not significantly more advanced that what the open source community has

Products where the community is very active and will catch up soon

Already non-profitable products

For areas where they need infrastructure for value adds up the chain

To improve customer and community perception

What would companies might open source their code?

Page 10: Surviving the Hadoop Revolution

© 2016 IBM Corporation10 Hadoop Summit – Dublin – April 2016

Giving up control - Don’t own the roadmap of the project

Too complex and large- Low chance for the community to adopt

It’s expensive!- Need to invest in being the custodian for the code where

you don’t make money out of- Need to prepare the code for open source, remove

proprietary IP, write docs, go through code clearance Corporate Politics

- Open source projects have corporate leaders and sometime corporate politics are involved when someone wants to contribute

What would companies might still NOT open source their code?

Page 11: Surviving the Hadoop Revolution

© 2016 IBM Corporation11 Hadoop Summit – Dublin – April 2016

Number of players in some spaces is unprecedented (e.g. SQL)- Barrier of entry is low- Small and open is seen as attractive- Everyone can sit at the table (vs traditional vendors)

The new rules of roadmaps- We compete in features delivered and planned features

Traditional benchmarking: the right strategy to show we are faster?- The fast pace of new releases make benchmarks obsolete almost when

published

Keeping customer trust- Sometimes our solutions are faster and more reliable but … the feature exists in

open source- When just good enough is enough

Marketing & Sales: Who do we compete with?

Page 12: Surviving the Hadoop Revolution

© 2016 IBM Corporation12 Hadoop Summit – Dublin – April 2016

Evolving Software EngineeringWaterfall to Agile

andLearning to Let Go

Page 13: Surviving the Hadoop Revolution

© 2016 IBM Corporation13 Hadoop Summit – Dublin – April 2016

Once Upon a Time – There Was Lots of (Slow) Process

Development involved a lot of water-fall style process- Gather requirements from the customer- Prioritize the requirements- Develop and publish a roadmap- Develop a project plan for each release- Develop and document- QA- Release- Rake in the money!

- Repeat

So what you do is you take the specifications from the

customers and you bring them down to the software

engineers?

Page 14: Surviving the Hadoop Revolution

© 2016 IBM Corporation14 Hadoop Summit – Dublin – April 2016

Once Upon a Time – We Were Control Freaks!

With all that process also came a lot of control:- What features will be available- When features will be made available- How features will be built- Quality and testing of all of the features- Focus on usability and not just function- Compatibility testing and assurances- Thorough knowledge of the code base- Detailed documentation of functionality- Control over code style, documentation, interfaces- Etc, etc. etc,.

Page 15: Surviving the Hadoop Revolution

© 2016 IBM Corporation15 Hadoop Summit – Dublin – April 2016

The Results of Process and Control?

Slow, methodical pace…- Difficult to “stay pace” with this rapidly evolving

open source world, but....- Enterprise customers tend like the stability and

predictable evolution- Integrating applications have time to stabilize

(Reasonably) strong assurances to customers Complete responsibility for quality (both the good and bad) Freedom and autonomy

Page 16: Surviving the Hadoop Revolution

© 2016 IBM Corporation16 Hadoop Summit – Dublin – April 2016

Control In The New Age: Learning to Let Go Adopting Free and Open Source Software (FOSS)

as a foundation means letting go of much the control we so cherished!

The community controls a lot- Features and functionality- Release timeline and roadmap- Quality, stability, and performance- Backwards compatibility- Quality of documentation

At times one or more of above is lacking- And not in our control!

Page 17: Surviving the Hadoop Revolution

© 2016 IBM Corporation17 Hadoop Summit – Dublin – April 2016

It’s OPEN, dummy. Get involved.

Fix it.Improve it.

The community is YOU!

Open Source

Page 18: Surviving the Hadoop Revolution

© 2016 IBM Corporation19 Hadoop Summit – Dublin – April 2016

Entering the Community Some developers have been involved in ASF Most came from closed source

- ASF process feel burdensome and unfamiliar- Not clear/comfortable how to work openly

Learning to work with the community- Filing JIRA’s and waiting for something to happen- Getting attention from the community- Accepting input and criticism- Building committers- Designing in the open- Being patient

Page 19: Surviving the Hadoop Revolution

© 2016 IBM Corporation20 Hadoop Summit – Dublin – April 2016

Community Challenges

Like your local community, there are conflicts, disputes, and other challenges

Sometimes the “community” is small, inclusive- The founding team usually maintains a lot

of control- May be extraordinarily difficult to become

committer or have code accepted- They may be (rightfully) very concerned about quality and design

Even an “open” project has its fiefdoms- Even as a committer you can’t just “force” changes- Different areas have different owners/authors- Conflicts arise if you just commit without review

Page 20: Surviving the Hadoop Revolution

© 2016 IBM Corporation21 Hadoop Summit – Dublin – April 2016

Closed source has these problems too!but working directly with people means they can be quickly solved by

Meetings & Leaders Hand-to-Hand Combat

Page 21: Surviving the Hadoop Revolution

© 2016 IBM Corporation22 Hadoop Summit – Dublin – April 2016

Evolving to Agile

We used to deliver software on feature boundaries- A release was “done” when a certain set of features were completed

The open development model mandates an AGILE strategy- We could no longer plan when a feature might be added- We can control the time to implement, but not the time to adoption- We could no longer publish detailed time based roadmaps

• Sales, marketing, product management HATE this!- But, agile paid off by speeding up our rate of

release to the market and response to customerdemands

Page 22: Surviving the Hadoop Revolution

© 2016 IBM Corporation23 Hadoop Summit – Dublin – April 2016

Policing & Herding Cats: Many Projects With Many Unknowns

Nowhere can FOSS agility prove more challengingthan maintaining a Hadoop distribution!

Bringing together many different projects, each:- Evolving at it’s own pace- With different degrees of stability and docs- Different levels of maturity and security- Varied user interfaces- May include “preview” features that that customers will invariably use- Different levels of compatibility with the others- Varied levels of backwards compatibility with prior releases

We are committed to working with projects to address as much as we can- It is truly challenging though!

Page 23: Surviving the Hadoop Revolution

© 2016 IBM Corporation24 Hadoop Summit – Dublin – April 2016

So what’s next?

Hadoop and Open source has forced an evolutionof traditional software business

Continues pushing industry to innovate to stay relevant…to survive- Pushing new areas of research- Identifying and defining new markets- Pushing technological advances- Continue to increase the rate of

innovation