My talk at LVEE 2016
-
Upload
alex-chistyakov -
Category
Technology
-
view
55 -
download
0
Transcript of My talk at LVEE 2016
Using Hadoop stack to build a cloud VATdeclarations revising service
Alex ChistyakovGit in Sky
Grodno, LVEE 2016
Who I am
● Hello, my name is Alex
● Principal Engineer @ Git in Sky
● Hadoop operations engineer
● Former Java developer (not only Java and not so
“former” in fact)
Who are you?
● Linux and OSS enthusiasts?
● Software developers?
● DevOps engineers?
● Big data guys?
Well, what is this all about?
● Configuring a Hadoop/HBase cluster is easy
Well, what is this all about?
● Configuring a Hadoop/HBase cluster is easy
● 1) Buy a lot of hardware
Well, what is this all about?
● Configuring a Hadoop/HBase cluster is easy
● 1) Buy a lot of hardware
● 2) Configure the bloody cluster!
Well, what is this all about?
● Configuring a Hadoop/HBase cluster is easy
● 1) Buy a lot of hardware
● 2) Configure the bloody cluster!
● 3) ???
Well, what is this all about?
● Configuring a Hadoop/HBase cluster is easy
● 1) Buy a lot of hardware
● 2) Configure the bloody cluster!
● 3) ???
● 4) PROFIT!!!
Big Data is hard!
● A customer wants a number of environments fordifferent purposes (dev, testing, staging &production)
● DevOps culture requires repeatability!
● (Observe a beautiful snowflake to the right)
● Business wants to reduce costs
So, we need a detailed plan
● 1) Buy an enterprise subscription from Oracle
So, we need a detailed plan
● 1) Buy an enterprise subscription from Oracle
● ^ FAIL!
So, we need a detailed plan
● 1) Read the manual on the product site
So, we need a detailed plan
● 1) Read the manual on the product site
● 2) Configure everything manually
So, we need a detailed plan
● 1) Read the manual on the product site
● 2) Configure everything manually
● ^ FAIL!
So, we need a detailed plan
● 1) Take Cloudera distribution of Hadoop
So, we need a detailed plan
● 1) Take Cloudera distribution of Hadoop
● 2) Configure everything from a web interface
So, we need a detailed plan
● 1) Take Cloudera distribution of Hadoop
● 2) Configure everything from a web interface
● 3) Don’t forget to buy an enterprise subscription
So, we need a detailed plan
● 1) Take Cloudera distribution of Hadoop
● 2) Configure everything from a web interface
● 3) Don’t forget to buy an enterprise subscription
● 4) ^ MULTIPLE FAILS!!!
A word on proprietary software
● Proprietary software is full of nasty bugs, period
A word on open source software
● Open source software is awesome
Software market in 2016
● It’s not “proprietary vs open source”
Software market in 2016
● It’s not “proprietary vs open source”
● It’s “open source vs open source”
Open source vs open source
● Cloudera CDH vs vanilla Apache
So, we need a detailed plan
● 1) Hire a DevOps engineer
So, we need a detailed plan
● 1) Hire a DevOps engineer
● 2) Use Chef or something
So, we need a detailed plan
● 1) Hire a DevOps engineer
● 2) Use Chef or something
● 3) Automate all the things
So, we need a detailed plan
● 1) Hire a DevOps engineer
● 2) Use Chef or something
● 3) Automate all the things
● 4) ???
So, we need a detailed plan
● 1) Hire a DevOps engineer
● 2) Use Chef or something
● 3) Automate all the things
● 4) ???
● 5) PROFIT!!!
100 reasons not to use Cloudera CDH
● Cloudera CDH obscures configuration
● Cloudera CDH generates textual configs from the DB
● Cloudera CDH is web-interface centric
● Cloudera CDH is a monolith with a vendor lock-in
Our own little open source product
● Based on Ansible (Ansible is like Chef but awesome)
● https://github.com/gitinsky/ansible-hadoop-stack-howto
● https://github.com/gitinsky/ansible-role-*
Problems
● Lack of documentation
Problems
● Lack of documentation
● Lack of manpower
Problems
● Lack of documentation
● Lack of manpower
● Nobody uses our product (except us)
What about the VAT service thing?
● Forget it, it’s not that relevant
Conclusions
● Open source software is awesome
● But Cloudera CDH is not
● We can make open source software better
So long, and thanks for all the fish!
● Ask your questions please
● Alex Chistyakov, Principal Engineer @ Git in Sky
● http://gitinsky.com
● http://meetup.com/DevOps-40