Post on 19-May-2020
Presented by
Date
Event
ODPi (Open Data Platform Initiative)
Ganesh Raju & Naresh Bhat Big Data Team, LEG
BKK14-400B March 10, 2016
Linaro Connect BKK16
Standardizing Hadoop Ecosystem
Why ODPi ?● Hadoop is a collection of 47+ components
○ Compatibility issues○ Insufficient documentation○ No proper integrated tests○ Maintenance nightmare
■ Release cycle timelines■ Api changes■ Config files
● Help minimize the fragmentation and duplication of effort within the industry
In the table below, you can see who supports what. Other projects supported by all the vendors include HBase, Hive, Pig, Spark, and Zookeeper – for a total of 8 projects supported by all. Potential for more.
What is ODPi ?● A shared industry effort to advance the state of the Apache
Hadoop and Big Data technologies for the enterprise● Provide a well integrated and tested stable base● Bringing real enterprise demands to align with the
developer community● Configuration optimizations● Best practices and standards● Platform agnostic (ARM, X86, etc.)● Certification programs● Backed by Linux Foundation
Benefits for ISVs● Eliminates fragmentation, reduces costs, accelerates time to market
○ Cost of maintaining external open source components. ○ Insufficient documentation
● Promotes compatibility between distros. ● Members can focus on innovation and differentiated value-add ● Already companies like Pivotal, Hortonworks, IBM and Altiscale have big
data solutions based on ODPi specs
ODPi Components● ODPi consists of 2 stacks
○ Runtime stack○ Management stack
● ODPi currently consists of only 3 components in Runtime stack and Ambari on the management stack
ODPi - Current status● Automated CI loop builds are established. Local Nexus repository is setup to store all artifacts
instead of getting from apache● Utilizing BigTop as single tool for CI build, deployment and automated tests● Defining more smoke tests.● Spec for runtime stack is in draft, publically shared ● Working on certification specs for distros ● Technical progress can be found at: https://github.com/odpi
Challenges:● Concession between members. ● Management tool like Ambari
Linaro is a Member ODPi● Enablement of AARCH64● Tested on Member Platforms● Participate in Technical Spec● Provide feedback to the community● Provide Engineering efforts
Linaro’s contribution● Reference ARM64 build● Patches to Apache Bigtop● ODPi tested against multi node ARM based clusters● Benchmarking and performance tuning for ARM● JDK optimizations● ARM based Developer Cloud for ODPi members to set up CI builds and
tests
ODPi Installation and Run Instructions● ODPi specs can be found here: https://github.com/odpi/specs/blob/v1.0.0-
runtime-draft/ODPi-Operations.md ● ODPi deb and rpm packages can be found on Linaro repositories:
○ Debian Jessie - http://repo.linaro.org/ubuntu/linaro-overlay/ ○ CentOS7 - http://repo.linaro.org/rpm/linaro-overlay/centos-7/
● ODPi Installation, setup and instructions to run○ https://github.com/96boards/documentation/wiki/ODPi-Hadoop-Installation
ODPi Milestones● RC of spec for runtime stack will be delivered on March● First official release of ODPi will be by end of March● Next release will be in October with a cadence of release every 6 months● Voting procedure to be defined by April to let members choose the next
important component to be added into ODPi (Could be HBase / Hive / Spark)
ODPi Roadmap● Add more tests● Certification program with tests for members to validate against them● Expanding tests to applications● Containerize ODPi● Make ODPi more cloud friendly, make Ambari-like management stack
more cloud integrated● Grow the footprint and have as many components included as possible
DemoBy Naresh Bhat
Q & A