How we lose etu hadoop competition

51
How We Lose Etu Hadoop Competition Evans Ye 2014.6.16 05/24/202 2 Confidential | Copyright 2013 TrendMicro Inc. 1

description

The experience about join a Taiwan hadoop deployment competition .

Transcript of How we lose etu hadoop competition

Page 1: How we lose etu hadoop competition

How We Lose Etu Hadoop Competition

Evans Ye

2014.6.16

04/07/2023 Confidential | Copyright 2013 TrendMicro Inc. 1

Page 2: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

This April, a Hadoop Competition hosted by Etu was announced

2

Page 3: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

It’s about hadoop deployment

2

Page 4: How we lose etu hadoop competition

04/07/2023

4

I have a dream… to win that 150 grand

Confidential | Copyright 2013 TrendMicro Inc.

Page 5: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Our Team

• Fann Wu, Mammi Chang– Solid Hardware related knowledge– knowing well how to tune performance on

hadoop clusters• Evans Ye

– Have some experience on developing a automatic hadoop deployment tool

2

Page 6: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Agenda

• The preliminary– Winning criteria– What we’ve prepared

• The final– Winning criteria– What we’ve prepared

• Why we lost the competition• Lesson learned

2

Page 7: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

The preliminary

• Deploy a all-in-one hadoop EC2 instance

2

Page 8: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Criteria to win the preliminary

• namenode daemon exist• put 100MB file up to hdfs • yarn daemons exist• run a pi job• zookeeper daemon exist• hbase daemon exist• run hbase put and scan• run a pig script • run a hive query

2

Page 9: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

And the most Improtant one, Finish Time

2

Page 10: How we lose etu hadoop competition

04/07/2023

10

Prepare for the fight

Confidential | Copyright 2013 TrendMicro Inc.

Page 11: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

What we prepare to do

• in order to achieve fastest finish time, we need to practice over and over.– A Vagrant based scripts to simulate the AWS

environment– A shell script which will automatically provision

all-in-one hadoop

2

Page 12: How we lose etu hadoop competition

04/07/2023

Copyright 2013 Trend Micro Inc.

Vagrant

• An open source command line VM provision tool– http://www.vagrantup.com/

• Support Virtualbox, VMware, AWS and more as VM provider

• Support shell, puppet, chef on provisioning• previous sharing

Page 13: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Vagrant-aws plugin

• https://github.com/mitchellh/vagrant-aws• Vagrantfile

2

Page 14: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Provision script

• Jazz Wang already leaked the script to provision a all-in-one hadoop on Ubuntu in OSDC.TW– package based deployment

(you can also started from tarballs)

2

Page 15: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Our hack #1

• Use self cloned S3 repo instead of worldwide public repos– avoid SPOF– co-located with Singapore region to speed up

network transmission

2

Page 16: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Our hack #2• the evil /usr/lib/hadoop/libexec/init-hdfs.sh

2

Page 17: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Our hack #2

• /usr/lib/hadoop/libexec/init-hdfs.sh– A hdfs directories bootstrap script

• /user/hbase, /tmp, /var/log/hadoop-yarn/apps…– Execute lots of hadoop shell command

• HELL SLOW!– BIGTOP-952 attempt to solve it by calling HDFS

API directly using groovy– Our hack is to concatenate similar commands

into one command• hadoop fs -mkdir -p /tmp /var/log /tmp/hadoop-yarn• 50 15 calls

2

Page 18: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Our hack #3

• run hdfs, hbase, pig, hive test case in parallel– (hdfs test case here) &– (hbase test case here) &– (…) &– wait– send my score

2

Page 19: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Pretty good result on the preliminary

2

Page 20: How we lose etu hadoop competition

04/07/2023

20

The Final

Confidential | Copyright 2013 TrendMicro Inc.

Page 21: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Evans: GJ, let’s get some rest

• 2 weeks gone

2

Page 22: How we lose etu hadoop competition

04/07/2023

22

The Final

Confidential | Copyright 2013 TrendMicro Inc.

Page 23: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Criteria to win the final

• held on 5/31 at Etu’s building

2

Page 24: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Criteria to win the final

• 部署完整性 (20%)– Zookeeper, HDFS, YARN deployed

• 高可用性驗證 (20%)– Namenode HA using Journalnodes

• 系統安全性驗證 (10%)– Kerberos enabled

• 運行效能 (30%)– DFSIO (write throughput)– Terasort (sort speed)– HBaseEvaluation (Hbase write throughtput)

2

Page 25: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Environment

• Hardware

• Software

2

Page 26: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Summarize things we need to do

• This time, finish time doesn’t matter. We need to focus on correctness and performance– Choose a hadoop deployment tool which

supports• Namenode HA• Kerberos • YARN

– Figure out how to get best performance on YARN and Virtualbox

2

Page 27: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Choosing the deoloyment tool

• Cloudera Manager– You need to install/configure Kerberos by yourself

• Ambari– “Claimed” support Kerberos, while actually it does

not• Bigtop

– Do have Kerberos and namenode HA puppet recipes, but currently is kind of buggy

• Hadooppet– Need to implement yarn deployment

2

Page 28: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Cloudera Manager

…Kerberos installation/configuration is on your own

2

Page 29: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Ambari has great UI design, but…

2

Page 30: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Comparison

2

Deployment Tool

Namenode HA

Kerberos YARN Hadoop distro

Troubleshooting

Cloudera Manager

YES NO YES Hadoop 2.3.0(CDH5)

HARD

Ambari YES NO(enable failed)

YES Hadoop 2.4.0(HDP2.1)

HARD

Bigtop NO(NFS)

NO(buggy)

YES Hadoop 2.0.6-alpha(bigtop-0.7.0)

MIDDLE

Hadooppet YES YES NO Hadoop 2.3.0(CDH5)

EASY勝 勝

Page 31: How we lose etu hadoop competition

04/07/2023

31

Getting our deployment tool ready

Confidential | Copyright 2013 TrendMicro Inc.

Page 32: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Trap#1

• Got connection refused from JournalNodes while formatting namenodes

• The root cause– When hostname defined in Vagrantfile

– It will help to setup VM’s hostname, AND the /etc/hosts

– Which lead Journalnodes listening on 127.0.0.1 and results in connection refused error while formatting namenodes

• The fix– cat /dev/null > /etc/hosts

2

Page 33: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Trap#2

• Kerberos database initialization failed due to timeout exceed

• The root cause– Virtualbox has poor entropy performance(

Ticket #11297)– Kerberos DB init can not get enough random

data– Entropy is often collected from hardware

sources for use in cryptography

2

Page 34: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Trap#2

• A quick test to get entropy– A xen VM

– A virtualbox VM

• The fix– Setup havege package which will improve

entropy performance• havege official site, Installation

2

Page 35: How we lose etu hadoop competition

04/07/2023

35

Performance Tuning

Confidential | Copyright 2013 TrendMicro Inc.

Page 37: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Virtualbox tuning

• Raw hard disk access– direct access host disks from guest VM– create a VMDK file to represent the

disk/partition

– mount it up on the guest through virtualbox GUI

– fdisk the newly added disk in guest VM

2

Page 38: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

YARN tuning

• HDFS cache for reads(available since 2.3.0)• YARN:

– yarn.nodemanager.resource.memory-mb• Mapreduce:

– io.sort.mb– mapreduce.map.memory.mb– mapreduce.map.java.opts– mapreduce.map.speculative– …– Most properties are job specific

2

Page 39: How we lose etu hadoop competition

04/07/2023

39

Deployment Architecture

Confidential | Copyright 2013 TrendMicro Inc.

Page 40: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

VMs configuration

2

RAM CPU DISK daemons

VM1 7G 3 vcpus Local disk NamenodeResourcemanager

VM2 7G 3 vcpus Local disk NamenodeResourcemanager

VM3 15G 8 vcpus 1T raw disk *2 DatanodeNodemanager

VM4 15G 8 vcpus 1T raw disk *2 DatanodeNodemanager

total 44G 22 vcpus 4T for hdfs -

Page 41: How we lose etu hadoop competition

04/07/2023

41

5/31The Day

Confidential | Copyright 2013 TrendMicro Inc.

Page 42: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

The check we’re so eager to win

2

Page 43: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

And the result

2

Page 44: How we lose etu hadoop competition

04/07/2023

44

WE LOST

Confidential | Copyright 2013 TrendMicro Inc.

Page 45: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

The reason we lost

• VirtualBox sluggish performance on hyper-threading

• To avoid that:– Disable hyper-threading– set equal number of cores for host and guest

• VMs != physical machines– We all assume that hyper-threading helps a lot

on performance, at least it does so on our hadoop cluster

2

Page 46: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Poor support for multi-cores

• VMs with multiple vCPUs require that all allocated cores be free before processing can begin– Do not configure too many vCPUs for 1 single

VM– A strong VM will not perform well as you

expect

2

Page 47: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

The better architecture

2

RAM CPU DISK daemons

VM1 10G 4 vcpus 1T raw disk *1 NamenodeResourcemanagerDatanodeNodemanager

VM2 10G 4 vcpus 1T raw disk *1 NamenodeResourcemanagerDatanodeNodemanager

VM3 10G 4 vcpus 1T raw disk *1 DatanodeNodemanager

VM4 10G 4 vcpus 1T raw disk *1 DatanodeNodemanager

total 40G 16 vcpus(equal to physical cores)

4T for hdfs -

Page 48: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

How about hadoop performance tuning?

• Everybody pretty much using defaults, including the team who win the competition

• …

2

Page 49: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Lesson learned

• Don't judge too soon• Don’t stay up for a week. If so, you can’t

make decision wisely• We need better project management

– We spent to much time on tuning our deployment tool

– We don’t do much tests on different deployment architectures

2

Page 50: How we lose etu hadoop competition

04/07/2023

Confidential | Copyright 2013 TrendMicro Inc.

Acknowledgments

• Thanks to Fann for sorting out those trivial works– packaging the box– cloning repositories– Preparing testing environment

• Thanks to Mammi for the great presentation on that day

2

Page 51: How we lose etu hadoop competition

51

Q&A

04/07/2023 Confidential | Copyright 2013 TrendMicro Inc.