How We Prepared Etu Hadoop Competition 2014

29
How We Prepared Etu Hadoop Competition 2014 Study Hsueh 2014/06/26 那年,我們起追的Hadoop

Transcript of How We Prepared Etu Hadoop Competition 2014

Page 1: How We Prepared Etu Hadoop Competition 2014

How We Prepared Etu Hadoop Competition

2014

Study Hsueh!!

2014/06/26

那⼀一年,我們⼀一起追的Hadoop

Page 2: How We Prepared Etu Hadoop Competition 2014

那⼀一年,我們怎麼僥倖贏的EHC

Page 3: How We Prepared Etu Hadoop Competition 2014

Background• qrtt1

• Java & AWS Expert

• Study

• Java Fan

• Lu

• Machining Learning Beauty

Page 4: How We Prepared Etu Hadoop Competition 2014

Hadoop Experience• qrtt1

• 從Hadoop 1.x就說要玩Hadoop,但⼀一直沒玩

• Study

• 裝過CDH、略懂Hadoop 1.x

• 介接過Hive、⽤用sqoop轉置過RDBMS資料

• Lu

• 聽⼈人家說過Hadoop

Page 5: How We Prepared Etu Hadoop Competition 2014

初賽

Page 6: How We Prepared Etu Hadoop Competition 2014
Page 7: How We Prepared Etu Hadoop Competition 2014

初賽前分⼯工• qrtt1

• ⼿手⼯工架Hadoop環境

• Study

• 準備bigtop rpms (放在S3上⾯面)

• 改Vagrantfile

• 測試

• Lu

• 專⼼心學Linux與架Hadoop

Page 8: How We Prepared Etu Hadoop Competition 2014

初賽當天分⼯工• qrtt1

• 分析送分程式

• Study

• 跑Vagrant script

!

Page 9: How We Prepared Etu Hadoop Competition 2014

初賽結果• 漏掉設定hostname, 導致HBase異常,還好最後有進決賽:)

!

!

!

Page 10: How We Prepared Etu Hadoop Competition 2014

決賽

Page 11: How We Prepared Etu Hadoop Competition 2014

決賽說明會前分⼯工• qrtt1!

• ⼿手⼯工架Hadoop Cluster!• 架設KDC!• HA、Kerberos Setup & Usage!

• Study!• 準備與參賽環境相似的測試機!• 準備CDH & CentOS repository mirror!• 玩各種Hadoop distribution (CDH、HDP與BigTop)!• Performance Turning & Testing!• HA & Kerberos Usage!

• Lu!• ⼿手⼯工架Hadoop Cluster!• 測試Hadoop參數

Page 12: How We Prepared Etu Hadoop Competition 2014

測試機 v1

• Type 1 Hypervisor: VMware ESXi 5.5

• CPU: Intel i5 760

• RAM: 16 GB

• HDD: 2 TB * 2

Page 13: How We Prepared Etu Hadoop Competition 2014
Page 14: How We Prepared Etu Hadoop Competition 2014

決定使⽤用的 Hadoop Distribution

• 採⽤用CDH

• Pros

• 容易修改&部署Hadoop參數

• Log位置固定

• Cons

• Cloudera Management Service⾮非常吃資源 (可以關掉)

• 安裝耗時

Page 15: How We Prepared Etu Hadoop Competition 2014

決賽說明會後分⼯工• qrtt1

• Performance Testing

• Study

• 調整測試機,盡可能貼近⽐比賽環境

• 準備⽐比賽當天⽤用的VM

• Performance Testing

• Lu

• 測試Hadoop參數

Page 16: How We Prepared Etu Hadoop Competition 2014

測試機 v2

• Host: CentOS 6.5 x86_64 Desktop

• Type 2 Hypervisor: Oracle VirtualBox 4.3.12

• CPU: Intel i5 760

• RAM: 32 GB

• HDD: 2 TB * 4

Page 17: How We Prepared Etu Hadoop Competition 2014
Page 18: How We Prepared Etu Hadoop Competition 2014

決賽前⼀一天...

• 準備得越多,越發現可以準備的東⻄西更多

• 累了

!

!

Page 19: How We Prepared Etu Hadoop Competition 2014

決賽當天分⼯工• qrtt1

• KDC Setup • Watch Log • 執⾏行送分程式

• Study • 準備軟硬體環境 • 協助問題排除

• Lu • Hadoop參數調整

Page 20: How We Prepared Etu Hadoop Competition 2014

Before The Final Game We Know

• 單⼀一台⼤大VM⽐比四台⼩小VM快上數倍

• CDH預設不允許使⽤用系統使⽤用者hdfs做某些操作

• VirtualBox

• JBOD無顯著效果

• ⽐比ESXi VM慢很多,且不時無回應

• Shared Folder權限更改無效

• VM互傳資料速度約30MB/s

Page 21: How We Prepared Etu Hadoop Competition 2014

策略• 先求各項有分數

• 若有⼈人分數超前,才開始turning

• VM turning

• Hadoop parameter turning

• ramfs

• Make Hadoop cluster run like a single-node Hadoop

• JBOD

Page 22: How We Prepared Etu Hadoop Competition 2014

決賽中遇到的問題

• VM異常的慢

• HDFS寫⼊入30 * 3G的資料,準備的VM硬碟配置只有80 GB

• HA Failover只等10秒,Namenode來不及切換

• HBase使⽤用系統使⽤用者hdfs執⾏行,導致出現權限錯誤

Page 23: How We Prepared Etu Hadoop Competition 2014

Troubleshooting• VM異常的慢

• 原因:每個VM配置了過多的cores (12 cores)

• 解決⽅方法:每個VM改為4 cores

!

!

Page 24: How We Prepared Etu Hadoop Competition 2014

Troubleshooting• HDFS寫⼊入30 * 3G的資料,我們準備的VM硬碟配置只有80 GB

• Mount new virtual disks

• Stop Kerberos

• Reformat HDFS

• Start Kerberos

• 最後把HBase弄掛了

• 使⽤用snapshot還原VM

Page 25: How We Prepared Etu Hadoop Competition 2014

Troubleshooting• HA Failover送分程式只等10秒,Namenode來不及切換

• ⽤用Ctrl+z暫停送分程式

• 確認 Failover 完成,⽤用 fg 將送分程式喚醒

!

Page 26: How We Prepared Etu Hadoop Competition 2014

Troubleshooting• HBase使⽤用系統使⽤用者hdfs執⾏行,導致出現權限錯誤

• 新增Kerberos user

• 賦予User執⾏行MapReduce、HBase與HDFS的權限

!

Page 27: How We Prepared Etu Hadoop Competition 2014

結論• ⽐比賽中有很多取捨,最後很多準備的東⻄西都沒⽤用上

• ⺩王牌還沒出,⽐比賽就結束了

• 也許我們只是⼩小贏在 Linux ⽐比較熟

!

Page 28: How We Prepared Etu Hadoop Competition 2014

⾨門外漢只要努⼒力,也有變成男⼦子漢的⼀一天!!

Page 29: How We Prepared Etu Hadoop Competition 2014

參考資料• Etu Hadoop Competition 2014

• http://ehc.etusolution.com/index.php/tw/

• ⾨門外漢的 Hadoop 部署⼤大賽(上)

• http://www.codedata.com.tw/social-coding/contest-of-hadoop-layman-1/

• ⾨門外漢的 Hadoop 部署⼤大賽(下)

• http://www.codedata.com.tw/social-coding/contest-of-hadoop-layman-2/