Karmasphere hadoop-productivity-tools
-
Upload
hadoop-user-group -
Category
Education
-
view
1.634 -
download
0
description
Transcript of Karmasphere hadoop-productivity-tools
![Page 1: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/1.jpg)
This slide intentionally left blank.
![Page 2: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/2.jpg)
State-of-the-Art Productivity Tools for Developers & Analysts
Shevek
![Page 3: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/3.jpg)
About Karmasphere
● Productivity suite for Developers and Analysts.● Point-and-drool GUI for Hadoop, Hive, Cascading, Pig.
● MapReduce development and debugging on-cluster.
● Integrated with Eclipse and NetBeans IDEs.
● Interface between a human (you!) and a Hadoop cluster.● Does the boring, tedious or repetitive bits.
● Finds the errors fast before you do.
● Works anywhere with anything.
HALP!
Karmasphere
Hockey sticks!
![Page 4: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/4.jpg)
The Idea
● Collect Underpants● ....?● Profit
But what goes in the middle?
![Page 5: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/5.jpg)
The Problem
● Collect Data● Convert to MapReduce● Execute● Debug● Tune● … Profit
Get someone else to do it!
![Page 6: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/6.jpg)
How long will it take?
● Performance
Of what? Surely not the computer.
![Page 7: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/7.jpg)
Computational Performance
Time (faster considered better)
Make this algorithm as fast as you can.
![Page 8: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/8.jpg)
Analytics Performance
But what aboutthis bit?
Or this bit?
Analytics is slightly different.
![Page 9: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/9.jpg)
Analytics Performance
But what aboutthis bit?
Or this bit?
That the human understands the problem does not mean that the computer understands the problem.
![Page 10: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/10.jpg)
Analytics Performance
But what aboutthis bit?
Or this bit?
The computer knowing the answer is not the same asthe human understanding the answer.
![Page 11: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/11.jpg)
Common MapReduce Challenges
● How do I write a Hadoop job?● Did my job work?
● If it didn't throw an exception, it worked. Right?
● Did I get the correct answer?● Are you sure?● Do you have enough information to prove that?● … to your accountants or customers?
● What happened? or What do I need to know?● Please note, this feature is now officially called the
“Job Profiler”, not the “What?! Window.”
![Page 12: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/12.jpg)
Karmasphere Studio
![Page 13: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/13.jpg)
Karmasphere Studio
![Page 14: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/14.jpg)
Common Analytical Tasks
So common, in fact, that ...
group
sort
aggregate
intersection
unique
limit
scan
join
function
hash
materialize
condition
set operations
store
catindex
![Page 15: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/15.jpg)
High Level Languages
Hive PigCascading
![Page 16: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/16.jpg)
Cascading
A workflow based language
Perfect for dylsexics like me.
![Page 17: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/17.jpg)
Pig
An imperative scripting language
data = LOAD '$input' AS (query:CHARARRAY, count:INT);
queries_group = GROUP data BY query PARALLEL $reducers;
queries_sum = FOREACH queries_group GENERATE group AS query, SUM(data.count) AS count;
queries_ordered = ORDER queries_sum BY count DESC PARALLEL $reducers;
Simple and accessible to all.
![Page 18: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/18.jpg)
Hive
An SQL-like language
FROM ( FROM ( FROM src src1 SELECT src1.key AS c1, src1.value AS c2 WHERE src1.key > 10 and src1.key < 20 ) a FULL OUTER JOIN ( FROM src src2 SELECT src2.key AS c3, src2.value AS c4 WHERE src2.key > 15 and src2.key < 25 ) b ON (a.c1 = b.c3) SELECT a.c1 AS c1, a.c2 AS c2, b.c3 AS c3, b.c4 AS c4) cSELECT c.c1, c.c2, c.c3, c.c4
I can parse that in my head, honest.
![Page 19: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/19.jpg)
Karmasphere Analyst
FROM ( FROM src select src.key, src.value WHERE src.key < 100 UNION ALL FROM src SELECT src.* WHERE src.key > 100) unioninputINSERT OVERWRITE DIRECTORY 'union.out' SELECT unioninput.*
![Page 20: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/20.jpg)
Karmasphere Analyst
![Page 21: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/21.jpg)
Conclusions
How long does it take to get your answers?
![Page 22: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/22.jpg)
How to get involved
● Getting started as a Hadoop Java Developer?● Download Karmasphere Studio FREE!
● Deploying Hadoop jobs in production?● Use Karmasphere Studio Professional Edition.
● Want to use high level languages like SQL?● Talk to us about Karmasphere Analyst.● Join the beta programme!
![Page 23: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/23.jpg)
Questions, Errata, Heckling
● Some questions suggested by others:● Where can I download Karmasphere Studio Community Edition?
– Visit http://www.karmasphere.com/ for free downloads and great justice.
● What about building production-ready jobs for enterprise deployment?
– Ask us about introductory offers on Karmasphere Studio Professional Edition.
● How can I use graphical SQL on Hadoop?
– Talk to us about the Karmasphere Analyst Sekrit(!) Beta.
● Some questions I thought up:● How do I (something awfully complicated)?
– Please talk to us, we enjoy the challenges.
● Is there any tea on this spaceship?
● And some from the audience, please!● I get paid by the answer. I need questions.
![Page 24: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/24.jpg)
![Page 25: Karmasphere hadoop-productivity-tools](https://reader033.fdocuments.us/reader033/viewer/2022042813/54be2e734a7959ee018b46c3/html5/thumbnails/25.jpg)
BAY AREA HADOOP USER GROUP ;KARMASPHERE®PRODUCTION
K A R M A S P H E R E S T U D I OP R O D U C T I V I T Y S U I T E F O R D E V E L O P E R S A N D A N A LY S T S
SHEVEK CTO, KARMASPHERE MARTIN HALL CEO, KARMASPHERE
kDARREN ARONOFSKY pCLAUDE BESSON cMETALLICA _ENNIO MORRICONE nJK ROWLING
dJACQUELINE DURRAN zJIM HENSON uINDUSTRIAL LIGHT AND MAGIC Ä Ç À