Big data Analytics hands-on sessions

Post on 07-Apr-2017

50 views 2 download

Transcript of Big data Analytics hands-on sessions

BIG DATA Hands-on!

Pravin Hanchinalpravinhanchinal.com

Typical Overview

Ways to Run Hadoop

Files to be Configed

Typical Use Case

During Hadoop Setup

Multi-Node Clustering

http://www.michael-noll.com/tutorials/running-hadoop-on-ubuntu-linux-multi-node-cluster/

The Feel:

HortonWorks= Hadoop + web UICloud Era=Hadoop + GUIBoth uses Cent OS (Linux)Everything comes Preconfigured

Big Data Stacks

Why CloudEra?

Enough! Let's see in ActionSource: http://www.crackerjackann.net/blog/is-a-lack-of-leads-frustrating-you

Prerequisites

RAM: 4 GB (min),8GB recommendedFreeSpace >10GBVirtual Box on Ubuntu / WindowsBig Data Stacks

The Tools

Flume:Streaming large data into HDFSSqoop:Hadoop and relational database servers.MySQL, Oracle<-->HDFSMore:http://pravinhanchinal.com/what-is-for-what-hadoop-tools

YARN: Cluster Resource Mgmt tool

multiple access-Single Data Set

Go for Certifications

Got questions?

Text/WhatsApp on 974-086-1099

What Next?

Dive in and Explore

Resourceshttps://ayende.com/blog/4435/map-reduce-a-visual-explanation

MultiNode on Amazon: https://dzone.com/articles/how-set-multi-node-hadoop

https://ayende.com/blog/4435/map-reduce-a-visual-explanation

Run Sample MapReduce Examples:

MapReduce examples: http://www.informit.com/articles/article.aspx?p=2190194&seqNum=3

https://hortonworks.com/hadoop-tutorial/how-to-process-data-with-apache-pig/