Big data: Loading your data with flume and sqoop

Post on 08-May-2015

3.424 views 0 download

description

Studying Hortonworks stack, I created this 10 minutes presentation. http://hortonworks.com

Transcript of Big data: Loading your data with flume and sqoop

Loading data in Hadoop 2

with SQOOP and Flume

Christophe Marchal | Software Architect

Problem to solve

Hortonworks stack

Batch Loading vs Stream Loading

SQOOP

HCatalog

SQOOP 1: Import

SQOOP 1: Export

SCOOP 2

Flume

AgentWeb Server

Source

Channel

Sink

HDFSAgent

Source

Channel

Sink

Agent

Source

Channel

Sink

Agent

Source

Channel

Sink

Web ServerWeb

Server

Multi agent flow

Consolidation flow

Flume vs SQOOP

● distributed

● reliable (transaction)

● available (backup

routes)

● collecting data

● aggregating data

● Data imports

● Parallelizes data

transfer

● Copies data quickly

Flume example

Flume example

Flume example

SQOOP: import HDFS

SQOOP: import HDFS

SQOOP: import HDFS

SQOOP: import Hive

SQOOP: import Hive

SQOOP: import Hive

Thanks

Christophe Marchal | Software Architect @toff63