Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

21
Slide 1 © 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com Big Data Analytics using Hive

Transcript of Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Page 1: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 1© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Big Data Analytics using Hive

Page 2: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 2© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Scope of PPT – BIG Data Analytics via Hive

ᗍ Introduction to Big Data and Hadoop

ᗍ Understanding Hive and its Concepts

ᗍ Hive Architecture, Hive Meta Store and Hive Use-Cases

ᗍ BIG Data Analytics via Hive

ᗍ BIG Data & Hadoop Job Trends

ᗍ Webinar Session by Skillspeed

Get Started with BIG Data & Hadoop

Page 3: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 3© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Big Data and its Challenges

Get Started with BIG Data & Hadoop

Page 4: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 4© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Big Data and its Challenges

Big data is the term for a collection of data sets so large and complex that it becomes difficult to process using on-hand database management tools or traditional data processing applications

Systems / Enterprises generate huge amount of data from Terabytes to and even Petabytes of information

It’s very difficult to manage such huge data……

Get Started with BIG Data & Hadoop

Page 5: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 5© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Who Generates Big Data?

Have you ever wondered how Google, Facebook or LinkedIn manages to store and utilize the huge data?

Today, managing unstructured and voluminous data is creating a big problem.Get Started with BIG Data & Hadoop

Page 6: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 6© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Hadoop can be utilized for processing & analyzing large data-sets.

Before that let’s understand what is Hadoop?Get Started with BIG Data & Hadoop

Page 7: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 7© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Hadoop and its Characteristics

Apache Hadoop is a framework that allows the distributed processing of large data sets across clusters of commodity computers using a simple programming model

It is an Open-source Data Management technology with scale-out storage and distributed processing

Hadoop Characteristics

Flexible

Reliable

Economical

Scalable Get Started with BIG Data & Hadoop

Page 8: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 8© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Hadoop Ecosystem

Flume Sqoop

Import Or Export

Unstructured or Semi-Structured data Structured Data

Apache Oozie (Workflow)

HDFS(Hadoop Distributed File System)

Pig LatinData Analysis

HiveDW System

MapReduce Framework HBase

OtherYARN

Frameworks (MPI,GIRAPH)

YARNCluster Resource Management

Get Started with BIG Data & Hadoop

Page 9: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 9© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Hive Origination

ᗍ Hive originated as an internal project in Facebook

ᗍ Later it was adopted in Apache as an open source project

ᗍ Facebook deals with massive amount of data (petabytes scale) and it needs to perform more than 75k ad-hoc queries on this massive amount of data

ᗍ Since the data is collected from multiple servers and is of diverse nature, any RDBMS system could not fit as probable solution

ᗍ Map Reduce could be a natural choice, but it had its own limitations

Get Started with BIG Data & Hadoop

Page 10: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 10© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

What is Hive?

ᗍ It is a query engine wrapper built on top of Map Reduce

ᗍ It is treated as Data Warehousing tool of Hadoop Ecosystem

ᗍ It is used for data analysis

ᗍ Primarily targeted to the users with SQL background

ᗍ Provides HiveQL, which is very similar to SQL

ᗍ It is used for managing and querying structured data

ᗍ Hadoop complexity is hidden from end users

ᗍ Java and Hadoop API knowledge is optional for core users

ᗍ Developed by Facebook and contributed to community

Get Started with BIG Data & Hadoop

Page 11: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 11© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Hive Use Cases

Ad-hoc analysis of underlying

data

Hypothesis testing of the

underlying data

Big Data Testing of huge data

sets

Analysis of the processed data

Get Started with BIG Data & Hadoop

Page 12: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 12© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Hive Components

Hive Components

Driver

Shell

Metastore

CompilerExecution

Engine

Get Started with BIG Data & Hadoop

Page 13: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 13© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Hive Architecture

JDBC/ODBCBrowse, Query, DDL

Metastore Thrift API

HIVE QL

ParserPlannerOptimizer

Execution

User-definedMapReduce Scripts

FileFormatsTextFile

SequenceFileRCFile

Map Reduce HDFS

UDF/UDAFSubstrSum

Average

SerDeCSV

ThriftRegex

Get Started with BIG Data & Hadoop

Page 14: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 14© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Hive Meta Store

Metastore

Derby

Metastore Metastore

MySQL

MetastoreServer JVM

MetastoreServer JVM

MySQL

Embedded Metastore Local Metastore Remote Metastore

HIVE Service JVM

DriverDriver Driver Driver Driver Driver

Get Started with BIG Data & Hadoop

Page 15: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 15© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Job Trends – Hadoop

Get Started with BIG Data & Hadoop

Page 16: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 16© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Why SkillSpeed?

Course Curriculum

from Industry Experts

Instructor Led Live Virtual

Sessions

Lifetime access to Course

Content via LMS

100% Placement Assistance

24x7 Support

Get Started with BIG Data & Hadoop

Page 17: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 17© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Course Topics

Module 1

Introduction to Big Data and Hadoop

Module 2

HDFS Internals, Hadoop Configurations and

Data Loading

Module 3

Introduction to Map Reduce

Module 4

Advanced Map Reduce Concepts

Module 5

Introduction to Pig

Module 6

Advanced Pig and Introduction to Hive

Module 7

Advanced Hive Concepts

Module 8

Extending Hive and HBase Introduction

Module 9

Advanced HBase and Oozie Introduction

Module 10

Project Set-up Discussion

Get Started with BIG Data & Hadoop

Page 18: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 18© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Corporate Partners

Get Started with BIG Data & Hadoop

Page 19: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 19© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Lines open 24/7

To know more about the course, Please contact:

IND +91-90660-20904 USA 1866-607-6547 (Toll Free)

Or reach us at

[email protected]

Contact Us

Get Started with BIG Data & Hadoop

Page 20: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture

Slide 20© 2015 BlueCamphor Technologies (P) Ltd. www.skillspeed.com

Image References

Google images – credit for google, Facebook and LinkedIn LOGO and Snapshots

http://iconizer.net/en/search/1/collection:Practika

http://findicons.com/icon/66444/user_group

http://www.virtualizor.com/tour

https://accounts.it.et.byu.edu/

http://www.clipartsfree.net/tag/server.html

http://www.gopixpic.com/16/time-clock-icon-png-download

http://blog.smartbear.com/requirements/how-to-interview-users-to-find-out-what-they-really-want/

http://www.lincs.fr/research/areas/big-data/

http://www.counsellingpages.co.uk/

http://langfordsconsultancy.com/langfords-training-support-package/

http://cbsepathshala.blogspot.in/2012/05/physics-class-x-chapter-electricity.html

http://mmatycoon.com/tycoontimes/tycoontimesstory.php?SID=1010

Page 21: Hadoop Hive Tutorial | Hive Fundamentals | Hive Architecture