Big Data & Hadoop By Mr.Nataraj smallest unit is bit 1 byte=8 bits 1 KB (Kilo Byte)= 1024 bytes...

Big Data & HadoopBy Mr.Nataraj

smallest unit is bit1 byte=8 bits1 KB (Kilo Byte) = 1024 bytes =1024*8 bits1MB (Mega Byte) =1024 KB =(1024)^2 * 8 bits1 GB (Giga Byte) =1024 MB =(1024)^3 * 8 bits1 TB (Tera Byte) =1024GB =(1024)^4 * 8 bits1 PB (Peta Byte) =1024 TB =(1024)^5 * 8 bits1 EB (Exa Byte) =1024 PB =(1024)^6 * 8 bits1 ZB (Zetta Byte) =1024 EB =(1024)^7 * 8 bits1 YB (Yotta Byte) =1024 ZB =(1024)^8 * 8 bits1 XB (Xenotta Byte) =1024 YB =(1024)^9 * 8 bits

UNITS OF DATA

1 byte =A single character1 KB = A very short story1 MB=A small novel (6 seconds of TV-quality video)1 Gigabyte: A pickup truck filled with paper1 Terabyte : 50000 trees made into paper2 PB: All US academic research libraries5 EB: All words ever spoken by human beings

HOW BIG ARE THOSE NUMBERS

WHAT IS BIG DATA

SOME INTERESTING FACTS• Google: 20,00,000 query per second• Facebook 34000 likes per minute• Online Shopping of USD 300,000 per minute• 1,00,000 tweets in twitter per minute• 600 new videos are uploaded per minute in yT• Barack Obama used Big Data to win election• Driver-less cars uses Big Data Processing for

driving vehicles

• AT&T transfers about 30 petabytes of data through its networks each day

• Google processed about 24 petabytes of data per day in 2009

• The 2009 movie Avatar is reported to have taken over 1 petabyte of local storage at Weta Digital for the rendering of the 3D CGI effects

As of January 2013, Facebook users had uploaded over 240 billion photos, with 350 million new photos every day. For each uploaded photo, Facebook generates and stores four images of different sizes, which translated to a total of 960 billion images and an estimated 357 petabytes of storage

Processing capabiltiyGoogle process 20 PB a dayFacebook 2.5 PB of User data + 15 TB/dayebay 6.5 PB of data +50TB/day

Evolution of Hadoop• Doug Cutting working on Lucene

Project(A Search engine to search document)got problem of Storage and computation, was looking for distributed Processing.

• Google publish a Paper GFS(Google File System)

• Doug cutting & Michael Cafarella implemented GFS to come out with Hadoop

WHAT IS HADOOP• A framework written in Java for running

applications on large clusters of commodity hardware.

• Mainly contains 2 parts– HDFS for Storing data– Map-Reduce for processing data

• Maintains fault-tolerant using replication factor.

• employee.txt(eno,ename,empAge,empSal,empDes)• 101,prasad,t20,1000,lead

Assume you have around 100,00,000,00000,0000000000 records and you would like to find out all the employees above 60 years of age.How do you program them traditionally. 10 GB= 10 min 1 TB= 1000 minutes =16 hoursGoogle process 20 PB of data per dayTo process 20 PB it will take 3200 hours = 133 days

INSPIRATION FOR HADOOPTo store huge data(unlimited)To process huge data

• Node -A single computer with its own processor and memory.

• Cluster-combination of nodes as a single unit• Commodity Hardware-cheap non-reliable

hardware• Replication Factor-data getting duplicated &

saved in more than one place• Data Local Optimization-data will be processed

locally

Basic Terminology

• Block:- A part of data• node1 node2 node3• data1 data2 data3• 1 file 200 MB(50MB 50MB 50MB 50MB)

• Block size:- The size of data that can stored as a single unit

• apache hadoop:- 64 MB(configurable)• 1GB in apache hadoop=16 blocks• 65MB(apache)=64MB+ 1MB• Replication:- duplicate the data replication factor is: 3

NODE CLUSTER

SCALING• Vertical Scaling

Adding more powerful hardware to an existing system.Will Scale only up to certain limit.

• Horizontal Scaling Adding a completely new node to an existing

cluster. will scale up to many nodes

3 V's of Hadoop• Volume: The amount of data generated• Variety: structured data,unstructed

data.Database table data• Velocity: The frequency at which data is

generated

1.hadoop believes on scale out instead of scale up when needed buy more oxes dont grow your oxe more powerful

2.hadoop on structured as well unstructured RDBMS only works with structured data.(However now a days many no-sql database has comeout in the market like mongo db,couch base.)

3.hadoop believes on key-value pair rather than data in the column

HADOOP VS RDMS

No doubt Hadoop is a framework for processing big data. But it is not the only framework to do so. Below are few more alternative.

Apache SparkGraphLabHPCC Systems- (High Performance Computing Cluster)

DryadStratosphere

HADOOP ALTERNATIVES

Storm R3 Disco Phoenix Plasma

You can download hadoop from link http://hadoop.apache.org/releases.htmlhttp://apache.bytenet.in/hadoop/common/ · 18 November, 2014: Release 2.6.0 available · 27 June, 2014: Release 0.23.11 available · 1 Aug, 2013: Release 1.2.1 (stable) available

DOWLOADING HADOOP

1. Name Node2. Secondary Name Node3. Job Tracker4. Task Tracker5. Data Node

HADOOP1. Storing Huge Data 2. Processing Huge Data

Hadoop Daemons

HADOOP CORE COMPONENTS

Modes in HadoopStandalone ModePseudo Distributed ModeFully Distributed Mode

Standalone modeIt is the default mode1 nodeNo separate process will be running(daemons)Everything runs in a single JVMSmall development,Test,Debugging

Pseudo Distributed Mode1. A single node, but cluster will be simulated2. Daemons will run on separate process separate JVMs3. Development and Debugging

1. Multiple nodes2. Hadoop will run in a cluster of machines/nodes used in Production Environment

Fully Distributed Mode

Hadoop Architecture

HivePigScoopAvro

ECOsystem Components

FlumeOozieHBaseCassandra

Job Tracker

Job Tracker contd..

Job Tracker Contd..

HDFS write

Big Data & Hadoop By Mr.Nataraj smallest unit is bit 1 byte=8 bits 1 KB (Kilo Byte)= 1024 bytes...

Documents

Transcript of Big Data & Hadoop By Mr.Nataraj smallest unit is bit 1 byte=8 bits 1 KB (Kilo Byte)= 1024 bytes...

IRJET-ENHANCING DES AND AES WITH 1024 BITS KEY

Bits and byte wkst

“2 bits, 4 bits, 6 bits a byte!” Other bits 4-1-1 ...csbio.unc.edu/mcmillan/Comp411F18/Lecture02.pdf08/24/2018 Comp 411 - Fall 2018 4-1-1: Information Please Representing Information

1 MOTHERBOARD. Tablas de Conversion 1 Byte 8 Bits 1 KiloByte 1000 Bytes(1024) 1MegaByte 1000 KiloBytes 1 GigaByte 1000 MegaBytes 1 TeraByte 1000 GigaBytes.

Binary number, Bits and Byte Sen Zhang. Number systems –Decimal –Binary –Bits –bytes –Hexadecimal –Octal –Numbers conversion among different systems Ascii.

COS217:Introductionto ProgrammingSystems · 47 Javavs.C:Details Java C Character&type char // 16-bit Unicode char /* 8 bits */ Integral&types byte // 8 bits short // 16 bits int

Multibeam antennas planning— limitations and solutions...• Each neighbor QRXLEVMIN that deviates from serving cell, 1 byte (8 bits) • Use of QOFFSET, 1 byte (8 bits) • Header:

QUESTION 1. - PapersByTopic · 2021. 4. 11. · Byte 1 Byte 2 The location is indicated by the setting of one of the seven bits in byte 1. For example, location 4 is indicated by

Fabián E. Bustamante, Spring 2007 Bits and Bytes Today Why bits? Binary/hexadecimal Byte representations Boolean algebra Expressing in C.

Http://proglit.com/. bits and text BY SA byte (the size of a cell of addressable memory) 8 bits on all modern systems octet = 8 bits.

RAM (cont.) 2 20 bytes of RAM (1 Mega-byte) Write Address Data input Data Output 20 bits of address 8 bits (1 byte) of data.

128/256/512/1024/2048 words of 8 bits each. The device is ...

Bits and Bytes 4/2/2008 Topics Physics, transistors, Moore’s law Why bits? Representing information as bits Binary/Hexadecimal Byte representations »numbers.

Bits and Bytes September 2, 2004 Topics Why bits? Representing information as bits Binary / Hexadecimal Byte representations »Numbers »Characters and strings.

Bits, Bytes, and Integers – Part 1213/lectures/02-bits-ints-part1.pdf · 2019-05-22 · ⬛Byte = 8 bits Binary 000000002 to 111111112 Decimal: 010 to 25510 Hexadecimal 0016 to

Bits and Bytes Topics Why bits? Tell me Representing information as bits Binary/Hexadecimal Byte representations »numbers »characters and strings »Instructions.

•Multiple-byte data · • A byte is composed of 8 bits. Two nibbles make up a byte. • Halfwords, words, doublewords, and quadwords are composed of bytes as shown below: Bit Nibble

“2 bits, 4 bits, 6 bits a byte!” Other bits 4-1-1 ...csbio.unc.edu/mcmillan/Comp411F17/Lecture02.pdf · Special "control" characters set upper two bits to 00 ex. cntl-g → bell,

Computer Fundamentals - · Web view1 Tera Byte = 1024 GB= 210 GB 1 Peta Byte =1024 TB= 210 TB 1 Exa Byte =1024 PB= 210 PB 1 Zetta Byte = 1024 EB= 210 EB 1 Yotta Byte = 1024 ZB= 210

BITS, BYTES, AND INTEGERSwitchel/429/lectures/02-bits... · 2012-09-06 · 3 Encoding Byte Values Byte = 8 bits Binary 00000000 2 to 11111111 2 Decimal: 0 10 to 255 10 Hexadecimal