Install Sqoop on Amazon EMR (Elastic Map Reduce)

14
Installing Sqoop on AWS Elas1c Map Reduce Rohit Ghatol Director of Engineering @ Synerzip h3p://www.linkedin.com/in/rohitghatol @rohitghatol h3p://rohitghatol.com BY

description

Slides used in the Video "Sqoop on EMR" - https://www.youtube.com/watch?v=3YJwDJOyDE0

Transcript of Install Sqoop on Amazon EMR (Elastic Map Reduce)

Page 1: Install Sqoop on Amazon EMR (Elastic Map Reduce)

Installing  Sqoop  on    AWS  Elas1c  Map  Reduce  

Rohit  Ghatol  Director  of  Engineering  @  Synerzip  

                             h3p://www.linkedin.com/in/rohitghatol                  @rohitghatol                  h3p://rohitghatol.com    

 

BY  

Page 2: Install Sqoop on Amazon EMR (Elastic Map Reduce)

So<ware  Stack  

Amazon  EMR  

Apache  Sqoop  

Page 3: Install Sqoop on Amazon EMR (Elastic Map Reduce)

Step  1  –  Set  S3  Buckets  

S3  

S3  

S3  

synerzip-­‐sqoop-­‐scripts  •  install-­‐sqoop.sh  •  sqoop-­‐import-­‐all.sh  •  mysql-­‐connector-­‐java-­‐5.1.33.tar.gz  •  sqoop-­‐1.4.4.bin__hadoop-­‐2.0.4-­‐alpha.tar.gz  

synerzip-­‐emr-­‐logs  •  j-­‐2SL51VFFUEVZT/  

•  daemons  •  node  •  steps  

synerzip-­‐imported-­‐data  •  User_Profile-­‐12-­‐12-­‐12_10:10:10  

•  part-­‐m-­‐00000  •  part-­‐m-­‐00001  •  part-­‐m-­‐00002  

S3  Bucket  with  Sqoop  Scripts  

S3  Bucket  with  EMR  Logs  

S3  Bucket  with  Sqoop  Imported  Data  

Page 4: Install Sqoop on Amazon EMR (Elastic Map Reduce)

S3  Buckets  

Page 5: Install Sqoop on Amazon EMR (Elastic Map Reduce)

Install-­‐Sqoop.sh  #!/bin/bash    cd  /home/hadoop  hadoop  fs  -­‐copyToLocal  s3://synerzip-­‐sqoop-­‐scripts/sqoop-­‐1.4.4.bin__hadoop-­‐2.0.4-­‐alpha.tar.gz  sqoop-­‐1.4.4.bin__hadoop-­‐2.0.4-­‐alpha.tar.gz  tar  -­‐xzf  sqoop-­‐1.4.4.bin__hadoop-­‐2.0.4-­‐alpha.tar.gz  hadoop  fs  -­‐copyToLocal  s3://synerzip-­‐sqoop-­‐scripts/mysql-­‐connector-­‐java-­‐5.1.33.tar.gz  mysql-­‐connector-­‐java-­‐5.1.33.tar.gz  tar  -­‐xzf  mysql-­‐connector-­‐java-­‐5.1.33.tar.gz  cp  mysql-­‐connector-­‐java-­‐5.1.33/mysql-­‐connector-­‐java-­‐5.1.33-­‐bin.jar  sqoop-­‐1.4.4.bin__hadoop-­‐2.0.4-­‐alpha/lib/  

Page 6: Install Sqoop on Amazon EMR (Elastic Map Reduce)

Sqoop-­‐Import-­‐all.sh  !/bin/bash    cd  /home/hadoop/sqoop-­‐1.4.4.bin__hadoop-­‐2.0.4-­‐alpha/bin    ./sqoop    import  -­‐-­‐connect  jdbc:mysql://db.c5zzejm1gdnx.us-­‐west-­‐1.rds.amazonaws.com/test  -­‐-­‐username  root  -­‐-­‐password  password  -­‐-­‐table  User_Profile  -­‐-­‐target-­‐dir  s3://synerzip-­‐imported-­‐data/User_Profile-­‐`date  +"%m-­‐%d-­‐%y_%T"`  

Page 7: Install Sqoop on Amazon EMR (Elastic Map Reduce)

Step  2  –  MySQL  Database  

Page 8: Install Sqoop on Amazon EMR (Elastic Map Reduce)

User_Profile  Table  

Page 9: Install Sqoop on Amazon EMR (Elastic Map Reduce)

Step  3  –  Start  EMR  Cluster  

 s3://us-­‐west-­‐1.elasacmapreduce/libs/script-­‐runner/script-­‐runner.jar    S3://synerzip-­‐sqoop-­‐scripts/install-­‐sqoop.sh    

 s3://us-­‐west-­‐1.elasacmapreduce/libs/script-­‐runner/script-­‐runner.jar    S3://synerzip-­‐sqoop-­‐scripts/import-­‐sqoop-­‐all.sh    

Install-­‐Sqoop  Step  

Import  Sqoop  Step  

Page 10: Install Sqoop on Amazon EMR (Elastic Map Reduce)

Install  Sqoop  Step  

Jar  locaaon  -­‐  s3://us-­‐west-­‐1.elasacmapreduce/libs/script-­‐runner/script-­‐runner.jar  

Page 11: Install Sqoop on Amazon EMR (Elastic Map Reduce)

Import  Sqoop  

Jar  locaaon  -­‐  s3://us-­‐west-­‐1.elasacmapreduce/libs/script-­‐runner/script-­‐runner.jar  

Page 12: Install Sqoop on Amazon EMR (Elastic Map Reduce)

EMR  Steps    

Page 13: Install Sqoop on Amazon EMR (Elastic Map Reduce)

Step  4  –  See  Imported  Data  

Page 14: Install Sqoop on Amazon EMR (Elastic Map Reduce)

part-­‐m-­‐00000