Amazon Elastic [email protected] Conference Japan 2011 Fall

Click here to load reader

  • date post

    30-Jun-2015
  • Category

    Technology

  • view

    8.454
  • download

    1

Embed Size (px)

Transcript of Amazon Elastic [email protected] Conference Japan 2011 Fall

  • 1. Elastic MapReduce - AmazonHadoop -Amazon Data Service Japan Shinpei Ohtani

2. ( ) Twitter: @shot6 Facebook: facebook.com/shot6 Mail: [email protected] 3. Amazon Web Services(AWS)Big DataHadoopAmazon Elastic MapReduce(EMR)EMREMR 4. 3 E Amazon.co.jp Amazon Services & IT Amazon Web Services 5. Amazon Web Services(AWS) 10Amazon 6. The Living AWS Cloud Tools to access services Cross Service features High-level building blocks Low-level building blocks 7. Big data: () Big Data 8. / like/dont like 9. HadoopBig Data2 Apache Hadoop HDFS MapReduce Cost / TB is a fraction of traditional options PB / 10. Amazon Elastic MapReduce(EMR) 11. Amazon Elastic MapReduce Hadoop AWS MapReduce AWS Hadoop S3 S3 12. Amazon Elastic MapReduce(2) Big Data Hadoop Hadoop Hadoop AWS 13. EMRHadoop Hadoop 0.18 Hadoop 0.20 Pig 0.3 Pig 0.6 Hive 0.4 Hive 0.5/0.7 Cascading 1.1 Cascading 1.1 14. EMR 15. EMRAWSAmazon EC2 EMRMasterCoreAmazon S3 Web 99.999999999% EMRSimpleDB AmazonNoSQL EMR 16. EMR Amazon S3 Amazon S3 Input DataOutput Data TaskAmazon Elastic Node MapReduceAmazon SimpleDBMapReduceCode/ Master TaskServiceMetadataScripts HiveQLNode NodePig LatinCascading Core HiveQL Node Pig Latin CoreNode HDFSBI AppsJDBC Amazon Elastic MapReduce ODBC Hadoop Cluster 17. AWS REST API 18. EMR: : Job FlowJob Flow Job Flow4925 14 Hours 7 Hours3 Hours 19. EMR: / : ( vs ) ()()()925 9 20. EMR + Spot EMR EMRSpot Spot 21. + 1 $56 $10/$0.03 $0.01 $0.005 $0.105 AWS EC2 22. SpotEC2 EC2 EMR AWS 23. M1.XLARGEAmazon EC2 ()$0.60 24. EMR: Spot=: m2.xlarge 4 5 Job FlowJob Flow4 instances *14 hrs * $0.50 = $28Spot4 54 instances *7 hrs * $0.50 = $13 +5 instances * 7 hrs * $0.25 = $8.75Total = $21.7514 7: 50%: ~22% 25. US AWS HadoopHive 0.7 HAVINGIN S3 26. EMR/BI ()()Web 27. EMR 28. Razorfish Razorfish 35, 7100, 170 (170) EMRS3 100 28 29. Razorfish - EMR SAN/30/SQL3 40,000,000 2 48 EMR EMR/S3/Cascading 0 100() 0(6) 8ROAS()500% 30. Razorfish -Aggregate Log File ExportAPIsAd Serving dataFiles Internet ClientData SourcesProvidedData Presentation Layer Direct Analytics Processing viaWeb Application Layer Talend Data Flow Manager EMR Cache EdgeOLAPProvisioningDB ODBCCloud Storage S3Elastic MapReduceHBase/SDB 31. Sonet 110GB3.65TB 15TBS3EMR EMR+S350(600) 201 50% 32. SonetEMR 33. EMRGUI: BI MicroStrategy, Pentaho Datameer, Karmasphere, Quest Beeswax 34. EMR 35. Q.Hadoop vs EMR EMR=(EC2/S3) (Hadoop) Hadoop 36. Q.Hadoop Hadoop HDFS HDFS(HDFS) EMRS3 37. S3Peak Requests: 449 Billion290,000+per second 262 Billion 102 Billion40 Billion2.9 Billion14 BillionQ4 2006Q4 2007Q4 2008Q4 2009 Q4 2010 Q2 2011Total Number of Objects Stored in Amazon S3 38. Q.Hadoop EMRHadoop Hadoop 0.18.3/0.20.2 BootstrapAction EMR 39. Q.AWS 200027.6AWSEMR 20 http://aws.amazon.com/jp/contact-us/ec2-request/ 40. Beyond Hadoop Hadoop HadoopIN/OUT / Hadoop Hadoop 41. AWSBig Data Enterprise Stack 42. () Batch Tier Speed TierS3Hadoop HBaseEMRSimpleDB Cassandra MongoDB RDBMS 43. HadoopEMRHadoopAWS () Hadoop S3 44. http://aws.amazon.com/jp/