CH7: Databases with AWS - wmich.edu · 7/20/17 4 Amazon RDS – Amazon Aurora • Aurora is...

8
7/20/17 1 CH7: Databases with AWS By: Dan Peekstok Overview Quick Database overview Amazon RelaFonal Database Service (RDS) Amazon RedshiJ Amazon DynamoDB Database Primer – RelaFonal Databases Most common database system Must have a primary key and predefined data types Defined as either Online TransacFon Processing (OLTP) or Online AnalyFcal Processing (OLAP) OLTP is used for frequent read/writes such as an applicaFon database OLAP is designed for data warehouses or analyzing large data sets Amazon RDS handles both easily

Transcript of CH7: Databases with AWS - wmich.edu · 7/20/17 4 Amazon RDS – Amazon Aurora • Aurora is...

Page 1: CH7: Databases with AWS - wmich.edu · 7/20/17 4 Amazon RDS – Amazon Aurora • Aurora is Amazon’s own database system that offers the simplicity of an open source database and

7/20/17

1

CH7:DatabaseswithAWS

By:DanPeekstok

Overview

•  QuickDatabaseoverview•  AmazonRelaFonalDatabaseService(RDS)

•  AmazonRedshiJ

•  AmazonDynamoDB

DatabasePrimer–RelaFonalDatabases

• Mostcommondatabasesystem

• Musthaveaprimarykeyandpredefineddatatypes

•  DefinedaseitherOnlineTransacFonProcessing(OLTP)orOnlineAnalyFcalProcessing(OLAP)•  OLTPisusedforfrequentread/writessuchasanapplicaFondatabase•  OLAPisdesignedfordatawarehousesoranalyzinglargedatasets

•  AmazonRDShandlesbotheasily

Page 2: CH7: Databases with AWS - wmich.edu · 7/20/17 4 Amazon RDS – Amazon Aurora • Aurora is Amazon’s own database system that offers the simplicity of an open source database and

7/20/17

2

DatabasePrimer–DataWarehouses

•  Datacancomefrommanysources

•  UsuallyaspecifictypeofarelaFonaldatabasethatusesOLAPforanalysisandreporFng

•  AmazonRedshiJisahigh-performancedatawarehousedesignedspecificallyforOLAPusecases

•  CanalsobecombinedwithatradiFonalAmazonRDSdatabasefortransacFons

DatabasePrimer–NoSQL

•  HavegoWenmorepopularrecentlyduetoflexibilityandperformance

•  EasiertoscalebeyondasingleserverthantradiFonalrelaFonaldatabases•  AmazonDynamoDBisusefultohelpcreateandscaleaNoSQLdatabase

AmazonRDS–OverviewofFeatures

•  ThemainideaofRDSistoletthesetupandmaintenanceofthedatabasebeeasysoyoucanfocusontheapplicaFoninstead

•  RDSinstancesaren’tcreatedlikeEC2instances.Youdon’tgetshellaccesstotheinstance,justtheSQLendpoint.

•  However,youcanusetheAWSCLItomanagealotoftheservicesfromyourowncomputer,suchascreaFng,modifying,ordroppingadatabase

Page 3: CH7: Databases with AWS - wmich.edu · 7/20/17 4 Amazon RDS – Amazon Aurora • Aurora is Amazon’s own database system that offers the simplicity of an open source database and

7/20/17

3

AmazonRDS–InstanceSizes

•  Databaseinstancesizes•  Smallestisadb.t2.microinstancewith1virtualCPUand1GBofRAM

•  Largestisadb.r3.8xlargeinstancewith32vCPUsand244GBofRAM

•  Databasememory•  Cangetdatabasesrangingfrom5GBupto64TBofstorage

•  YoucanchangethesizeofyourinstanceoverFmeshouldyourneedschange.ThismakesscalingasmallprogramintoalargeproducFoneasy•  Somedatabaseenginesevenallowforscalingacrossmanyserverssoeachcanrunaninstanceofthedatabasethatupdatestherest

AmazonRDS

•  Comparisonofwhoisresponsiblefordifferentaspectsofthedatabasewithdifferentservices

AmazonRDS–DatabaseEngines

•  SupportedDatabaseEngines•  Mysql

•  PostgreSQL•  MariaDB

•  Oracle•  MicrosoJSQLServer

•  AmazonAurora

•  ForOracleandMicrosoJSQLServer,youhavetochosealicenseopFon•  LicenseIncludedcostsmorebutAmazondealswiththelicenseforyou

•  BringYourOwnLicense(BYOL)meansyouhavetopurchaseandmanagethelicense

Page 4: CH7: Databases with AWS - wmich.edu · 7/20/17 4 Amazon RDS – Amazon Aurora • Aurora is Amazon’s own database system that offers the simplicity of an open source database and

7/20/17

4

AmazonRDS–AmazonAurora

•  AuroraisAmazon’sowndatabasesystemthatoffersthesimplicityofanopensourcedatabaseandthetechnologyofanenterprise-gradedatabase

•  BuiltoffofMySQLwithamoreserviceorientedapproach

•  Becauseofthat,itscompaFblewithanyMySQLinterface

•  Duetothebehindthesceneschanges,itcandeliverupto5FmestheperformanceofMySQLwithoutanychangesfromtheuserorapplicaFon

• WhenanAurorainstanceiscreated,youcreateaclusterinstead.ThisallowsforacopyofthedatabaseoneachofmanydifferentAvailabilityZones

AmazonRDS–OpFons

•  StorageOptions•  MagneticDisk:cheap,poorestperformance,andsmallersizes

•  GeneralPurposeSSD:moderatecost,betterperformance,andlargersizes

•  ProvisionedIOPS:expensive,bestperformance,samesizesasgeneralpurposessds

•  Formostapplications,generalpurposessdsarethebestchoice.Theyoffergoodperformanceandlowercosts

AmazonRDS–BackupandRecovery

• MostprojectshaveeitheraRecoveryPointObjecFveorRecoveryTimeObjecFve.Thesespecifywhatisacceptablefordataloss

•  AmazonoffersAutomatedBackupsforyou.ThesewillbackuptheenFreDBinstance,notjustthedata.YoucanchangehowoJenitbacksuptheinstance

•  InaddiFon,AmazonoffersManualDBSnapshotsthatcanbedoneatanyFmethroughtheAWSCLI.

•  RestoringfromeitherasnapshotorabackupisveryeasythroughtheAWSconsole

Page 5: CH7: Databases with AWS - wmich.edu · 7/20/17 4 Amazon RDS – Amazon Aurora • Aurora is Amazon’s own database system that offers the simplicity of an open source database and

7/20/17

5

AmazonRDS–UsingMulti-AZ

• Multi-AZisawaytohelpwithdatarecoverybycopyingyourdatabaseacrossotheravailabilityzonestohelpmeetRPOandRTOtargets

•  Youwillhaveonemasterdatabaseandanumberofslaves.Yourapplicationwillupdatethemasterwhichwillthenupdatetheslaves

•  Sincethedatabaseisclonesacrossavailabilityzones,ifonegoesdown,youstillhaveaccesstotheothersforrecovery

AmazonRedshiJ

•  Petabytescaledatawarehouseservice•  RelaFonaldatabasedesignedforOLAPscenarioswithafocusonperformance

•  BasedonPostgreSQL,soexisFngSQLclientscaneasilyuseitwithafewsmallchanges

•  CancreateSnapshotsjustlikewithAmazonRDS

•  UsesIAMusersforprimarysecuritywithnetworksecuritytopreventIPaddressesfromconnecFng

AmazonRedshiJ–

ClustersandNodes

•  ApplicaFonsonlyinteractwiththeleadernode.Thattalkswiththeothernodes

•  6typesofnodesareavailablein2categories:•  DenseCompute

•  DenseStorage

Page 6: CH7: Databases with AWS - wmich.edu · 7/20/17 4 Amazon RDS – Amazon Aurora • Aurora is Amazon’s own database system that offers the simplicity of an open source database and

7/20/17

6

AmazonRedshift–Tables

•  UsesstandardtablecreaFonstatementslikeCREATE TABLEandalsoaddsopFonssuchascompressionencoding,distribuFonstrategy,andsortkeys

•  Compressionencodingishowtocompressthedata.Itcandoitonitsownoryoucanspecifyhow

•  DistribuFonStrategyishowtodistributethedataacrossthenodes•  SortKeysiswhatcolumnscanbesortedmakingsearchandsorFngonthatkeyfaster

AmazonRedshiJ–Data

•  RedshiJaddsanewcommandCOPYthatisfasterforloadingdataintoatable

•  Builttoloaddatainfrombulkfilesandcanreadmanyfilesatonce

•  AJerloadingdata,runVACUUMtoreorganizethedataandreclaimspacefromdeletes

•  SupportsstandardSELECTstatementsforqueries

•  CanuseCloudWatchtoanalyzeperformance

AmazonDynamoDB

•  ProvidesaNoSQLdatabasewithfast,lowlatencyperformancethatiseasytoscaleverFcallyandhorizontally•  SinceitismanagedbyAmazon,youdon’thavetoworryaboutscalingit,justusingit•  EachitemhasaprimarykeyandoneormoreaWributes•  Eachitemisakey/valuepair

•  NodatatypesontheaWributes•  Maximumof400KBperitem

•  ConnectviaHTTP/SendpointandprovideJSONdataforread/writes

Page 7: CH7: Databases with AWS - wmich.edu · 7/20/17 4 Amazon RDS – Amazon Aurora • Aurora is Amazon’s own database system that offers the simplicity of an open source database and

7/20/17

7

AmazonDynamoDB–PrimarykeysandDataTypes

•  Eachprimarykeymusthaveadatatype

•  PrimaryKeysareoneof2opFons•  PrimaryKeyisjusttheonefieldthatishashed

•  PrimaryKeyandSortKeydefinestwoitemsthatareusedlikeadoubleprimarykey.ItemsmusthaveauniquecombinaFonofthetwovalues

•  Candefinesecondaryindexesthatcanbeusedforlookupsoneachtable

AmazonDynamoDB–ReadingandWriFngData

•  MostusersuseAmazon’sprovidedSDKsforusewithapplicaFons•  4mainoperaFons•  PutItemcreatesanewitemorupdatesthatitemifitexists

•  UpdateItemchangestheaWributesofanexisFngitem

•  DeleteItemremovesanitemfromthetable

•  GetItemretrievesanitembasedontheprovidedprimarykey

•  DynamoDBisreferredtoaseventuallyconsistentmeaningthatareadjustaJerawritemaynotshowthedata•  ThisisduetohowthedataisspreadacrossmulFpleservers

•  AStronglyConsistentopFonisavailablethatwillforceupdatesbeforethenextquerybutitcostssomeefficiency

AmazonDynamaDB–ScalingandParFFoning

•  Eachtable’sitemsarestoredonmulFpleparFFonsforfasterreads

• WhichparFFonthedataisstoredonisbasedonthehashedprimarykey

•  EachparFFoncanholdupto10GBofdataanditautomaFcallymakesmoreparFFonsbasedonneedandsplitthecurrentdata

•  EachparFFonhasroomforusagespikessoifthedataisneededfasteratonepoint,itcandothat

Page 8: CH7: Databases with AWS - wmich.edu · 7/20/17 4 Amazon RDS – Amazon Aurora • Aurora is Amazon’s own database system that offers the simplicity of an open source database and

7/20/17

8

AmazonDynamoDB–DynamoDBStreams

• ManyNoSQLapplicaFonsneedtotracknewitemsorchangesforaddiFonaldataprocessing

•  DynamoDBprovidesaccesstoaDynamoDBStreamthatwillshowalistofallthechangestoitemsinthelast24hours.Thisprovideseasyaccesstoalistofrecentchangesforprocessing

•  Streamscanbeenables,disabled,andaccessedfromtheAWSConsole,CLI,andSDKs