Advanced Data Migration Techniques for Amazon RDS (DAT308) | AWS re:Invent 2013
-
Upload
amazon-web-services -
Category
Technology
-
view
5.568 -
download
6
description
Transcript of Advanced Data Migration Techniques for Amazon RDS (DAT308) | AWS re:Invent 2013
© 2013 Amazon.com, Inc. and its affiliates. All rights reserved. May not be copied, modified, or distributed in whole or in part without the express consent of Amazon.com, Inc.
DAT308 – Advanced Data Migration
Techniques for Amazon RDS
Shakil Langha, Abdul Sathar Sait, Bharanendra Nallamotu
Amazon Web Services
November 13, 2013
Next 60 minutes …
• What is new in Amazon Relational Database
Service
• Types of data migration
• General considerations
• Advanced migration techniques for Oracle
• Near-zero downtime migration for MySQL
Amazon
RDS
Amazon RDS Recent Releases
Oracle Transparent Data Encryption
MySQL 5.6
MySQL Replication to RDS
CR1.8XLarge for MySQL 5.6
Oracle Statspack
Cross-region Snapshot Copy
One-Time Migration
Periodic Migration
Ongoing Replication
General Considerations
RDS Pre-Migration Steps
• Stop applications accessing the DB
• Take a snapshot
• Disable backups
• Use Single-AZ instances
• Optimum instance for load performance
• Configure security for cross-DB traffic
RDS Post-migration Steps
• Turn on backups
• Turn on multi-az
• Create read replicas
• Tighten down security
• Notifications via Amazon CloudWatch, DBEvents
Data Migration Approaches for
Oracle
Let’s Move Some Data
500 GB
How about
Migration Process
Data Pump Export
expdp demoreinv/demo full=y dumpfile=data_pump_exp1:reinvexp1%U.dmp,
data_pump_exp2:reinvexp2%U.dmp, data_pump_exp3:reinvexp3%U.dmp
filesize=20G parallel=8 logfile=data_pump_exp1:reinvexpdp.log
compression=all job_name=reInvExp
Data Pump Export Start
Data Pump Export Done
Data Pump Files
Compression Makes 500 GB to 175 GB
57+62+56 = 175 GB
Upload Files to EC2 Using UDP
$ yum -y install make
yum -y install automake
yum -y install gcc
yum -y install autoconf
yum -y install cvs
wget http://sourceforge.net/projects/tsunami-udp/files/latest/download?_test=goal
tar -xzf tsunami*gz
cd tsunami-udp*
./recompile.sh
make install
Install Tsunami on both the source database server and the Amazon EC2 instance
Open port 46224 for Tsunami communication
Using UDP Tool Tsunami
$ cd/mnt/expdisk1
$ tsunamid *
On the source database server, start Tsunami server
$ cd /mnt/data_files
$ tsunami
$ tsunami> connect source.db.server
$ tsunami> get *
On the destination database server, start Tsunami server
Export and Upload in Parallel
• No need to wait till all 18 files are done to start upload
• Start upload as soon as the first set of 3 files are done
Total time to upload 175 GB
Transfer Files to Amazon RDS DB instance
Amazon RDS DB instance has an externally accessible Oracle Directory
Object DATA_PUMP_DIR
Use a script to move data files to Amazon RDS DATA_PUMP_DIR
Perl Script to Transfer Files to DB Instance # RDS instance info
my $RDS_PORT=4080;
my $RDS_HOST="myrdshost.xxx.us-east-1-devo.rds-dev.amazonaws.com";
my $RDS_LOGIN="orauser/orapwd";
my $RDS_SID="myoradb";
my $dirname = "DATA_PUMP_DIR";
my $fname = $ARGV[0];
my $data = “dummy";
my $chunk = 8192;
my $sql_open = "BEGIN perl_global.fh := utl_file.fopen(:dirname, :fname, 'wb', :chunk); END;";
my $sql_write = "BEGIN utl_file.put_raw(perl_global.fh, :data, true); END;";
my $sql_close = "BEGIN utl_file.fclose(perl_global.fh); END;";
my $sql_global = "create or replace package perl_global as fh utl_file.file_type; end;";
my $conn = DBI->connect('dbi:Oracle:host='.$RDS_HOST.';sid='.$RDS_SID.';port='.$RDS_PORT,$RDS_LOGIN, '') || die ( $DBI::errstr . "\n") ;
my $updated=$conn->do($sql_global);
my $stmt = $conn->prepare ($sql_open);
Perl Script to Transfer Files to DB Instance $stmt->bind_param_inout(":dirname", \$dirname, 12);
$stmt->bind_param_inout(":fname", \$fname, 12);
$stmt->bind_param_inout(":chunk", \$chunk, 4);
$stmt->execute() || die ( $DBI::errstr . "\n");
open (INF, $fname) || die "\nCan't open $fname for reading: $!\n";
binmode(INF);
$stmt = $conn->prepare ($sql_write);
my %attrib = ('ora_type,24);
my $val=1;
while ($val > 0) {
$val = read (INF, $data, $chunk);
$stmt->bind_param(":data", $data , \%attrib);
$stmt->execute() || die ( $DBI::errstr . "\n") ; };
die "Problem copying: $!\n" if $!;
close INF || die "Can't close $fname: $!\n";
$stmt = $conn->prepare ($sql_close);
$stmt->execute() || die ( $DBI::errstr . "\n") ;
Transfer Files as They Are Received
• No need to wait till all 18 files are received in the EC2 instance
• Start transfer to RDS instance as soon as the first file is
received.
Total time to Transfer Files to RDS
Import data into the Amazon RDS
instance
• Import from within Amazon RDS instance using DBMS_DATAPUMP
package
• Submit a job using PL/SQL script
Import Data into the RDS DB Instance declare
h1 NUMBER;
begin
h1 := dbms_datapump.open (operation => 'IMPORT', job_mode => 'FULL', job_name => 'REINVIMP', version => 'COMPATIBLE');
dbms_datapump.set_parallel(handle => h1, degree => 8);
dbms_datapump.add_file(handle => h1, filename => 'IMPORT.LOG', directory => 'DATA_PUMP_DIR', filetype => 3);
dbms_datapump.set_parameter(handle => h1, name => 'KEEP_MASTER', value => 0);
dbms_datapump.add_file(handle => h1, filename => 'reinvexp1%U.dmp', directory => 'DATA_PUMP_DIR', filetype => 1);
dbms_datapump.add_file(handle => h1, filename => 'reinvexp2%U.dmp', directory => 'DATA_PUMP_DIR', filetype => 1);
dbms_datapump.add_file(handle => h1, filename => 'reinvexp3%U.dmp', directory => 'DATA_PUMP_DIR', filetype => 1);
dbms_datapump.set_parameter(handle => h1, name => 'INCLUDE_METADATA', value => 1);
dbms_datapump.set_parameter(handle => h1, name => 'DATA_ACCESS_METHOD', value => 'AUTOMATIC');
dbms_datapump.set_parameter(handle => h1, name => 'REUSE_DATAFILES', value => 0);
dbms_datapump.set_parameter(handle => h1, name => 'SKIP_UNUSABLE_INDEXES', value => 0);
dbms_datapump.start_job(handle => h1, skip_current => 0, abort_step => 0);
dbms_datapump.detach(handle => h1);
end;
/
Total Time to Import Data into Amazon RDS
Time Taken to Migrate the Database
Optimize the Data Pump Export
• Reduce the data set to optimal size, avoid
indexes
• Use compression and parallel processing
• Use multiple disks with independent I/O
Optimize Data Upload
• Use Tsunami for UDP-based file transfer
• Use large Amazon EC2 instance with SSD or PIOPS
volume
• Use multiple disks with independent I/O
• You could use multiple Amazon EC2 instances for parallel
upload
Optimize Data File Upload to RDS
• Use the largest Amazon RDS DB instance possible during
the import process
• Avoid using Amazon RDS DB instance for any other load
during this time
• Provision enough storage in the Amazon RDS DB
instance for the uploaded files and imported data
Periodic Upload
• Oracle data pump network mode
• Oracle materialized views
• Custom triggers
For Small Dataset, one time load
• Oracle Import/Export
• Oracle Data Pump network mode
• Oracle SQL*Loader
• Oracle materialized views
Data Migration Approaches for
MySQL
Importing from a MySQL DB Instance
DB
Application
Staging
area AWS Region
Replication
Application
mysqldump scp
Tsunami UDP
Load data
Staging server
Importing from a MySQL DB Instance
Importing from a MySQL DB Instance
Check the Size of the Master Database
Create the Backup File
Create a DB Instance for MySQL and EC2
PROMPT>rds-create-db-instance mydbinstance -s 1024 -c db.m3.2xlarge -e MySQL - u
<masterawsuser> -p <secretpassword> --backup-retention-period 3
Create DB instance for MySQL using AWS Management Console or CLI
aws ec2 run-instances --image-id ami-xxxxxxxx --count 1 --instance-type m3.2xlarge --key-name
MyKeyPair --security-groups MySecurityGroup
Create Amazon EC2 (Staging server) using AWS Management Console or CLI
mysql> GRANT SELECT,REPLICATION USER,REPLICATION CLIENT ON *.* TO
repluser@‘<RDS Endpoint>' IDENTIFIED BY ‘<password>';
Create replication user on the master
Update /etc/my.cnf on the Master Server
[mysqld]
server-id = 1
binlog-do-db=mytest
relay-log = /var/lib/mysql/mysql-relay-bin
relay-log-index = /var/lib/mysql/mysql-relay-bin.index
log-error = /var/lib/mysql/mysql.err
master-info-file = /var/lib/mysql/mysql-master.info
relay-log-info-file = /var/lib/mysql/mysql-relay-log.info
log-bin = /var/lib/mysql/mysql-bin
Enable MySQL binlog
This enables bin logging, which creates a file recording the changes that have
occurred on the master, which the slave uses to replicate the data.
Configure the Master Database
$ sudo /etc/init.d/mysqld start
Restart the master database after /etc/my.cnf is updated
$ mysql -h localhost -u root -p
mysql> show master status\G
*************************** 1. row ***************************
File: mysql-bin.000023
Position: 107
Binlog_Do_DB: mytest
Binlog_Ignore_DB:
1 row in set (0.00 sec)
Record the “File” and the “Position” values.
Upload Files to Amazon EC2 using UDP
• Tar and compress MySQL dump file preparation to
ship to Amazon EC2 staging server.
• Update the Amazon EC2 security group to allow
UDP connection from the server where the dump file
is being created to your new MySQL client server.
• On the Amazon EC2 staging instance, untar the
tar.tgz file.
Configure the Amazon RDS database
mysql> create database bench;
Create the database
Mysql> load data local infile '/reinvent/tables/customer_address.txt' into table customer_address
fields terminated by ',';
Mysql> load data local infile '/reinvent/tables/customer.txt' into table customer fields terminated by ',';
Import the database that you previously exported from the master database
mysql> call mysql.rds_set_external_master(‘<master
server>',3306,‘<replicationuser>',‘<password>','mysql-bin.000013',107,0);
mysql> call mysql.rds_start_replication;
Configure the slave DB instance for MySQL, and start the slave server
Amazon RDS for MySQL replication status
Make Amazon RDS DB Instance the Master
Switch over to the RDS DB instance – Stop the service/application that is pointing at the Master
Database
– Once all changes have been applied to New RDS
Database. Stop replication with “call mysql.rds_stop_replication”
– Point the service/application at the New RDS Database.
– Once Migration is complete. “call mysql.
rds_reset_external_master”
Please give us your feedback on this
presentation
As a thank you, we will select prize
winners daily for completed surveys!
DAT308
References
• RDS Home Page
• RDS FAQs
• RDS Webinars
• RDS Best Practices
• Data Import Guides – MySQL
– Oracle
– SQL Server