AWS Webcast - Cost and Performance Optimization in Amazon RDS
from AWS RDS to EC2 PostgreSQL migration -...
Transcript of from AWS RDS to EC2 PostgreSQL migration -...
● Technology lover● Worked as Software Engineer, Team lead, DevOps, DBA, Data analyst● Sr. Tech Architect at Coverfox● Email me at [email protected]● Tweet me at @hitul007
“Everything is possible but, it takes time”- Hitul Mistry
● Database Evolution● Self hosted database● AWS RDS● Why we migrated from Database As Service to self hosted database● Challenges in migration● How we planned migration and point to be noted● Current Architecture● Demo
● Simple GUI● HA, Multi AZ. Encryption, Backup, Recovery, Disaster recovery,
Security, Compliance● SLAs● Performance optimizations by self● GUI for version upgrades
AWS RDS
● Postgresql DB functionality is similar to postgresql
● Cannot install extra extensions other then provided
● Cannot do replication to self hosted server● Cannot install custom plugin to logical
decoding● Cannot install custom data-types● Upgrades can be done at few clicks on
GUI
Self hosted PostgreSQL
● Postgresql DB functionality will be as postgresql original behaviour
● 100% control on functionality● Upgrade needs to be done
manually
AWS RDS
● HA, Fault Tolerance, Disaster recovery, Backups can be implemented at few GUI clicks
● You will have to monitor common parameters when postgresql can crash. It can crash when disk is full or CPU usage is high or other parameters. Postgresql will be auto restarted, disk is full.
● SLA
Self hosted PostgreSQL
● HA, Fault Tolerance, Disaster recovery, Backups can be implemented by configuring postgresql by self
● We will have to monitor all the threats which can occur and be ready to fix those things
● Upgrades can be done by self● SLA
AWS RDS
● Everything is available in GUI● Postgresql usage knowledge and some
architectural knowledge required● AWS controlled performance tuning; not
use-case dependent
Self hosted PostgreSQL
● Postgresql Expert knowledge required
● Everything needs to be done by setting up configs
AWS RDS
● Few performance parameters can be tuned via GUI
● Cannot use custom hardware for performance
Self hosted PostgreSQL
● Performance can be tuned via basic postgresql config and also other parameters like kernel, os, disk etc.
● Lot of performance parameters available to tune as per the application
AWS RDS
● To identify fault in postgresql you will be provided GUI where all the postgresql logs will be printed.
● New version upgrades can be done with few clicks
● Cannot go deep beyond DAAS service provides
Self hosted PostgreSQL
● Can directly see logs of postgresql● Can go deep as much want to go
AWS RDS
● Operating environment cannot be changed
Self hosted PostgreSQL
● It can be moved to any operating environment
● Cost to scale vertically or horizontally on aws is high● Many open-source plugins required by the application cannot be
installed on RDS● New Logical decoding plugin for replication or other use-cases is
not supported by RDS● AWS takes time to upgrade to latest postgres versions● Almost zero downtime server upgrades possible with self hosting● Database auto scaling● Performance tuning as per application needs
Instance Type CPU RAM(GiB) Pricing/Yr
m4.2xlarge 8 32 $ 9014.04
m4.4xlarge 16 64 $ 18045.6
m4.10xlarge 40 160 $ 45122.76
AWS RDS Cost (On Demand)
Instance Type CPU RAM(GiB) Pricing/Yr
m4.2xlarge 8 32 $ 3679.2
m4.4xlarge 16 64 $ 7358.4
m4.10xlarge 40 160 $ 18396
AWS EC2 Cost (On Demand)
● If you we buy multi-AZ setup then cost will be doubled● Reserved instance can save cost from 12-64%● Rack servers and different cloud infrastructures usage for cost cutting● Zero Downtime Upgrades
Migration required extra hands, but self-hosted maintenance has not increased load on DevOPs team!
● RDS supports limited plugins. Just now they added wal2json.
Database Operation
INSERT INTO data(data) VALUES('1');
INSERT INTO data(data) VALUES('2');
Format inserts
table public.data: INSERT: id[integer]:1 data[text]:'1'
table public.data: INSERT: id[integer]:2 data[text]:'2'
BEGIN 89283
table public.core_tracker: UPDATE: id[bigint]:63899671 session_key[text]:'w84fhz6c8b5jpc1ufesnegbxrfmnehh8' user_id[bigint]:23573
extra[text]:'{"h":100,"no_show":true}' created[timestamp with time zone]:'2018-01-05 16:03:23.654652+05:30' fd_id[integer]:null
COMMIT 89283
BEGIN 89285
table public.core_tracker: UPDATE: id[bigint]:63899671 session_key[text]:'w84fhz6c8b5jpc1ufesnegbxrfmnehh8' user_id[bigint]:23573 extra[text]:'{"h":100,"no_show":true,"hello-world":{"sfs":"sdf\"2''3"}}'
created[timestamp with time zone]:'2018-01-05 16:03:23.654652+05:30' fd_id[integer]:null
COMMIT 89285
● 5M unique quotes a month● 45M unique quotes from insurance companies ● 5GB write on DB and logs combined
● Most Reliable● We cannot access pg_hba.conf● You don’t have enough permission to execute pg_start_backup
● Functions, Indexes, Constraints, are not migrated● JSON considered as CLOB and truncated values● Varchar, character varying values truncated● DDL ignored● Does not replicate partitioned tables
● All tables must have modified date● All the tables must have primary key but, some tables had non non numeric
primary keys
● Disable foreign key validations on self hosted postgresql db● Create triggers on AWS RDS database● Take backup of postgreSQL RDS
- MongoDB Schema { "table_name":"schema_name.table_name", "primary_key":"{primary_key_value}", "created_at":"timestamp", "operation":"Insert/Update/Delete"
}
● Reset sequence● Enable foreign key validations● Stop AWS RDS instance● Run basic Sanity scripts which will verify data on sample● Stop website and it can be opened from internal users only● Run QA tests● Take backup of AWS RDS
● How much downtime can be accepted ?● SLA● What is the worst thing that can happen ?● Services which can be affected ?● How soon we can recover ?● How much data will be lost and can be recovered ?● What will be long term ROI ?● What data will be affected ?
● Plan should be like steps to execute● Execute plan once on sandbox environment before going live● Plan should include rollback strategy
● High availability● Fault tolerance● Disaster recovery● Backup & Recovery● Hardware & Software updates● Security● Monitoring● Testing● Rollback● Compliance
● Promote master
pg_ctl promote -D /data-dir-path/
● Add cluster to PostgreSQL
rm -rf /data-dir-path/ && pg_basebackup ….
● Run pg_rewind after promote
pg_rewind -D /data-dir-path --source-server=... host=...
● Service discovery is the automatic detection of devices and services offered by these devices on a computer network. - Wikipedia
● service/coverfox/optime/leader○ Leader node name
● service/coverfox/members/master-a○ Member of cluster
● service/coverfox/members/master-b○ Member of cluster
● Errors○ data corruption○ system failure (including hardware failure)○ human error○ natural disaster
● Tool for disaster recovery, backups and recovery by 2ndcondrant● Remote backup● Remote restore● WAL Logs recovery
Barman Backup
/usr/bin/barman backup --jobs 6 mumbai-master-a
Barman restore db
barman recover --target-time "2017-12-15 22:22:00" --remote-ssh-command "ssh [email protected]" mumbai-master-c 20171214T190201 /pg-data-dir-path/ -j 10
What to monitor ?
● RAM/CPU Usage● DIskIO● Process info● Bandwidth● Vacuum running● DB Space● Active Connections● Active Transactions● Open files● Replication diff● PgBouncer client connections● PgBouncer stats