Data Antipatterns

Post on 18-Dec-2014

332 views 1 download

description

Are you running a database in the cloud? Worried that you're doing it wrong? Engine Yard supports a broad set of databases with flexibility for customers to modify and configure. However, freedom to adapt and extend standard functionality comes with unexpected negative consequences: modifications can seriously affect durability and performance. I've observed common problems, patterns and best practices with big (and not so big) data. I'll highlight the most common pitfalls and discuss how to avoid them. Video for this talk is available here: http://vimeo.com/83755776

Transcript of Data Antipatterns

ANTI PATTERNS

DATA

ines @ Engine Yard.com @Randommood

And I’m a happy dog!

I N E SS O M B R A

I work with Databases

Engine Yard

ZOMG, the horror!

.BACKUPSyes, we are going there

“I know you. You know you. And I know you

know that I know you”

White Goodman (no relationship to White October)

Boring Definition #1

Backups

Copy and archiving of data

Goal is to restore the state of a DB

Many types - blah

Anti-Pattern #1

Taking too many

backups

Not free, they requires resources

Full backup every hour, really? What about backup retention?

Anti-Pattern #2

Taking too few

backups

Enough to minimize the risk of data loss due to corrupted backup files

yes,����������� ������������������  this����������� ������������������  totes����������� ������������������  happens!

The untested backup

Anti-Pattern #3

Doing backups right

Logically test

backups

Errorless restore is not enough. Test logical data too

Doing backups right

Know your types &

tools

Take logical and binary backups

Continuous archiving & hot backup utilities

Doing backups right

Practice restores

Backups alone do not constitute DR. Have a plan & practice it

Server extensions and configuration matter when restoring

“I want a ridiculously

good looking

Database”Derek Zoolander

(honestly, Ben Stiller rules)

Obvious statement #1

Many DB choices

Cargo culting your

database

Anti-Pattern #4

Failure to understand use case, strengths & weaknesses of a new database

RDBMS for Session

Data

Anti-Pattern #5 Often means at least one write per request

Any DB issue/task may cause app to hang

Tables have a tendency to bloat

Modeling, it’s all the same

Anti-Pattern #6

Data Model

Consistency needs

Availability needs

Scaling needsOperational story & cost

Doing it right

Know your needs

Doing it right

Spike it, forealsies

Spike it with your data and traffic. Best way to gain operational experience

Doing it right

Leverage new

features

Relational databases are getting quite versatile

Evaluate clustered MySQL options

We have a cloud deployment!Happy team on shipping day, lmfao if you don’t celebrate like this

Cloud-based databases,

they are real

Obvious statement #2 Databases can live in the cloud quite well

Many IaaS, PaaS, & DBaaS options

Easy to get started & may be economical

Where did my instance go?

Anti-Pattern #7

Anti-Pattern #8

Cloud, it’s just like

hardware

It’s not. Cloud resources are virtualized

Capacity planning and monitoring matter. A lot

Anti-Pattern #9

Shit doesn’t happen

You are not immune to infrastructure failures. Plan for it

Anti-Pattern #10

Storage is the same

Instance storage is not persisted (use EBS)

Data locality matters

Don’t run your cloud DBs too hot!

Doing cloud right

Know your cloud

deployments

Replication in the cloud is a must-have

Put DB master & replicas in different AZs

Doing cloud right

Learn high availability &

disaster recovery

Get good at replica promotions (some work involved)Understand and invest in DR/HA. Know your options

Doing cloud right

Know your system

Invest in monitoring

Know your data distribution & querying patterns

Know baseline behavior

Questions?