Practical Distributed Processing Using MySQL Built-In Functionality Presentation

46
Practical Distributed Processing Using MySQL Built-In Functionality Bob Burgess radian 6 Technologies MySQL User Conference 2010

Transcript of Practical Distributed Processing Using MySQL Built-In Functionality Presentation

Page 1: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 1/46

Practical Distributed

ProcessingUsing MySQL Built-In Functionality 

Bob Burgess – radian6 Technologies

MySQL User Conference 2010

Page 2: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 2/46

Who Am I

Database Analyst, radian6

Several Years Experience:

◦ Sybase, Oracle, MySQL

◦ AIX, Linux

◦ programming

Page 3: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 3/46

Scope

MySQL concepts used in this system

How I solved our particular problem

Practical set-up of our distributed

system

Page 4: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 4/46

Scope – NOT

Complete course on distributedprocessing

Complete coverage of related MySQL

functionality

Page 5: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 5/46

The Problem

“Influencer” calculation: complex andgetting more complex

More and more data

More and more customers and profiles

Not scalable to run the stored

procedure on just the main database

Page 6: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 6/46

Design Goals

Move calculation off of main database

Keep existing stored procedure

No single controller

Horizontally scalable!

Page 7: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 7/46

“Influencer” calculation

Who’s most influential? (For this topic.)

subset ofdatabase

sphinx index

do stuffinfluencer data

(two MyISAM tables)

Page 8: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 8/46

MySQL Functionality Used

Page 9: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 9/46

MySQL Functionality Used

Replication

Blackhole Storage Engine

Federated Storage Engine

Public-Key Authentication

Page 10: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 10/46

Replication

MasterDatabase

ReplicaDatabase

binlogs

INSERT...UPDATE...INSERT...DELETE...

relaylogs

INSERT...UPDATE...INSERT...

DELETE...

Page 11: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 11/46

Replication

MasterDatabase

binlogs

INSERT...UPDATE...INSERT...DELETE...

server_id = 1log_bin = /store/log/mysql-bin

grant replication slave on *.*

to 'repl'@'%' identified by 'secret';

 /store/log/mysql-bin.000001

Page 12: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 12/46

Replication

ReplicaDatabase

relaylogs

INSERT...UPDATE...INSERT...

DELETE...

server_id = 2relaylog = /store/relaylog/relaylog

change master tomaster_host="masterserver",master_port=3306,master_user="repl",master_password="secret",master_log_file="mysql-bin.000001",master_log_pos=0;

start slave; /store/relaylog/relaylog.000001

Page 13: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 13/46

Black Hole Tables

> create table t (a int) engine=blackhole;Query OK, 0 rows affected (0.02 sec)

> insert into t values (1);Query OK, 1 row affected (0.00 sec)

> select * from t;Empty set (0.00 sec)

Page 14: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 14/46

real table

Black Hole Tables

Before-Insert triggercreate trigger t_bibefore insert on tfor each rowinsert into real_tablevalues (new.a);

Page 15: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 15/46

server2

Federated Tables

Points to a table in another database

federatedtable `tm`

server1

table `t`

create table tm (a int) engine=federatedconnection="mysql://user:pass@server1/schemaname/t"

create table t (a int)

Not enabled by default in 5.1! 

Page 16: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 16/46

server2

Federated Tables

Points to a table in another database

federatedtable `tm`

server1

table `t`

create table tm (a int) engine=federatedconnection="mysql://user:pass@server1/schemaname/t"

create table t (a int)

server3

federatedtable `tm`

server4

federatedtable `tm`

Page 17: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 17/46

Public-Key Authentication

Secure shell (ssh) or secure copy(scp) without a password 

Page 18: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 18/46

worker

Public-Key Authentication

main db

# scp file mysql@main:/...

# cd /root/.ssh

# ssh-keygen# cat id_rsa.pub

# vi /home/mysql/.ssh/authorized_keys

Page 19: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 19/46

Distributed System –

Why It Works For Us

All involved database data fits on the

worker database

Calculation produces a self-contained

data set

MyISAM can be used for the resultingdata

Page 20: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 20/46

worker 1

Distributed System –

General Architecture

main db server

worker db

main db

MYD & MYI

MYD & MYI

r  e

 pl  i   c  a t  i   on

f   e d  er  a t   e d 

 s  c  p f  i  l   e  c  o p y 

Page 21: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 21/46

Distributed System –

General Architecture

main db server

worker 1

worker 2

worker 3

main db

worker db

worker db

worker db MYD & MYI

MYD & MYI

MYD & MYI

MYD & MYI MYD & MYIMYD & MYI

Page 22: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 22/46

The Non-Queue... Queue

--- Topics ---No. Name

101 – Apples102 – Oranges103 – Grapes104 – Bananas

105–Pears

--- To-do List ---101104105

Page 23: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 23/46

The Non-Queue... Queue

--- Topics ---101 – Apples102 – Oranges103 – Grapes104 – Bananas105 – Pears

--- Topics ---Recalc Recalc

No. Name -Reqd -Date Priority Running

101 – Apples Full 10:00am high 101102 – Oranges No 8:00am 0

103–Grapes No 7:30am 0

104 – Bananas Partial 10:10am high 302105 – Pears Full 10:05am low 0

• not running• highest priority• earliest recalc date

Page 24: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 24/46

Processing Cycle

Get nextTopic to do

is thereany?

sleepno

yes

Get work...

Page 25: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 25/46

Processing Cycle

replicationstopped?

sleep 10 min.yes

replicationbehind?

yessleep 1 min.

Is data current? Let replication catch up!

Page 26: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 26/46

Processing Cycle

waited forreplication?

yes get a newTopic to do

Replace stale Topic Profile – someone else likely has it...

Page 27: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 27/46

Processing Cycle

markTopic with

my ID

sleep shortrandom time

Topic stillhas my

ID?

nosleep 1 min.

begin again

Playing Nice with Others

Page 28: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 28/46

Processing Cycle

run the stored proc

success?

nolog the error

sleep 1 min.

begin again

yes

Page 29: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 29/46

Processing Cycle

flush the data table forthis Profile (locally)

copy to main dbserver

flush the data table for

this Profile (on main db)

update Profile tableon main: not running 

Page 30: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 30/46

Processing Cycle

back to the top

Page 31: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 31/46

Code – main loop 

trap 'STOP=1' TERMSTOP=0

while [ $STOP -eq 0 ] ; do

done

Page 32: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 32/46

Code – who am I? 

NODEID_QUERY="select @@server_id-1000;"INSTANCE=$2

NODEID=`$MYSQL -e"$NODEID_QUERY"`UNIQUE_INSTANCEID=$((NODEID*100+INSTANCE))

Page 33: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 33/46

Code – who’s next? 

NEEDY=`$MYSQL_MASTER -e"$NEEDY_TP_QUERY" \2>/tmp/mysql.err`

set -- $NEEDY

TPID=$1; RECALC_SCOPE=$2; RECALC_DATE=$3

Page 34: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 34/46

Code – who’s next, the SQL

NEEDY_TP_QUERY="select topicFilterId,influencerRecalcReqd,unix_timestamp(influencerRecalcDate) RecalcDatefrom TopicFilterwhere influencerRecalcReqd>0

and influencerRunning=0and influencerRecalcDate<=now()and active in (1,2)and influencerPriority>=$MINPRIORITY

order by influencerPriority desc,influencerRecalcDate asc,topicFilterId

limit 1;"

Page 35: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 35/46

Code – inner loop 

while [ ! -z "${TPID}" -a $STOP -eq 0 ]; do

done

Page 36: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 36/46

Code – how far behind? 

REPLICATION_DELAY=`$MYSQL -e'show slave status\G'| grep "Seconds_Behind_Master"

| cut -f2 -d':'`

Page 37: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 37/46

Code – call the proc 

$MYSQL -e"call proc(${TPID},${RECALC_SCOPE})">${INFLUENCER_LOG}${TPID}.${INSTANCE}.log2>&1

RUN_SUCCESS=`tail -1${INFLUENCER_LOG}${TPID}.${INSTANCE}.log| grep -c SUMMARY`

if [ $RUN_SUCCESS -eq 1 ] ; then copy tables

else report error

fi

Page 38: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 38/46

Code – send the result back 

$MYSQL -e"use influencer;flush tables Dynamics_${TPID};"

if scp -p -B ${TABLE_SOURCE}/Dynamics_${TPID}.*${TABLE_MASTER_PUSH_LOC} 2>&1 && \

scp -p -B ${TABLE_SOURCE}/Post_${TPID}.*${TABLE_MASTER_PUSH_LOC} 2>&1then

 copy files to backup

 remove tables from worker

else report error

fi

Page 39: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 39/46

Code – was I killed? 

if [ $STOP -eq 1 ]; thenecho "---- Script stopped by Kill

(TERM signal) ----"fi

Page 40: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 40/46

A Blackhole Table Trigger

create table BlogPost (...)engine=blackhole;

create table PostAuthor (blogPostId bigint not null primary key,author varchar(189))

default character set=utf8;

create trigger BlogPost_birbefore insert on BlogPost for each rowbegin

if new.mediaProviderId>1 theninsert ignore into PostAuthor (blogPostId,author)values (new.blogPostId, new.author);

end if;end;

Page 41: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 41/46

Launch Script

Shell script to launch workers On

◦ Starts several workers

Off◦ sends a signal to each worker

◦ stops after this stored proc call

Kill◦ abort and clean up database

Page 42: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 42/46

Monitor Scripts

Replication Disk space

Errors from the worker script

◦ Procedure fails

◦ scp fails

◦ gzip fails

◦ replication is behind

Page 43: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 43/46

Design Decisions

MyISAM vs. InnoDB◦ Size

◦ Tables easily copied

◦ Table locking / Replication: SmallerQueries Required

SSD Storage

Page 44: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 44/46

Development Directions

Migrate stored proc to Java More workers

Problems in Code – not atomic:

◦ flush table / copy table

◦ copy result tables to main db

Page 45: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 45/46

Take Away...

Replication and Federated tablesaren't hard.

Thousands of MyISAM tables isn't

crazy.Distributed processing

isn't rocket surgery.

Page 46: Practical Distributed Processing Using MySQL Built-In Functionality Presentation

8/3/2019 Practical Distributed Processing Using MySQL Built-In Functionality Presentation

http://slidepdf.com/reader/full/practical-distributed-processing-using-mysql-built-in-functionality-presentation 46/46

Thank you! 

Bob Burgess

[email protected] 

Slides on the conference site.

Email me for sample scripts.