The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015

23
The Direc)ons Pipeline at Mapbox

Transcript of The Directions Pipeline at Mapbox - AWS Meetup Berlin June 2015

The$Direc)ons$Pipeline$at$Mapbox

Goal

• Always(have(the(freshest(map(data(available(for(rou6ng

• Allows(to(6mely(fix(data(problems

1. Get&OpenStreetMap&data

2. Pre1process&for&Direc8ons

3. Load&new&data&into&API&servers

4. Repeat

• Every'step'is'an'own'CloudForma6on'stack.

Overview

1. Get$OpenStreetMap$data

2. Pre'process,for,Direc0ons

3. Load,new,data,into,API,servers

4. Repeat

• Every'step'is'own'CloudForma6on'stack.

OpenStreetMap

• Planet.osm,is,all,OpenStreetMap,data,in,one,file

• New,version,release,every,week

• Big,file:,576,GB,uncompressed,,28,GB,as,compressed,protobuf

• Incremental,diff,update,available,every,minute

OpenStreetMap

• 1#EC2#c3.2xlarge#instance

• Replays#latest#changesets#to#generate#a#new#planet#file

• New#planet#every#2#1/2#hours,#uploaded#to#S3

• Updates#a#file#/latest#with#a#reference#to#most#recent#planet

• Old#planets#are#purged#via#S3#Lifecycle#Object#ExpiraHon

Overview

1. Get$OpenStreetMap$data

2. Pre'process,for,Direc0ons

3. Load,new,data,into,API,servers

4. Repeat

Overview

1. Get&OpenStreetMap&data

2. Pre$process)for)Direc-ons

3. Load&new&data&into&API&servers

4. Repeat

Direc&ons*pre,processing

• 1#EC2#r3.4xlarge#on#Spot#Pricing#per#profile

• Fetch#latest#OSM#planet#from#S3#(discovery#via#/latest)

• Run#preDprocessing#for#profile

• Car:#6#Hours

• Bicycle:#15#Hours

• Walk:#23#Hours

Direc&ons*pre,processing

• Upload(results(to(S3((and(update(/latest)

• Update(CloudForma6on(stack(of(API

Overview

1. Get&OpenStreetMap&data

2. Pre$process)for)Direc-ons

3. Load&new&data&into&API&servers

4. Repeat

Overview

1. Get&OpenStreetMap&data

2. Pre1process&for&Direc8ons

3. Load%new%data%into%API%servers

4. Repeat

Direc&ons*API

• One%CloudForma/on%stack%per%Profile

• Several%EC2s%r3.2xlarge

• ELB/AutoScaling

• On%start,%downloads%latest%Direc/ons%data%from%S3

• EC2%instances%are%cycled%on%CloudForma/on%parameter%update

Direc&ons*pre,processing

aws cloudformation update-stack \ --stack-name "$ApiDirectionsStack" \ --use-previous-template \ --capabilities "CAPABILITY_IAM" \ --parameters $params ParameterKey=LatestTimstamp,ParameterValue="$TimeStamp"

Direc&ons*API*CloudForma&on*Template

{ "Parameters": { "LatestTimstamp": { "Type": "String" } }, "LaunchConfiguration": { "Type": "AWS::AutoScaling::LaunchConfiguration", "UserData": { { "Ref": "LatestTimstamp" } } }, "AutoScalingGroup": { "Type": "AWS::AutoScaling::AutoScalingGroup", "UpdatePolicy": { "AutoScalingRollingUpdate": { } } }}

The$Good

• Few%moving%parts%and%easy%concept

• Upda5ng%stack%parameter%mirrors%our%exis5ng%deployment%flow

• Data<driven%approach%decouples%stacks

• no%queues

• Easy%1:n%distribu5on

• Star5ng%new%stacks%is%easy,%they%just%pick%up%latest%state

The$Bad

• Upda&ng)CloudForma&on)stack)parameters)needs)a)lot)of)IAM)permissions

• Poten&al)security)problem)if)instance)is)hacked

• AutoScaling)is)not)scopeable)by)resource

• Poten&al)for)UPDATE_ROLLBACK_FAILED

• Problem)with)UpdatePolicy,)AutoScaling)and)nonBboo&ng)EC2s

Open%Source

• h#ps://wiki.openstreetmap.org/wiki/Osmosis

• Process7and7forward7OpenStreetMap7data

• h#p://project=osrm.org/

• Open%Source%Rou,ng%Machine7for7doing7Direc?ons7queries

• Load7OpenStreetMap7data

• Pre=processing7the7data7for7a7profile

• Do7queries7against7the7pre=processed7data

1. Get&OpenStreetMap&data

2. Pre1process&for&Direc8ons

3. Load&new&data&into&API&servers

4. Repeat

@[email protected]