How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale
-
Upload
linkedin -
Category
Technology
-
view
10.657 -
download
1
description
Transcript of How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale
![Page 1: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/1.jpg)
JDBC – We don’t need no stinking JDBC. How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale.
David Raccah & Dhananjay Ragade LinkedIn Corporation
![Page 2: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/2.jpg)
2
Goal of this PresentationWhat you will learn
How LinkedIn built a cheap and scalable system to store our member’s profiles, and how you can do the same
![Page 3: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/3.jpg)
3
Agenda
> Review system ilities> What happened to databases?> SOA What> Discuss existing Best Practices> Pixie Dust and Kool-Aid are not so bad> What LinkedIn’s got up their sleeve> How it all came together…> Q&A
![Page 4: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/4.jpg)
4
Terminology of the ilitiesthe terms of large successful systems> Performance
Not an “ility” but without it, no ility will save you> Availability
Availability is the proportion of time a system is in a functioning condition
> Reliability The probability that a functional unit will perform its
required function for a specified interval under stated conditions.
The ability of something to "fail well" (fail without catastrophic consequences)
![Page 5: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/5.jpg)
5
Terminology of the ilitiesthe terms of large successful systems
> Scalability Slow with multiple users vs. single user
> Manageability The ability to manage all parts of a large moving
system> Serviceability
The ability to service an arm of the system without bleeding to death (e.g. change out a database from a working system). Bleeding is OK in a high performance system – death is NOT acceptable.
![Page 6: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/6.jpg)
6
Agenda
> Review system ilities> What happened to databases?> SOA What> Discuss existing Best Practices> Pixie Dust and Kool-Aid are not so bad> What LinkedIn’s got up their sleeve> How it all came together…> Q&A
![Page 7: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/7.jpg)
7
> RDBMS – Relational Data Base Management System Attribute
> KVSS – Key Value Storage System
> Enterprise Search Engines
DatabasesThe systems that drive the enterprise … or….
![Page 8: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/8.jpg)
8
Database Server History….
![Page 9: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/9.jpg)
9
Database mind set has changed…From data access to data management to….> Initially it was all about remote
data access with an index> Then it moved to ACID data
management and tooling> Then it became an Application
Server with data affinity> Now we have come full circle
and people have figured out that scaling is more important than relationships, transactions, and data and behavioral affinity.
![Page 10: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/10.jpg)
10
Database Mantra that Rule the RoostACID
> Atomicity – All or nothing> Consistency – Data in the
system should never get in a contradictory state.
> Isolation: Two requests cannot interfere with one another.
> Durability: No do over – once the data is persisted, it cannot change.
![Page 11: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/11.jpg)
11
Anti-Database Rules BASE
> Basically Available Support partial failures within your
architecture (e.g. sharding)> Soft state
State may be out of synch for some time
> Eventually consistent Eventually all data is made
consistent (as long as the hardware is reliable)
![Page 12: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/12.jpg)
12
Database ScalabilityOr lack thereof…> Databases work. Look at:
Hotmail Facebook eBay
> Databases scale with hardware> They do not scale horizontally
well Partition management is
nonexistent and RYO is a mess Many use them as ISAM and
not even relational
![Page 13: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/13.jpg)
13
Database Tools and languageDuh…> Defacto standards for tools and
languages abound for relational databases
> Easy to manage the data within a partition and easy to write code to operate on said data
> Terrifying but nice to use extensions include running Java within the Data Engine, so that you could run your application within the big iron
![Page 14: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/14.jpg)
14
Database’s other features Which are the pain points….> Constraints – Nice idea until
you start partitioning.2PC is the anti-scalability pattern (Pat Helland)
> Computation – this feature turns out to cause more pain as cost rises with scale and are incompatible with most languages and tools.
> Replication & backup Nice tools that are indeed important and useful
> ACL support & Data Engine optimizations Used for sure, but exist to circumvent deficiencies
![Page 15: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/15.jpg)
15
Key Value Storage SystemsBigTable, Hive, Dynamo– the Wild Wild West
> Reliable – Proven on web> Available – redundant (locally)> Scalable – no constraints> Limited ACIDity> No Standard and not portable> Almost no:
Constraints or relationships Computation or transactions
![Page 16: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/16.jpg)
16
Enterprise Search EnginesIndex yes – storage device no
> A great inverted index> Finds data quickly> However, what it returns is
commonly an ID to the entity(s) in question
> Real-Time solutions are available but not fully deployed today
> Limited ACIDity/transactions> Scalable, available, reliable
![Page 17: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/17.jpg)
17
Agenda
> Review system ilities> What happened to databases?> SOA What> Discuss existing Best Practices> Pixie Dust and Kool-Aid are not so bad> What LinkedIn’s got up their sleeve> How it all came together…> Q&A
![Page 18: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/18.jpg)
18
SOAService Oriented Architecture
> SOA may be overkill for most enterprises
> Still a Tiered and layered architecture – which is what SOA hoped to formulate and standardize is a solid approach
> Services (not SOA) allow for efficient reuse of business processes and aggregation services within a complex development organization
![Page 19: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/19.jpg)
19
Agenda
> Review system ilities> What happened to databases?> SOA What> Discuss existing Best Practices> Pixie Dust and Kool-Aid are not so bad> What LinkedIn’s got up their sleeve> How it all came together…> Q&A
![Page 20: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/20.jpg)
20
> Store critical data redundantly and reliably with a cluster Google via BigTable, Facebook
via MySQL, eBay via replicated & sharded DB
> Layer services on top of the storage device to manage data integrity and complexity LinkedIn, Amazon, eBay
Best PracticesStorage and architecture
![Page 21: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/21.jpg)
21
> Create a bus to route replicated data to consumers – e.g. search, data mining, etc. Almost all sites
> Parallelization via things like scatter/gather Almost all search topologies
(Google, Yahoo, Live), Facebook, etc.
Best PracticesStorage and architecture
![Page 22: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/22.jpg)
22
Best PracticesStorage and architecture
> Keep the system stateless eBay, Google, etc.
> Partition data and services Facebook, eBay
> Cache data> Replicate your data> Route requests to where the
behavior and/or data exists> Degrade gracefully with load
![Page 23: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/23.jpg)
23
Best PracticesStorage and architecture> Tiering systems
Latency vs. Affinity Traversal versus affinity – you need to
analyze the cost and make a decision Scaling vs. parallelizing
Do you need to keep tiering all systems to keep the scalability uniform?
Complexity vs. diminished dependencies Does the reduced dependencies make
up for the increased system complexity?
![Page 24: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/24.jpg)
24
Agenda
> Review system ilities> What happened to databases?> SOA What> Discuss existing Best Practices> Pixie Dust and Kool-Aid are not so bad> What LinkedIn’s got up their sleeve> How it all came together…> Q&A
![Page 25: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/25.jpg)
25
Pixie Dust and Kool-AidBuilding on the past
![Page 26: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/26.jpg)
26
Pixie Dust and Kool-AidBuilding on the past> So what do we want:
Reliable Available Scalable ACIDity on simple transactions Standard and portable interface Data Optimizations Cache and replicate Low cost BASE architecture
![Page 27: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/27.jpg)
27
Agenda
> Review system ilities> What happened to databases?> SOA What> Discuss existing Best Practices> Pixie Dust and Kool-Aid are not so bad> What LinkedIn’s got up their sleeve> How it all came together…> Q&A
![Page 28: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/28.jpg)
28
LinkedIn’s Data ServicesMixture of standards and pixie dust
> Front a database with a service> Cache data> Route to and partition the data
service> Scale and replicate services in a
horizontal manner> Keep all writes ACID and
subsequent reads ACID as well
![Page 29: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/29.jpg)
29
LinkedIn’s Data ServicesMixture of standards and pixie dust
> Databases are reliable> Scale out at the service> Replicate and cache> Partitioning comes from the front
tier and business servers that front the data services
![Page 30: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/30.jpg)
30
LinkedIn’s Data ServicesImmediate replication vs. eventual replication
> Caching needs a consistency algorithm> Techniques for immediate replication
Paxos Chubby, Microsoft AutoPilot, Zoo Keeper
N Phase Commit (2PC and 3PC)> Techniques for eventual consistency
BASE (Basically Available, Soft-state, Eventual Consistency Inktomi, Dynamo, AWS
![Page 31: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/31.jpg)
31
LinkedIn’s Data ServicesLinkedIn’s approach
> Keep core data ACID> Keep replicated and cached data BASE> Replicate data via the data bus> Cache data on a cheap memory
(memcached)> Use a hint to route the client to his /
her’s ACID data
![Page 32: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/32.jpg)
32
LinkedIn’s Data ServicesDatabus – the linchpin of our replication
![Page 33: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/33.jpg)
33
LinkedIn’s Data ServicesLinkedIn’s approach
![Page 34: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/34.jpg)
34
LinkedIn’s Data ServicesCore DS
> Keep core data ACID in the DB> All writes come here.> Databus source for all replication> The last line of defense for a
cache miss> Manages sharding
![Page 35: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/35.jpg)
35
LinkedIn’s Data ServicesRepDS
> Manages cache consistency and replication
> Manages the freshness of the caller
> Reads come from cache
![Page 36: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/36.jpg)
36
LinkedIn’s Data ServicesRepReader
> RepReader is the typical tip of the iceberg problem
> All read operations are sourced from the cache unless the caller’s freshness token is out of the window
![Page 37: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/37.jpg)
37
LinkedIn’s Data ServicesFreshness Token (AKA Pixie Dust)
> The freshness token = Pixie Dust for CUD operations
> It also allows us to give the caller control over whether they are content with BASE data, even if they did no CUD operation.
![Page 38: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/38.jpg)
38
LinkedIn’s Data ServicesFor the love of Pixie dust and Kool-Aid
> We use commodity hardware and software to run our service
> We use Pixie Dust to keep costs down and keep our customer happy
> We keep OPS and the exec-staff happy with our special brand of Kool-Aid
![Page 39: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/39.jpg)
39
Agenda
> Review system ilities> What happened to databases?> SOA What> Discuss existing Best Practices> Pixie Dust and Kool-Aid are not so bad> What LinkedIn’s got up their sleeve> How it all came together…> Q&A
![Page 40: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/40.jpg)
40
Profile Re-architectureChanging planes in mid-flight
> Original LinkedIn System> Use of XML for i18n> Phased Transition
![Page 41: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/41.jpg)
41
Problems from the original systemAnthropology 101
> Be fair… it worked well for a startup
> Many tables in one big DB
> Too many similar object hierarchies
> No well defined domains
![Page 42: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/42.jpg)
42
Why XML?Flexibility
> Profile has many fields > 1NF for I18n ==> too many
tables> StAX for fast parsing> Easier to version the profile> Human readable> JSON? ProtoBuf?
![Page 43: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/43.jpg)
43
Issues with XML<good/> <bad/> <ugly/>
> XML schema design tradeoffs and analytics impact
> XML is verbose> StAX is unfriendly> XML in the DB caused us
some performance headaches
![Page 44: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/44.jpg)
44
Phased TransitionEvolving a living, breathing organism
> Successive iterations avoid breakages> No major site downtime> Easier to sanity check> Does not hold other teams hostage> Phases LinkedIn went through
![Page 45: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/45.jpg)
45
Double Writes TopologySafety first
![Page 46: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/46.jpg)
46
After Legacy Tables Dropped Auld Lang Syne
![Page 47: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/47.jpg)
47
Wrap upThe moral of the story is…
> Keep your system BASE> Use commodity hardware> Use pixie dust (AKA data freshness token)> Evolve slowly - no big bang!
![Page 48: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/48.jpg)
48
Q&A
![Page 50: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/50.jpg)
Appendix
![Page 51: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/51.jpg)
51
PerformanceOften mixed up with scalability> Performance
A numerical value given to a single system when asked to do a task under nominal load
If the system responds poorly without load, it will assuredly continue its molasses response time under load
![Page 52: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/52.jpg)
52
AvailabilityOften mixed up with reliability
> Availability A numerical value
given to a system that defines the proportion of time a system is in a functioning condition.
Most common scoring system is called nines – which is defined as the uptime versus the uptime and downtime – five nines = 0.99999
![Page 53: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/53.jpg)
53
ReliabilityThe ability for a system to perform its functionality
> Reliability A system can be 100% available
and still be 100% unreliable (e.g. non consistent caching)
A person can consistently give you the wrong answer
Architecture is defined as the balance of the ilities and cost
![Page 54: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/54.jpg)
54
Scalabilitythe term that many think is the holy grail> Scalability
The ability for a system to manage more traffic or to be “scaled” as more traffic appears
System slows with multiple users vs. single user
Route, Partition, Orchestrate, replicate, and go asynch
Split the system horizontally Rarely scale vertically
![Page 55: How LinkedIn uses memcached, a spoonful of SOA, and a sprinkle of SQL to scale](https://reader035.fdocuments.us/reader035/viewer/2022070320/55892759d8b42ade2f8b4742/html5/thumbnails/55.jpg)
55
The rest of the ilitiesthe ones that people tend to ignore till its too late> Manageability
It is a double-edged sword which can be easily ignored
> Serviceability Here complexity starts to
rear its ugly head> Maintainability
Of course maintainability tends to run upstream of complexity