Breaking The Monolith: Fast Distributed Web Services Using Sets (Feb13, NUS)
-
Upload
cristobal-viedma -
Category
Technology
-
view
1.122 -
download
1
description
Transcript of Breaking The Monolith: Fast Distributed Web Services Using Sets (Feb13, NUS)
Breaking the Monolith
Cristobal Viedma
Fast Distributed Web Services Using Sets
What is Viki?
Cross-cultural communication and understanding
Technology
Web
viki.com,viki.56.com,
viki.pptv.com,youtube.com,
yahoo.com,msn.com...
Android,iOS,Blackberry,Windows Phone,Samsung Bada,Kindle Fire...
Mobile
TV
Google TV, Samsung SmartTV, Roku...
Top 5 cities: Singapore, Santiago, La Victoria, Jakarta, New York
Data: viki.com Oct'12
~22M users, 2500% mobile growth in 2012
...viewers start leaving if video doesn't play in 2 seconds...
Video Stream Quality Impacts Viewer Behavior: Inferring Causality Using Quasi-Experimental Designs. S. Shunmuga Krishnan, Ramesh K. Sitaraman, 2012
viewers start leaving if video doesn't play in 2 seconds...and every second of additional delay about 6% more viewers jumping ship!
Large ecosystem
Global audience
User growth
Performance
Breaking the monolith
Platform
Scalability
Availability
Performance
Performance
Network time
Generation time
Render time
Read heavy (writes can wait)
Everything fits in memory
Goal: 25ms uncached (10-100x better)
Essence
Hyperion(platform cluster)
Data structure
Bitmaps
genre:1 -> 010101001genre:2 -> 010010011type:music -> 011000001intersect -> 010000001 (ids: 2 and 9)
Good: speed for intersect & memory efficiencyBad: get the ids and the real data. sorting!, paging...
Everything is a set
genre:1 = [1, 2, 4]type:1 = [1, 2, 5]type:2 = [3, 4]
Good: Sparse (too many 0s with bitmaps) Complexity O(n+m) (m is sets, n elements in the smallest set)genre:1 -> 10 elementsvideos -> 100K elementscomplexity: O(10)
Building our own indexes
Data storedKeeping track of the indexes
How do we find data...redis.call('sort', my_set, 'BY', 'v:*->created_at', 'desc', 'LIMIT', offset, count, 'GET', 'v:*->details')
Old system: not in (id, id, id)first attempt: hash with a list of rulespermutations (too many countries in the world!)CAP -> 10GB. meh~alias matching permutations: 800mb ;)Redis 32bit, even better!
CAP is just another set! (well, but a DIFF!)
Holdbacks
Hacking Redis
Vfind: Building our own Redis function Setlets: Pre-calculated sorted listsMost requests 18~20ms, some cases 100ms (depends on the bigger set)
Vfind only gets content to fill 1 page: 15msPaging just showing more: 9msSerialization of jsons: 5msEnough.. for now!
Lists
A list is just a sorted set. E.g. a list of subscriptions, list of featured content...
Is a set!You can apply holdbacks or any other filter.
The platform
Each vertical is a source of truth
Logical and operational reasons
Everything routed through api.viki.io
Oceanus(Videos)
Activities (Behaviour)
Many web services
Gaia(Users)
Aphrodite(Community)
Queue
Centralized queue for events and messages
Event-driven web services
Messages must be idempotent
Message / Events Queue
Prometheus
Central logging system
Graphics, stats & logs
Improve visibilityElasticSearch
Logstash Stastd
Graphite
Tasseo /CubismKibana
Logs Stats
Col
lect
ors
Vis
ualiz
atio
ns
Bac
kend
Prometheus(Logs)
AnalyticsBaboon(CMS)
Subtitling tool
Hyperion(platform cluster)
Hyperion(platform cluster)
Partners
(Samsung, Youtube,Renren...)
Devices
(Android,iPhone, TV...)
Third party developers
(Open API)
Viki.com
(distributed thin-client)
Pla
tform
Clie
nts
Con
tent
Kno
wle
dge
Oceanus(Videos)
Gaia(Users)
Aphrodite(Community)
Subber(Subs,segs)
Message / Events Queue
Prometheus(Logs)
Segmenting tool
Hyperion(platform cluster)
Web services and API'sCentralized queue for messages
Nothing new...
... just having fun and learning.
:)
[email protected] btw, we're hiring!