Cloud Computing Clase 7
description
Transcript of Cloud Computing Clase 7
Cloud ComputingClase 7
Miguel Saez
@masaez
Johnny Halife@johnnyhalife
Matias Woloski
@woloski
Based on a slide deck from Steve Huffman presented on May 2010
Lecciones Aprendidas en Reddit
• Sitio: Reddit.com• Objetivo: entender lo que significa hacer una
aplicacion web que recibe 270 millones de page views por mes
• http://vimeo.com/10506751 • Puntos mas importantes
– Esquema abierto– Procesamiento asincronico– Stateless– Caching
Reddit.com
A brief history of reddit• Founded in June 2005• Acquired by Condé Nast October
2007• 7.5 Million user / month• 270 Million page views / month• Many mistakes along the way
Lesson 1: Crash!• …and restart.• Daemontools (supervise)• Single greatest improvement to
uptime we ever made.• When in doubt, let it die.• Don’t forget to read the logs!
Lesson 2: Separation of services
• Often, one->two machines more than doubles performance.
• Group similar process together.• Group similar types of data together.• Better caching.• Less contention for CPU.• Avoid threads. Processes are easier
to separate later.
Lesson 3: Open Schema
ID UPS DOWNS
TITLE URL
12345 120 34 Buffins Create Zombie Dog!
www.someaussiesite.co.au/dog.html
12346 3 24 Check out my new blog!
noobspamer.blogspot.com
12347 509 167 Pee in a sink if you’ve ever voted up.
self
Lesson 3: Open SchemaIn the early days:
• Too much time spent thinking about the database.
• Every feature required a schema update.
• Schema updates became more painful as we grew.
• Maintaining replication was difficult.• Deployment was complex.
Lesson 3: Open Schema
THING_ID KEY VALUE
12345 Title Boffins Create Zombie Dog!
12345 URL www.someaussiesite.com.au/zombiedog.html
12346 Title Pee in a sink if you’ve ever voted up.
12346 URL self
ID UPS DOWNS TYPE12345 120 34 Link12346 3 24 Link
Thing Data
Lesson 3: Open SchemaWith an open schema:
• Faster development• Easier deployment• Maintainable database replication• No joins = easy to distribute• Must be careful to maintain
consistency
Lesson 4: Keep it stateless• Goal: any app server can handle any
request• App server failure/restart is no big
deal• Scaling is straightforward• Caching must be independent from a
specific app server.
Lesson 5: Memcache everything
• Database data• Session data• Rendered pages• Memoizing internal functions• Rate-limiting (user actions, crawlers)• Storing pre-computing listings/pages• Global locking• Memcachedb for persistence
Lesson 6: Store redundant data
• Recipe for slow: keep data normalized until you need it.
• If data has multiple presentations, store it in multiple times in multiple formats.
• Disk and memory is less costly than making your users wait.
Lesson 7: Work offline• Do the minimum amount of work to
end the request.• Everything else can be done offline.• An architecture of queues is simple
and easy to scale.• AMQP/RabbitMQ.
Lesson 7: Work offline• Pre-computing listings• Fetching thumbnails• Detecting cheating• Removing spam• Computing awards• Updating the “search” index
Lesson 7: Work offline
Master Databases
App Servers
Worker Databases
Cache
Precomputer
Thumbnailer
Spam
Request
Queue
Consigna
• Reimplementar la funcionalidad de ranking de “el Prode” utilizando lo aprendido luego de haber visto esta presentacion