HBase Feed Aggregator Wurbe 25

Post on 06-May-2015

878 views 0 download

Transcript of HBase Feed Aggregator Wurbe 25

feed aggregator powered by hbase & python

Andrei Savuwurbe #25

Objectives

Highly scalable feed aggregatorPlay with python & thrift Provide some sample codeProvide detailed install instructionsLearn new stuff

Table Structure

3 tables: Feeds, Urls, UrlsIndex

Feeds: all feedsUrls: data extracted from feedsUrlsIndex: index table

Source code

http://github.com/andreisavu/feedaggregator

detailed install instructions

Lessons learned

Lesson #1: Hbase Game Rules

Not relationsNo joins No sophisticated query engineNo column typingNo transactionsNo secondary indices

... all done in application code

Lesson #2: Design your index

<cat>/<w3c_timestamp>

time sorting = lexicographic sorting

Lesson #3: No charsets

convert everything to bytes

... but store the original charset

Questions?

http://www.andreisavu.ro

http://twitter.com/andreisavu

contact@andreisavu.ro