HBase Feed Aggregator Wurbe 25
-
Upload
andrei-savu -
Category
Technology
-
view
877 -
download
0
Transcript of HBase Feed Aggregator Wurbe 25
feed aggregator powered by hbase & python
Andrei Savuwurbe #25
Objectives
Highly scalable feed aggregatorPlay with python & thrift Provide some sample codeProvide detailed install instructionsLearn new stuff
Table Structure
3 tables: Feeds, Urls, UrlsIndex
Feeds: all feedsUrls: data extracted from feedsUrlsIndex: index table
Source code
http://github.com/andreisavu/feedaggregator
detailed install instructions
Lessons learned
Lesson #1: Hbase Game Rules
Not relationsNo joins No sophisticated query engineNo column typingNo transactionsNo secondary indices
... all done in application code
Lesson #2: Design your index
<cat>/<w3c_timestamp>
time sorting = lexicographic sorting
Lesson #3: No charsets
convert everything to bytes
... but store the original charset