Scalable PostgreSQL as your data platform -...
Transcript of Scalable PostgreSQL as your data platform -...
![Page 2: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/2.jpg)
1. What is a data platform?
2. Why PostgreSQL?
3. Extensions, Extensions, Extensions
• HSTORE – semi-structured data in your DB
• HLL – distinct counts using mathematical magic
• CSTORE – a fast columnar store for PostgreSQL
4. How CitusDB lets you scale PostgreSQL
è
![Page 3: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/3.jpg)
A Data Platform solves Data Problems
![Page 4: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/4.jpg)
Data Problems
• LOTS of data • Changing needs • Need a way to work with data I
understand
![Page 5: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/5.jpg)
Data Platform • Store and query ALL your data • Scalable • Cost effective • Extensible
![Page 6: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/6.jpg)
1. What is a data platform?
2. Why PostgreSQL?
3. Extensions, Extensions, Extensions
• HSTORE – semi-structured data in your DB
• HLL – distinct counts using mathematical magic
• CSTORE – a fast columnar store for PostgreSQL
4. How CitusDB lets you scale PostgreSQL
è
![Page 7: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/7.jpg)
PostgreSQL • http://www.postgresql.org/
• Used by:
![Page 8: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/8.jpg)
1. What is a data platform?
2. Why PostgreSQL?
3. Extensions, Extensions, Extensions
• HSTORE – semi-structured data in your DB
• HLL – distinct counts using mathematical magic
• CSTORE – a fast columnar store for PostgreSQL
4. How CitusDB lets you scale PostgreSQL
è
![Page 9: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/9.jpg)
![Page 10: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/10.jpg)
1. What is a data platform?
2. Why PostgreSQL?
3. Extensions, Extensions, Extensions
• HSTORE – semi-structured data in your DB
• HLL – distinct counts using mathematical magic
• CSTORE – a fast columnar store for PostgreSQL
4. How CitusDB lets you scale PostgreSQL
è
![Page 11: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/11.jpg)
What is HSTORE? Data-type that enables storage of semi-structured data in key/value pairs. Compare to JSON.
{ “referrer” => “…”, “ua” => “Mozilla/5…”, “cc” => “max-‐age…”, “ae” => “UTF-‐8”, “host” => “www…”
}
![Page 12: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/12.jpg)
24.84.219.32 [10/Apr/2003:04:20:07 -‐0700] "GET /archive/cat/games/index.shtml HTTP/1.1” "http://www.google.ca/search?q=games+speed&ie=UTF-‐8&start=40" "Mozilla/4.0 (MSIE 6.0; Windows 98; .NET CLR 1.0.3705)"
…
![Page 13: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/13.jpg)
24.84.219.32 [10/Apr/2003:04:20:07 -‐0700] "GET /archive/cat/games/index.shtml HTTP/1.1” "http://www.google.ca/search?q=games+speed&ie=UTF-‐8&start=40 "Mozilla/4.0 (MSIE 6.0; Windows 98; .NET CLR 1.0.3705)"
…
![Page 14: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/14.jpg)
postgres=# CREATE EXTENSION hstore; ... postgres=# \d access_logs Table "public.access_logs" ┌──────────────────────┬─────────────────────────────┬───────────┐ │ Column │ Type │ Modifiers │ ├──────────────────────┼─────────────────────────────┼───────────┤ │ time │ timestamp without time zone │ │ │ request_ip │ inet │ │ │ request_path │ text │ │ │ referer_domain │ text │ │ │ referer_path │ text │ │ │ referer_query_params │ hstore │ │ └──────────────────────┴─────────────────────────────┴───────────┘ postgres=#
![Page 15: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/15.jpg)
postgres=# select referer_query_params from access_logs where referer_domain LIKE '%google%’ LIMIT 5;
┌───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ referer_query_params │ ├───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ "q"=>"cd+rom+drive+open+html", "hl"=>"en", "ie"=>"ISO-‐8859-‐1" │ │ "q"=>"Heart+of+the+Alien", "hl"=>"it", "ie"=>"UTF-‐8", "oe"=>"UTF-‐8", "sourceid"=>"navclient" │ │ "q"=>"star+war+kid+download", "hl"=>"en", "ie"=>"ISO-‐8859-‐1", "lr"=>"", "safe"=>"off" │ │ "q"=>"stupid+cubs+fan+video", "hl"=>"en", "ie"=>"UTF-‐8", "lr"=>"", "oe"=>"UTF-‐8", "sa"=>"N", "start"=>"10" │ │ "q"=>"star+wars+kid", "hl"=>"en", "ie"=>"UTF-‐8", "oe"=>"UTF-‐8" │ └───────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘ (5 rows) postgres=#
![Page 16: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/16.jpg)
postgres=# select count(*), CASE WHEN referer_query_params-‐>'q' ~ 'star.*wars' THEN 'star wars' ELSE 'not star wars' END AS search_type FROM access_logs WHERE referer_domain='www.google.com’ AND referer_query_params ? 'q' GROUP BY search_type;
┌───────┬───────────────┐ │ count │ search_type │ ├───────┼───────────────┤ │ 25316 │ star wars │ │ 38251 │ not star wars │ └───────┴───────────────┘ (2 rows) postgres=#
![Page 17: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/17.jpg)
HSTORE goodness • Support for indexing on HSTORE • Still under development; coming soon
nested HSTORE support
![Page 18: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/18.jpg)
?
![Page 19: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/19.jpg)
![Page 20: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/20.jpg)
0 5
10 15 20 25 30 35 40
Homepage Search View Purchase
Num
Use
rs
Users from Homepage
25%
84%
40%
![Page 21: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/21.jpg)
Original Model
![Page 22: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/22.jpg)
WITH “login" AS ( SELECT event.user_id, min(event.time) AS mintime FROM event_pruned "event" WHERE (("type" = 'click' AND ("target_tag" = 'input' AND "target_id" = 'item_complete')) AND "app_id" = 3212870315 AND "event".app_id = 3212870315) GROUP BY event.user_id ), “view_article" AS ( SELECT event.user_id, min(event.time) AS mintime FROM event_pruned "event" NATURAL JOIN “login" WHERE (("type" = 'submit' AND ("target_class" ilike '%well%' AND "target_tag" = 'form')) AND "app_id" = 3212870315 AND “login".user_id = "event".user_id AND “login".mintime < "event".time AND "event".app_id = 3212870315) GROUP BY event.user_id ), "funnel" AS ( SELECT *, CASE WHEN user_id IN (select user_id from “view_article") THEN 'View article' WHEN user_id IN (select user_id from “login") THEN 'Login' ELSE null END AS stage FROM "user" WHERE ("user".app_id = 3212870315) ) SELECT count(*) AS count, stage FROM "funnel" WHERE ("funnel".app_id = 3212870315) GROUP BY stage;
![Page 23: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/23.jpg)
New Model events = [
{ “url” => “http://…” “query_string” => “…” “event_type” => “click” “event_tag” => “…” }, {
… ]
![Page 24: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/24.jpg)
Add new custom function
contains_elements( events [] hstore, search_sequence [] text );
![Page 25: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/25.jpg)
contains_elements( [{ “event” => “home” }, { “event” => “find” }, { “event” => “view” }, { “event” => “find” }, { “event” => “find” }, { “event” => “rate” }, { “event” => “home” }, { “event” => “add” }, { “event” => “buy” }, { “event” => “view” }, ], [ ‘event=>home’, ‘event=>find’, ‘event=>buy’ ] );
![Page 26: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/26.jpg)
contains_elements( [{ “event” => “home” }, { “event” => “find” }, { “event” => “view” }, { “event” => “find” }, { “event” => “find” }, { “event” => “rate” }, { “event” => “home” }, { “event” => “add” }, { “event” => “buy” }, { “event” => “view” }, ], [ ‘event=>home’, ‘event=>find’, ‘event=>buy’ ] );
![Page 27: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/27.jpg)
contains_elements( [{ “event” => “home” }, { “event” => “find” }, { “event” => “view” }, { “event” => “find” }, { “event” => “find” }, { “event” => “rate” }, { “event” => “home” }, { “event” => “add” }, { “event” => “buy” }, { “event” => “view” }, ], [ ‘event=>home’, ‘event=>find’, ‘event=>buy’ ] );
![Page 28: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/28.jpg)
contains_elements( [{ “event” => “home” }, { “event” => “find” }, { “event” => “view” }, { “event” => “find” }, { “event” => “find” }, { “event” => “rate” }, { “event” => “home” }, { “event” => “add” }, { “event” => “buy” }, { “event” => “view” }, ], [ ‘event=>home’, ‘event=>find’, ‘event=>buy’ ] );
![Page 29: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/29.jpg)
New User Defined Function
contains_elements(A hstore[], B text[]) Returns 1 if there is a sub-sequence of A like A[i1]...A[i|B|] such that A[ij] matches B[j] for all 1 <= j <= |B|, otherwise it returns 0.
![Page 30: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/30.jpg)
WITH “login" AS ( SELECT event.user_id, min(event.time) AS mintime FROM event_pruned "event" WHERE (("type" = 'click' AND ("target_tag" = 'input' AND "target_id" = 'item_complete')) AND "app_id" = 3212870315 AND "event".app_id = 3212870315) GROUP BY event.user_id ), “view_article" AS ( SELECT event.user_id, min(event.time) AS mintime FROM event_pruned "event" NATURAL JOIN “login" WHERE (("type" = 'submit' AND ("target_class" ilike '%well%' AND "target_tag" = 'form')) AND "app_id" = 3212870315 AND “login".user_id = "event".user_id AND “login".mintime < "event".time AND "event".app_id = 3212870315) GROUP BY event.user_id ), "funnel" AS ( SELECT *, CASE WHEN user_id IN (select user_id from “view_article") THEN 'View article' WHEN user_id IN (select user_id from “login") THEN 'Login' ELSE null END AS stage FROM "user" WHERE ("user".app_id = 3212870315) ) SELECT count(*) AS count, stage FROM "funnel" WHERE ("funnel".app_id = 3212870315) GROUP BY stage;
![Page 31: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/31.jpg)
SELECT sum(contains_elements(events, ARRAY['type=>home'])), sum(contains_elements(events, ARRAY['type=>home', 'type=>find'])), sum(contains_elements(events, ARRAY['type=>click', 'type=>find’, ‘type=>buy’])) FROM event_sessions WHERE app_id = 3212870315;
![Page 32: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/32.jpg)
So what?
0 20 40 60 80 100 120
Redshi/
CitusDB (HSTORE)
Time in seconds
Funnel query over 100M+ events
>10x
![Page 33: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/33.jpg)
1. What is a data platform?
2. Why PostgreSQL?
3. Extensions, Extensions, Extensions
• HSTORE – semi-structured data in your DB
• HLL – distinct counts using mathematical magic
• CSTORE – a fast columnar store for PostgreSQL
4. How CitusDB lets you scale PostgreSQL
è
![Page 34: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/34.jpg)
24.84.219.32 [10/Apr/2003:04:20:07 -‐0700] "GET /archive/cat/games/index.shtml HTTP/1.1” "http://www.google.ca/search?q=games+speed&ie=UTF-‐8&start=40" "Mozilla/4.0 (MSIE 6.0; Windows 98; .NET CLR 1.0.3705)"
…
![Page 35: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/35.jpg)
Unique Viewer Counts Daily data Monthly data
![Page 36: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/36.jpg)
How to compute?
Time IP URL 1/2/14 12:03:04
87.32.182.4 /
1/2/14 12:03:04
243.32.93.2 /prod…
1/2/14 12:03:05
243.32.93.2 /prod…
… … …
SELECT count(distinct(ip)) FROM requests WHERE time >= '2014-‐01-‐01 12:00' AND time < '2014-‐01-‐02 12:00';
![Page 37: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/37.jpg)
How to compute?
Time IP URL 1/2/14 12:03:04
87.32.182.4 /
1/2/14 12:03:04
243.32.93.2 /prod…
1/2/14 12:03:05
243.32.93.2 /prod…
… … …
SELECT count(distinct(ip)) FROM requests WHERE time >= '2014-‐01-‐01 00:00' AND time < '2014-‐02-‐01 00:00';
![Page 38: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/38.jpg)
What happens when you scale?
01/01/14
01/02/14
01/03/14
01/04/14
01/05/14
01/06/14
02/07/14
02/08/14
02/09/14
Machine #1 Machine #2 Machine N
…
![Page 39: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/39.jpg)
01/01/14
01/02/14
01/03/14
01/04/14
01/05/14
01/06/14
02/07/14
02/08/14
02/09/14
Machine #1 Machine #2 Machine N
…
master
![Page 40: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/40.jpg)
?
![Page 41: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/41.jpg)
Academia to the rescue!
• HyperLogLog (Flajolet et. al.) paper from 2007
![Page 42: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/42.jpg)
HLL extension • https://github.com/aggregateknowledge/
postgresql-hll • Enables approximate distinct counts • Adds new type, aggregation functions,
and estimation functions
![Page 43: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/43.jpg)
HLL extension • hll_hash – create HLL given a value • hll_union – combine two HLLs • hll_cardinality – provide an estimate
of the number of distinct values in an HLL
![Page 44: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/44.jpg)
01/01/14
01/02/14
01/03/14
01/04/14
01/05/14
01/06/14
02/07/14
02/08/14
02/09/14
Machine #1 Machine #2 Machine N
…
master HLL
HLL
HLL HLL
HLL HLL
HLL
HLL
![Page 45: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/45.jpg)
1. What is a data platform?
2. Why PostgreSQL?
3. Extensions, Extensions, Extensions
• HSTORE – semi-structured data in your DB
• HLL – distinct counts using mathematical magic
• CSTORE – a fast columnar store for PostgreSQL
4. How CitusDB lets you scale PostgreSQL è
![Page 46: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/46.jpg)
Id Sz Ln Ht … … … … … … … … … … …
1 4 3 4 … … … … … … … … … … …
2 4 11 3 … … … … … … … … … … …
3 1 4 2 … … … … … … … … … … …
4 8 4 12 … … … … … … … … … … …
…
4… … … … … … … … … … … … … … …
4… … … … … … … … … … … … … … …
4… … … … … … … … … … … … … … …
30M rows
700 columns
![Page 47: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/47.jpg)
SELECT id, AVG(price), MAX(price)
FROM items
WHERE quantity > 100 AND last_stock_date < ‘2013-‐10-‐01’
GROUP BY weight
![Page 48: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/48.jpg)
Id … price … … quan… … … last_st… … … … … … weight
1 … 3.90 … … 31 … … 2013-‐… … … … … … 0.6
2 … 13 … … 70 … … 2010-‐… … … … … … 0.8
3 … 4.25 … … 432 … … 2013-‐… … … … … … 1
4 … 4 … … 45 … … 2013-‐… … … … … … 6
…
4… … 95 … … 37 … … 2013-‐… … … … … … 0.6
4… … 59 … … 90 … … 2012-‐… … … … … … 1.5
Row-oriented store
![Page 49: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/49.jpg)
Id … price … … quan… … … last_st… … … … … … weight
1 … 3.90 … … 31 … … 2013-‐… … … … … … 0.6
2 … 13 … … 70 … … 2010-‐… … … … … … 0.8
3 … 4.25 … … 432 … … 2013-‐… … … … … … 1
4 … 4 … … 45 … … 2013-‐… … … … … … 6
…
4… … 95 … … 37 … … 2013-‐… … … … … … 0.6
4… … 59 … … 90 … … 2012-‐… … … … … … 1.5
Row-oriented store
![Page 50: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/50.jpg)
Id … price … … quan… … … last_st… … … … … … weight
1 … 3.90 … … 31 … … 2013-‐… … … … … … 0.6
2 … 13 … … 70 … … 2010-‐… … … … … … 0.8
3 … 4.25 … … 432 … … 2013-‐… … … … … … 1
4 … 4 … … 45 … … 2013-‐… … … … … … 6
…
4… … 95 … … 37 … … 2013-‐… … … … … … 0.6
4… … 59 … … 90 … … 2012-‐… … … … … … 1.5
Row-oriented store
![Page 51: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/51.jpg)
Id … price … … quan… … … last_st… … … … … … weight
1 … 3.90 … … 31 … … 2013-‐… … … … … … 0.6
2 … 13 … … 70 … … 2010-‐… … … … … … 0.8
3 … 4.25 … … 432 … … 2013-‐… … … … … … 1
4 … 4 … … 45 … … 2013-‐… … … … … … 6
…
4… … 95 … … 37 … … 2013-‐… … … … … … 0.6
4… … 59 … … 90 … … 2012-‐… … … … … … 1.5
Row-oriented store
![Page 52: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/52.jpg)
Cost of row storage • Read 700 columns instead of 5 • >39 GB of unnecessary I/O
Input Type
Estimated Input Rate
Cost to query performance
Memory 10 GB/s 3.9 seconds
SSD 600 MB/s >60 seconds
![Page 53: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/53.jpg)
SELECT id, AVG(price), MAX(price)
FROM items
WHERE quantity > 100 AND last_stock_date < ‘2013-‐10-‐01’
GROUP BY weight
![Page 54: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/54.jpg)
Id sz price … … quan… … … last_st… … … … … … weight
1 4 3.90 … … 31 … … 2013-‐… … … … … … 0.6
2 3 13 … … 70 … … 2010-‐… … … … … … 0.8
3 2 4.25 … … 432 … … 2013-‐… … … … … … 1
4 4 4 … … 45 … … 2013-‐… … … … … … 6
…
4… 19 95 … … 37 … … 2013-‐… … … … … … 0.6
4… 2 59 … … 90 … … 2012-‐… … … … … … 1.5
Column-oriented store
![Page 55: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/55.jpg)
Column-oriented store Id sz price … … quan… … … last_st… … … … … … weight
1 4 3.90 … … 31 … … 2013-‐… … … … … … 0.6
2 3 13 … … 70 … … 2010-‐… … … … … … 0.8
3 2 4.25 … … 432 … … 2013-‐… … … … … … 1
4 4 4 … … 45 … … 2013-‐… … … … … … 6
…
4… 19 95 … … 37 … … 2013-‐… … … … … … 0.6
4… 2 59 … … 90 … … 2012-‐… … … … … … 1.5
![Page 56: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/56.jpg)
Column-oriented store Id sz price … … quan… … … last_st… … … … … … weight
1 4 3.90 … … 31 … … 2013-‐… … … … … … 0.6
2 3 13 … … 70 … … 2010-‐… … … … … … 0.8
3 2 4.25 … … 432 … … 2013-‐… … … … … … 1
4 4 4 … … 45 … … 2013-‐… … … … … … 6
…
4… 19 95 … … 37 … … 2013-‐… … … … … … 0.6
4… 2 59 … … 90 … … 2012-‐… … … … … … 1.5
![Page 57: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/57.jpg)
Benefits • Less I/O if reading subset of columns • Better compression: – Less disk usage – Less I/O
![Page 58: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/58.jpg)
But how in PostgreSQL? • Foreign Data Wrappers (FDW) allow
you to easily connect to any external data source
• Exist already for Mongo, Redis, CouchDB, JSON, Oracle, MySQL, others
![Page 59: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/59.jpg)
Announcing CSTORE • Open source columnar store built by
Citus Data • Releases March 14th for PostgreSQL
and CitusDB users
![Page 60: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/60.jpg)
CSTORE features • ORC inspired data layout benefits from
years of learning with RCFile format • Integrated with PostgreSQL FDW APIs: – StaKsKcs collecKon for opKmal query planning – Support for all PostgreSQL types and user defined types
![Page 61: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/61.jpg)
CSTORE benchmarks • TPC-H is the common benchmark • Performed benchmarks with 4 GB of
data on m1.xlarge instance • Compared vanilla PostgreSQL, CSTORE,
CSTORE with compression
![Page 62: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/62.jpg)
TPC-H Query 3
0 5 10 15 20 25 30 35 40 45
PostgreSQL
CSTORE
CSTORE (compressed)
seconds
Mean Query Times 2x
4GB of data on PostgreSQL 9.3 on m1.xlarge
![Page 63: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/63.jpg)
Other TPC-H Queries
4GB of data on PostgreSQL 9.3 on m1.xlarge
0 5 10 15 20 25 30 35 40 45
TPC-‐H #3 TPC-‐H #5 TPC-‐H #6 TPC-‐H #10
Que
ry Tim
e (secon
ds)
PostgreSQL
CSTORE
CSTORE (compressed)
![Page 64: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/64.jpg)
TPC-H Query 3
0 1 2 3 4 5
PostgreSQL
CSTORE
CSTORE (compressed)
Data transferred in GB
Disk I/O
>10x
4GB of data on PostgreSQL 9.3 on m1.xlarge
![Page 65: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/65.jpg)
Other TPC-H Queries
4GB of data on PostgreSQL 9.3 on m1.xlarge
0
1000
2000
3000
4000
5000
TPC-‐H #3 TPC-‐H #5 TPC-‐H #6 TPC-‐H #10
I/O in M
B
PostgreSQL
CSTORE
CSTORE (compressed)
![Page 66: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/66.jpg)
1. What is a data platform?
2. Why PostgreSQL?
3. Extensions, Extensions, Extensions
• HSTORE – semi-structured data in your DB
• HLL – distinct counts using mathematical magic
• CSTORE – a fast columnar store for PostgreSQL
4. How CitusDB lets you scale PostgreSQL è
![Page 67: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/67.jpg)
Data Platform • Store and query ALL your data • Scalable • Cost effective • Extensible
![Page 68: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/68.jpg)
Scaling Performance • Minimize network I/O through advanced
query parser integration • Push compute to nodes
![Page 69: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/69.jpg)
SELECT avg(price), max(price)
FROM items
WHERE quantity > 10
![Page 70: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/70.jpg)
Machine #1 Machine #2 Machine N
Master
Row Data
…
![Page 71: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/71.jpg)
SELECT avg(price), max(price)
FROM items
WHERE quantity > 10
![Page 72: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/72.jpg)
SELECT sum(price), count(*), max(price)
FROM items
WHERE quantity > 10
![Page 73: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/73.jpg)
Machine #1 Machine #2 Machine N
Master
sum
count
max sum
count
max
sum
count
max
…
![Page 74: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/74.jpg)
Master
sum
count
max sum
count
max
sum
count
max
Σ sumi Σ counti
max({max1 ... maxN})
![Page 75: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/75.jpg)
Scaling Fault Tolerance • Advanced query parser integration
allows for partial query retries • Block storage allows for improved node
failure handling
![Page 76: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/76.jpg)
Fixed size block of data
Machine 1 Machine 6 …
![Page 77: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/77.jpg)
SELECT avg(price), max(price)
FROM items
WHERE quantity > 10
![Page 78: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/78.jpg)
SELECT avg(price), max(price)
FROM items
WHERE quantity > 10
![Page 79: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/79.jpg)
SELECT avg(price), max(price)
FROM items
WHERE quantity > 10
![Page 80: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/80.jpg)
SELECT avg(price), max(price)
FROM items
WHERE quantity > 10
![Page 81: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/81.jpg)
Citus 3.0 • Support for large table joins • Includes PostgreSQL 9.3 features
(improved JSON support, etc.) • Available Feb 23rd
![Page 82: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/82.jpg)
Summary • PostgreSQL makes a great extensible
single node solution • For it to be a data platform it needs to
scale • Citus makes PostgreSQL scale
![Page 83: Scalable PostgreSQL as your data platform - Huihoodocs.huihoo.com/.../big-data...PostgreSQL-as-your-data-platform.pdf · 1. What is a data platform? 2. Why PostgreSQL? 3. Extensions,](https://reader034.fdocuments.us/reader034/viewer/2022051720/5a751f3f7f8b9a4b538c3aa4/html5/thumbnails/83.jpg)
Acknowledgements • PostgreSQL http://www.postgresql.org/ • Heap https://heapanalytics.com/ • HLL
http://www.aggregateknowledge.com/ • Icons
http://www.iconfinder.com/tmthymllr