Big datalab

27
BIGDATA LAB BigQuery & Query Visualization

description

 

Transcript of Big datalab

Page 1: Big datalab

BIGDATA LABBigQuery & Query Visualization

Page 2: Big datalab

Outline

• BigQuery

• BigQuery Visualization

• BigData Lab Open Source!

Page 3: Big datalab

About meDavid Chen

!TAGOO CTO

PyCon APAC 2014 PR Taipei.py Coorganizer

GDE !

Speaker: PyCon Apac 2014

PyCon 2013 Google Festival

Google Launch Event

Page 4: Big datalab

Big Query

Page 5: Big datalab

RealTime

BigQuery: Big Data Analytics in the cloud

BigData SQL

Page 6: Big datalab

SQL

Page 7: Big datalab

Basic Characteristic• STRING, INTEGER, FLOAT, BOOLEAN, TIMESTAMP,

RECORD

• schema: Support repeated / nested field (json)

• Import / (parallel) Export with CSV / JSON

• Streaming (real time insert)

• 100,000 rows/s

Page 8: Big datalab

Big Join

Page 9: Big datalab

Nested / Repeated

Page 10: Big datalab

Table wildcard / decorators

Page 11: Big datalab

User defined function

Page 12: Big datalab

Big Query Visualization

Page 13: Big datalab

BigQuery Taiwanhttp://littleq0903.github.io/bq-taiwan/

Page 14: Big datalab

With google chartshttps://gcdc2013-coder.appspot.com/app#

Page 15: Big datalab

http://googlegeodevelopers.blogspot.tw/2013/09/visualizing-airport-delay-correlations.html

BigQuery + Map

Page 16: Big datalab

http://nbviewer.ipython.org/gist/fhoffa/6459195

BigQuery + Ipython Notebook

Page 17: Big datalab

Even More• BigQuery with R

http://thinktostart.wordpress.com/2014/03/10/using-google-bigquery-with-r/

• BigQuery with Pandashttp://pandas.pydata.org/pandas-docs/stable/io.html#google-bigquery-experimental

• BigQuery with Hadoophttp://googlecloudplatform.blogspot.tw/2014/04/google-bigquery-and-datastore-connectors-for-hadoop.html

• Excel Connectorhttps://developers.google.com/bigquery/bigquery-connector-for-excel

Page 18: Big datalab

Real Time

Page 19: Big datalab

BigQuery + Hadoop

Page 20: Big datalab

https://www.youtube.com/watch?v=yKBHEznag-g#t=231Live Dashboard

Page 21: Big datalab

Big Data LabOpen Source

Page 22: Big datalab

Google Developer Challenge 2013

Page 23: Big datalab

AppEngine Manipulate data with MapReduce

Cloud Storage Storage with low price and highly consistence

Predict API* Machine learning on cloudBigQuery AdHoc Query to google sheet & Visualization

No Deploy / Config needs Easy to use (for kids) but still powerful Open Source

Page 24: Big datalab

Big Data Pipeline

Page 25: Big datalab

AppEngine

Task Client Pipeline WorkerVirtual Env

AppEngine

Task Client Pipeline WorkerVirtual Env

Map Reduce

Map Reduce

GCE

Task Client Hadoop!!

GCE Task Controller

Cron Tab

Task Graph

Controller UI

Virtual Env

Currently use Luigi

Page 26: Big datalab

• Task Workerhttps://github.com/Tagtoo/TaskWorker

• Predefined Pipelinehttps://github.com/Tagtoo/TaskWorker

• Virtual Package https://github.com/Tagtoo/BigDataLabWorker

AppEngine Manipulate data with MapReduce

Page 27: Big datalab

Reference

• https://cloud.google.com/events/google-cloud-platform-live/