CloudML talk at DevFest Madurai 2016

82
BigData BigQuery CloudML Cloud API

Transcript of CloudML talk at DevFest Madurai 2016

Page 1: CloudML talk at DevFest Madurai 2016

● BigData● BigQuery● CloudML● Cloud API

Page 2: CloudML talk at DevFest Madurai 2016

Karthik PadmanabhanDeveloper Relations

@ karthik_padman

Page 3: CloudML talk at DevFest Madurai 2016

Big data and machine learning at Google

Big Query Cloud Dataflow Cloud ML

Anything you can ask in SQL

Parallel processing, batch and stream

Machine learning, neural networks

Page 4: CloudML talk at DevFest Madurai 2016

Big data and machine learning at Google

Big Query Cloud Dataflow Cloud ML

Anything you can ask in SQL

Parallel processing, batch and stream

Machine learning, neural networks

Page 5: CloudML talk at DevFest Madurai 2016

Big data and machine learning at Google

Big Query Cloud Dataflow Cloud ML

Apache Beam Tensorflow

Open source

Page 6: CloudML talk at DevFest Madurai 2016

Big data and machine learning at Google

Big Query Cloud Dataflow Cloud ML

Apache Beam Tensorflow

Vision API

Speech API

Translate API

Pre-trained models

Page 7: CloudML talk at DevFest Madurai 2016

Cloud Dataflow demo

Page 8: CloudML talk at DevFest Madurai 2016

Vision API demo

Page 9: CloudML talk at DevFest Madurai 2016

Tensorflow demo

Page 10: CloudML talk at DevFest Madurai 2016

Photo credit: Matt Chanphoto credit - isaiah115 on flickr

Page 11: CloudML talk at DevFest Madurai 2016

Photo credit: Matt Chan

Page 12: CloudML talk at DevFest Madurai 2016

Google Research Publications

Page 13: CloudML talk at DevFest Madurai 2016

Google Research Publications

Page 14: CloudML talk at DevFest Madurai 2016

Open Source Implementations

Bigtable

Flume

Dremel

Page 15: CloudML talk at DevFest Madurai 2016

Managed Cloud Versions

Bigtable Bigtable

Flume Dataflow

Dremel BigQuery

Page 16: CloudML talk at DevFest Madurai 2016

BigQuery demo

Page 17: CloudML talk at DevFest Madurai 2016

Google BigQueryGoogle BigQuery

Page 18: CloudML talk at DevFest Madurai 2016

02 Count some stuff

Page 19: CloudML talk at DevFest Madurai 2016

SELECT count(word)FROM publicdata:samples.shakespeare

Words in Shakespeare

Page 20: CloudML talk at DevFest Madurai 2016

SELECT sum(requests) as totalFROM [fh-bigquery:wikipedia.pagecounts_20150212_01]

Wikipedia hits over 1 hour

Page 21: CloudML talk at DevFest Madurai 2016

SELECT sum(requests) as totalFROM [fh-bigquery:wikipedia.pagecounts_201505]

Wikipedia hits over 1 month

Page 22: CloudML talk at DevFest Madurai 2016

Several years of Wikipedia data

SELECT sum(requests) as totalFROM [fh-bigquery:wikipedia.pagecounts_201105], [fh-bigquery:wikipedia.pagecounts_201106], [fh-bigquery:wikipedia.pagecounts_201107],

...

Page 23: CloudML talk at DevFest Madurai 2016

SELECT SUM(requests) AS totalFROM TABLE_QUERY( [fh-bigquery:wikipedia], 'REGEXP_MATCH( table_id, r"pagecounts_2015[0-9]{2}$")')

Several years of Wikipedia data

Page 24: CloudML talk at DevFest Madurai 2016

How about a RegExp

SELECT SUM(requests) AS totalFROM TABLE_QUERY( [fh-bigquery:wikipedia], 'REGEXP_MATCH( table_id, r"pagecounts_2015[0-9]{2}$")')WHERE (REGEXP_MATCH(title, '.*[dD]inosaur.*'))

Page 25: CloudML talk at DevFest Madurai 2016

03 How did it do that?o_O

Page 26: CloudML talk at DevFest Madurai 2016

Qualities of a good RDBMS

Page 27: CloudML talk at DevFest Madurai 2016

Qualities of a good RDBMS

● Inserts & locking● Indexing● Cache● Query planning

Page 28: CloudML talk at DevFest Madurai 2016

Qualities of a good RDBMS

● Inserts & locking● Indexing● Cache● Query planning

Page 29: CloudML talk at DevFest Madurai 2016
Page 30: CloudML talk at DevFest Madurai 2016
Page 31: CloudML talk at DevFest Madurai 2016
Page 32: CloudML talk at DevFest Madurai 2016

Storing data

-- -- -- ---- -- -- ---- -- -- --

Table

Columns

Disks

Page 33: CloudML talk at DevFest Madurai 2016

Reading data: Life of a BigQuery

SELECT sum(requests) as sumFROM ( SELECT requests, title FROM [fh-bigquery:wikipedia.pagecounts_201501] WHERE (REGEXP_MATCH(title, '[Jj]en.+')) )

Page 34: CloudML talk at DevFest Madurai 2016

Life of a BigQuery

L L

MMixer

Leaf

Storage

Page 35: CloudML talk at DevFest Madurai 2016

L L L L

M M

M

Life of a BigQuery

Root Mixer

Mixer

Leaf

Storage

Page 36: CloudML talk at DevFest Madurai 2016

Life of a BigQueryQuery

L L L L

M M

MRoot Mixer

Mixer

Leaf

Storage

Page 37: CloudML talk at DevFest Madurai 2016

Life of a BigQueryLife of a BigQuery

Root Mixer

Mixer

Leaf

StorageSELECT requests, title

L L L L

M M

M

Page 38: CloudML talk at DevFest Madurai 2016

Life of a BigQueryLife of a BigQuery

Root Mixer

Mixer

Leaf

Storage5.4 Bil

SELECT requests, title

WHERE (REGEXP_MATCH(title, '[Jj]en.+'))L L L L

M M

M

Page 39: CloudML talk at DevFest Madurai 2016

Life of a BigQueryLife of a BigQuery

Root Mixer

Mixer

Leaf

Storage5.4 Bil

SELECT sum(requests)

5.8 MilWHERE (REGEXP_MATCH(title, '[Jj]en.+'))

SELECT requests, title

L L L L

M M

M

Page 40: CloudML talk at DevFest Madurai 2016

Life of a BigQueryLife of a BigQuery

Root Mixer

Mixer

Leaf

Storage5.4 Bil

SELECT sum(requests)

5.8 MilWHERE (REGEXP_MATCH(title, '[Jj]en.+'))

SELECT requests, title

SELECT sum(requests)

L L L L

M M

M

Page 41: CloudML talk at DevFest Madurai 2016

04 Something Useful Use Wikipedia data to pick a movie

Page 42: CloudML talk at DevFest Madurai 2016

1. Wikipedia edits2. ???3. Movie recommendation

Page 43: CloudML talk at DevFest Madurai 2016

Follow the edits

Page 44: CloudML talk at DevFest Madurai 2016

Follow the edits

Same editor

Page 45: CloudML talk at DevFest Madurai 2016

select title, id, count(id) as editsfrom [publicdata:samples.wikipedia]where title contains 'Hackers' and title contains '(film)' and wp_namespace = 0group by title, idorder by editslimit 10

Pick a great movie

Page 46: CloudML talk at DevFest Madurai 2016

select title, id, count(id) as edits from [publicdata:samples.wikipedia]where contributor_id in ( select contributor_id from [publicdata:samples.wikipedia] where

id=264176 and contributor_id is not null and is_bot is null and wp_namespace = 0 and title CONTAINS '(film)' group by contributor_id) and wp_namespace = 0 and id != 264176 and title CONTAINS '(film)'group each by title, idorder by edits desclimit 100

Find edits in common

Page 47: CloudML talk at DevFest Madurai 2016

Discover the most broadly popular filmsselect id from ( select id, count(id) as edits from [publicdata:samples.wikipedia] where wp_namespace = 0 and title CONTAINS '(film)' group each by id order by edits desc limit 20)

Page 48: CloudML talk at DevFest Madurai 2016

Edits in common, minus broadly popularselect title, id, count(id) as edits from [publicdata:samples.wikipedia]where contributor_id in ( select contributor_id from [publicdata:samples.wikipedia] where

id=264176 and contributor_id is not null and is_bot is null and wp_namespace = 0 and title CONTAINS '(film)' group by contributor_id) and wp_namespace = 0 and id != 264176 and title CONTAINS '(film)' and id not in (

select id from ( select id, count(id) as edits from [publicdata:samples.wikipedia] where wp_namespace = 0 and title CONTAINS '(film)' group each by id order by edits desc limit 20 ) )group each by title, idorder by edits desclimit 100

Page 49: CloudML talk at DevFest Madurai 2016

Interesting challenges await

Page 50: CloudML talk at DevFest Madurai 2016

The plan

01

02

03

04

05

A (very) brief overview of machine learning

Vision API

Speech API

Natural Language API

Tears (of joy)

Page 51: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 51

Machine Learning is

using many examples to answer questions

Page 52: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 52

Page 53: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 53

Why the sudden explosion in machine learning?

Page 54: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 54

Page 55: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 55

Page 56: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 56

Page 57: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 57

Google Cloud is

The Datacenter as a Computer

Page 58: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 58

Page 59: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 59

Page 60: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 60

Page 61: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 61

Page 62: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 62

So what's special?

● Sound → Text

● Pixels → Meaning

Understanding the real world is hard

Page 63: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 63

How can we make it easier?

Page 64: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 64

Cloud Speech API Cloud Vision API

Page 65: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 6565

Speech API● Speech to text transcription in over 80 languages

● Supports streaming and non-streaming recognition

● Filters inappropriate content

● Demo!

Page 67: CloudML talk at DevFest Madurai 2016

67

{ "labelAnnotations": [ { "mid": "/m/0c9ph5", "description": "Flower", "score": 98 }, { "mid": "/m/05s2s", "description": "Plant", "score": 93 }, { "mid": "/m/03bmqb", "description": "Flora", "score": 83 }, { "mid": "/m/0k3b9", "description": "Hydrangea", "score": 81 }, ] }

Label Detection

67

Page 68: CloudML talk at DevFest Madurai 2016

68

{

"landmarkAnnotations" : [

{

"boundingPoly" : {

"vertices" : [

{

"x" : 52,

"y" : 25

},

...

]

},

"mid" : "\/m\/0b__kbm",

"score" : 0.4231607,

"description" : "The Wizarding World of Harry Potter",

"locations" : [

{

"latLng" : {

"longitude" : -81.471261,

"latitude" : 28.473

}

}

]

}

]

}

Landmark Detection

68

Page 69: CloudML talk at DevFest Madurai 2016

69

{..."itemListElement": [ { "@type": "EntitySearchResult", "result": { "@id": "kg:/m/0b__kbm", "name": "The Wizarding World of Harry Potter", ...

"detailedDescription": { "articleBody": "The Wizarding World of Harry Potter is a themed area spanning two theme parks – Islands of Adventure and Universal Studios Florida – at the Universal Orlando Resort in Orlando, Florida, USA.\n",

...

Knowledge Graph sidebarGET https://kgsearch.googleapis.com/v1/entities:search?ids=%2Fm%2F0b__kbm&key={API_KEY}

Page 70: CloudML talk at DevFest Madurai 2016

70

"faceAnnotations" : [

{

"headwearLikelihood" : "VERY_UNLIKELY",

"surpriseLikelihood" : "VERY_UNLIKELY",

"rollAngle" : 8.5484314,

"angerLikelihood" : "VERY_UNLIKELY",

"detectionConfidence" : 0.9996134,

"joyLikelihood" : "VERY_LIKELY",

"panAngle" : 18.178885,

"sorrowLikelihood" : "VERY_UNLIKELY",

"tiltAngle" : -12.244568,

"underExposedLikelihood" : "VERY_UNLIKELY",

"blurredLikelihood" : "VERY_UNLIKELY"

"landmarks" : [

{

"type" : "LEFT_EYE",

"position" : {

"x" : 268.25815,

"y" : 491.55255,

"z" : -0.0022390306

}

},

...

Face Detection

70

{

"type" : "RIGHT_EYE",

"position" : {

"x" : 418.42868,

"y" : 508.22632,

"z" : 49.302765

}

},

{

"type" : "MIDPOINT_BETWEEN_EYES",

"position" : {

"x" : 359.86551,

"y" : 500.2868,

"z" : -7.9241152

}

},

{

"type" : "NOSE_TIP",

"position" : {

"x" : 358.51404,

"y" : 611.80286,

"z" : -31.350466

}

},

...

Page 71: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 71

Page 72: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 72

Page 73: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 73

Page 74: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 74

How about some meaning in those words?

Page 75: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 75

Natural Language API

Three methods:

1. Analyze entities - Montreal is a city in Canada

2. Analyze sentiment - I love Montreal

3. Analyze syntax - Michelle Obama is married to

Barack Obama

Page 76: CloudML talk at DevFest Madurai 2016

Confidential & ProprietaryGoogle Cloud Platform 76

https://cloud.google.com/nl

Page 77: CloudML talk at DevFest Madurai 2016

77

Free tears!

Page 78: CloudML talk at DevFest Madurai 2016

78

● Vision API - 1,000 requests / month

● Speech API - 60 minutes / month

● Natural Language API - 5,000 units /

month (1 unit = 1000 unicode

characters)

Free tears!tiers

Page 79: CloudML talk at DevFest Madurai 2016

Thank you!@karthik_padman

Resources:

Speech APIcloud.google.com/speech

Vision APIcloud.google.com/vision

Natural Language APIcloud.google.com/nl

Page 80: CloudML talk at DevFest Madurai 2016

Thank you!

Karthik PadmanabhanDeveloper RelationsGoogle Cloud Platform@ karthik_padman

Try BigQuery: bigquery.google.comCloud APICloudML

Slides:

Page 81: CloudML talk at DevFest Madurai 2016
Page 82: CloudML talk at DevFest Madurai 2016

About you

● Game developers?● Data people?● Students?● Not techies at all?