Google Cloud for Developers - Devfest Manila
-
Upload
patrick-chanezon -
Category
Technology
-
view
14.129 -
download
0
Transcript of Google Cloud for Developers - Devfest Manila
Google Cloud for Developers Apps, Apps Marketplace, App Engine for Business, Storage, Prediction, Bigquery
2
Patrick ChanezonDeveloper [email protected]://twitter.com/chanezon
#devfest ManilaJuly 6 2010
Tuesday, July 6, 2010
Tuesday, July 6, 2010
The benefits of Cloud Computing
Economics Pay for only what you use TCO OPEX vs CAPEXOperations Day to day: no maintenance Fighting fires: no PagersElasticityFocus on your Business
3
Tuesday, July 6, 2010
Enterprise Firewall
Enterprise Data Authentication Enterprise Services User Management
Google Apps Platform
Build and Buy all your enterprise cloud apps...
4
Tuesday, July 6, 2010
Enterprise Firewall
Enterprise Data Authentication Enterprise Services User Management
Google Apps Platform
Buy from Google
Google Apps for Business
Build and Buy all your enterprise cloud apps...
4
Tuesday, July 6, 2010
Buy from others
Google Apps Marketplace
Enterprise Firewall
Enterprise Data Authentication Enterprise Services User Management
Google Apps Platform
Buy from Google
Google Apps for Business
Build and Buy all your enterprise cloud apps...
4
Tuesday, July 6, 2010
Build your own
Google App Enginefor Business
Buy from others
Google Apps Marketplace
Enterprise Firewall
Enterprise Data Authentication Enterprise Services User Management
Google Apps Platform
Buy from Google
Google Apps for Business
Build and Buy all your enterprise cloud apps...
4
Tuesday, July 6, 2010
Customers want more Apps
5
GoogleApps
Businessin the cloud
Tuesday, July 6, 2010
Google Apps Marketplace
Tuesday, July 6, 2010
Marketplace Overview
Tuesday, July 6, 2010
Google Apps is Focused on Messaging & Collaboration
Gives every employee powerful messaging and collaboration tools without the usual IT hassle and cost
Tuesday, July 6, 2010
These customers want more great cloud apps...
Messaging
Collaboration
Accounting & Finance
Admin Tools
Calendaring / Meetings
Customer Management
Document Management
Productivity
Project Management
Sales & Marketing
Security & Compliance
Workflow
Tuesday, July 6, 2010
Steps to sell your app to Google Apps customers1. Build your app:- with any tools and hosting provider you want
2. Integrate your app:- add Single Sign On using OpenID (required)- access over a dozen integration points from Calendar, Contacts, Docs, etc. using OAuth (optional)
3. Sell your app:- to 2M+ businesses & 25M+ users- through Google Apps resellers- to wherever Google Apps goes- One-time fee of $100, 20% rev share starting 2H'10
Tuesday, July 6, 2010
Everything in the Cloud
Tuesday, July 6, 2010
Integrating with Google Apps & the Marketplace
Manifest
Tuesday, July 6, 2010
Demo Google Apps Marketplace
Tuesday, July 6, 2010
How it's done - Single Sign On
Tuesday, July 6, 2010
Single Sign OnOpenID with Google Apps
Welcome [email protected]
***********
Sounds complicated, but not hard in practice!
Tuesday, July 6, 2010
Single Sign OnOpenID Libraries
Language Libraries
Java OpenID4Java, Step2
.NET DotNetOpenAuth
PHP php-openid, php-openid-apps-discovery
Ruby ruby-openid,ruby-openid-apps-discovery
Any RPX, Ping Identity
Tuesday, July 6, 2010
Single Sign On - CodeStep 1 of 3 - Initialize libraryfunction getOpenIDConsumer() { // Initialize client, storing associations // in memcache $store = new Auth_OpenID_MemcachedStore($MEMCACHE); $consumer = new Auth_OpenID_Consumer($store); // Enable Google Apps support GApps_OpenID_EnableDiscovery($consumer); return $consumer;}
Tuesday, July 6, 2010
Single Sign On - CodeStep 2 of 3 - Make the request// Create an auth request to the user's domain $auth_request = getOpenIDConsumer()->begin($domain);
// Request email address & name during login $ax = new Auth_OpenID_AX_FetchRequest; $attr = Auth_OpenID_AX_AttrInfo::make( $AX_SCHEMA_EMAIL, 1, 1, 'email');$attr = Auth_OpenID_AX_AttrInfo::make( $AX_SCHEMA_FIRSTNAME, 1, 1, 'first');$attr = Auth_OpenID_AX_AttrInfo::make( $AX_SCHEMA_LASTNAME, 1, 1, 'last');$auth_request->addExtension($ax);
// Render Javascript/form to post request$form_html = $auth_request->htmlMarkup($REALM, $RETURN_TO, false, array('id' => 'openid_message')); print $form_html;
Tuesday, July 6, 2010
Single Sign On - CodeStep 3 of 3 - Handle response
// Parse the response from identity provider $response = getOpenIDConsumer()->complete($RETURN_TO); if ($response->status == Auth_OpenID_SUCCESS) { // Extract data from response $openid = $response->getDisplayIdentifier(); $ax = new Auth_OpenID_AX_FetchResponse(); $ax_resp = $ax->fromSuccessResponse($response); $email = $ax_resp->data[$AX_SCHEMA_EMAIL][0]; $firstName = $ax_resp->data[$AX_SCHEMA_FIRSTNAME][0]; $lastName = $ax_resp->data[$AX_SCHEMA_LASTTNAME][0]; // Map to user in DB $user = fetch_user($openid, $email, $firstName, $lastName);}
Tuesday, July 6, 2010
Single Sign On - CodeManifest
<ApplicationManifest> <Name>Sassy Voice</Name> <Description>Voice Mail & Messaging</Description> <Admin> <Link rel="setup">http://voice.saasyapp.com/setup.php?domain=${DOMAIN_NAME} </Link> </Admin> <Extension id="navlink" type="link"> <Name>SaasyVoice</Name> <Url>http://voice.saasyapp.com/login.php?domain=${DOMAIN_NAME} </Url> </Extension> <Extension id="realm" type="openIdRealm"> <Url>http://voice.saasyapp.com/</Url> </Extension></ApplicationManifest>
Tuesday, July 6, 2010
Single Sign OnWhat we learned about users
Claimed ID: http://example.com/openid?id=12345
Email: [email protected]*First Name: BobLast Name: Dobbs
* Be sure to confirm or white-list trusted providers
Tuesday, July 6, 2010
How it's done - Data Access
Tuesday, July 6, 2010
2-Legged OAuth Delegating data access
<ApplicationManifest> ... <Scope id="userFeed"> <Url>https://apps-apis.google... <Reason>To get a list of user... </Scope> <Scope id="contactsFeed"> <Url>https://www.google.com... <Reason>To display names of... </Scope> <Scope id="docsFeed"> <Url>https://docs.google.com... <Reason>To export a call log... </Scope> </ApplicationManifest>
Tuesday, July 6, 2010
Data Access - CodeStep 1 of 2 - Initialize the client
function getOauthClient() { $options = array( 'consumerKey' => $CONSUMER_KEY, 'consumerSecret' => $CONSUMER_SECRET, 'signatureMethod' => 'HMAC-SHA1', 'requestScheme' =>Zend_Oauth::REQUEST_SCHEME_HEADER, 'version' => '1.0'); $consumer = new Zend_Oauth_Consumer($options); $token = new Zend_Oauth_Token_Access(); $httpClient = $token->getHttpClient($options); return $httpClient;}
Tuesday, July 6, 2010
Data Access - CodeStep 2 of 2 - Fetching Users with Provisioning API// Initialize client$userClient = new Zend_Gdata_Gapps(getOauthClient());// Query feed for current user's domain$userQuery = new Zend_Gdata_Gapps_UserQuery( getCurrentUserDomain());$usersFeed = $userClient->getUserFeed($userQuery);// Extract data from user feed$users = array();foreach ($usersFeed as $userEntry) { $login = $userEntry->getLogin(); $name = $userEntry->getName(); $users[] = array( 'username' => $login->getUsername(), 'firstName' => $name->getGivenName(), 'lastName' => $name->getFamilyName(), 'admin' => $login->getAdmin());}
Tuesday, July 6, 2010
Integration Recap
With not a lot of code we added:
• Single Sign On with OpenID• Quicker setup with Attribute Exchange, Provisioning API• Integrated user data with Contacts API• Uploading spreadsheets with Docs API
Tuesday, July 6, 2010
Gadgets - your real estate in Google Apps
Tuesday, July 6, 2010
Gadgets
• Many types of gadgetsoGmail sidebaroCalendar sidebaroSites oSpreadsheets
• Two new types of gadgets available this week!oGmail contextual!oWave!
• All use the OpenSocial gadgets specification
Tuesday, July 6, 2010
Gmail contextual gadgets
• Detect e-mail content via regular expressions• Display and collect actionable business information
directly in Gmail, below each message
Tuesday, July 6, 2010
Gmail contextual gadgets
Tuesday, July 6, 2010
Gmail contextual gadgets
Tuesday, July 6, 2010
Calendar sidebar gadgets
• Can detect:o currently-displayed date range o currently-selected event
titleattendeesdate/time
Tuesday, July 6, 2010
Calendar sidebar gadgets
Tuesday, July 6, 2010
Calendar sidebar gadgets
Tuesday, July 6, 2010
Sell
Tuesday, July 6, 2010
Complete Manifest
<ApplicationManifest xmlns="http://schemas.google.com/ApplicationManifest/2009"><Name>SaaSy Voice</Name> <Description>SaaSy Voice</Description> <Support> <Link rel="setup" href="https://.../setup.php?domain=${DOMAIN_NAME}"/> <Link rel="manage" href="https://.../config.php?domain=${DOMAIN_NAME}"/> <Link rel="deletion-policy" href="http://.../deletion-policy.php"/> <Link rel="support" href="http://.../support.php"/> </Support>
Tuesday, July 6, 2010
Complete Manifest(continued)
<!-- Link in universal navigation --> <Extension id="oneBarLink" type="link"> <Name>SaaSy Voice</Name> <Url>http://.../index.php?domain=${DOMAIN_NAME}</Url> <Scope ref="provisioningFeed"/> <Scope ref="contactsFeed"/> <Scope ref="docsFeed"/> </Extension> <!-- Declare our openid.realm --> <Extension id="realm" type="openIdRealm"> <Url>http://voice.saasyapp.com/marketplace/</Url> </Extension>
Tuesday, July 6, 2010
Complete Manifest(continued)
<Scope id="provisioningFeed"> <Url>https://apps-apis.google.com/a/feeds/user/#readonly</Url> <Reason>To get a list of users to provision accounts</Reason> </Scope> <Scope id="contactsFeed"> <Url>https://www.google.com/m8/feeds/</Url> <Reason>To display names of people who called</Reason> </Scope> <Scope id="docsFeed"> <Url>https://docs.google.com/feeds/</Url> <Reason>To export a call log</Reason> </Scope></ApplicationManifest>
Tuesday, July 6, 2010
Listing in the Google Apps Marketplace
Tuesday, July 6, 2010
Listing in the Google Apps Marketplace
Tuesday, July 6, 2010
Reach 25 million Google Apps Users
"We've seen a very meaningful increase in high quality, non-paid lead flow directly attributable to Google Apps customers...Our customers cite Smartsheet's tight integration with Google's Data APIs as a key factor in their decision to purchase." Brent Frei, Smartsheet
Tuesday, July 6, 2010
Roadmap
Tuesday, July 6, 2010
Roadmap - General
• Recent LaunchesoGmail contextual gadgetsoOAuth support for Gmail's IMAP and SMTP
Tuesday, July 6, 2010
Roadmap - Billing
• GoalsoSimplified invoicing & payments for customersoConsistent experience across apps
• Planned 2nd-half of 2010oRequired for installable apps after transition periodoRevenue sharing starts once APIs adopted
• FeaturesoUnified billing for apps purchased in marketplaceoSupport for a variety of pricing & licensing modelsoAPIs for customization and up-selling
Tuesday, July 6, 2010
Learn More
Tuesday, July 6, 2010
Resources
• Business and Marketing o http://developer.googleapps.com/marketplace
• Technical and Codeo http://code.google.com/googleapps/marketplace
• Shopping!o http://www.google.com/appsmarketplace
• Don't have Google Apps?o http://www.google.com/a
Tuesday, July 6, 2010
Build your own
Google App Enginefor Business
Buy from others
Google Apps Marketplace
Enterprise Firewall
Enterprise Data Authentication Enterprise Services User Management
Google Apps Platform
Buy from Google
Google Apps for Business
Build and Buy all your enterprise cloud apps...
47
Tuesday, July 6, 2010
Google App Engine
Tuesday, July 6, 2010
Leveraging Google's Leadership in Cloud Computing• Massive data center operations• Purpose built hardware• Multi tenant software platform at Internet scale
49
Tuesday, July 6, 2010
Google App Engine
-Easy to build-Easy to maintain-Easy to scale
50
Tuesday, July 6, 2010
Cloud development in a box
51
• SDK & “The Cloud”• Hardware• Networking• Operating system• Application runtime
o Java, Python• Static file serving• Services• Fault tolerance• Load balancing
Tuesday, July 6, 2010
App Engine Services
BlobstoreImages
Mail XMPP Task Queue
Memcache Datastore URL Fetch
User Service
52
Tuesday, July 6, 2010
Always free to get started
~5M pageviews/month• 6.5 CPU hrs/day• 1 GB storage• 650K URL Fetch calls/day• 2,000 recipients emailed• 1 GB/day bandwidth• 100,000 tasks enqueued• 650K XMPP messages/day
53
Tuesday, July 6, 2010
Purchase additional resources *
* free monthly quota of ~5 million page views still in full effect54
Tuesday, July 6, 2010
0.5B+ daily Pageviews
250,000+ Developers
100,000+ Apps
By the numbers
55
Tuesday, July 6, 2010
App Engine
56
Tuesday, July 6, 2010
Socialwok
57
Tuesday, July 6, 2010
Social networking at scale
58
Tuesday, July 6, 2010
Social networking at scale
>62M Users
3.6M DAUs on Facebook1.9M DAUs on MySpace
Orkut, Bebo, Hi5,Friendster, Hyves, Ning, …
58
Tuesday, July 6, 2010
Chillingo CrystalGaming meets Social
Cogs
Guerilla Bob
Zombie Dash Angry Birds LITE Underground Meltdown
Mission Deep Sea Speed ForgeExtreme
Ravensword:The Fallen King
Angry Birds
59
Tuesday, July 6, 2010
gigy Socialize
60
Tuesday, July 6, 2010
gigy Socialize - traffic
61
Tuesday, July 6, 2010
gigy Socialize - traffic
61
Tuesday, July 6, 2010
App Engine
62
Tuesday, July 6, 2010
Google App Engine for Business
Tuesday, July 6, 2010
Build your Enterprise Apps on Google
• Easy to Build - Java standards
• Easy to Deploy - push-button deployment
• Easy to Scale - from small apps to millions of users
64
Tuesday, July 6, 2010
Google App Engine for Business
• Centralized administration - controls
• Reliability and support - SLA, Premium support
• Secure by default - only your users
• Pricing that makes sense - pay only for what you use
• Enterprise features - hosted SQL, SSL on your domain
65
Tuesday, July 6, 2010
Understanding the Cloud Computing Landscape
IaaS
PaaS
SaaS
Source: Gartner AADI Summit Dec 2009
66
Tuesday, July 6, 2010
Google Storage for DevsMachine Learning
BigQuery
Google App Engine
Google's Cloud Offerings
IaaS
PaaS
SaaS
1. Our Apps2. 3rd party Apps: Google Apps Marketplace3. ________
67
Tuesday, July 6, 2010
Google Storage for DevsMachine Learning
BigQuery
Google App Engine
Google's Cloud Offerings
IaaS
PaaS
SaaS
Your Apps
1. Our Apps2. 3rd party Apps: Google Apps Marketplace3. ________
67
Tuesday, July 6, 2010
Domain Console
68
Tuesday, July 6, 2010
Domain Console
Like the regular admin consoleDesigned to manage enterprises with a portfolio of apps• Keep track of all apps in a domain• Access Control: view apps, deploy• Global Settings: apply to all apps in the domain • Billing rolling up to single account• DNS configuration done only once: *.ext.example.com• All apps by default for logged in users from domain
69
Tuesday, July 6, 2010
Google Apps Integration• SSO/SSO delegation• APIs for most Google Apps for integration
70
Tuesday, July 6, 2010
Federate your on-premise data
71
Tuesday, July 6, 2010
Secure Data Connector (SDC)
72
Tuesday, July 6, 2010
Using Secure Data Connector
73
Tuesday, July 6, 2010
Using Secure Data Connector
Installation- Determine access rules- Configure and install SDC
73
Tuesday, July 6, 2010
Using Secure Data Connector
Installation- Determine access rules- Configure and install SDC
Getting ready to serve- SDC opens SSL tunnel
73
Tuesday, July 6, 2010
Using Secure Data Connector
Installation- Determine access rules- Configure and install SDC
Getting ready to serve- SDC opens SSL tunnel
Serving- User request sent to App Engine- User authenticated- App makes request through tunnel- SDC performs access checks- Results returned
73
Tuesday, July 6, 2010
App Engine for Business Pricing
Intranet apps:Each app costs $8 / active user / monthCapped at $1,000 / month (i.e. users above 125 are free)Apps are auth-restricted to domain usersDevelopment is freeOverage charges on Background Analysis/Storage
Non intranet apps (external/public/ISV apps):Pricing TBD Postpaid (i.e. billed at the end of month)
74
Tuesday, July 6, 2010
App Engine for Business Support and SLAPaid Support Email based 1000$/month 1h response time on operational issues 8h on development issues SLA 99.9% uptime Service credits from 10% to 100% refund of monthly bill
75
Tuesday, July 6, 2010
Gadgets JS Maps APISearch KML 3DApp Engine
Google Developer Qualification
Chrome Extensions
76
Tuesday, July 6, 2010
Two years in review
77
Apr 2008 Python launchMay 2008 Memcache, Images APIJul 2008 Logs exportAug 2008 Batch write/deleteOct 2008 HTTPS supportDec 2008 Status dashboard, quota detailsFeb 2009 Billing, larger filesApr 2009 Java launch, DB import, cron support, SDCMay 2009 Key-only queriesJun 2009 Task queuesAug 2009 Kindless queriesSep 2009 XMPPOct 2009 Incoming emailDec 2009 BlobstoreFeb 2010 Datastore cursors, AppstatsMar 2010 Read policies, IPv6May 2010 App Engine for Business
Tuesday, July 6, 2010
An evolving platform
78
Tuesday, July 6, 2010
App Engine Roadmap
79
SSL for your domainBackground servers
Reserved instances
Control datastore availability vs. latency trade-offs
Mapping operations across datasetsDatastore dump and restore facility
Raise request/response size limits for some APIs
Improved monitoring/alerting
Channel APIBuilt-in support for OAuth & OpenID
Tuesday, July 6, 2010
Google Storage
Tuesday, July 6, 2010
Google Storage• Cloud-based binary object store
oStructured as buckets and objectsoMany buckets, many objects, large objects
• You control your dataoPrivate, shared, or publicoGet your data back out at any time
• For developersoRESTful APIoMany SDKs + toolso Integration with other Google services
Tuesday, July 6, 2010
Google Storage Benefits
High Performance and Scalability backed by Google infrastructure
Flexible Authentication & Sharing Models
Get Started Fast withGoogle & 3rd Party Utilities
Tuesday, July 6, 2010
Demo: Get Started In < 1 Minute• click emailed invitation link
• read & accept Terms Of Service
• start using GS Manager
Tuesday, July 6, 2010
Google services using Google Storage
Partner ReportingData Liberation Haiti Relief Imagery
Google BigQuery
Google Prediction API
Partner Reporting
Tuesday, July 6, 2010
Some current users
Tuesday, July 6, 2010
Technical Overview
Tuesday, July 6, 2010
Google Storage Overview
• Fast, scalable, highly available object storeoObjects of any type and practically any sizeo Lots of objects, lots of bucketsoAll data replicated to multiple US data centersoRead-your-writes data consistency
• Easy, flexible authentication and sharingoKey-based authenticationoAuthenticated downloads from a web browseroSharing with individuals and groups
• Google products and 3rd party tools/services oCompatible with many available tools and librariesoGetting started toolkit
Tuesday, July 6, 2010
API Concepts
• RESTful APIoVerbs: GET, PUT, POST, HEAD, DELETEoResources: identified by URI
• BucketsoFlat containers
• ObjectsoAny type, practically any size
• Access Control for Google AccountsoGoogle Groups
• Two Ways to Authenticate RequestsoSign request using access keysoWeb browser login
Tuesday, July 6, 2010
Sample Signed Request
PUT /mybucket/My/Long/Object/Name HTTP/1.1Host: commondatastorage.googleapis.com:443Accept-Encoding: identityDate: Sat, 08 May 2010 19:04:21 GMTContent-Length: 28Content-Type: text/plain Authorization:GOOG1GOOG4622809698762217:J+y3mj5GThfI6Ed1MqLi7JpCq5Y= This is my object's content.
Tuesday, July 6, 2010
Sharing and ACLs
• Data can be private or shared
• Bucket ACL determines:owho can list objects (READ) owho can create / delete objects (WRITE)owho can read / write bucket ACL (FULL_CONTROL)
• Object ACL determines:owho can read objects (READ)owho can read / write object ACL (FULL_CONTROL)
Tuesday, July 6, 2010
Read-Your-Writes Consistency
• Once a write succeeds, all future reads will see a snapshot that includes that write... no matter what replica they talk to
• Once any reader sees a result (even if the write previously appeared to fail) then all future readers will see a snapshot that includes the write
Tuesday, July 6, 2010
Interoperability Gives You Choice
• Data LiberationoYou shouldn't be locked in by your choice of
storage provider
• Choice of toolsoYou should be able to use the same tools to
manage your data, regardless of where you keep it
Tuesday, July 6, 2010
Demo: Using Google Storage
Tuesday, July 6, 2010
Computing in the Cloud: App Engine Sampleos.environ['BOTO_CONFIG'] = 'boto.cfg'from boto import storage_uri# <other imports omitted>
class MainPage(webapp.RequestHandler): def get(self): self.response.out.write('<html><body>') uri = storage_uri('gs://pub/shakespeare/rose.txt') poem = uri.get_contents_as_string() self.response.out.write('<pre>' + poem + '</pre>') self.response.out.write('</body></html>')
def main(): application = webapp.WSGIApplication([('/', MainPage)]) run_wsgi_app(application)
if __name__ == "__main__": main()
Tuesday, July 6, 2010
Pricing and Availability
• Pay as you go pricing• Storage - $0.17/GB/month• Network
o Upload data to Google$0.10/GB
o Download data from Google$0.15/GB for Americas and EMEA$0.30/GB for APAC
• Requestso PUT, POST, LIST - $0.01 per 1,000 Requestso GET, HEAD - $0.01 per 10,000 Requests
• Free storage (up to 100GB) during preview periodo No SLA
• http://code.google.com/apis/storage
Tuesday, July 6, 2010
What's Coming Up
• Service Level Agreement
• Support
• Available to Premium Apps Customers
• Technical Features:oGroup support in ACLsoResumable uploadsoAdditional regions
Tuesday, July 6, 2010
Google BigQuery and Prediction APIs
Tuesday, July 6, 2010
Overview
• Big Data - Challenging and Important
• Google has tools for deep data analysis
• Now you can use these tools
Tuesday, July 6, 2010
Overview
• Big Data - Challenging and Important
• Google has tools for deep data analysis
• Now you can use these tools
• Announcing two new APIs to get more from your data: 1.BigQuery2.Prediction API
Tuesday, July 6, 2010
Benefits
• Built on Google technology
• Scalability
• Security
• Sharing
• Easy integration with Google App Engine, Google Spreadsheets, ....
Tuesday, July 6, 2010
Using Your Data with BigQuery & Prediction API
1. Upload
2. Process
Upload your datato Google Storage
Import to tablesTrain a model
Run queriesMake predictions3. Act
Your Data
BigQuery Prediction API
Your Apps
Tuesday, July 6, 2010
BigQueryInteractive Analysis of Big Data
Tuesday, July 6, 2010
Big Data is Challenging
Starts with Scale
Tuesday, July 6, 2010
Many Use Cases ...
Spam TrendsDetection
Web Dashboards
Network Optimization
Interactive Tools
Tuesday, July 6, 2010
Demo: Analyzing M-LabAn open platform for advanced network research http://www.measurementlab.net/
Tuesday, July 6, 2010
Demo: Exploring M-Lab
Tuesday, July 6, 2010
Key Capabilities of BigQuery
• Scalable: Billions of rows• Fast: Response in seconds
• Simple: Queries in SQL
• Web Serviceo RESTo JSON-RPCo Google App Scripts
Tuesday, July 6, 2010
Using Your Data with BigQuery
1. Upload
2. Import
Upload to Google Storage
Import data into a BigQuery Table- No need to define indices, keys, etc..
Execute queries via APIs- No provisioning machines or resources3. Query
Tuesday, July 6, 2010
Writing Queries
Compact subset of SQLo SELECT ... FROM ...
WHERE ... GROUP BY ... ORDER BY ...LIMIT ...;
Common functionsoMath, String, Time, ...
Statistical approximationsoTOPoCOUNT DISTINCT
Tuesday, July 6, 2010
API in a Minute
GET /bigquery/v1/tables/{table name}
GET /bigquery/v1/query?q={query}Sample JSON Reply:{ "results": { "fields": { [ {"id":"COUNT(*)","type":"uint64"}, ... ] }, "rows": [ {"f":[{"v":"2949"}, ...]}, {"f":[{"v":"5387"}, ...]}, ... ] }}
Also supports JSON-RPC
Tuesday, July 6, 2010
Security and Privacy
Standard Google Authentication• Client Login• OAuth• AuthSub
HTTPS support• protects your credentials• protects your data
Use Google Storage for Developers to manage access
Tuesday, July 6, 2010
Large Corpus Analysis
Wikimedia Revision history data from: http://download.wikimedia.org/enwiki/latest/enwiki-latest-pages-meta-history.xml.7z
Wikimedia Revision History
Tuesday, July 6, 2010
Using BigQuery ShellPython DB API 2.0 + B. Clapper's sqlcmdhttp://www.clapper.org/software/python/sqlcmd/
Tuesday, July 6, 2010
BigQuery from a Spreadsheet
Tuesday, July 6, 2010
BigQuery from a Spreadsheet
Tuesday, July 6, 2010
BigQuery Recap
• Interactive analysis of very large data sets• Simple SQL query language
• APIs enable a variety of use cases
Tuesday, July 6, 2010
Google Prediction APIMachine learning as a web service
Tuesday, July 6, 2010
Prediction API 101
• Google's sophisticated machine learning algorithms
• Available as an on-demand RESTful HTTP web service
• Train a model offline/asynchronously
• Predict results in real-time
"french"Prediction API"Tous pour un, un pour tous, c'est notre devise."
Tuesday, July 6, 2010
How does it work?
"english" The quick brown fox jumped over the lazy dog.
"english" To err is human, but to really foul things up you need a computer.
"spanish" No hay mal que por bien no venga.
"spanish" La tercera es la vencida.
? To be or not to be, that is the question.
? La fe mueve montañas.
The Prediction APIfinds relevantfeatures in the sample data during training.
The Prediction APIlater searches forthose featuresduring prediction.
Tuesday, July 6, 2010
A virtually endless number of applications...
CustomerSentiment
TransactionRisk
SpeciesIdentification
MessageRouting
Legal DocketClassification
SuspiciousActivity
Work RosterAssignment
RecommendProducts
PoliticalBias
UpliftMarketing
EmailFiltering
Diagnostics
InappropriateContent
CareerCounselling
ChurnPrediction
... and many more ...
Tuesday, July 6, 2010
Three simple steps to use the Prediction API
1. Upload
2. Train
Upload your training data toGoogle Storage
Build a model from your data
Make new predictions
prediction/v1/train/{}POST : a training request
prediction/v1/query/{}GET : model infoPOST : a prediction request
Use the API, gsutil or any compatible utility to upload your data to Google Storage
3. Predict
Tuesday, July 6, 2010
Prediction API Demo Automatically categorize and respond to emails by language
• Customer: ACME Corp, a multinational organization• Goal: Respond to customer emails in their language• Data: Many emails, tagged with their languages
• Outcome: Predict language and respond accordingly
Tuesday, July 6, 2010
Step 1: UploadUpload your training data to Google Storage
• Training data: outputs and input features • Data format: comma separated value format (CSV)
$ head -n 2 ${data}"english","To err is human, but to really ...""spanish","No hay mal que por bien no venga."
Upload to Google Storage$ gsutil cp ${data} gs://io10/${data}
Tuesday, July 6, 2010
Step 2: TrainCreate a new model by training on data
To train a model:
POST prediction/v1/train/${data}
Training runs asynchronously. To see if it has finished:
GET prediction/v1/query/${data}
{"data": { "resource": { "data": "${data}", "modelinfo": "estimated accuracy: ${acc}"}}}
Tuesday, July 6, 2010
Step 3: PredictApply the trained model to make predictions on new data
POST prediction/v1/query/${data}
{ data : { "instance" : { "input" : { "text" : [ "J'aime X! C'est le meilleur" ]}}}}
Tuesday, July 6, 2010
Step 3: PredictApply the trained model to make predictions on new data
POST prediction/v1/query/${data}
{ data : { "instance" : { "input" : { "text" : [ "J'aime X! C'est le meilleur" ]} "output" : {"output_label" : "french"}}}}
Tuesday, July 6, 2010
Step 3: PredictApply the trained model to make predictions on new data
import httplib
header = {"Content-Type" : "application/json"}#...put new data in JSON format in params variableconn = httplib.HTTPConnection("www.googleapis.com")conn.request("POST", "/prediction/v1/query/${data}", params, header)print conn.getresponse()
Tuesday, July 6, 2010
Prediction API Capabilities
Data• Input Features: numeric or unstructured text• Output: up to 100s of discrete categories
Training• Many machine learning techniques• Automatically selected • Performed asynchronously
Access from many platforms:• Web app from Google App Engine• Apps Script (e.g. from Google Spreadsheet)• Desktop app
Tuesday, July 6, 2010
Prediction API and BigQuery Demo: Tagger
Input Data: http://delic.io.us/chanezon–6000 urls, 14000 tags in 6 years
Analyze my delicious tags–use delicious API to get all tagged urls–cleanup data, resize (100Mb limit)–PUT data in Google storage–Define table–analyze
Predict how I would tag a technology article–input is tag,url,text–send new url and text–get predicted tag
Tuesday, July 6, 2010
Get the BigQuery & Prediction APIs
• Preview, opened to a limited number of developers• You need a Google Storage for Developers account• To request access and get more information, go to:
o http://code.google.com/apis/bigqueryo http://code.google.com/apis/prediction
Tuesday, July 6, 2010
Acknowledgement
Thanks to many Googlers from their slides, most available at the Google IO 2010 website
131
Tuesday, July 6, 2010
Thank you
Read morehttp://code.google.com/appengine/
Contact infoPatrick ChanezonDeveloper [email protected]://twitter.com/chanezon
Questions?
132
Tuesday, July 6, 2010
Tuesday, July 6, 2010