SlideShare a Scribd company logo
Chris Schalk
Google Developer Advocate


Creación de aplicaciones
integradas en las tecnologías de
nube de Google
(Building Integrated Applications on
Google's Cloud Technologies)
Agenda

● Introduction

● Introduction to Google's Cloud Technologies

● App Engine Recap

● Google's new Cloud Technologies
   ○ Google Storage
   ○ Prediction API
   ○ BigQuery

● Summary Q&A
Google's Cloud Technologies




                Google App Engine



           Google                 Google
          BigQuery             Prediction API



                     Google
                     Storage
Google App Engine




An App Engine recap...
Cloud Development in a Box
● Downloadable SDK

● Application runtimes
   ○ Java, Python, (Go)

● Local development tools
    ○ Eclipse plugin
    ○ App Engine Launcher

● Specialized application services

● Cloud based dashboard

● Ready to scale
   ○ Built in fault tolerance, load
     balancing
Quick GAE Demo!




Building, testing and deploying a new cloud app
Specialized Services


Memcache         Datastore    URL Fetch




Mail             XMPP         Task Queue




Images           Blobstore    User Service

                             But, is that it?
No!
App Engine now has access to new
specialized cloud services...
Google's new Cloud Technologies
New Google Cloud Technologies


 ● Google Storage
    ○ Store your data in Google's cloud

 ● Prediction API
    ○ Google's machine learning tech in an API

 ● BigQuery
    ○ Hi-speed data analysis on massive scale
Google Storage for Developers
       Store your data in Google's cloud
What Is Google Storage?



 ● Store your data in Google's cloud
    ○ any format, any amount, any time

 ● You control access to your data
    ○ private, shared, or public

 ● Access via Google APIs or 3rd party tools/libraries
Sample Use Cases

 Static content hosting
 e.g. static html, images, music, video

 Backup and recovery
 e.g. personal data, business records

 Sharing
 e.g. share data with your customers

 Data storage for applications
 e.g. used as storage backend for Android, AppEngine, Cloud
 based apps

 Storage for Computation
 e.g. BigQuery, Prediction API
Google Storage Benefits


             High Performance and Scalability
             Backed by Google infrastructure




               Strong Security and Privacy
               Control access to your data



           Easy to Use
           Get started fast with Google & 3rd party tools
Google Storage Technical Details
 ● RESTful API
    ○ Verbs: GET, PUT, POST, HEAD, DELETE
    ○ Resources: identified by URI
    ○ Compatible with S3

 ● Buckets
    ○ Flat containers
 ● Objects
    ○ Any type
    ○ Size: 100 GB / object

 ● Access Control for Google Accounts
    ○ For individuals and groups
 ● Two Ways to Authenticate Requests
    ○ Sign request using access keys
    ○ Web browser login
Security and Privacy Features

  ● Key-based authentication
  ● Authenticated downloads from a web browser

  ● Sharing with individuals
  ● Group sharing via Google Groups

  ● Access control for buckets and objects
  ● Set Read/Write/List permissions
Demo


● Tools:
   ○ GSUtil
   ○ GS Manager

● Upload / Download
Google Storage usage within Google



            Google                        Google
           BigQuery                    Prediction API




                                Haiti Relief Imagery      USPTO data




                Partner Reporting     Partner Reporting
Some Early Google Storage Adopters
Google Storage - Pricing
    ○ Free trial quota until Dec 31, 2011
        ■ For first project
        ■ 5 GB of storage
        ■ 25 GB download/upload data
            ■ 20 GB to Americas/EMEA, 5GB APAC
        ■ 25K GET, HEAD requests
        ■ 2,5K PUT, POST, LIST* requests
    ○ Production Storage
        ■ $0.17/GB/Month (Location US, EU)
        ■ Upload - $0.10/GB
        ■ Download - $0.15/GB Americas/EMEA, $0.30/GB APAC
        ■ Requests
            ■ PUT, POST, LIST - $0.01 / 1000 Requests
            ■ GET, HEAD - $0.01 / 10,000 Requests
        ■ 99.9% uptime SLA
Google Storage Summary


 ● Store any kind of data using Google's cloud infrastructure

 ● Easy to Use APIs

 ● Many available tools and libraries
    ○ gsutil, GS Manager
    ○ 3rd party:
        ■ Boto, CloudBerry, CyberDuck, JetS3t, and more
Google Prediction API
Google's prediction engine in the cloud
Google Prediction API as a simple example



      Predicts outcomes based on 'learned' patterns
How does it work?

                     "english" The quick brown fox jumped over the
The Prediction API             lazy dog.
finds relevant
                     "english" To err is human, but to really foul things
features in the                up you need a computer.
sample data during   "spanish" No hay mal que por bien no venga.
training.
                     "spanish" La tercera es la vencida.


The Prediction API
later searches for   ?          To be or not to be, that is the
                                question.
those features
                     ?          La fe mueve montañas.
during prediction.
A virtually endless number of applications...


 Customer    Transaction         Species           Message     Diagnostics
 Sentiment      Risk           Identification      Routing




  Churn      Legal Docket      Suspicious       Work Roster    Inappropriate
Prediction   Classification     Activity        Assignment        Content




Recommend      Political         Uplift             Email        Career
 Products       Bias            Marketing          Filtering   Counselling

                           ... and many more ...
Using the Prediction API

A simple three step process...


                                 Upload your training data to
              1. Upload          Google Storage




                                 Build a model from your data
              2. Train




              3. Predict         Make new predictions
Step 1: Upload
 Upload your training data to Google Storage
● Training data: outputs and input features
● Data format: comma separated value format (CSV)

   "english","To err is human, but to really ..."
   "spanish","No hay mal que por bien no venga."
   ...

   Upload to Google Storage
   gsutil cp ${data} gs://yourbucket/${data}
Step 2: Train
Create a new model by training on data

To train a model:

POST prediction/v1.3/training
{"id":"mybucket/mydata"}
Training runs asynchronously. To see if it has finished:

GET prediction/v1.3/training/mybucket%2Fmydata

{"kind": "prediction#training",...
,"training status": "DONE"}
Step 3: Predict
 Apply the trained model to make predictions on new data
POST
prediction/v1.3/training/mybucket%2Fmydata/predict
{ "data":{
   "input": { "text" : [
    "J'aime X! C'est le meilleur" ]}}}
Step 3: Predict
   Apply the trained model to make predictions on new data
POST prediction/v1.3/training/bucket%2Fdata/predict

{ "data":{
   "input": { "text" : [
    "J'aime X! C'est le meilleur" ]}}}

{ data : {
 "kind" : "prediction#output",
 "outputLabel":"French",
 "outputMulti" :[
   {"label":"French", "score": x.xx}
   {"label":"English", "score": x.xx}
   {"label":"Spanish", "score": x.xx}]}}
Step 3: Predict
   Apply the trained model to make predictions on new data

import httplib

header = {"Content-Type" : "application/json"}#...put new data in JSON
format in params variable
conn = httplib.HTTPConnection("www.googleapis.com")conn.request
("POST",
 "/prediction/v1.3/query/bucket%2Fdata/predict", params, header)print
conn.getresponse()
Prediction API - Pricing
    ○ Free Quota
       ■ Free trial quota for first 6 months (per project)
       ■ 100 predictions/day
       ■ 5 MB trained/day
       ■ 100 Streaming updates
       ■ Lifetime cap: 20,000 predictions
    ○ Paid Usage
       ■ 99.9% availability SLA
       ■ Base fee: $10 monthly fee per project
       ■ Prediction:
           ■ 10,000 predictions/month: $0.00 (free)
           ■ $0.50/1,000 predictions (beyond initial 10k)
       ■ Training
           ■ $0.002/MB bulk trained (dataset max size: 250MB)
           ■ 0-10k streaming updates: $0.00 (free)
           ■ $0.05/1,000 updates (beyond initial 10k)
Demos!


 ● Command line Demos
    ○ Training a model
    ○ Checking training status
    ○ Making predictions


 ● A complete Web application using the JavaScript
   API for Prediction
Prediction API Capabilities
Data
 ● Input Features: numeric or unstructured text
 ● Output: up to hundreds of discrete categories

Training
 ● Many machine learning techniques
 ● Automatically selected
 ● Performed asynchronously

Access from many platforms:
 ● Web app from Google App Engine
 ● Apps Script (e.g. from Google Spreadsheet)
 ● Desktop app
Prediction API - key features



 ● Multi-category prediction
    ○ Tag entry with multiple labels

 ● Continuous Output
    ○ Finer grained prediction rankings based on multiple labels

 ● Mixed Inputs
    ○ Both numeric and text inputs are now supported


Can combine continuous output with mixed inputs
Google BigQuery
Interactive analysis of large datasets in Google's cloud
Introducing Google BigQuery


 ● Google's large data adhoc analysis technology
    ○ Analyze massive amounts of data in seconds

 ● Simple SQL-like query language

 ● Flexible access
     ○ REST APIs, JSON-RPC, Google Apps Script
Many Use Cases ...




    Interactive                                 Trends
                               Spam
       Tools                                   Detection




                     Web               Network
                  Dashboards          Optimization
Key Capabilities of BigQuery

 ● Scalable: Billions of rows

 ● Fast: Response in seconds

 ● Simple: Queries in SQL

 ● Web Service
    ○ REST
    ○ JSON-RPC
    ○ Google App Scripts
Using BigQuery

Another simple three step process...


                                   Upload your raw data to
              1. Upload            Google Storage




                                   Import raw data into
              2. Import
                                   BigQuery table



              3. Query             Perform SQL queries on
                                   table
Writing Queries

Compact subset of SQL
   ○ SELECT ... FROM ...
     WHERE ...
     GROUP BY ... ORDER BY ...
     LIMIT ...;

Common functions
   ○ Math, String, Time, ...

Statistical approximations
     ○ TOP
     ○ COUNT DISTINCT
BigQuery via REST
GET /bigquery/v1/tables/{table name}

GET /bigquery/v1/query?q={query}
Sample JSON Reply:
{
    "results": {
      "fields": { [
       {"id":"COUNT(*)","type":"uint64"}, ... ]
      },
      "rows": [
       {"f":[{"v":"2949"}, ...]},
       {"f":[{"v":"5387"}, ...]}, ... ]
    }
}
Also supports JSON-RPC
BigQuery Security and Privacy

Standard Google Authentication
 ● Client Login
 ● AuthSub
 ● OAuth

HTTPS support
 ● protects your credentials
 ● protects your data

Relies on Google Storage to manage access
Large Data Analysis Example
Wikimedia Revision History




Wikimedia Revision history data from: http://download.wikimedia.
org/enwiki/latest/enwiki-latest-pages-meta-history.xml.7z
BigQuery from a Spreadsheet
BigQuery from a Spreadsheet
Recap
 ● Google App Engine
    ○ Application development platform for the cloud

 ● Google Storage
    ○ High speed cloud data storage on Google's
      infrastructure

 ● Prediction API
    ○ Google's machine learning technology able to predict
      outcomes based on sample data

 ● BigQuery
    ○ Interactive analysis of very large data sets
    ○ Simple SQL query language access
Further info available at:

● Google App Engine
   ○ http://code.google.com/appengine

● Google Storage for Developers
   ○ http://code.google.com/apis/storage

● Prediction API
   ○ http://code.google.com/apis/predict

● BigQuery
   ○ http://code.google.com/apis/bigquery
Muchas Gracias!


Questions?




         Contact: @cschalk

More Related Content

PDF
Building Integrated Applications on Google's Cloud Technologies
PDF
GDD 2011 - How to build kick ass video games for the cloud
PDF
Google App Engine Overview and Update
PDF
Building Kick Ass Video Games for the Cloud
PDF
How to build Kick Ass Games in the Cloud
PDF
Serverless Computing with Python
PDF
Introduction to Cloud Computing with Google Cloud
PDF
Serverless Computing with Google Cloud
Building Integrated Applications on Google's Cloud Technologies
GDD 2011 - How to build kick ass video games for the cloud
Google App Engine Overview and Update
Building Kick Ass Video Games for the Cloud
How to build Kick Ass Games in the Cloud
Serverless Computing with Python
Introduction to Cloud Computing with Google Cloud
Serverless Computing with Google Cloud

What's hot (20)

PDF
Exploring Google APIs with Python
PDF
Image archive, analysis & report generation with Google Cloud
PDF
Easy path to machine learning (Spring 2021)
PDF
Designing flexible apps deployable to App Engine, Cloud Functions, or Cloud Run
PDF
Run your code serverlessly on Google's open cloud
PDF
Easy path to machine learning (Spring 2020)
PDF
Powerful Google Cloud tools for your hack
PDF
Deep dive into serverless on Google Cloud
PDF
Building Translate on Glass
PDF
Using Google (Cloud) APIs
PDF
Build with ALL of Google Cloud
PDF
Apache Beam and Google Cloud Dataflow - IDG - final
PDF
Powerful Google Cloud tools for your hack (2020)
PDF
GDD Brazil 2010 - What's new in Google App Engine and Google App Engine For B...
PDF
Introduction to serverless computing on Google Cloud
PDF
Cloud Spin - building a photo booth with the Google Cloud Platform
PPTX
Google Cloud Platform - Eric Johnson, Joe Selman - ManageIQ Design Summit 2016
PDF
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIs
PDF
Supercharge your app with Cloud Functions for Firebase
PDF
JCConf 2016 - Google Dataflow 小試
Exploring Google APIs with Python
Image archive, analysis & report generation with Google Cloud
Easy path to machine learning (Spring 2021)
Designing flexible apps deployable to App Engine, Cloud Functions, or Cloud Run
Run your code serverlessly on Google's open cloud
Easy path to machine learning (Spring 2020)
Powerful Google Cloud tools for your hack
Deep dive into serverless on Google Cloud
Building Translate on Glass
Using Google (Cloud) APIs
Build with ALL of Google Cloud
Apache Beam and Google Cloud Dataflow - IDG - final
Powerful Google Cloud tools for your hack (2020)
GDD Brazil 2010 - What's new in Google App Engine and Google App Engine For B...
Introduction to serverless computing on Google Cloud
Cloud Spin - building a photo booth with the Google Cloud Platform
Google Cloud Platform - Eric Johnson, Joe Selman - ManageIQ Design Summit 2016
GDD Brazil 2010 - Google Storage, Bigquery and Prediction APIs
Supercharge your app with Cloud Functions for Firebase
JCConf 2016 - Google Dataflow 小試
Ad

Similar to Building Integrated Applications on Google's Cloud Technologies (20)

PDF
Building Apps on Google Cloud Technologies
PDF
Quick Intro to Google Cloud Technologies
PDF
Introduction to Google's Cloud Technologies
PDF
Intro to Google's Cloud Technologies
PDF
Introduction to Google Cloud platform technologies
PDF
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
PPT
Google cloud platform
PDF
Google Cloud Platform Update
PDF
Introduction to Google Cloud Platform Technologies
PDF
Entrepreneurship Tips With HTML5 & App Engine Startup Weekend (June 2012)
PDF
From zero to Google APIs: Beyond search & AI... leverage all of Google
PDF
Google Cloud for Data Crunchers - Strata Conf 2011
PDF
Powerful Google developer tools for immediate impact! (2023-24 A)
PPT
Computing at scale
PDF
Modern Thinking área digital MSKM 21/09/2017
PDF
Exploring Google APIs with Python
PDF
Big Query Basics
PDF
Google Technical Webinar - Building Mashups with Google Apps and SAP, using S...
PDF
Google App Engine – niekonwencjonalna platforma aplikacji SaaS do Twojego nas...
PDF
Easy path to machine learning (2022)
Building Apps on Google Cloud Technologies
Quick Intro to Google Cloud Technologies
Introduction to Google's Cloud Technologies
Intro to Google's Cloud Technologies
Introduction to Google Cloud platform technologies
Intro to new Google cloud technologies: Google Storage, Prediction API, BigQuery
Google cloud platform
Google Cloud Platform Update
Introduction to Google Cloud Platform Technologies
Entrepreneurship Tips With HTML5 & App Engine Startup Weekend (June 2012)
From zero to Google APIs: Beyond search & AI... leverage all of Google
Google Cloud for Data Crunchers - Strata Conf 2011
Powerful Google developer tools for immediate impact! (2023-24 A)
Computing at scale
Modern Thinking área digital MSKM 21/09/2017
Exploring Google APIs with Python
Big Query Basics
Google Technical Webinar - Building Mashups with Google Apps and SAP, using S...
Google App Engine – niekonwencjonalna platforma aplikacji SaaS do Twojego nas...
Easy path to machine learning (2022)
Ad

More from Chris Schalk (18)

PDF
Google App Engine's Latest Features
PDF
Google App Engine's Latest Features
PDF
Building Multi-platform Video Games for the Cloud
PDF
Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...
PDF
Introduction to Google's Cloud Technologies
PDF
Javaedge 2010-cschalk
PDF
Google Cloud Technologies Overview
PDF
Introducing App Engine for Business
PDF
Google App Engine for Business 101
PDF
What's new in App Engine and intro to App Engine for Business
PDF
App Engine Presentation @ SFJUG Sep 2010
PDF
What is Google App Engine
PDF
App engine cloud_comp_expo_nyc
PDF
App engine devfest_mexico_10
PDF
App Engine Overview Cloud Futures Publish
PDF
App Engine Overview @ Google Hackathon SXSW 2010
PDF
Google App Engine and Social Apps
PDF
Devfest09 OpenSocial Enterprise
Google App Engine's Latest Features
Google App Engine's Latest Features
Building Multi-platform Video Games for the Cloud
Building Enterprise Applications on Google Cloud Platform Cloud Computing Exp...
Introduction to Google's Cloud Technologies
Javaedge 2010-cschalk
Google Cloud Technologies Overview
Introducing App Engine for Business
Google App Engine for Business 101
What's new in App Engine and intro to App Engine for Business
App Engine Presentation @ SFJUG Sep 2010
What is Google App Engine
App engine cloud_comp_expo_nyc
App engine devfest_mexico_10
App Engine Overview Cloud Futures Publish
App Engine Overview @ Google Hackathon SXSW 2010
Google App Engine and Social Apps
Devfest09 OpenSocial Enterprise

Recently uploaded (20)

PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPT
Teaching material agriculture food technology
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PPTX
Spectroscopy.pptx food analysis technology
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
August Patch Tuesday
PPTX
1. Introduction to Computer Programming.pptx
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Assigned Numbers - 2025 - Bluetooth® Document
PDF
A comparative study of natural language inference in Swahili using monolingua...
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Encapsulation theory and applications.pdf
PDF
Mobile App Security Testing_ A Comprehensive Guide.pdf
Group 1 Presentation -Planning and Decision Making .pptx
Teaching material agriculture food technology
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
gpt5_lecture_notes_comprehensive_20250812015547.pdf
OMC Textile Division Presentation 2021.pptx
TokAI - TikTok AI Agent : The First AI Application That Analyzes 10,000+ Vira...
Advanced methodologies resolving dimensionality complications for autism neur...
Univ-Connecticut-ChatGPT-Presentaion.pdf
Spectroscopy.pptx food analysis technology
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
August Patch Tuesday
1. Introduction to Computer Programming.pptx
Diabetes mellitus diagnosis method based random forest with bat algorithm
Spectral efficient network and resource selection model in 5G networks
Assigned Numbers - 2025 - Bluetooth® Document
A comparative study of natural language inference in Swahili using monolingua...
A comparative analysis of optical character recognition models for extracting...
Programs and apps: productivity, graphics, security and other tools
Encapsulation theory and applications.pdf
Mobile App Security Testing_ A Comprehensive Guide.pdf

Building Integrated Applications on Google's Cloud Technologies

  • 1. Chris Schalk Google Developer Advocate Creación de aplicaciones integradas en las tecnologías de nube de Google (Building Integrated Applications on Google's Cloud Technologies)
  • 2. Agenda ● Introduction ● Introduction to Google's Cloud Technologies ● App Engine Recap ● Google's new Cloud Technologies ○ Google Storage ○ Prediction API ○ BigQuery ● Summary Q&A
  • 3. Google's Cloud Technologies Google App Engine Google Google BigQuery Prediction API Google Storage
  • 4. Google App Engine An App Engine recap...
  • 5. Cloud Development in a Box ● Downloadable SDK ● Application runtimes ○ Java, Python, (Go) ● Local development tools ○ Eclipse plugin ○ App Engine Launcher ● Specialized application services ● Cloud based dashboard ● Ready to scale ○ Built in fault tolerance, load balancing
  • 6. Quick GAE Demo! Building, testing and deploying a new cloud app
  • 7. Specialized Services Memcache Datastore URL Fetch Mail XMPP Task Queue Images Blobstore User Service But, is that it?
  • 8. No! App Engine now has access to new specialized cloud services...
  • 9. Google's new Cloud Technologies
  • 10. New Google Cloud Technologies ● Google Storage ○ Store your data in Google's cloud ● Prediction API ○ Google's machine learning tech in an API ● BigQuery ○ Hi-speed data analysis on massive scale
  • 11. Google Storage for Developers Store your data in Google's cloud
  • 12. What Is Google Storage? ● Store your data in Google's cloud ○ any format, any amount, any time ● You control access to your data ○ private, shared, or public ● Access via Google APIs or 3rd party tools/libraries
  • 13. Sample Use Cases Static content hosting e.g. static html, images, music, video Backup and recovery e.g. personal data, business records Sharing e.g. share data with your customers Data storage for applications e.g. used as storage backend for Android, AppEngine, Cloud based apps Storage for Computation e.g. BigQuery, Prediction API
  • 14. Google Storage Benefits High Performance and Scalability Backed by Google infrastructure Strong Security and Privacy Control access to your data Easy to Use Get started fast with Google & 3rd party tools
  • 15. Google Storage Technical Details ● RESTful API ○ Verbs: GET, PUT, POST, HEAD, DELETE ○ Resources: identified by URI ○ Compatible with S3 ● Buckets ○ Flat containers ● Objects ○ Any type ○ Size: 100 GB / object ● Access Control for Google Accounts ○ For individuals and groups ● Two Ways to Authenticate Requests ○ Sign request using access keys ○ Web browser login
  • 16. Security and Privacy Features ● Key-based authentication ● Authenticated downloads from a web browser ● Sharing with individuals ● Group sharing via Google Groups ● Access control for buckets and objects ● Set Read/Write/List permissions
  • 17. Demo ● Tools: ○ GSUtil ○ GS Manager ● Upload / Download
  • 18. Google Storage usage within Google Google Google BigQuery Prediction API Haiti Relief Imagery USPTO data Partner Reporting Partner Reporting
  • 19. Some Early Google Storage Adopters
  • 20. Google Storage - Pricing ○ Free trial quota until Dec 31, 2011 ■ For first project ■ 5 GB of storage ■ 25 GB download/upload data ■ 20 GB to Americas/EMEA, 5GB APAC ■ 25K GET, HEAD requests ■ 2,5K PUT, POST, LIST* requests ○ Production Storage ■ $0.17/GB/Month (Location US, EU) ■ Upload - $0.10/GB ■ Download - $0.15/GB Americas/EMEA, $0.30/GB APAC ■ Requests ■ PUT, POST, LIST - $0.01 / 1000 Requests ■ GET, HEAD - $0.01 / 10,000 Requests ■ 99.9% uptime SLA
  • 21. Google Storage Summary ● Store any kind of data using Google's cloud infrastructure ● Easy to Use APIs ● Many available tools and libraries ○ gsutil, GS Manager ○ 3rd party: ■ Boto, CloudBerry, CyberDuck, JetS3t, and more
  • 22. Google Prediction API Google's prediction engine in the cloud
  • 23. Google Prediction API as a simple example Predicts outcomes based on 'learned' patterns
  • 24. How does it work? "english" The quick brown fox jumped over the The Prediction API lazy dog. finds relevant "english" To err is human, but to really foul things features in the up you need a computer. sample data during "spanish" No hay mal que por bien no venga. training. "spanish" La tercera es la vencida. The Prediction API later searches for ? To be or not to be, that is the question. those features ? La fe mueve montañas. during prediction.
  • 25. A virtually endless number of applications... Customer Transaction Species Message Diagnostics Sentiment Risk Identification Routing Churn Legal Docket Suspicious Work Roster Inappropriate Prediction Classification Activity Assignment Content Recommend Political Uplift Email Career Products Bias Marketing Filtering Counselling ... and many more ...
  • 26. Using the Prediction API A simple three step process... Upload your training data to 1. Upload Google Storage Build a model from your data 2. Train 3. Predict Make new predictions
  • 27. Step 1: Upload Upload your training data to Google Storage ● Training data: outputs and input features ● Data format: comma separated value format (CSV) "english","To err is human, but to really ..." "spanish","No hay mal que por bien no venga." ... Upload to Google Storage gsutil cp ${data} gs://yourbucket/${data}
  • 28. Step 2: Train Create a new model by training on data To train a model: POST prediction/v1.3/training {"id":"mybucket/mydata"} Training runs asynchronously. To see if it has finished: GET prediction/v1.3/training/mybucket%2Fmydata {"kind": "prediction#training",... ,"training status": "DONE"}
  • 29. Step 3: Predict Apply the trained model to make predictions on new data POST prediction/v1.3/training/mybucket%2Fmydata/predict { "data":{ "input": { "text" : [ "J'aime X! C'est le meilleur" ]}}}
  • 30. Step 3: Predict Apply the trained model to make predictions on new data POST prediction/v1.3/training/bucket%2Fdata/predict { "data":{ "input": { "text" : [ "J'aime X! C'est le meilleur" ]}}} { data : { "kind" : "prediction#output", "outputLabel":"French", "outputMulti" :[ {"label":"French", "score": x.xx} {"label":"English", "score": x.xx} {"label":"Spanish", "score": x.xx}]}}
  • 31. Step 3: Predict Apply the trained model to make predictions on new data import httplib header = {"Content-Type" : "application/json"}#...put new data in JSON format in params variable conn = httplib.HTTPConnection("www.googleapis.com")conn.request ("POST", "/prediction/v1.3/query/bucket%2Fdata/predict", params, header)print conn.getresponse()
  • 32. Prediction API - Pricing ○ Free Quota ■ Free trial quota for first 6 months (per project) ■ 100 predictions/day ■ 5 MB trained/day ■ 100 Streaming updates ■ Lifetime cap: 20,000 predictions ○ Paid Usage ■ 99.9% availability SLA ■ Base fee: $10 monthly fee per project ■ Prediction: ■ 10,000 predictions/month: $0.00 (free) ■ $0.50/1,000 predictions (beyond initial 10k) ■ Training ■ $0.002/MB bulk trained (dataset max size: 250MB) ■ 0-10k streaming updates: $0.00 (free) ■ $0.05/1,000 updates (beyond initial 10k)
  • 33. Demos! ● Command line Demos ○ Training a model ○ Checking training status ○ Making predictions ● A complete Web application using the JavaScript API for Prediction
  • 34. Prediction API Capabilities Data ● Input Features: numeric or unstructured text ● Output: up to hundreds of discrete categories Training ● Many machine learning techniques ● Automatically selected ● Performed asynchronously Access from many platforms: ● Web app from Google App Engine ● Apps Script (e.g. from Google Spreadsheet) ● Desktop app
  • 35. Prediction API - key features ● Multi-category prediction ○ Tag entry with multiple labels ● Continuous Output ○ Finer grained prediction rankings based on multiple labels ● Mixed Inputs ○ Both numeric and text inputs are now supported Can combine continuous output with mixed inputs
  • 36. Google BigQuery Interactive analysis of large datasets in Google's cloud
  • 37. Introducing Google BigQuery ● Google's large data adhoc analysis technology ○ Analyze massive amounts of data in seconds ● Simple SQL-like query language ● Flexible access ○ REST APIs, JSON-RPC, Google Apps Script
  • 38. Many Use Cases ... Interactive Trends Spam Tools Detection Web Network Dashboards Optimization
  • 39. Key Capabilities of BigQuery ● Scalable: Billions of rows ● Fast: Response in seconds ● Simple: Queries in SQL ● Web Service ○ REST ○ JSON-RPC ○ Google App Scripts
  • 40. Using BigQuery Another simple three step process... Upload your raw data to 1. Upload Google Storage Import raw data into 2. Import BigQuery table 3. Query Perform SQL queries on table
  • 41. Writing Queries Compact subset of SQL ○ SELECT ... FROM ... WHERE ... GROUP BY ... ORDER BY ... LIMIT ...; Common functions ○ Math, String, Time, ... Statistical approximations ○ TOP ○ COUNT DISTINCT
  • 42. BigQuery via REST GET /bigquery/v1/tables/{table name} GET /bigquery/v1/query?q={query} Sample JSON Reply: { "results": { "fields": { [ {"id":"COUNT(*)","type":"uint64"}, ... ] }, "rows": [ {"f":[{"v":"2949"}, ...]}, {"f":[{"v":"5387"}, ...]}, ... ] } } Also supports JSON-RPC
  • 43. BigQuery Security and Privacy Standard Google Authentication ● Client Login ● AuthSub ● OAuth HTTPS support ● protects your credentials ● protects your data Relies on Google Storage to manage access
  • 44. Large Data Analysis Example Wikimedia Revision History Wikimedia Revision history data from: http://download.wikimedia. org/enwiki/latest/enwiki-latest-pages-meta-history.xml.7z
  • 45. BigQuery from a Spreadsheet
  • 46. BigQuery from a Spreadsheet
  • 47. Recap ● Google App Engine ○ Application development platform for the cloud ● Google Storage ○ High speed cloud data storage on Google's infrastructure ● Prediction API ○ Google's machine learning technology able to predict outcomes based on sample data ● BigQuery ○ Interactive analysis of very large data sets ○ Simple SQL query language access
  • 48. Further info available at: ● Google App Engine ○ http://code.google.com/appengine ● Google Storage for Developers ○ http://code.google.com/apis/storage ● Prediction API ○ http://code.google.com/apis/predict ● BigQuery ○ http://code.google.com/apis/bigquery
  • 49. Muchas Gracias! Questions? Contact: @cschalk