SlideShare a Scribd company logo
Getting Started with
BigQuery ML
Dan Sullivan
GDG Great North DevFest 2020
October 17, 2020
OVERVIEW
• INTRODUCTION TO BIGQUERY
• MACHINE LEARNING BASICS
• BUILDING MACHINE LEARNING MODELS
IN BIGQUERY
Bio
• Principle Engineer, PEAK6 Technologies
• Author
• Instructor
• Udemy
• Google Cloud
• LinkedIn Learning
• Data Science
• Machine Learning
• Databases & Data Modeling
Introduction to
BigQuery
What is BigQuery?
• Serverless data warehouse
• Petabyte scale
• Uses SQL but is not a relational database
• Analytical database
• Other features
• BigQuery ML
• BigQuery BI Engine
• BigQuery GIS
Datasets & Tables
• Datasets
• Collection of tables and views
• Access control set at dataset and table level
• Tables
• Supports scalar and nested structures
• Stored in columnar format
• Partitioning
• No indexes
Views
• Projection of one or more tables
• Tables can be joined
• Views can be materialized
Getting Started with BigQuery ML
Getting Started with BigQuery ML
Command Line
Federated Data Access
• Federated query
• Cloud Storage
• Parquet
• ORC
• Bigtable and Cloud SQL
• Spreadsheets in Google Drive
Machine Learning
Basics
Machine Learning Problem Categories
• Unsupervised Learning
• Supervised Learning
• Reinforcement Learning*
Unsupervised Learning
• Draw inference from data
• Previously undetected patterns
• Examples
• Clustering
• Anomaly detection
• Principal component analysis
Supervised Learning
• Learn from examples
• Goal is to predict category or value
• Examples
• Classifying tumors from images
• Predicting housing prices
• Identifying fraudulent credit card transactions
2 Approaches to MachineLearning
• Symbolic Artificial Intelligence
• Neural networks and deep learning
SymbolicArtificial Intelligence
• Symbols represent entities and attributes
• Manipulate symbols to make inferences
• Variety of ways to manipulate symbols
SymbolicMachine Learning
• Domain of interest is modeled using symbols
• Set of features, e.g.
• Length of stay in a hospital
• Type of operation
• Age at time of operation
Variety ofSymbolicML Algorithms
• Decision trees
• Random forests
• Naïve Bayes
• Support vector machines (SVMs)
• K Nearest Neighbor
Deep Learning Builds onNeuron-likeAbstraction
• Inputs are numbers
• Weights assign
importance to inputs
• Sum weighted inputs
becomes
• Input to a non-linear
function
• Output is result of that
function.
Activation Function
• Non-linear function known as activation function
• Examples
• Sigmoid
• TanH
• ReLU
NeuralNetworks
• Input layer
• Hidden layer
• Output
Deep Learning Networks
• More than 3 layers
• Challenging to learn
weights
• Backpropagation
algorithm used to adjust
weights
Steps to Creating and Using ML Models
• Data engineering
• Collection
• Quality assessment and cleansing
• Transform and format
• Feature engineering
• Training
• Evaluating
• Predicting
CommonlyUsed ML Tools
BigQuery ML
WhyDo ML inBigQuery?
• Support for data engineering
• No need to export data
• Use SQL
https://commons.wikimedia.org/wiki/File:Supervised_machine_learning_in_a_nutshell.svg
BigQueryML
• Create machine learning models in SQL
• Several kinds of models
• Linear regression
• Binary and multiclass logistic regressions
• K-means clustering
• Time series forecast
• Matrix factorization
• Boosted Tree and XGBoost
• Tensorflow (imported)
• AutoML Tables
ExampleCREATE MODEL
Source: https://cloud.google.com/bigquery-ml/docs/bigqueryml-natality
CREATE MODEL Options
• Model Type
• Input Label Columns
• Regularization
• Learning rate
• Early Stop
• Standardize Features
• Max Tree Depth
• More …
ExampleEVALUATE
Source: https://cloud.google.com/bigquery-ml/docs/bigqueryml-natality
ExamplePREDICT
Source: https://cloud.google.com/bigquery-ml/docs/bigqueryml-natality

More Related Content

PPTX
Azure Identity and access management
PPTX
Power of the cloud - Introduction to azure security
PPTX
Microsoft Cloud Adoption Framework for Azure: Thru Partner Governance Workshop
PPTX
Azure security and Compliance
PPTX
Azure role based access control (rbac)
PPTX
Azure key vault
PDF
Azure security architecture
PPTX
Introduction to Azure Cloud Storage
Azure Identity and access management
Power of the cloud - Introduction to azure security
Microsoft Cloud Adoption Framework for Azure: Thru Partner Governance Workshop
Azure security and Compliance
Azure role based access control (rbac)
Azure key vault
Azure security architecture
Introduction to Azure Cloud Storage

What's hot (20)

PPTX
Azure SQL Database Managed Instance
PDF
Understanding Azure AD
PPTX
Azure Databricks - An Introduction (by Kris Bock)
PPTX
TechEvent Infrastructure as Code on Azure
PPT
Data Lakehouse Symposium | Day 1 | Part 2
PPTX
Azure Migration Program Pitch Deck
PDF
Identity and Access Management from Microsoft and Razor Technology
PPTX
Azure Logic Apps
PDF
Learn to Use Databricks for Data Science
PDF
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
PPTX
Azure Security and Management
PPTX
Azure governance
PDF
Azure vm introduction
PPTX
Azure Service Bus
PDF
Migrating Oracle Databases to AWS
PPTX
Windows Azure Service Bus
PPTX
Azure Storage Services - Part 01
PPTX
Microsoft Azure Cost Optimization and improve efficiency
PPTX
AWS Storage - S3 Fundamentals
PDF
Azure Site Recovery - BC/DR - Migrations & assessments in 60 minutes!
Azure SQL Database Managed Instance
Understanding Azure AD
Azure Databricks - An Introduction (by Kris Bock)
TechEvent Infrastructure as Code on Azure
Data Lakehouse Symposium | Day 1 | Part 2
Azure Migration Program Pitch Deck
Identity and Access Management from Microsoft and Razor Technology
Azure Logic Apps
Learn to Use Databricks for Data Science
[Giovanni Galloro] How to use machine learning on Google Cloud Platform
Azure Security and Management
Azure governance
Azure vm introduction
Azure Service Bus
Migrating Oracle Databases to AWS
Windows Azure Service Bus
Azure Storage Services - Part 01
Microsoft Azure Cost Optimization and improve efficiency
AWS Storage - S3 Fundamentals
Azure Site Recovery - BC/DR - Migrations & assessments in 60 minutes!
Ad

Similar to Getting Started with BigQuery ML (20)

PDF
MLSEV. Machine Learning: Business Perspective
PDF
A few Challenges to Make Machine Learning Easy
PDF
BigQuery ML - Machine learning at scale using SQL
PPTX
Machine learning
PDF
VSSML17 Review. Summary Day 2 Sessions
PDF
VSSML18. Introduction to Machine Learning and the BigML Platform
PDF
BigQuery ML - Machine learning at scale using SQL
PDF
BSSML17 - Introduction, Models, Evaluations
PDF
VSSML16 LR1. Summary Day 1
PDF
DutchMLSchool. Introduction to Machine Learning with the BigML Platform
PDF
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
PDF
VSSML17 Review. Summary Day 1 Sessions
PPTX
Machine Learning On Big Data: Opportunities And Challenges- Future Research D...
PDF
Discover BigQuery ML, build your own CREATE MODEL statement
PDF
BSSML16 L5. Summary Day 1 Sessions
PPTX
Machine Learning with Spark
PDF
From SF with Love
PDF
Getting started with Machine Learning
PDF
BSSML17 - Deepnets
MLSEV. Machine Learning: Business Perspective
A few Challenges to Make Machine Learning Easy
BigQuery ML - Machine learning at scale using SQL
Machine learning
VSSML17 Review. Summary Day 2 Sessions
VSSML18. Introduction to Machine Learning and the BigML Platform
BigQuery ML - Machine learning at scale using SQL
BSSML17 - Introduction, Models, Evaluations
VSSML16 LR1. Summary Day 1
DutchMLSchool. Introduction to Machine Learning with the BigML Platform
Democratizing AI/ML with GCP - Abishay Rao (Google) at GoDataFest 2019
VSSML17 Review. Summary Day 1 Sessions
Machine Learning On Big Data: Opportunities And Challenges- Future Research D...
Discover BigQuery ML, build your own CREATE MODEL statement
BSSML16 L5. Summary Day 1 Sessions
Machine Learning with Spark
From SF with Love
Getting started with Machine Learning
BSSML17 - Deepnets
Ad

More from Dan Sullivan, Ph.D. (13)

PPTX
How to Design a Modern Data Warehouse in BigQuery
PPTX
With Automated ML, is Everyone an ML Engineer?
PPTX
Google Cloud Certifications & Machine Learning
PPTX
Unstructured text to structured data
PPTX
A first look at tf idf-pdx data science meetup
PPTX
Text mining meets neural nets
PPTX
ACID vs BASE in NoSQL: Another False Dichotomy
PPTX
Big data, bioscience and the cloud biocatalyst june 2015 sullivan
PPTX
Tools and Techniques for Analyzing Texts: Tweets to Intellectual Property
PPTX
Modeling with Document Database: 5 Key Patterns
PPTX
Sullivan GBCB Seminar Fall 2014 - Limits of RDMS for Bioinformatics v2
PPTX
Text Mining for Biocuration of Bacterial Infectious Diseases
PPTX
Limits of RDBMS and Need for NoSQL in Bioinformatics
How to Design a Modern Data Warehouse in BigQuery
With Automated ML, is Everyone an ML Engineer?
Google Cloud Certifications & Machine Learning
Unstructured text to structured data
A first look at tf idf-pdx data science meetup
Text mining meets neural nets
ACID vs BASE in NoSQL: Another False Dichotomy
Big data, bioscience and the cloud biocatalyst june 2015 sullivan
Tools and Techniques for Analyzing Texts: Tweets to Intellectual Property
Modeling with Document Database: 5 Key Patterns
Sullivan GBCB Seminar Fall 2014 - Limits of RDMS for Bioinformatics v2
Text Mining for Biocuration of Bacterial Infectious Diseases
Limits of RDBMS and Need for NoSQL in Bioinformatics

Recently uploaded (20)

PDF
Foundation of Data Science unit number two notes
PDF
Business Analytics and business intelligence.pdf
PPTX
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
PPTX
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
PPTX
IB Computer Science - Internal Assessment.pptx
PDF
annual-report-2024-2025 original latest.
PDF
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
PPTX
STUDY DESIGN details- Lt Col Maksud (21).pptx
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
PPTX
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
1_Introduction to advance data techniques.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PPTX
Introduction-to-Cloud-ComputingFinal.pptx
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
Foundation of Data Science unit number two notes
Business Analytics and business intelligence.pdf
MODULE 8 - DISASTER risk PREPAREDNESS.pptx
mbdjdhjjodule 5-1 rhfhhfjtjjhafbrhfnfbbfnb
IB Computer Science - Internal Assessment.pptx
annual-report-2024-2025 original latest.
22.Patil - Early prediction of Alzheimer’s disease using convolutional neural...
STUDY DESIGN details- Lt Col Maksud (21).pptx
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
AI Strategy room jwfjksfksfjsjsjsjsjfsjfsj
The THESIS FINAL-DEFENSE-PRESENTATION.pptx
Reliability_Chapter_ presentation 1221.5784
1_Introduction to advance data techniques.pptx
climate analysis of Dhaka ,Banglades.pptx
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Galatica Smart Energy Infrastructure Startup Pitch Deck
Introduction-to-Cloud-ComputingFinal.pptx
Data_Analytics_and_PowerBI_Presentation.pptx
oil_refinery_comprehensive_20250804084928 (1).pptx

Getting Started with BigQuery ML