SlideShare a Scribd company logo
Building a
Machine Learning
Recommendation Engine
in SQL
@garyorenstein @memsql
MemSQL 1
Today’s Talk
1. State of Data 2018 according to Gartner
2. Rise of Machine Learning
3. Live Demo - A SQL Recommendation Engine
MemSQL 2
SECTION 1
The State of DataAccording to Gartner 2018
MemSQL 3
Hype Cycle for Data
Management
26 July 2017
Donald Feinberg
Adam M. Ronthal
G00313950
MemSQL 4
MemSQL 5
Multimodel has the potential
to support both relational and
nonrelational use cases
while reducing the number of
disparate DBMS products
in an organization.
MemSQL 6
the idea of a
Hadoop distribution
will become obsolete
before it reaches
the Plateau of Productivity
MemSQL 7
Penetration continues to increase and organizations
should be evaluating these resources for
— cost-efficiency
— infrastructure simplification and
— new use cases, such as Hybrid Transactional/
Analytical Processing (HTAP)
MemSQL 8
Build Your Digital Business
Platform Around Data and
Analytics
31 January 2018
Andrew White
W. Roy Schulte
Roxane Edjlali
Joao Tapadinhas
Svetlana Sicular
G00350435
MemSQL 9
Select Challenges
Data and analytics investments that are tied to
measurable business outcomes are more likely to
produce reportable benefits.
MemSQL 10
Magic Quadrant for Data
Management Solutions for
Analytics
13 February 2018
Adam M. Ronthal
Roxane Edjlali
Rick Greenwald
G00326691
MemSQL 11
We define four primary use cases for DMSAs that reflect
this diversity of data and use cases:
— Traditional data warehouse
— Real-time data warehouse
— Context-independent data warehouse
— Logical data warehouse
MemSQL 12
MemSQL 13
MemSQL 14
Real-Time Data Warehouse
This use case adds a real-time component to analytics
use cases, with the aim of reducing latency — the time
lag between when data is generated and when it can be
analyzed.
MemSQL 15
MemSQL 16
Other Vendors to Consider for
Operational DBMSs
23 November 2017
Donald Feinberg
Merv Adrian
Nick Heudecker
G00327284
MemSQL 17
Other Vendors to Consider for Operational DBMSs
Actian
Aerospike
Alibaba Cloud
Altibase
ArangoDB
Cloudera
Clustrix
Couchbase
FairCom
Fujitsu
General Data Technology
Hortonworks
MariaDB
MemSQL
MongoDB
Neo4j
NuoDB
Percona
Redis Labs
SequoiaDB
TmaxSoft
VoltDB
MemSQL 18
Other Vendors to Consider for Operational DBMSs
also listed as Challenger or Leader
in the Magic Quadrant
for Data Management Solutions for Analytics
MemSQL
MemSQL 19
MemSQL 20
Over the next five years,
the OPDBMS and DMSA
markets converge to a
single DBMS market.
MemSQL 21
Look to your operational DBMS
vendor for both transactional
and analytical workloads.
MemSQL 22
SECTION 2
Rise of Machine Learning
MemSQL 23
MemSQL 24
MemSQL 25
MemSQL 26
MemSQL 27
MemSQL 28
MemSQL 29
2018 Outlook Survey
MemSQL and O’Reilly
1600+ respondents
memsql.com/MLsurvey
MemSQL 30
MemSQL 31
MemSQL 32
Machine Learning and
Databases
MemSQL 33
MemSQL 34
MemSQL 35
MemSQL 36
MemSQL 37
MemSQL 38
MemSQL 39
MemSQL 40
MemSQL 41
MemSQL 42
MemSQL 43
MemSQL 44
MemSQL 45
MemSQL 46
MemSQL 47
SECTION 3
DEMO with Yelp Dataset
MemSQL 48
MemSQL 49
MemSQL 50
MemSQL 51
MemSQL 52
Can you build a machine
learning recommendation
engine in SQL?
Yes
MemSQL 53
Can you build a machine learning
recommendation engine in SQL?
Yes
Should you?
For training? Maybe, maybe not.
For Operational Scoring?
Absolutely!
MemSQL 54
MemSQL 55
MemSQL 56
Secret Weapons to Machine Learning in SQL
— Extensibility
— Stored Procedures
— User Defined Functions
— User Defined Aggregates
— DOT_PRODUCT
— Compare two vectors
MemSQL 57
MemSQL 58
MemSQL 59
Sequel Pro Mac app for MySQL databases
MemSQL 60
MemSQL in one slide
— Distributed SQL database
— Massively parallel, lock-free, fast
— Full ACID features
— In-memory and on-disk
— JSON, key-value, geospatial, full-text search
— Robust security
— Built for transactions and analytics
MemSQL 61
MemSQL 62
MemSQL 63
Why do ML in SQL?
— Train in any number of systems
— Score in the database for applications from real-time
drilling to fraud detection to personalization
— Complete certain functions within the database to
radically simplify operational infrastructure
MemSQL 64
“It is a fine line between
a well executed SQL query on
live data and ML/AI”
MemSQL 65
MemSQL 66
Thank you!
Please visit our booth
www.memsql.com
@garyorenstein
@memsql
MemSQL 67
Abstract: Building a Machine Learning Recommendation Engine in SQL
Modern businesses constantly seek deeper customer relationships and more
compelling experiences.
To accomplish this, companies are looking to machine learning and artificial
intelligence solutions; however, that often involves a host of new systems and
approaches.
With a modern database architecture, it is possible to build compelling machine
learning solutions with SQL, deliver real-time engagements, and rapidly move to
operational applications.
See live, how a modern database can accomplish these feats within a single
integrated solution.
MemSQL 68

More Related Content

PPTX
How Kafka and Modern Databases Benefit Apps and Analytics
PDF
Architecting Data in the AWS Ecosystem
PDF
Converging Database Transactions and Analytics
PDF
The State of the Data Warehouse in 2017 and Beyond
PPTX
Building the Foundation for a Latency-Free Life
PDF
An Engineering Approach to Database Evaluations
PPTX
Five ways database modernization simplifies your data life
PPTX
Real-Time Analytics with Spark and MemSQL
How Kafka and Modern Databases Benefit Apps and Analytics
Architecting Data in the AWS Ecosystem
Converging Database Transactions and Analytics
The State of the Data Warehouse in 2017 and Beyond
Building the Foundation for a Latency-Free Life
An Engineering Approach to Database Evaluations
Five ways database modernization simplifies your data life
Real-Time Analytics with Spark and MemSQL

What's hot (20)

PPTX
See who is using MemSQL
PPT
Google App Engine
PPTX
Introducing MemSQL 4
PPTX
Real-Time Geospatial Intelligence at Scale
PPTX
Bringing olap fully online analyze changing datasets in mem sql and spark wi...
PPTX
In-Memory Database Performance on AWS M4 Instances
PPTX
Internet of Things and Multi-model Data Infrastructure
PDF
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
PDF
Democratizing Data
PPTX
Getting It Right Exactly Once: Principles for Streaming Architectures
PDF
Add Historical Analysis of Operational Data with Easy Configurations in Fivet...
PDF
PDF
Presto: Fast SQL on Everything
PDF
Columbia Migrates from Legacy Data Warehouse to an Open Data Platform with De...
PDF
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
PDF
Making Data Timelier and More Reliable with Lakehouse Technology
PDF
Ebooks - Accelerating Time to Value of Big Data of Apache Spark | Qubole
PDF
Personalization Journey: From Single Node to Cloud Streaming
PDF
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
PDF
IBM Cloud Day January 2021 - A well architected data lake
See who is using MemSQL
Google App Engine
Introducing MemSQL 4
Real-Time Geospatial Intelligence at Scale
Bringing olap fully online analyze changing datasets in mem sql and spark wi...
In-Memory Database Performance on AWS M4 Instances
Internet of Things and Multi-model Data Infrastructure
Denodo DataFest 2017: Integrating Big Data and Streaming Data with Enterprise...
Democratizing Data
Getting It Right Exactly Once: Principles for Streaming Architectures
Add Historical Analysis of Operational Data with Easy Configurations in Fivet...
Presto: Fast SQL on Everything
Columbia Migrates from Legacy Data Warehouse to an Open Data Platform with De...
The Future of Data Science and Machine Learning at Scale: A Look at MLflow, D...
Making Data Timelier and More Reliable with Lakehouse Technology
Ebooks - Accelerating Time to Value of Big Data of Apache Spark | Qubole
Personalization Journey: From Single Node to Cloud Streaming
Presto: Fast SQL-on-Anything (including Delta Lake, Snowflake, Elasticsearch ...
IBM Cloud Day January 2021 - A well architected data lake
Ad

Similar to Building a Machine Learning Recommendation Engine in SQL (20)

PDF
Get a clearer picture of potential cloud performance by looking beyond SPECra...
PDF
Making Sense of NoSQL and Big Data Amidst High Expectations
PPTX
Logical Data Warehouse: The Foundation of Modern Data and Analytics
PPT
Mule esb-microsoft
PPT
Mule microsoft
PDF
Guide to NoSQL with MySQL
PDF
bigdatasqloverview21jan2015-2408000
PDF
Microsoft Sql Server 2016 Is Now Live
PDF
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
PDF
Connecting Silos in Real Time with Data Virtualization
PDF
SQL Saturday 119 Chicago -- Enterprise Data Mining with SQL Server
PDF
SQL Saturday 108 -- Enterprise Data Mining with SQL Server
PPT
Migrating legacy ERP data into Hadoop
PPTX
SAP and Microsoft Manufacturing Solution
PDF
Webinar: Faster Big Data Analytics with MongoDB
PDF
Microsoft SQL Server 2012 Data Warehouse on Hitachi Converged Platform
PDF
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
PDF
How a Semantic Layer Makes Data Mesh Work at Scale
PDF
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
PDF
GigaOm-sector-roadmap-cloud-analytic-databases-2017
Get a clearer picture of potential cloud performance by looking beyond SPECra...
Making Sense of NoSQL and Big Data Amidst High Expectations
Logical Data Warehouse: The Foundation of Modern Data and Analytics
Mule esb-microsoft
Mule microsoft
Guide to NoSQL with MySQL
bigdatasqloverview21jan2015-2408000
Microsoft Sql Server 2016 Is Now Live
Big Data LDN 2018: CONNECTING SILOS IN REAL-TIME WITH DATA VIRTUALIZATION
Connecting Silos in Real Time with Data Virtualization
SQL Saturday 119 Chicago -- Enterprise Data Mining with SQL Server
SQL Saturday 108 -- Enterprise Data Mining with SQL Server
Migrating legacy ERP data into Hadoop
SAP and Microsoft Manufacturing Solution
Webinar: Faster Big Data Analytics with MongoDB
Microsoft SQL Server 2012 Data Warehouse on Hitachi Converged Platform
¿Cómo modernizar una arquitectura de TI con la virtualización de datos?
How a Semantic Layer Makes Data Mesh Work at Scale
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
GigaOm-sector-roadmap-cloud-analytic-databases-2017
Ad

More from SingleStore (20)

PPTX
MemSQL 201: Advanced Tips and Tricks Webcast
PDF
Introduction to MemSQL
PPTX
Building a Fault Tolerant Distributed Architecture
PDF
Stream Processing with Pipelines and Stored Procedures
PPTX
Curriculum Associates Strata NYC 2017
PPTX
Image Recognition on Streaming Data
PPTX
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
PDF
How Database Convergence Impacts the Coming Decades of Data Management
PPTX
Teaching Databases to Learn in the World of AI
PDF
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
PPTX
Gartner Catalyst 2017: Image Recognition on Streaming Data
PPTX
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
PDF
Real-Time Analytics at Uber Scale
PDF
Machines and the Magic of Fast Learning
PPTX
Machines and the Magic of Fast Learning - Strata Keynote
PDF
Enabling Real-Time Analytics for IoT
PPTX
Driving the On-Demand Economy with Predictive Analytics
PPTX
Tapjoy: Building a Real-Time Data Science Service for Mobile Advertising
PPTX
The Real-Time CDO and the Cloud-Forward Path to Predictive Analytics
PDF
Enabling Real-Time Analytics for IoT
MemSQL 201: Advanced Tips and Tricks Webcast
Introduction to MemSQL
Building a Fault Tolerant Distributed Architecture
Stream Processing with Pipelines and Stored Procedures
Curriculum Associates Strata NYC 2017
Image Recognition on Streaming Data
Spark Summit Dublin 2017 - MemSQL - Real-Time Image Recognition
How Database Convergence Impacts the Coming Decades of Data Management
Teaching Databases to Learn in the World of AI
Gartner Catalyst 2017: The Data Warehouse Blueprint for ML, AI, and Hybrid Cloud
Gartner Catalyst 2017: Image Recognition on Streaming Data
Spark Summit West 2017: Real-Time Image Recognition with MemSQL and Spark
Real-Time Analytics at Uber Scale
Machines and the Magic of Fast Learning
Machines and the Magic of Fast Learning - Strata Keynote
Enabling Real-Time Analytics for IoT
Driving the On-Demand Economy with Predictive Analytics
Tapjoy: Building a Real-Time Data Science Service for Mobile Advertising
The Real-Time CDO and the Cloud-Forward Path to Predictive Analytics
Enabling Real-Time Analytics for IoT

Recently uploaded (20)

PDF
gpt5_lecture_notes_comprehensive_20250812015547.pdf
PDF
Machine learning based COVID-19 study performance prediction
PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
August Patch Tuesday
PDF
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
PDF
Heart disease approach using modified random forest and particle swarm optimi...
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PPTX
cloud_computing_Infrastucture_as_cloud_p
PPTX
Tartificialntelligence_presentation.pptx
PDF
A comparative analysis of optical character recognition models for extracting...
PPTX
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PDF
Agricultural_Statistics_at_a_Glance_2022_0.pdf
PDF
Univ-Connecticut-ChatGPT-Presentaion.pdf
PDF
Approach and Philosophy of On baking technology
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
OMC Textile Division Presentation 2021.pptx
PDF
Spectral efficient network and resource selection model in 5G networks
gpt5_lecture_notes_comprehensive_20250812015547.pdf
Machine learning based COVID-19 study performance prediction
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
August Patch Tuesday
Profit Center Accounting in SAP S/4HANA, S4F28 Col11
Heart disease approach using modified random forest and particle swarm optimi...
Diabetes mellitus diagnosis method based random forest with bat algorithm
cloud_computing_Infrastucture_as_cloud_p
Tartificialntelligence_presentation.pptx
A comparative analysis of optical character recognition models for extracting...
TechTalks-8-2019-Service-Management-ITIL-Refresh-ITIL-4-Framework-Supports-Ou...
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Agricultural_Statistics_at_a_Glance_2022_0.pdf
Univ-Connecticut-ChatGPT-Presentaion.pdf
Approach and Philosophy of On baking technology
Accuracy of neural networks in brain wave diagnosis of schizophrenia
OMC Textile Division Presentation 2021.pptx
Spectral efficient network and resource selection model in 5G networks

Building a Machine Learning Recommendation Engine in SQL

  • 1. Building a Machine Learning Recommendation Engine in SQL @garyorenstein @memsql MemSQL 1
  • 2. Today’s Talk 1. State of Data 2018 according to Gartner 2. Rise of Machine Learning 3. Live Demo - A SQL Recommendation Engine MemSQL 2
  • 3. SECTION 1 The State of DataAccording to Gartner 2018 MemSQL 3
  • 4. Hype Cycle for Data Management 26 July 2017 Donald Feinberg Adam M. Ronthal G00313950 MemSQL 4
  • 6. Multimodel has the potential to support both relational and nonrelational use cases while reducing the number of disparate DBMS products in an organization. MemSQL 6
  • 7. the idea of a Hadoop distribution will become obsolete before it reaches the Plateau of Productivity MemSQL 7
  • 8. Penetration continues to increase and organizations should be evaluating these resources for — cost-efficiency — infrastructure simplification and — new use cases, such as Hybrid Transactional/ Analytical Processing (HTAP) MemSQL 8
  • 9. Build Your Digital Business Platform Around Data and Analytics 31 January 2018 Andrew White W. Roy Schulte Roxane Edjlali Joao Tapadinhas Svetlana Sicular G00350435 MemSQL 9
  • 10. Select Challenges Data and analytics investments that are tied to measurable business outcomes are more likely to produce reportable benefits. MemSQL 10
  • 11. Magic Quadrant for Data Management Solutions for Analytics 13 February 2018 Adam M. Ronthal Roxane Edjlali Rick Greenwald G00326691 MemSQL 11
  • 12. We define four primary use cases for DMSAs that reflect this diversity of data and use cases: — Traditional data warehouse — Real-time data warehouse — Context-independent data warehouse — Logical data warehouse MemSQL 12
  • 15. Real-Time Data Warehouse This use case adds a real-time component to analytics use cases, with the aim of reducing latency — the time lag between when data is generated and when it can be analyzed. MemSQL 15
  • 17. Other Vendors to Consider for Operational DBMSs 23 November 2017 Donald Feinberg Merv Adrian Nick Heudecker G00327284 MemSQL 17
  • 18. Other Vendors to Consider for Operational DBMSs Actian Aerospike Alibaba Cloud Altibase ArangoDB Cloudera Clustrix Couchbase FairCom Fujitsu General Data Technology Hortonworks MariaDB MemSQL MongoDB Neo4j NuoDB Percona Redis Labs SequoiaDB TmaxSoft VoltDB MemSQL 18
  • 19. Other Vendors to Consider for Operational DBMSs also listed as Challenger or Leader in the Magic Quadrant for Data Management Solutions for Analytics MemSQL MemSQL 19
  • 21. Over the next five years, the OPDBMS and DMSA markets converge to a single DBMS market. MemSQL 21
  • 22. Look to your operational DBMS vendor for both transactional and analytical workloads. MemSQL 22
  • 23. SECTION 2 Rise of Machine Learning MemSQL 23
  • 30. 2018 Outlook Survey MemSQL and O’Reilly 1600+ respondents memsql.com/MLsurvey MemSQL 30
  • 48. SECTION 3 DEMO with Yelp Dataset MemSQL 48
  • 53. Can you build a machine learning recommendation engine in SQL? Yes MemSQL 53
  • 54. Can you build a machine learning recommendation engine in SQL? Yes Should you? For training? Maybe, maybe not. For Operational Scoring? Absolutely! MemSQL 54
  • 57. Secret Weapons to Machine Learning in SQL — Extensibility — Stored Procedures — User Defined Functions — User Defined Aggregates — DOT_PRODUCT — Compare two vectors MemSQL 57
  • 60. Sequel Pro Mac app for MySQL databases MemSQL 60
  • 61. MemSQL in one slide — Distributed SQL database — Massively parallel, lock-free, fast — Full ACID features — In-memory and on-disk — JSON, key-value, geospatial, full-text search — Robust security — Built for transactions and analytics MemSQL 61
  • 64. Why do ML in SQL? — Train in any number of systems — Score in the database for applications from real-time drilling to fraud detection to personalization — Complete certain functions within the database to radically simplify operational infrastructure MemSQL 64
  • 65. “It is a fine line between a well executed SQL query on live data and ML/AI” MemSQL 65
  • 67. Thank you! Please visit our booth www.memsql.com @garyorenstein @memsql MemSQL 67
  • 68. Abstract: Building a Machine Learning Recommendation Engine in SQL Modern businesses constantly seek deeper customer relationships and more compelling experiences. To accomplish this, companies are looking to machine learning and artificial intelligence solutions; however, that often involves a host of new systems and approaches. With a modern database architecture, it is possible to build compelling machine learning solutions with SQL, deliver real-time engagements, and rapidly move to operational applications. See live, how a modern database can accomplish these feats within a single integrated solution. MemSQL 68