SlideShare a Scribd company logo
100

SQL SERVER: Data Mining
Types of analysis
• Ad-hoc query/Reporting/Analysis
  – What is the purpose?
     • Simple reports
     • Key Performance Indicators
     • OLAP cubes – Slice & Dice
  – In Real time - What happens now?
     • Events/Triggers

• Data Mining
  – How do we do it?
  – What happens?
What does Data Mining Do?
 Explores
Your Data

             Finds
            Patterns

                        Performs
                       Predictions
Data Mining Algorithms
•   Classification
•   Regression
•   Segmentation
•   Association
•   Forecasting
•   Text Analysis
•   Advanced Data Exploration
Mining Process
Training data                    Data to be
                  Mining Model   predicted




    DM Engine




Mining Model
                                      With
                                      predictions
Data Mining Process
                                                                     SSAS
                                                                     (OLAP)
                  Business                          Data
                                                                     DSV
                Understanding                   Understanding



                                                                              SSIS
                                                                              SSAS
                                                                   Data
                                Data                                          (OLAP)
                                                                Preparation

SSIS
SSAS(OLAP)
SSRS             Deployment
Flexible APIs                                                                   SSAS
                                                                 Modeling      (Data
                                                                              Mining)

                                   Evaluation

                                                                  www.crisp-dm.org
Data Mining in SQL Server 2008
• New algorithms developed in conjunction
  with Microsoft Research
• Data mining is made accessible and easy to
  use through integrated user interface, cross-
  product integration and familiar, standard APIs
• Complete framework for building and
  deploying intelligent applications on the fly
• Integration into the cloud.
Top New Features in SQL Server 2008

• Test multiple data mining models simultaneously with statistical
  scores of error and accuracy and confirm their stability with cross
  validation
• Build multiple, incompatible mining models within a single
  structure; apply model analysis over filtered data; query against
  structure data to present complete information, all enabled by
  enhanced mining structures
• Combine the best of both worlds by blending optimized near-term
  predictions (ARTXP) and stable long-term predictions (ARIMA) with
  Better Time Series Support
• Discover the relationship between items that are frequently
  purchased together by using Shopping Basket Analysis; generate
  interactive forms for scoring new cases with Predictive Calculator,
  delivered with Microsoft SQL Server 2008 Data Mining Add-ins for
  Office 2007
Rich and Innovative Algorithms
•   Benefit from many rich and innovative data mining algorithms, most developed by Microsoft Research to
    support common business problems promptly and accurately.
•   Market Basket Analysis - Discover which items tend to be bought together to create recommendations on-
    the-fly and to determine how product placement can directly contribute to your bottom line
•   Churn Analysis - Anticipate customers who may be considering canceling their service and identify benefits
    that will keep them from leaving
•   Market Analysis - Define market segments by automatically grouping similar customers together. Use
    these segments to seek profitable customers
•   Forecasting - Predict sales and inventory amounts and learn how they are interrelated to foresee
    bottlenecks and improve performance
•   Data Exploration - Analyze profitability across customers, or compare customers who prefer different
    brands of the same product to discover new opportunities
•   Unsupervised Learning - Identify previously unknown relationships between various elements of your
    business to better inform your decisions
•   Web Site Analysis - Understand how people use your Web site and group similar usage patterns to offer a
    better experience
•   Campaign Analysis - Spend marketing dollars more effectively by targeting the customers most likely to
    respond to a promotion
•   Information Quality - Identify and handle anomalies during data entry or data loading to improve the
    quality of information
•   Text Analysis - Analyze feedback to find common themes and trends that concern your customers or
    employees, informing decisions with unstructured input
Value of Data Mining
                           Business Knowledge

                                                            SQL Server 2008
Business value




                                                                           Data Mining


                                                          OLAP



                                     Reports (Adhoc)

                           Reports (static)

                  Simple                                         Complex
                                              Usability
Data Mining User Interface
• SQL Server BI Development Studio
  – Environment for creation and data exploration
  – Data Mining projects in Visual Studio solutions, tightly
    integrated
  – Source Control Integration
• SQL Server Management Studio
  – One tool for all administrative tasks
  – Manage, view and query mining models
BI Integration
• Integration Services
  – Data Mining processing and results integrate
    directly in IS pipeline
• OLAP
  – Processing of mining models directly from
    cubes
  – Use of mining results as dimensions
• Reporting Services
  – Embed Data Mining results directly in
    Reporting Services Reports
Applied Data Mining
• Make Decisions without Coding
   – Learn business rules directly from data
• Client Customization
   – Learn logic customized for each client
• Automatic Update
   – Data mining application logic updated by model re-
     processing
   – Applications do not need to be rewritten, recompiled, re-
     deployed
Server Mining Architecture
      BI Dev        Your Application
      Studio
      (Visual
      Studio)        OLE DB/ ADOMD/ XMLA
                                            App
Deploy                                      Data


Analysis Services   Mining Model
Server
                    Data Mining Algorithm           Data
                                                   Source
Data Mining EXtensions
• OLE DB for Data Mining specification
   – Now part of XML/A specification
   – See www.xmla.org for XML/A details
• Connect to Analysis Server
   – OLEDB, ADO, ADO.Net, ADOMD.Net, XMLA
   Dim cmd as ADOMD.Command
   Dim reader as ADOMD.DataReader
   Cmd.Connection = conn
   Set reader =
     Cmd.ExecuteReader(“Select
     Predict(Gender)…”)
Typical DM Process Using DMX
Define a model:
CREATE MINING MODEL ….

                                    Data Mining
Train a model:                   Management System
INSERT INTO dmm ….                   (DMMS)
         Training Data




Prediction using a model:          Mining Model
SELECT …
FROM dmm PREDICTION JOIN …
         Prediction Input Data
DMX Commands
• Definition (DDL)
   –   CREATE – Make new model
   –   SELECT INTO – Create model by copying existing
   –   EXPORT – Save model as .abf file
   –   IMPORT – Retrieve model from .abf file
• Manipulation (DML)
   –   INSERT INTO – Train model
   –   UPDATE – Change content of model
   –   DELETE – Clear content
   –   SELECT – Browse model
DMX SELECT Elements
•   SELECT [FLATTENED] [TOP] <columns>
•   FROM <model>
•   PREDICTION JOIN <table>
•   ON <mapping>
•   WHERE <filter>
•   ORDER BY <sort expression>
    – Use query builder to create SELECT statement
Training a DM Model: Simple
INSERT INTO CollegePlanModel
  (StudentID, Gender, ParentIncome,
   Encouragement, CollegePlans)
OPENROWSET(‘<provider>’, ‘<connection>’,
      ‘SELECT    StudentID,
                 Gender,
                 ParentIncome,
                 Encouragement,
                 CollegePlans
       FROM CollegePlansTrainData’)
Prediction Using a DM Model
• PREDICTION JOIN
  SELECT t.ID, CPModel.Plan
  FROM CPModel PREDICTION JOIN
      OPENQUERY(…,„SELECT * FROM NewStudents‟) AS t
  ON CPModel.Gender = t.Gender AND
     CPModel.IQ = t.IQ
Visit more self help tutorials

• Pick a tutorial of your choice and browse
  through it at your own pace.
• The tutorials section is free, self-guiding and
  will not involve any additional support.
• Visit us at www.dataminingtools.net

More Related Content

What's hot (20)

Facebook Presto presentation
Facebook Presto presentation
Cyanny LIANG
 
Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22
Mark Kromer
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
James Serra
 
SQL Database on Azure
SQL Database on Azure
Thurupathan Vijayakumar
 
Azure SQL Database
Azure SQL Database
Palash Debnath
 
Big data architectures and the data lake
Big data architectures and the data lake
James Serra
 
High Performance Mysql
High Performance Mysql
liufabin 66688
 
Microsoft Azure Data Factory Data Flow Scenarios
Microsoft Azure Data Factory Data Flow Scenarios
Mark Kromer
 
Azure Synapse Analytics
Azure Synapse Analytics
WinWire Technologies Inc
 
Data platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptx
CalvinSim10
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
James Serra
 
Koalas: Making an Easy Transition from Pandas to Apache Spark
Koalas: Making an Easy Transition from Pandas to Apache Spark
Databricks
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
Databricks
 
Building End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCP
Databricks
 
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Mark Kromer
 
Fast analytics kudu to druid
Fast analytics kudu to druid
Worapol Alex Pongpech, PhD
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Cathrine Wilhelmsen
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Apache Druid 101
Apache Druid 101
Data Con LA
 
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Simplilearn
 
Facebook Presto presentation
Facebook Presto presentation
Cyanny LIANG
 
Mapping Data Flows Training deck Q1 CY22
Mapping Data Flows Training deck Q1 CY22
Mark Kromer
 
Is the traditional data warehouse dead?
Is the traditional data warehouse dead?
James Serra
 
Big data architectures and the data lake
Big data architectures and the data lake
James Serra
 
High Performance Mysql
High Performance Mysql
liufabin 66688
 
Microsoft Azure Data Factory Data Flow Scenarios
Microsoft Azure Data Factory Data Flow Scenarios
Mark Kromer
 
Data platform modernization with Databricks.pptx
Data platform modernization with Databricks.pptx
CalvinSim10
 
Modern Data Warehousing with the Microsoft Analytics Platform System
Modern Data Warehousing with the Microsoft Analytics Platform System
James Serra
 
Koalas: Making an Easy Transition from Pandas to Apache Spark
Koalas: Making an Easy Transition from Pandas to Apache Spark
Databricks
 
Introduction SQL Analytics on Lakehouse Architecture
Introduction SQL Analytics on Lakehouse Architecture
Databricks
 
Building End-to-End Delta Pipelines on GCP
Building End-to-End Delta Pipelines on GCP
Databricks
 
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Microsoft Azure Data Factory Hands-On Lab Overview Slides
Mark Kromer
 
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Pipelines and Data Flows: Introduction to Data Integration in Azure Synapse A...
Cathrine Wilhelmsen
 
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
5 Critical Steps to Clean Your Data Swamp When Migrating Off of Hadoop
Databricks
 
Apache Druid 101
Apache Druid 101
Data Con LA
 
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Hadoop Architecture | HDFS Architecture | Hadoop Architecture Tutorial | HDFS...
Simplilearn
 

Viewers also liked (20)

SQL Server Data Mining - Taking your Application Design to the Next Level
SQL Server Data Mining - Taking your Application Design to the Next Level
Mark Ginnebaugh
 
AI: Planning and AI
AI: Planning and AI
DataminingTools Inc
 
Lf conditionals
Lf conditionals
Olaf Du Pont
 
Lecture no 15
Lecture no 15
zaidshaidzaid
 
Microsoft Data Mining 2012
Microsoft Data Mining 2012
Mark Ginnebaugh
 
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Salah Amean
 
38475471 qa-and-software-testing-interview-questions-and-answers
38475471 qa-and-software-testing-interview-questions-and-answers
Maria FutureThoughts
 
Microsoft azure service 소개자료
Microsoft azure service 소개자료
Alvin You
 
Interview Questions for Mobile application Testing
Interview Questions for Mobile application Testing
Rahul S Singh
 
Preparing your QA team for mobile testing
Preparing your QA team for mobile testing
Geoffrey Goetz
 
Webservices(or)SoapUI Interview Questions
Webservices(or)SoapUI Interview Questions
H2kInfosys
 
Portavocía en redes sociales
Portavocía en redes sociales
Muévete en bici por Madrid
 
Quick Look At Clustering
Quick Look At Clustering
DataminingTools Inc
 
Norihicodanch
Norihicodanch
Filip Yang
 
LISP: Errors In Lisp
LISP: Errors In Lisp
DataminingTools Inc
 
LISP: Scope and extent in lisp
LISP: Scope and extent in lisp
DataminingTools Inc
 
Matlab: Saving And Publishing
Matlab: Saving And Publishing
DataminingTools Inc
 
LISP:Predicates in lisp
LISP:Predicates in lisp
DataminingTools Inc
 
Matlab: Discrete Linear Systems
Matlab: Discrete Linear Systems
DataminingTools Inc
 
Data-Applied: Technology Insights
Data-Applied: Technology Insights
DataminingTools Inc
 
SQL Server Data Mining - Taking your Application Design to the Next Level
SQL Server Data Mining - Taking your Application Design to the Next Level
Mark Ginnebaugh
 
Microsoft Data Mining 2012
Microsoft Data Mining 2012
Mark Ginnebaugh
 
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Data Mining: Concepts and Techniques_ Chapter 6: Mining Frequent Patterns, ...
Salah Amean
 
38475471 qa-and-software-testing-interview-questions-and-answers
38475471 qa-and-software-testing-interview-questions-and-answers
Maria FutureThoughts
 
Microsoft azure service 소개자료
Microsoft azure service 소개자료
Alvin You
 
Interview Questions for Mobile application Testing
Interview Questions for Mobile application Testing
Rahul S Singh
 
Preparing your QA team for mobile testing
Preparing your QA team for mobile testing
Geoffrey Goetz
 
Webservices(or)SoapUI Interview Questions
Webservices(or)SoapUI Interview Questions
H2kInfosys
 
Data-Applied: Technology Insights
Data-Applied: Technology Insights
DataminingTools Inc
 
Ad

Similar to SQL Server: Data Mining (20)

BI 2008 Simple
BI 2008 Simple
llangit
 
SQL Server 2008 Data Mining
SQL Server 2008 Data Mining
llangit
 
SQL Server 2008 Data Mining
SQL Server 2008 Data Mining
llangit
 
SQL Server 2008 Data Mining
SQL Server 2008 Data Mining
llangit
 
Data mining applications
Data mining applications
Dr. C.V. Suresh Babu
 
Data Mining 2008
Data Mining 2008
llangit
 
Data Mining for Developers
Data Mining for Developers
llangit
 
Decision support system
Decision support system
Bhuwneshwar Pandaya
 
Mine craft:
Mine craft:
Mark Tabladillo
 
Data mining
Data mining
Akannsha Totewar
 
Microsoft SQL Server_2012_predictive_analytics
Microsoft SQL Server_2012_predictive_analytics
David J Rosenthal
 
Sql server 2008 r2 predictive analysis data sheet
Sql server 2008 r2 predictive analysis data sheet
Klaudiia Jacome
 
SQL Saturday 108 -- Enterprise Data Mining with SQL Server
SQL Saturday 108 -- Enterprise Data Mining with SQL Server
Mark Tabladillo
 
SSAS Design &amp; Incremental Processing - PASSMN May 2010
SSAS Design &amp; Incremental Processing - PASSMN May 2010
Dan English
 
Data mining
Data mining
Ahmed Moussa
 
SQL Saturday 109 -- Enterprise Data Mining with SQL Server
SQL Saturday 109 -- Enterprise Data Mining with SQL Server
Mark Tabladillo
 
SQL Saturday 86 -- Enterprise Data Mining with SQL Server
SQL Saturday 86 -- Enterprise Data Mining with SQL Server
Mark Tabladillo
 
Introduction To Sql Server Data Mining
Introduction To Sql Server Data Mining
Hugo Olivera Alonso
 
Lecture2 (1).ppt
Lecture2 (1).ppt
Minakshee Patil
 
Data mining
Data mining
RajThakuri
 
BI 2008 Simple
BI 2008 Simple
llangit
 
SQL Server 2008 Data Mining
SQL Server 2008 Data Mining
llangit
 
SQL Server 2008 Data Mining
SQL Server 2008 Data Mining
llangit
 
SQL Server 2008 Data Mining
SQL Server 2008 Data Mining
llangit
 
Data Mining 2008
Data Mining 2008
llangit
 
Data Mining for Developers
Data Mining for Developers
llangit
 
Microsoft SQL Server_2012_predictive_analytics
Microsoft SQL Server_2012_predictive_analytics
David J Rosenthal
 
Sql server 2008 r2 predictive analysis data sheet
Sql server 2008 r2 predictive analysis data sheet
Klaudiia Jacome
 
SQL Saturday 108 -- Enterprise Data Mining with SQL Server
SQL Saturday 108 -- Enterprise Data Mining with SQL Server
Mark Tabladillo
 
SSAS Design &amp; Incremental Processing - PASSMN May 2010
SSAS Design &amp; Incremental Processing - PASSMN May 2010
Dan English
 
SQL Saturday 109 -- Enterprise Data Mining with SQL Server
SQL Saturday 109 -- Enterprise Data Mining with SQL Server
Mark Tabladillo
 
SQL Saturday 86 -- Enterprise Data Mining with SQL Server
SQL Saturday 86 -- Enterprise Data Mining with SQL Server
Mark Tabladillo
 
Introduction To Sql Server Data Mining
Introduction To Sql Server Data Mining
Hugo Olivera Alonso
 
Ad

More from DataminingTools Inc (20)

Terminology Machine Learning
Terminology Machine Learning
DataminingTools Inc
 
Techniques Machine Learning
Techniques Machine Learning
DataminingTools Inc
 
Machine learning Introduction
Machine learning Introduction
DataminingTools Inc
 
Areas of machine leanring
Areas of machine leanring
DataminingTools Inc
 
AI: Logic in AI 2
AI: Logic in AI 2
DataminingTools Inc
 
AI: Logic in AI
AI: Logic in AI
DataminingTools Inc
 
AI: Learning in AI 2
AI: Learning in AI 2
DataminingTools Inc
 
AI: Learning in AI
AI: Learning in AI
DataminingTools Inc
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligence
DataminingTools Inc
 
AI: Belief Networks
AI: Belief Networks
DataminingTools Inc
 
AI: AI & Searching
AI: AI & Searching
DataminingTools Inc
 
AI: AI & Problem Solving
AI: AI & Problem Solving
DataminingTools Inc
 
Data Mining: Text and web mining
Data Mining: Text and web mining
DataminingTools Inc
 
Data Mining: Outlier analysis
Data Mining: Outlier analysis
DataminingTools Inc
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
DataminingTools Inc
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
DataminingTools Inc
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
DataminingTools Inc
 
Data warehouse and olap technology
Data warehouse and olap technology
DataminingTools Inc
 
Data Mining: Data processing
Data Mining: Data processing
DataminingTools Inc
 
Data Mining: clustering and analysis
Data Mining: clustering and analysis
DataminingTools Inc
 
AI: Introduction to artificial intelligence
AI: Introduction to artificial intelligence
DataminingTools Inc
 
Data Mining: Text and web mining
Data Mining: Text and web mining
DataminingTools Inc
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
DataminingTools Inc
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
DataminingTools Inc
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
DataminingTools Inc
 
Data warehouse and olap technology
Data warehouse and olap technology
DataminingTools Inc
 
Data Mining: clustering and analysis
Data Mining: clustering and analysis
DataminingTools Inc
 

Recently uploaded (20)

Reducing Conflicts and Increasing Safety Along the Cycling Networks of East-F...
Reducing Conflicts and Increasing Safety Along the Cycling Networks of East-F...
Safe Software
 
Oracle Cloud and AI Specialization Program
Oracle Cloud and AI Specialization Program
VICTOR MAESTRE RAMIREZ
 
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
NTT DATA Technology & Innovation
 
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
Edge AI and Vision Alliance
 
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance
 
Enabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FME
Safe Software
 
Oracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI Professional
VICTOR MAESTRE RAMIREZ
 
FME for Distribution & Transmission Integrity Management Program (DIMP & TIMP)
FME for Distribution & Transmission Integrity Management Program (DIMP & TIMP)
Safe Software
 
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
Edge AI and Vision Alliance
 
PyData - Graph Theory for Multi-Agent Integration
PyData - Graph Theory for Multi-Agent Integration
barqawicloud
 
Providing an OGC API Processes REST Interface for FME Flow
Providing an OGC API Processes REST Interface for FME Flow
Safe Software
 
FIDO Seminar: Perspectives on Passkeys & Consumer Adoption.pptx
FIDO Seminar: Perspectives on Passkeys & Consumer Adoption.pptx
FIDO Alliance
 
MuleSoft for AgentForce : Topic Center and API Catalog
MuleSoft for AgentForce : Topic Center and API Catalog
shyamraj55
 
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
Safe Software
 
June Patch Tuesday
June Patch Tuesday
Ivanti
 
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Puppy jhon
 
Artificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdf
OnBoard
 
Data Validation and System Interoperability
Data Validation and System Interoperability
Safe Software
 
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc
 
Crypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdf
Stephen Perrenod
 
Reducing Conflicts and Increasing Safety Along the Cycling Networks of East-F...
Reducing Conflicts and Increasing Safety Along the Cycling Networks of East-F...
Safe Software
 
Oracle Cloud and AI Specialization Program
Oracle Cloud and AI Specialization Program
VICTOR MAESTRE RAMIREZ
 
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
Can We Use Rust to Develop Extensions for PostgreSQL? (POSETTE: An Event for ...
NTT DATA Technology & Innovation
 
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
“Why It’s Critical to Have an Integrated Development Methodology for Edge AI,...
Edge AI and Vision Alliance
 
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance Seminar State of Passkeys.pptx
FIDO Alliance
 
Enabling BIM / GIS integrations with Other Systems with FME
Enabling BIM / GIS integrations with Other Systems with FME
Safe Software
 
Oracle Cloud Infrastructure Generative AI Professional
Oracle Cloud Infrastructure Generative AI Professional
VICTOR MAESTRE RAMIREZ
 
FME for Distribution & Transmission Integrity Management Program (DIMP & TIMP)
FME for Distribution & Transmission Integrity Management Program (DIMP & TIMP)
Safe Software
 
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
“From Enterprise to Makers: Driving Vision AI Innovation at the Extreme Edge,...
Edge AI and Vision Alliance
 
PyData - Graph Theory for Multi-Agent Integration
PyData - Graph Theory for Multi-Agent Integration
barqawicloud
 
Providing an OGC API Processes REST Interface for FME Flow
Providing an OGC API Processes REST Interface for FME Flow
Safe Software
 
FIDO Seminar: Perspectives on Passkeys & Consumer Adoption.pptx
FIDO Seminar: Perspectives on Passkeys & Consumer Adoption.pptx
FIDO Alliance
 
MuleSoft for AgentForce : Topic Center and API Catalog
MuleSoft for AgentForce : Topic Center and API Catalog
shyamraj55
 
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
No-Code Workflows for CAD & 3D Data: Scaling AI-Driven Infrastructure
Safe Software
 
June Patch Tuesday
June Patch Tuesday
Ivanti
 
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Viral>Wondershare Filmora 14.5.18.12900 Crack Free Download
Puppy jhon
 
Artificial Intelligence in the Nonprofit Boardroom.pdf
Artificial Intelligence in the Nonprofit Boardroom.pdf
OnBoard
 
Data Validation and System Interoperability
Data Validation and System Interoperability
Safe Software
 
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc Webinar - 2025 Global Privacy Survey
TrustArc
 
Crypto Super 500 - 14th Report - June2025.pdf
Crypto Super 500 - 14th Report - June2025.pdf
Stephen Perrenod
 

SQL Server: Data Mining

  • 2. Types of analysis • Ad-hoc query/Reporting/Analysis – What is the purpose? • Simple reports • Key Performance Indicators • OLAP cubes – Slice & Dice – In Real time - What happens now? • Events/Triggers • Data Mining – How do we do it? – What happens?
  • 3. What does Data Mining Do? Explores Your Data Finds Patterns Performs Predictions
  • 4. Data Mining Algorithms • Classification • Regression • Segmentation • Association • Forecasting • Text Analysis • Advanced Data Exploration
  • 5. Mining Process Training data Data to be Mining Model predicted DM Engine Mining Model With predictions
  • 6. Data Mining Process SSAS (OLAP) Business Data DSV Understanding Understanding SSIS SSAS Data Data (OLAP) Preparation SSIS SSAS(OLAP) SSRS Deployment Flexible APIs SSAS Modeling (Data Mining) Evaluation www.crisp-dm.org
  • 7. Data Mining in SQL Server 2008 • New algorithms developed in conjunction with Microsoft Research • Data mining is made accessible and easy to use through integrated user interface, cross- product integration and familiar, standard APIs • Complete framework for building and deploying intelligent applications on the fly • Integration into the cloud.
  • 8. Top New Features in SQL Server 2008 • Test multiple data mining models simultaneously with statistical scores of error and accuracy and confirm their stability with cross validation • Build multiple, incompatible mining models within a single structure; apply model analysis over filtered data; query against structure data to present complete information, all enabled by enhanced mining structures • Combine the best of both worlds by blending optimized near-term predictions (ARTXP) and stable long-term predictions (ARIMA) with Better Time Series Support • Discover the relationship between items that are frequently purchased together by using Shopping Basket Analysis; generate interactive forms for scoring new cases with Predictive Calculator, delivered with Microsoft SQL Server 2008 Data Mining Add-ins for Office 2007
  • 9. Rich and Innovative Algorithms • Benefit from many rich and innovative data mining algorithms, most developed by Microsoft Research to support common business problems promptly and accurately. • Market Basket Analysis - Discover which items tend to be bought together to create recommendations on- the-fly and to determine how product placement can directly contribute to your bottom line • Churn Analysis - Anticipate customers who may be considering canceling their service and identify benefits that will keep them from leaving • Market Analysis - Define market segments by automatically grouping similar customers together. Use these segments to seek profitable customers • Forecasting - Predict sales and inventory amounts and learn how they are interrelated to foresee bottlenecks and improve performance • Data Exploration - Analyze profitability across customers, or compare customers who prefer different brands of the same product to discover new opportunities • Unsupervised Learning - Identify previously unknown relationships between various elements of your business to better inform your decisions • Web Site Analysis - Understand how people use your Web site and group similar usage patterns to offer a better experience • Campaign Analysis - Spend marketing dollars more effectively by targeting the customers most likely to respond to a promotion • Information Quality - Identify and handle anomalies during data entry or data loading to improve the quality of information • Text Analysis - Analyze feedback to find common themes and trends that concern your customers or employees, informing decisions with unstructured input
  • 10. Value of Data Mining Business Knowledge SQL Server 2008 Business value Data Mining OLAP Reports (Adhoc) Reports (static) Simple Complex Usability
  • 11. Data Mining User Interface • SQL Server BI Development Studio – Environment for creation and data exploration – Data Mining projects in Visual Studio solutions, tightly integrated – Source Control Integration • SQL Server Management Studio – One tool for all administrative tasks – Manage, view and query mining models
  • 12. BI Integration • Integration Services – Data Mining processing and results integrate directly in IS pipeline • OLAP – Processing of mining models directly from cubes – Use of mining results as dimensions • Reporting Services – Embed Data Mining results directly in Reporting Services Reports
  • 13. Applied Data Mining • Make Decisions without Coding – Learn business rules directly from data • Client Customization – Learn logic customized for each client • Automatic Update – Data mining application logic updated by model re- processing – Applications do not need to be rewritten, recompiled, re- deployed
  • 14. Server Mining Architecture BI Dev Your Application Studio (Visual Studio) OLE DB/ ADOMD/ XMLA App Deploy Data Analysis Services Mining Model Server Data Mining Algorithm Data Source
  • 15. Data Mining EXtensions • OLE DB for Data Mining specification – Now part of XML/A specification – See www.xmla.org for XML/A details • Connect to Analysis Server – OLEDB, ADO, ADO.Net, ADOMD.Net, XMLA Dim cmd as ADOMD.Command Dim reader as ADOMD.DataReader Cmd.Connection = conn Set reader = Cmd.ExecuteReader(“Select Predict(Gender)…”)
  • 16. Typical DM Process Using DMX Define a model: CREATE MINING MODEL …. Data Mining Train a model: Management System INSERT INTO dmm …. (DMMS) Training Data Prediction using a model: Mining Model SELECT … FROM dmm PREDICTION JOIN … Prediction Input Data
  • 17. DMX Commands • Definition (DDL) – CREATE – Make new model – SELECT INTO – Create model by copying existing – EXPORT – Save model as .abf file – IMPORT – Retrieve model from .abf file • Manipulation (DML) – INSERT INTO – Train model – UPDATE – Change content of model – DELETE – Clear content – SELECT – Browse model
  • 18. DMX SELECT Elements • SELECT [FLATTENED] [TOP] <columns> • FROM <model> • PREDICTION JOIN <table> • ON <mapping> • WHERE <filter> • ORDER BY <sort expression> – Use query builder to create SELECT statement
  • 19. Training a DM Model: Simple INSERT INTO CollegePlanModel (StudentID, Gender, ParentIncome, Encouragement, CollegePlans) OPENROWSET(‘<provider>’, ‘<connection>’, ‘SELECT StudentID, Gender, ParentIncome, Encouragement, CollegePlans FROM CollegePlansTrainData’)
  • 20. Prediction Using a DM Model • PREDICTION JOIN SELECT t.ID, CPModel.Plan FROM CPModel PREDICTION JOIN OPENQUERY(…,„SELECT * FROM NewStudents‟) AS t ON CPModel.Gender = t.Gender AND CPModel.IQ = t.IQ
  • 21. Visit more self help tutorials • Pick a tutorial of your choice and browse through it at your own pace. • The tutorials section is free, self-guiding and will not involve any additional support. • Visit us at www.dataminingtools.net