SlideShare a Scribd company logo
© 2016 IBM Corporation
Introducing Big SQL Federation
Createdby C. M. Saracco,IBM Silicon Valley Lab
June 2016
© 2016 IBM Corporation2
Executive summary
§ What’s Big SQL federation?
− Integration technology for Hadoop and remote data sources
− Transparently query Big SQL (Hadoop) and RDBMS tables with standard SQL
− Query optimization, security mapping, other critical features built in
§ Why federate?
− Not always practical to move / replicate data from one source to another
− Hadoop programmers need access to corporate RDBMS data to enhance analytics,
integrate public and proprietary data, etc.
§ What’s supported?
− Big SQL tables (and views) in DFS, HBase, or Hive warehouse
− RDBMS tables (and views) in Oracle, Teradata, MS SQL Server, DB2, Informix,
Netezza, . . .
− Query data across all sources (project, restrict, join, union, wide range of sub-queries,
wide range of built-in functions )
− INSERT INTO … SELECT FROM …
− Issue data-source specific SQL
− Collect statistics and inspect detailed data access plan
− . . . .
© 2016 IBM Corporation3
Agenda
§Overview
− Key features
− When to federate
§Technology
− Architecture
− Set up, usage examples
− Supported data sources
§Summary
© 2016 IBM Corporation4
Big SQL query federation = virtualized data access
Transparent
§ Appears to be one source
§ Programmers don’t need to know how /
where data is stored
Heterogeneous
§ Accesses data from diverse sources
High Function
§ Full query support against all data
§ Capabilities of sources as well
Autonomous
§ Non-disruptive to data sources, existing
applications, systems.
High Performance
§ Optimization of distributed queries
SQL tools,
applications Data sources
Virtualized
data
© 2016 IBM Corporation5
When to federate….
§ Budget
§ Resources
§ Time
§ Ownership
§ Too ad hoc, temporary
§ Too proprietary
§ Too recent
§ Too big
Physical integration not always a requirement/option
Barriers
© 2016 IBM Corporation6
Agenda
§Overview
− Key features
− When to federate
§Technology
− Architecture
− Set up, usage examples
− Supported data sources
§Summary
© 2016 IBM Corporation7
Federation architecture and components
Wrapper
ServerServer
Nickname
Nickname
Nickname
Federated server:
BigSQL database enabled
for federation.
Wrapper: library allowing
access to a particular
class of data sources or
protocols (Net8, DRDA,
etc). Contains
information about data
source characteristics
Server: represents a
specific data source
Nickname: a local alias
to data on a remote
server (e.g, a specific
table or view)
Federation catalog
4Stores information about
4Wrappers,servers,
nicknames
4Server attributes
4Nickname attributes
4Remote functions
Federation server (Big SQL)
© 2016 IBM Corporation8
Federation in practice
§ Admin enables
federation
§ Apps connect to Big
SQL database
§ Nicknames look like
tables to the app
§ Big SQL optimizer
creates global data
access plan with cost
analysis, query push
down
§ Query fragments
executed remotely
Nickname
Nickname
Table
Cost-based optimizer
Wrapper
Client library
Wrapper
Client library
Local + Remote
Execution Plans
Remote sources
Federation server (Big SQL)
Native dialect
Connect to bigsql
© 2016 IBM Corporation9
Creating and using federated objects (example)
-- Create wrapper to identify client library (Oracle Net8)
CREATE WRAPPER ORA LIBRARY 'libdb2net8.so'
-- Create server for Oracle data source
CREATE SERVER ORASERV TYPE ORACLE VERSION 11 WRAPPER ORA
AUTHORIZATION
”orauser” PASSWORD ”orauser” OPTIONS (NODE 'TNSNODENAME', PUSHDOWN 'Y',
COLLATING_SEQUENCE 'N');
-- Map the local user 'orauser' to the Oracle user 'orauser' / password 'orauser'
CREATE USER MAPPING FOR orauser SERVER ORASERV OPTIONS (
REMOTE_PASSWORD
'orauser');
-- Create nickname for Oracle table / view
CREATE NICKNAME NICK1 FOR ORASERV.ORAUSER.TABLE1;
-- Query the nickname
SELECT * FROM NICK1 WHERE COL1 < 10;
© 2016 IBM Corporation10
Joining data across sources
© 2016 IBM Corporation11
Data sources supported by Big SQL Federation Server
§ Current list of supported data sources available at
https://www-304.ibm.com/support/entdocview.wss?uid=swg27044495
Data Source Supported Versions Notes
DB2® DB2 for Linux, UNIX, and
Windows 9.7, 9.8, 10.1, 10.5
DB2 for z/OS 8.x, 9.x, and 10.x
Oracle 11g, 11gR1, 11g R2, 12c
Teradata 12, 13, 14 Not supported on POWER systems.
Netezza 4.6, 5.0, 6.0, 7.2 Not supported on POWER systems.
Informix 11.5
Microsoft SQL Server 2012, 2014
© 2016 IBM Corporation12
Agenda
§Overview
− Key features
− When to federate
§Technology
− Architecture
− Set up, usage examples
− Supported data sources
§Summary
© 2016 IBM Corporation13
Big SQL federation
– Easily access information on demand
– Combine Big Data in Hadoop with RDBMS data
– Quickly extend your data warehouse
Benefits
– Cost-effective
– Quick to provide fast time to value
– Agile and flexible
– Versatile
– Low risk, seamless, and transparent

More Related Content

PDF
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
PDF
Big Data: Getting off to a fast start with Big SQL (World of Watson 2016 sess...
PDF
Big SQL Competitive Summary - Vendor Landscape
PDF
Using your DB2 SQL Skills with Hadoop and Spark
PDF
Big Data: SQL on Hadoop from IBM
PDF
Big SQL 3.0: Datawarehouse-grade Performance on Hadoop - At last!
PDF
Big Data: Big SQL web tooling (Data Server Manager) self-study lab
PDF
Big SQL 3.0 - Fast and easy SQL on Hadoop
Big Data: InterConnect 2016 Session on Getting Started with Big Data Analytics
Big Data: Getting off to a fast start with Big SQL (World of Watson 2016 sess...
Big SQL Competitive Summary - Vendor Landscape
Using your DB2 SQL Skills with Hadoop and Spark
Big Data: SQL on Hadoop from IBM
Big SQL 3.0: Datawarehouse-grade Performance on Hadoop - At last!
Big Data: Big SQL web tooling (Data Server Manager) self-study lab
Big SQL 3.0 - Fast and easy SQL on Hadoop

What's hot (18)

PDF
Big Data: Working with Big SQL data from Spark
PDF
Taming Big Data with Big SQL 3.0
PDF
Big SQL 3.0 - Toronto Meetup -- May 2014
PDF
Big Data: Explore Hadoop and BigInsights self-study lab
PDF
Big Data: Getting started with Big SQL self-study guide
PDF
Big Data: HBase and Big SQL self-study lab
PDF
Ibm db2 big sql
PDF
Hadoop-DS: Which SQL-on-Hadoop Rules the Herd
PDF
Getting started with Hadoop on the Cloud with Bluemix
PDF
Big Data: Big SQL and HBase
PPTX
Hadoop Innovation Summit 2014
PDF
Big Data: Get started with SQL on Hadoop self-study lab
PPTX
Hadoop and Hive in Enterprises
PDF
SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK
PPTX
Schema-on-Read vs Schema-on-Write
PPTX
Breakout: Hadoop and the Operational Data Store
PPTX
Integrating hadoop - Big Data TechCon 2013
PDF
Planing and optimizing data lake architecture
Big Data: Working with Big SQL data from Spark
Taming Big Data with Big SQL 3.0
Big SQL 3.0 - Toronto Meetup -- May 2014
Big Data: Explore Hadoop and BigInsights self-study lab
Big Data: Getting started with Big SQL self-study guide
Big Data: HBase and Big SQL self-study lab
Ibm db2 big sql
Hadoop-DS: Which SQL-on-Hadoop Rules the Herd
Getting started with Hadoop on the Cloud with Bluemix
Big Data: Big SQL and HBase
Hadoop Innovation Summit 2014
Big Data: Get started with SQL on Hadoop self-study lab
Hadoop and Hive in Enterprises
SUSE, Hadoop and Big Data Update. Stephen Mogg, SUSE UK
Schema-on-Read vs Schema-on-Write
Breakout: Hadoop and the Operational Data Store
Integrating hadoop - Big Data TechCon 2013
Planing and optimizing data lake architecture
Ad

Similar to Big Data: SQL query federation for Hadoop and RDBMS data (20)

PPTX
IDERA Live | Working with Complex Data Environments
PDF
RMOUG MySQL 5.7 New Features
PPTX
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
PDF
MySQL Day Paris 2016 - MySQL as a Document Store
PPTX
Data Analytics Meetup: Introduction to Azure Data Lake Storage
 
PDF
Oracle NoSQL Database release 3.0 overview
PDF
Semantic web meetup 14.november 2013
PDF
OpenStack Online Meetup
PDF
What is Trove, the Database as a Service on OpenStack?
PDF
NonStop SQL/MX DBS Explained
PDF
Prague data management meetup 2018-03-27
PDF
Postgres Integrates Effectively in the "Enterprise Sandbox"
 
PDF
Solution Use Case Demo: The Power of Relationships in Your Big Data
PDF
Ibm integrated analytics system
PPTX
Microsoft Data Platform - What's included
PPTX
Whats new in Oracle Database 12c release 12.1.0.2
PDF
MySQL Document Store
PDF
MySQL como Document Store PHP Conference 2017
PDF
Data API as a Foundation for Systems of Engagement
PDF
MySQL Connector/Node.js and the X DevAPI
IDERA Live | Working with Complex Data Environments
RMOUG MySQL 5.7 New Features
OUG Scotland 2014 - NoSQL and MySQL - The best of both worlds
MySQL Day Paris 2016 - MySQL as a Document Store
Data Analytics Meetup: Introduction to Azure Data Lake Storage
 
Oracle NoSQL Database release 3.0 overview
Semantic web meetup 14.november 2013
OpenStack Online Meetup
What is Trove, the Database as a Service on OpenStack?
NonStop SQL/MX DBS Explained
Prague data management meetup 2018-03-27
Postgres Integrates Effectively in the "Enterprise Sandbox"
 
Solution Use Case Demo: The Power of Relationships in Your Big Data
Ibm integrated analytics system
Microsoft Data Platform - What's included
Whats new in Oracle Database 12c release 12.1.0.2
MySQL Document Store
MySQL como Document Store PHP Conference 2017
Data API as a Foundation for Systems of Engagement
MySQL Connector/Node.js and the X DevAPI
Ad

Recently uploaded (20)

PDF
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
PDF
CIFDAQ's Market Insight: SEC Turns Pro Crypto
PDF
Advanced IT Governance
PDF
Modernizing your data center with Dell and AMD
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PPTX
Cloud computing and distributed systems.
PDF
Diabetes mellitus diagnosis method based random forest with bat algorithm
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PPT
Teaching material agriculture food technology
PDF
Sensors and Actuators in IoT Systems using pdf
PDF
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
PDF
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
PPTX
Understanding_Digital_Forensics_Presentation.pptx
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Review of recent advances in non-invasive hemoglobin estimation
PPTX
20250228 LYD VKU AI Blended-Learning.pptx
PDF
NewMind AI Monthly Chronicles - July 2025
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Transforming Manufacturing operations through Intelligent Integrations
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
Architecting across the Boundaries of two Complex Domains - Healthcare & Tech...
CIFDAQ's Market Insight: SEC Turns Pro Crypto
Advanced IT Governance
Modernizing your data center with Dell and AMD
Dropbox Q2 2025 Financial Results & Investor Presentation
Cloud computing and distributed systems.
Diabetes mellitus diagnosis method based random forest with bat algorithm
Advanced methodologies resolving dimensionality complications for autism neur...
Teaching material agriculture food technology
Sensors and Actuators in IoT Systems using pdf
solutions_manual_-_materials___processing_in_manufacturing__demargo_.pdf
GDG Cloud Iasi [PUBLIC] Florian Blaga - Unveiling the Evolution of Cybersecur...
Understanding_Digital_Forensics_Presentation.pptx
Reach Out and Touch Someone: Haptics and Empathic Computing
Review of recent advances in non-invasive hemoglobin estimation
20250228 LYD VKU AI Blended-Learning.pptx
NewMind AI Monthly Chronicles - July 2025
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Transforming Manufacturing operations through Intelligent Integrations
“AI and Expert System Decision Support & Business Intelligence Systems”

Big Data: SQL query federation for Hadoop and RDBMS data

  • 1. © 2016 IBM Corporation Introducing Big SQL Federation Createdby C. M. Saracco,IBM Silicon Valley Lab June 2016
  • 2. © 2016 IBM Corporation2 Executive summary § What’s Big SQL federation? − Integration technology for Hadoop and remote data sources − Transparently query Big SQL (Hadoop) and RDBMS tables with standard SQL − Query optimization, security mapping, other critical features built in § Why federate? − Not always practical to move / replicate data from one source to another − Hadoop programmers need access to corporate RDBMS data to enhance analytics, integrate public and proprietary data, etc. § What’s supported? − Big SQL tables (and views) in DFS, HBase, or Hive warehouse − RDBMS tables (and views) in Oracle, Teradata, MS SQL Server, DB2, Informix, Netezza, . . . − Query data across all sources (project, restrict, join, union, wide range of sub-queries, wide range of built-in functions ) − INSERT INTO … SELECT FROM … − Issue data-source specific SQL − Collect statistics and inspect detailed data access plan − . . . .
  • 3. © 2016 IBM Corporation3 Agenda §Overview − Key features − When to federate §Technology − Architecture − Set up, usage examples − Supported data sources §Summary
  • 4. © 2016 IBM Corporation4 Big SQL query federation = virtualized data access Transparent § Appears to be one source § Programmers don’t need to know how / where data is stored Heterogeneous § Accesses data from diverse sources High Function § Full query support against all data § Capabilities of sources as well Autonomous § Non-disruptive to data sources, existing applications, systems. High Performance § Optimization of distributed queries SQL tools, applications Data sources Virtualized data
  • 5. © 2016 IBM Corporation5 When to federate…. § Budget § Resources § Time § Ownership § Too ad hoc, temporary § Too proprietary § Too recent § Too big Physical integration not always a requirement/option Barriers
  • 6. © 2016 IBM Corporation6 Agenda §Overview − Key features − When to federate §Technology − Architecture − Set up, usage examples − Supported data sources §Summary
  • 7. © 2016 IBM Corporation7 Federation architecture and components Wrapper ServerServer Nickname Nickname Nickname Federated server: BigSQL database enabled for federation. Wrapper: library allowing access to a particular class of data sources or protocols (Net8, DRDA, etc). Contains information about data source characteristics Server: represents a specific data source Nickname: a local alias to data on a remote server (e.g, a specific table or view) Federation catalog 4Stores information about 4Wrappers,servers, nicknames 4Server attributes 4Nickname attributes 4Remote functions Federation server (Big SQL)
  • 8. © 2016 IBM Corporation8 Federation in practice § Admin enables federation § Apps connect to Big SQL database § Nicknames look like tables to the app § Big SQL optimizer creates global data access plan with cost analysis, query push down § Query fragments executed remotely Nickname Nickname Table Cost-based optimizer Wrapper Client library Wrapper Client library Local + Remote Execution Plans Remote sources Federation server (Big SQL) Native dialect Connect to bigsql
  • 9. © 2016 IBM Corporation9 Creating and using federated objects (example) -- Create wrapper to identify client library (Oracle Net8) CREATE WRAPPER ORA LIBRARY 'libdb2net8.so' -- Create server for Oracle data source CREATE SERVER ORASERV TYPE ORACLE VERSION 11 WRAPPER ORA AUTHORIZATION ”orauser” PASSWORD ”orauser” OPTIONS (NODE 'TNSNODENAME', PUSHDOWN 'Y', COLLATING_SEQUENCE 'N'); -- Map the local user 'orauser' to the Oracle user 'orauser' / password 'orauser' CREATE USER MAPPING FOR orauser SERVER ORASERV OPTIONS ( REMOTE_PASSWORD 'orauser'); -- Create nickname for Oracle table / view CREATE NICKNAME NICK1 FOR ORASERV.ORAUSER.TABLE1; -- Query the nickname SELECT * FROM NICK1 WHERE COL1 < 10;
  • 10. © 2016 IBM Corporation10 Joining data across sources
  • 11. © 2016 IBM Corporation11 Data sources supported by Big SQL Federation Server § Current list of supported data sources available at https://www-304.ibm.com/support/entdocview.wss?uid=swg27044495 Data Source Supported Versions Notes DB2® DB2 for Linux, UNIX, and Windows 9.7, 9.8, 10.1, 10.5 DB2 for z/OS 8.x, 9.x, and 10.x Oracle 11g, 11gR1, 11g R2, 12c Teradata 12, 13, 14 Not supported on POWER systems. Netezza 4.6, 5.0, 6.0, 7.2 Not supported on POWER systems. Informix 11.5 Microsoft SQL Server 2012, 2014
  • 12. © 2016 IBM Corporation12 Agenda §Overview − Key features − When to federate §Technology − Architecture − Set up, usage examples − Supported data sources §Summary
  • 13. © 2016 IBM Corporation13 Big SQL federation – Easily access information on demand – Combine Big Data in Hadoop with RDBMS data – Quickly extend your data warehouse Benefits – Cost-effective – Quick to provide fast time to value – Agile and flexible – Versatile – Low risk, seamless, and transparent