SlideShare a Scribd company logo
SQL vs. NoSQL 
Making the right choice 
18 September 2014
The contenders 
SQL NoSQL 
© Copyright Dimension Data 18 September 2014 2
SQL Databases 
• RDBMS 
• Standardized 
• Mature 
• Reliable 
• Well understood 
• Queryable 
• ACID 
© Copyright Dimension Data 18 September 2014 3
NoSQL scalability argument 
• Scale-Up vs Scale-Out 
• Use of commodity hardware 
• Locking / Latching 
• Consistency over partitions 
• Availability of partitions 
• Referential integrity 
Cost of scaling 
SQL NoSQL 
© Copyright Dimension Data 18 September 2014 4
Other RDBMS / SQL Database drawbacks 
• One-solution-fits-all 
• Slow for certain tasks 
• ACID is not always needed 
• ORM required 
• Lack of flexibility 
• Rigid schema 
• Management complexity 
• Add-on solutions 
• XML-fields, Filestreams 
• Full-text indexes 
© Copyright Dimension Data 18 September 2014 5
CAP theorem (Brewer's theorem) 
© Copyright Dimension Data 18 September 2014 6
NoSQL Use Cases 
• Bigness / Avoid hitting the wall 
• Massive write performance 
• Write availability 
• Fast key-value access 
• Flexible schema and flexible datatypes 
• Schema migration 
• No single point of failure 
• Generally available parallel computing 
• Easier maintainability, administration and operations 
• Programmer ease of use 
• Use the right data model for the right problem 
• Tunable CAP tradeoffs 
© Copyright Dimension Data 18 September 2014 7
ACID Transactions 
Atomicity 
Consistancy 
Isolation 
Durability 
© Copyright Dimension Data 18 September 2014 8
NoSQL ACID Trade-offs 
• Dropping Atomicity lets you shorten the 
time tables (sets of data) are locked. 
MongoDB, CouchDB. 
• Dropping Consistency lets you scale up 
writes across cluster nodes. 
Riak, Cassandra. 
• Dropping Durability lets you respond to 
write commands without flushing to disk. 
Memcache, Redis. 
© Copyright Dimension Data 18 September 2014 9
NoSQL Database Main Types 
• Key-Value Store 
• A basic dictionary design storing values under unique keys 
• The database does not care about the structure of the value 
• Examples: 
• Memcache 
• Riak 
• Azure Blob Storage 
• Good at: 
• Handles size well 
• Processing a constant stream of small reads and writes 
• Fast 
• Programmer friendly 
© Copyright Dimension Data 18 September 2014 10
NoSQL Database Main Types 
• Column Store 
• A column is a tuple of 3 elements: unique name of value, a typed 
value, timestamp 
• Columns may be part of column families 
• Columns need not appear in every record 
• Example: 
• Hbase 
• Hypertable 
• Cassandra 
• Azure Table Storage 
• Good at: 
• Handles size well 
• Stream massive write loads 
• High availability 
• Multiple-data centers 
• MapReduce. 
© Copyright Dimension Data 18 September 2014 11
NoSQL Database Main Types 
• Document Store 
• Use a unique key to store and retrieve a JSON document 
• Documents are schemaless 
• Metadata is added to the document to aid querying 
• Indexing of documents and metadata speeds up retrieval 
• Example: 
• CouchDB 
• MongoDB 
• RavenDB 
• Azure DocumentDB service (Preview) 
• Good at: 
• Natural data modeling 
• Programmer friendly 
• Rapid development 
• Web friendly 
• CRUD 
© Copyright Dimension Data 18 September 2014 12
NoSQL Database Main Types 
• Graph Database 
• Uses graph structures with nodes, edges, and properties to represent 
and store data 
• Every element contains a direct pointer to its adjacent elements 
• Example: 
• AllegroGraph 
• InfoGrid 
• Neo4j 
• Good at: 
• Complicated graph problems 
• Topographical data 
• Fast 
© Copyright Dimension Data 18 September 2014 13
NoSql Database Type Comparison 
Data Model Performance Scalability Flexibility Complexity Functionality 
Key–Value 
Store 
high high high none variable (none) 
Column- 
Oriented Store 
high high moderate low minimal 
Document- 
Oriented Store 
high variable (high) high low variable (low) 
Graph 
Database 
variable variable high high graph theory 
Relational 
Database 
variable variable low moderate 
relational 
algebra 
© Copyright Dimension Data 18 September 2014 14
Things to consider when choosing 
• Where are you starting from? 
• What are you trying to accomplish? 
• Things to Consider... 
• Your Problem 
• Access pattern, scalability, consistency, durability 
• Money 
• Scaling, admins, license, operating cost 
• Programming 
• Flexible schema, JSON, REST, language, graphs 
• Performance 
• Reads, writes, consistency, workload, eventual consistency 
• Features 
• Cross datacenter, upgrades, indexes, persistence, tunability 
• The vendor 
• Viability, future direction, responsiveness, partnerships 
© Copyright Dimension Data 18 September 2014 15
Big Data – Petabyte range 
Microsoft HDInsight 
= 
Hadoop as a service on Azure (+ .NET) 
© Copyright Dimension Data 18 September 2014 16
Hadoop components 
© Copyright Dimension Data 18 September 2014 17
Using Hadoop 
© Copyright Dimension Data 18 September 2014 18
Hadoop cluster size 
Yahoo! wins with a massive 42000 node cluster 
© Copyright Dimension Data 18 September 2014 19
Questions 
USE [Euricom] 
SELECT [Question] 
FROM [dbo].[FAQ] 
WHERE [Answer] IS NULL 
(0 row(s) affected) 
© Copyright Dimension Data 18 September 2014 20

More Related Content

PPT
SQL or NoSQL, that is the question!
PPTX
NoSQL vs SQL (by Dmitriy Beseda, JS developer and coach Binary Studio Academy)
PDF
SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
PPTX
iForum 2015: SQL vs. NoSQL
PPTX
SQL vs. NoSQL. It's always a hard choice.
DOCX
Sql vs NO-SQL database differences explained
PPT
SQL vs NoSQL
PPTX
What is NoSQL and CAP Theorem
SQL or NoSQL, that is the question!
NoSQL vs SQL (by Dmitriy Beseda, JS developer and coach Binary Studio Academy)
SQL vs NoSQL: Big Data Adoption & Success in the Enterprise
iForum 2015: SQL vs. NoSQL
SQL vs. NoSQL. It's always a hard choice.
Sql vs NO-SQL database differences explained
SQL vs NoSQL
What is NoSQL and CAP Theorem

What's hot (20)

PPT
SQL/NoSQL How to choose ?
PPTX
Rdbms vs. no sql
PPT
RDBMS vs NoSQL
PPTX
Relational and non relational database 7
PPTX
NOSQL vs SQL
ODP
Nonrelational Databases
PPTX
Sql vs. NoSql
PPTX
NoSQL Data Architecture Patterns
PPTX
Selecting best NoSQL
PPTX
NoSQL Architecture Overview
PPT
SQL, NoSQL, BigData in Data Architecture
PPTX
Data Modeling for NoSQL
PPTX
SQL vs NoSQL
PPTX
Introduction to NoSQL
PPTX
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
PDF
Introduction to NoSQL
PDF
NoSQL Databases
PDF
Big Challenges in Data Modeling: NoSQL and Data Modeling
PPTX
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
PPTX
NoSql Data Management
SQL/NoSQL How to choose ?
Rdbms vs. no sql
RDBMS vs NoSQL
Relational and non relational database 7
NOSQL vs SQL
Nonrelational Databases
Sql vs. NoSql
NoSQL Data Architecture Patterns
Selecting best NoSQL
NoSQL Architecture Overview
SQL, NoSQL, BigData in Data Architecture
Data Modeling for NoSQL
SQL vs NoSQL
Introduction to NoSQL
Webinar: Dyn + DataStax - helping companies deliver exceptional end-user expe...
Introduction to NoSQL
NoSQL Databases
Big Challenges in Data Modeling: NoSQL and Data Modeling
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
NoSql Data Management
Ad

Viewers also liked (15)

PPTX
Sql vs NoSQL
KEY
NoSQL Databases: Why, what and when
PDF
Un acercamiento a las bases de datos NoSQL
PDF
How to Speed up your Database
PDF
Introducción al mundo NoSQL
PPTX
PDF
SQL vs. NoSQL Databases
PDF
NoSQL: Introducción a las Bases de Datos no estructuradas
PDF
Distributed applications using Hazelcast
PDF
Introduction to column oriented databases
PDF
SQL vs. NoSQL
PDF
Bases de Datos No Relacionales (NoSQL): Cassandra, CouchDB, MongoDB y Neo4j
PDF
Hbase: Introduction to column oriented databases
PPTX
Introduction to NoSQL Databases
Sql vs NoSQL
NoSQL Databases: Why, what and when
Un acercamiento a las bases de datos NoSQL
How to Speed up your Database
Introducción al mundo NoSQL
SQL vs. NoSQL Databases
NoSQL: Introducción a las Bases de Datos no estructuradas
Distributed applications using Hazelcast
Introduction to column oriented databases
SQL vs. NoSQL
Bases de Datos No Relacionales (NoSQL): Cassandra, CouchDB, MongoDB y Neo4j
Hbase: Introduction to column oriented databases
Introduction to NoSQL Databases
Ad

Similar to Sql vs nosql (20)

PPT
NoSQL Seminer
PPTX
Introduction to Data Science NoSQL.pptx
PPTX
Introduction to Bigdata and NoSQL
PDF
NoSql and it's introduction features-Unit-1.pdf
PPTX
Navigating NoSQL in cloudy skies
PDF
the rising no sql technology
PPTX
No SQL- The Future Of Data Storage
PPTX
NoSQL A brief look at Apache Cassandra Distributed Database
PPTX
No sql database
PPTX
Relational databases vs Non-relational databases
PPTX
NoSQL.pptx
PPTX
NoSQL(MongoDB and DynamoDB) Overview.pptx
PDF
NOsql Presentation.pdf
PPTX
UNIT I Introduction to NoSQL.pptx
PPTX
DOCX
Know what is NOSQL
PPTX
cours database pour etudiant NoSQL (1).pptx
PDF
The Rise of Nosql Databases
PPTX
Presentation on NOSQL and mongodb .pptx
PPTX
Introduction to NoSql
NoSQL Seminer
Introduction to Data Science NoSQL.pptx
Introduction to Bigdata and NoSQL
NoSql and it's introduction features-Unit-1.pdf
Navigating NoSQL in cloudy skies
the rising no sql technology
No SQL- The Future Of Data Storage
NoSQL A brief look at Apache Cassandra Distributed Database
No sql database
Relational databases vs Non-relational databases
NoSQL.pptx
NoSQL(MongoDB and DynamoDB) Overview.pptx
NOsql Presentation.pdf
UNIT I Introduction to NoSQL.pptx
Know what is NOSQL
cours database pour etudiant NoSQL (1).pptx
The Rise of Nosql Databases
Presentation on NOSQL and mongodb .pptx
Introduction to NoSql

Recently uploaded (20)

PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
PDF
Complete Guide to Website Development in Malaysia for SMEs
PDF
wealthsignaloriginal-com-DS-text-... (1).pdf
PPTX
history of c programming in notes for students .pptx
PDF
Tally Prime Crack Download New Version 5.1 [2025] (License Key Free
PPTX
L1 - Introduction to python Backend.pptx
PPTX
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
PDF
Design an Analysis of Algorithms I-SECS-1021-03
DOCX
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
PDF
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
PDF
AutoCAD Professional Crack 2025 With License Key
PDF
Salesforce Agentforce AI Implementation.pdf
PDF
Wondershare Filmora 15 Crack With Activation Key [2025
PPTX
assetexplorer- product-overview - presentation
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 41
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
PPTX
Why Generative AI is the Future of Content, Code & Creativity?
Adobe Illustrator 28.6 Crack My Vision of Vector Design
Product Update: Alluxio AI 3.7 Now with Sub-Millisecond Latency
Complete Guide to Website Development in Malaysia for SMEs
wealthsignaloriginal-com-DS-text-... (1).pdf
history of c programming in notes for students .pptx
Tally Prime Crack Download New Version 5.1 [2025] (License Key Free
L1 - Introduction to python Backend.pptx
Oracle E-Business Suite: A Comprehensive Guide for Modern Enterprises
Design an Analysis of Algorithms I-SECS-1021-03
Greta — No-Code AI for Building Full-Stack Web & Mobile Apps
T3DD25 TYPO3 Content Blocks - Deep Dive by André Kraus
AutoCAD Professional Crack 2025 With License Key
Salesforce Agentforce AI Implementation.pdf
Wondershare Filmora 15 Crack With Activation Key [2025
assetexplorer- product-overview - presentation
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Internet Downloader Manager (IDM) Crack 6.42 Build 41
Design an Analysis of Algorithms II-SECS-1021-03
Adobe Premiere Pro 2025 (v24.5.0.057) Crack free
Why Generative AI is the Future of Content, Code & Creativity?

Sql vs nosql

  • 1. SQL vs. NoSQL Making the right choice 18 September 2014
  • 2. The contenders SQL NoSQL © Copyright Dimension Data 18 September 2014 2
  • 3. SQL Databases • RDBMS • Standardized • Mature • Reliable • Well understood • Queryable • ACID © Copyright Dimension Data 18 September 2014 3
  • 4. NoSQL scalability argument • Scale-Up vs Scale-Out • Use of commodity hardware • Locking / Latching • Consistency over partitions • Availability of partitions • Referential integrity Cost of scaling SQL NoSQL © Copyright Dimension Data 18 September 2014 4
  • 5. Other RDBMS / SQL Database drawbacks • One-solution-fits-all • Slow for certain tasks • ACID is not always needed • ORM required • Lack of flexibility • Rigid schema • Management complexity • Add-on solutions • XML-fields, Filestreams • Full-text indexes © Copyright Dimension Data 18 September 2014 5
  • 6. CAP theorem (Brewer's theorem) © Copyright Dimension Data 18 September 2014 6
  • 7. NoSQL Use Cases • Bigness / Avoid hitting the wall • Massive write performance • Write availability • Fast key-value access • Flexible schema and flexible datatypes • Schema migration • No single point of failure • Generally available parallel computing • Easier maintainability, administration and operations • Programmer ease of use • Use the right data model for the right problem • Tunable CAP tradeoffs © Copyright Dimension Data 18 September 2014 7
  • 8. ACID Transactions Atomicity Consistancy Isolation Durability © Copyright Dimension Data 18 September 2014 8
  • 9. NoSQL ACID Trade-offs • Dropping Atomicity lets you shorten the time tables (sets of data) are locked. MongoDB, CouchDB. • Dropping Consistency lets you scale up writes across cluster nodes. Riak, Cassandra. • Dropping Durability lets you respond to write commands without flushing to disk. Memcache, Redis. © Copyright Dimension Data 18 September 2014 9
  • 10. NoSQL Database Main Types • Key-Value Store • A basic dictionary design storing values under unique keys • The database does not care about the structure of the value • Examples: • Memcache • Riak • Azure Blob Storage • Good at: • Handles size well • Processing a constant stream of small reads and writes • Fast • Programmer friendly © Copyright Dimension Data 18 September 2014 10
  • 11. NoSQL Database Main Types • Column Store • A column is a tuple of 3 elements: unique name of value, a typed value, timestamp • Columns may be part of column families • Columns need not appear in every record • Example: • Hbase • Hypertable • Cassandra • Azure Table Storage • Good at: • Handles size well • Stream massive write loads • High availability • Multiple-data centers • MapReduce. © Copyright Dimension Data 18 September 2014 11
  • 12. NoSQL Database Main Types • Document Store • Use a unique key to store and retrieve a JSON document • Documents are schemaless • Metadata is added to the document to aid querying • Indexing of documents and metadata speeds up retrieval • Example: • CouchDB • MongoDB • RavenDB • Azure DocumentDB service (Preview) • Good at: • Natural data modeling • Programmer friendly • Rapid development • Web friendly • CRUD © Copyright Dimension Data 18 September 2014 12
  • 13. NoSQL Database Main Types • Graph Database • Uses graph structures with nodes, edges, and properties to represent and store data • Every element contains a direct pointer to its adjacent elements • Example: • AllegroGraph • InfoGrid • Neo4j • Good at: • Complicated graph problems • Topographical data • Fast © Copyright Dimension Data 18 September 2014 13
  • 14. NoSql Database Type Comparison Data Model Performance Scalability Flexibility Complexity Functionality Key–Value Store high high high none variable (none) Column- Oriented Store high high moderate low minimal Document- Oriented Store high variable (high) high low variable (low) Graph Database variable variable high high graph theory Relational Database variable variable low moderate relational algebra © Copyright Dimension Data 18 September 2014 14
  • 15. Things to consider when choosing • Where are you starting from? • What are you trying to accomplish? • Things to Consider... • Your Problem • Access pattern, scalability, consistency, durability • Money • Scaling, admins, license, operating cost • Programming • Flexible schema, JSON, REST, language, graphs • Performance • Reads, writes, consistency, workload, eventual consistency • Features • Cross datacenter, upgrades, indexes, persistence, tunability • The vendor • Viability, future direction, responsiveness, partnerships © Copyright Dimension Data 18 September 2014 15
  • 16. Big Data – Petabyte range Microsoft HDInsight = Hadoop as a service on Azure (+ .NET) © Copyright Dimension Data 18 September 2014 16
  • 17. Hadoop components © Copyright Dimension Data 18 September 2014 17
  • 18. Using Hadoop © Copyright Dimension Data 18 September 2014 18
  • 19. Hadoop cluster size Yahoo! wins with a massive 42000 node cluster © Copyright Dimension Data 18 September 2014 19
  • 20. Questions USE [Euricom] SELECT [Question] FROM [dbo].[FAQ] WHERE [Answer] IS NULL (0 row(s) affected) © Copyright Dimension Data 18 September 2014 20

Editor's Notes

  • #7: Consistency (all nodes see the same data at the same time) Availability (a guarantee that every request receives a response about whether it was successful or failed) Partition tolerance (the system continues to operate despite arbitrary message loss or failure of part of the system)
  • #9: Atomicity requires that each transaction is "all or nothing": if one part of the transaction fails, the entire transaction fails, and the database state is left unchanged. An atomic system must guarantee atomicity in each and every situation, including power failures, errors, and crashes. To the outside world, a committed transaction appears (by its effects on the database) to be indivisible ("atomic"), and an aborted transaction does not happen. The consistency property ensures that any transaction will bring the database from one valid state to another. Any data written to the database must be valid according to all defined rules, including constraints, cascades, triggers, and any combination thereof. This does not guarantee correctness of the transaction in all ways the application programmer might have wanted (that is the responsibility of application-level code) but merely that any programming errors do not violate any defined rules. The isolation property ensures that the concurrent execution of transactions results in a system state that would be obtained if transactions were executed serially, i.e. one after the other. Providing isolation is the main goal of concurrency control. Depending on concurrency control method, the effects of an incomplete transaction might not even be visible to another transaction. Durability means that once a transaction has been committed, it will remain so, even in the event of power loss, crashes, or errors. In a relational database, for instance, once a group of SQL statements execute, the results need to be stored permanently (even if the database crashes immediately thereafter). To defend against power loss, transactions (or their effects) must be recorded in a non-volatile memory.
  • #18: Apache Hadoop is a framework that allows for the distributed processing of such large data sets across clusters of machines. Apache Hadoop, at its core, consists of 2 sub-projects ? Hadoop MapReduce and Hadoop Distributed File System. Hadoop MapReduce is a programming model and software framework for writing applications that rapidly process vast amounts of data in parallel on large clusters of compute nodes. HDFS is the primary storage system used by Hadoop applications. HDFS creates multiple replicas of data blocks and distributes them on compute nodes throughout a cluster to enable reliable, extremely rapid computations. Other Hadoop-related projects at Apache include Chukwa, Hive, HBase, Mahout, Sqoop and ZooKeeper. HDFS - Filesystems that manage the storage across a network of machines are called distributed filesystems. HDFS is designed for storing very large files with write-once-ready-many-times patterns, running on clusters of commodity hardware. MapReduce - MapReduce is a framework for processing highly distributable problems across huge datasets using a large number of computers (nodes), collectively referred to as a cluster. The framework is inspired by the map and reduce functions commonly used in functional programming. Chukwa - Chukwa is a Hadoop subproject devoted to large-scale log collection and analysis. Chukwa is built on top of HDFS and MapReduce framework and inherits Hadoop’s scalability and robustness. Hive - Apache Hive is a data warehouse infrastructure built on top of Hadoop for providing data summarization, query and analysis. HiveServer provides a Thrift interface and a JDBC / ODBC server. HBase - HBase is the Hadoop application to use when you require real-time read/write random-access to very large datasets. It is a distributed column-oriented database built on top of HDFS. Mahout - Mahout is an open source machine learning library from Apache. It’s highly scalable. Mahout aims to be the machine learning tool of choice when the collection of data to be processed is very large, perhaps far too large for a single machine. Sqoop/Flume - Sqoop allows easy import and export of data from structured data stores such as relational databases, enterprise data warehouses, and NoSQL systems. The dataset being transferred is sliced up into different partitions and a map-only job is launched with individual mappers responsible for transferring a slice of this dataset. ZooKeeper - ZooKeeper is a distributed, open-source coordination service for distributed applications. It exposes a simple set of primitives that distributed applications can build upon to implement higher level services for synchronization, configuration maintenance, and groups and naming.