SlideShare a Scribd company logo
Fundamental software – Chapter 24
NOSQL Databases and Big Data Storage Systems
prepared by : Ayoub S. Al Sahafi - Ref. No. : CL186
Supervisor: Assistance Professor Dr. Mahmoud Alkhasawneh
Al-Madinah International University
Faculty of computer and information Technology
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
 24.1 Introduction to NOSQL Systems
 24.1.1 Emergence of NOSQL Systems
 24.1.2 Characteristics of NOSQL Systems
 24.1.3 Categories of NOSQL Systems
 24.2 The CAP Theorem
 24.3 Document-Based NOSQL Systems and MongoDB
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
 24.3.1 MongoDB Data Model
 24.3.2 MongoDB CRUD Operations
 24.3.3 MongoDB Distributed Systems Characteristics
 24.4 NOSQL Key-Value Stores
 24.4.1 DynamoDB Overview
 24.4.2 Voldemort Key-Value Distributed Data Store
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
 24.4.3 Examples of Other Key-Value Stores
 24.5 Column-Based or Wide Column NOSQL Systems
 24.5.1 Hbase Data Model and Versioning
 24.6.2 The Cypher Query Language of Neo4j
 24.6.3 Neo4j Interfaces and Distributed System Characteristics
 24.7 Summary
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
Introduction
 An introduction to NOSQL systems, their characteristics, and how they
differ from SQL systems.
 Four general categories of NoSQL systems.
 How NOSQL systems approach the issue of consistency among multiple
replicas (copies) by using the paradigm known as eventual consistency.
 The CAP theorem
 Present an overview of each category of NOSQL .
 In this chapter we will describe the following:
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
Introduction
 There are systems developed to manage large amounts of data in
organizations such as Google, Amazon, Facebook, and Twitter and in
applications such as social media, Web links, e-mail.
 NoSQL is an approach to database design that can contain a wide
different of data models, including key-value, document, columnar and
graph formats.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
Introduction
 NoSQL, which stands for ”Not only SQL," is an alternative to traditional
relational databases in which data is placed in tables and data schema is
carefully designed before the database is built. NoSQL databases are
especially useful for working with large sets of distributed data.
 Most NOSQL systems are distributed databases or distributed storage
systems, with a focus on semi structured data storage, high performance,
availability, data replication, and scalability as opposed to an emphasis on
immediate data consistency, powerful query languages, and structured
data storage.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
Terminology
No Term Definition
1 NOSQL NOSQL Databases and Big Data Storage Systems
2 CAP Consistency , Availability , Partition
3 CRUD Create, Read, Update, Delete
4 BigTable
is a compressed, high performance, proprietary data storage system built on Google File
System
5 DynamoDB is a hosted NoSQL database offered by Amazon Web Services (AWS)
6 Cassandra Wide-column store based on ideas of BigTable and DynamoDB
7 Availability is the condition wherein a given resource can be accessed by its consumers
8 Scalability
Database scalability is the ability of a database to handle changing demands by
adding/removing resources.
9 Graph-based
a graph database (GDB) is a database that uses graph structures for semantic queries with
nodes, edges, and properties to represent and store data.
Hbase
HBase is a column-oriented non-relational database management system that runs on top of
Hadoop Distributed File System (HDFS).
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
Terminology
No Term Definition
1 Object databases
is a database management system in which information is represented in the form of objects as
used in object-oriented programming.
2 XML databases
is a data persistence software system that allows data to be specified, and sometimes stored, in
XML format
3 Document-based
A document-oriented database, or document store, is a computer program designed for
storing, retrieving and managing document-oriented information, also known as semi-
data.
4 document-oriented
A document-oriented database, or document store, is a computer program designed for
storing, retrieving and managing document-oriented information, also known as semi-
data.
5 object-relational
An object-relational database (ORD), or object-relational database management system
(ORDBMS), is a database management system (DBMS) similar to a relational database, but with
an object-oriented database model
Hybrid NOSQL
systems
Combining Relational and NoSQL
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
Terminology
No Term Definition
JSON
range partitioning
Range partitioning is a type of relational database partitioning wherein the
is based on a predefined range for a specific data field such as uniquely numbered
IDs, dates or simple values like currency.
hash partitioning
Hash partitioning is a partitioning technique where a hash key is used to distribute
rows evenly across the different partitions (sub-tables
primary key
A primary key is a field in a table which uniquely identifies each row/record in a
database table
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.1.1 Emergence of NOSQL Systems
 Many companies and organizations are faced with applications that store
vast amounts of data.
 There are millions of web applications users who submit posts, many with
images and videos.
 User profiles, user relationships, and posts must all be stored in a huge
collection of data stores.
 Some data for this type of application is not suitable for a traditional
relational system and it needs multiple types of databases and data storage
systems.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.1.1 Emergence of NOSQL Systems
 For this reason, some organizations decided to develop their own
systems:
 Google developed a NoSQL system known as BigTable, it is an open
source NoSQL.
 Amazon developed a NOSQL system called DynamoDB.
 Facebook developed a NOSQL system called Cassandra.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.1.2 Characteristics of NOSQL Systems
 We divide the characteristics into two categories:
b) and those related to data models and query languages:
a) those related to distributed databases and distributed systems:
1- availability.
2- Scalability.
3- Replication Models.
4- Sharding of Files.
5- High-Performance Data Access.
As we show in the following diagram:
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.1.2 Characteristics of NOSQL Systems
NOSQL’s characteristics categories:
Scalability
Replication
Models
availability
Sharding of Files
High-
Performance Data
Access
distributed
systems
categories
systems
distributed systems
horizontal
vertical
master-slave
master-master
NOSQL systems
hashing
object keys
achieve techniques
related to distributed
databases and distributed
systems
NOSQL characteristics
Categories
related to data models and
query languages
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.1.3 Categories of NOSQL Systems
 NOSQL systems have been characterized into four major categories:
2. NOSQL key-value stores.
1. Document-based NOSQL systems.
3. Column based or wide column NOSQL systems.
4. Graph-based NOSQL systems.
Additional categories can be added as follows to include some systems that are not easily
categorized into the these four categories:
5. Graph-based NOSQL systems.
6. Hybrid NOSQL systems.
7. Object databases.
8. XML databases.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.2 The CAP Theorem
 The CAP theorem can be used to explain some competing requirements
in a distributed system with replication.
The three letters in CAP refer to three desirable properties of
distributed systems with replicated data (following Diagram):
C A P
C Consistency (among replicated copies)
A availability (of the system for read and write operations)
P partition
tolerance (in the face of the nodes in the system
being partitioned by a network fault).
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.2 The CAP Theorem
The three letters in CAP refer to three desirable properties of
distributed systems with replicated data (following Diagram):
All client can find a replica of data, even in case
of partial node failures
All Client see the same view of data, even right
after update or delete
The system continues to work, even in presence
of partial network failure
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.3 Document-Based NoSQL Systems and MongoDB
 Document-based or document-oriented NoSQL systems store data as
collections of similar documents. These types of systems are also
sometimes known as document stores.
 A major difference between document-based systems versus object
and object-relational systems and XML is that there is no requirement
to specify a schema—rather, the documents are specified as self-
describing data.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.3.1 MongoDB Data Model
 MongoDB documents are stored in BSON (Binary JSON) format, which
is a variation of JSON with some additional data types and is more
efficient for storage than JSON.
 Individual documents are stored in a collection.
As a simple example: COMPANY database.
The following command can be used to create a collection called project to hold
PROJECT objects from the COMPANY database:
db.create Collection(“project”, { capped : true, size : 1310720, max : 500 } )
* The collection is capped; this means it has upper limits on its storage space (size) and
number of documents (max).
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.3.2 MongoDB CRUD Operations
 MongoDb has several CRUD operations, where CRUD stands for
(create, read, update, delete).
 Documents can be created and inserted into their collections using
the insert operation, whose format is:
db.<collection_name>.insert(<document(s)>)
 The delete operation is called remove, and the format is:
db.<collection_name>.remove(<condition>)
 For read queries, the main command is called find, and the format is:
db.<collection_name>.find(<condition>)
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.3.3 MongoDB Distributed Systems Characteristics
 The concept of replica set is used in Mongo DB to create multiple
copies of the same data set on different nodes in the distributed
system, and it uses a variation of the master-slave approach for
replication.
 For example, suppose that we want to replicate a particular
document collection C. A replica set will have one primary copy of
the collection C stored in one node N1, and at least one secondary
copy (replica) of C stored at another node N2. Additional copies can
be stored in nodes N3, N4, etc.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.3.3 MongoDB Distributed Systems Characteristics
 There are two ways to partition a collection into shards in MongoDB:
1.range partitioning
2.and hash partitioning.
 The partitioning field—known as the shard key in MongoDB—must
have two characteristics:
1.it must exist in every document in the collection,
2.and it must have an index.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.4 NoSQL Key-Value Stores
 Key-value stores focus on high performance, availability, and scalability
by storing data in a distributed storage system.
 The main characteristic of key-value stores is the fact that every value
(data item) must be associated with a unique key, and that retrieving
the value by supplying the key must be very fast.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.4.1 DynamoDB Overview
 The DynamoDB system is an Amazon product and is available as part
of Amazon’s AWS/SDK platforms (Amazon Web Services/Software
Development Kit).
 The basic data model in DynamoDB uses the concepts of tables, items,
and attributes.
• When a table is created, it is required to specify a table name and
a primary key.
• The primary key can be one of the following two types:
1.A single attribute. 2. A pair of attributes.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.4.2 Voldemort Key-Value Distributed Data Store
 Voldemort is an open source system available through Apache 2.0
open source licensing rules. It is based on Amazon’s DynamoDB.
 Some of the features of Voldemort are as follows:
1. Simple basic operations.
2. High-level formatted data values.
Example :
o s.put(k, v) inserts an item as a key-value pair with key k and value v.
o s.delete(k) deletes the item whose key is k from the store.
o v = s.get(k) retrieves the value v associated with key k.
The values v in the (k, v) items can be specified in JSON (JavaScript Object Notation), and
the system will convert between JSON and the internal storage format.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.4.3 Examples of Other Key-Value Stores
 Oracle key-value store. Oracle has one of the well-known SQL relational data- base systems, and
Oracle also offers a system based on the key-value store concept; this system is called the Oracle
NoSQL Database.
 Oracle key-value store. Oracle has one of the well-known SQL relational data- base systems, and
Oracle also offers a system based on the key-value store concept; this system is called the Oracle
NoSQL Database.
24.5 Column-Based or Wide Column NOSQL Systems
 Another category of NOSQL systems is known as column-based or wide column systems.
 The Google distributed storage system for big data, known as BigTable, is a well-known example of this
class of NOSQL systems, and it is used in many Google applications that require large amounts of data
storage, such as Gmail.
 Big- Table uses the Google File System (GFS) for data storage and distribution. An open source system
known as Apache Hbase is somewhat similar to Google Big- Table, but it typically uses HDFS (Hadoop
Distributed File System) for data storage.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.5.1 Hbase Data Model and Versioning
 The data model in Hbase organizes data using the concepts of namespaces, tables, column
families, column qualifiers, columns, rows, and data cells.
 A column is identified by a combination of (column family:column qualifier).
 Data is stored in a self-describing form by associating columns with data values, where data
values are strings.
examples :
Creating a table called EMPLOYEE with three column families: Name, Address, and Details:
create ‘EMPLOYEE’, ‘Name’, ‘Address’, ‘Details’
Some Hbase basic CRUD operations:
Creating a table: create <tablename>, <column family>, <column family>, …
Inserting Data: put <tablename>, <rowid>, <column family>:<column qualifier>, <value>
Reading Data (all data in a table): scan <tablename>
Retrieve Data (one item): get <tablename>,<rowid>
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.5.1 Hbase Data Model and Versioning
examples :
inserting some row data in the EMPLOYEE table:
put ‘EMPLOYEE’, ‘row1’, ‘Name:Fname’, ‘Ahmad’
put ‘EMPLOYEE’, ‘row1’, ‘Name:Lname’, ‘Ali’
put ‘EMPLOYEE’, ‘row1’, ‘Name:Nickname’, ‘Khalidi’
put ‘EMPLOYEE’, ‘row1’, ‘Details:Job’, ‘Engineer’
put ‘EMPLOYEE’, ‘row1’, ‘Details:Review’, ‘Good’
put ‘EMPLOYEE’, ‘row2’, ‘Name:Fname’, ‘Sara’
put ‘EMPLOYEE’, ‘row2’, ‘Name:Lname’, ‘Rami’
put ‘EMPLOYEE’, ‘row2’, ‘Name:MName’, ‘S’
put ‘EMPLOYEE’, ‘row2’, ‘Details:Job’, ‘IT’
put ‘EMPLOYEE’, ‘row2’, ‘Details:Supervisor’, ‘Sead Noor’
put ‘EMPLOYEE’, ‘row3’, ‘Name:Fname’, ‘Hasan’
put ‘EMPLOYEE’, ‘row3’, ‘Name:Minit’, ‘E’
put ‘EMPLOYEE’, ‘row3’, ‘Name:Lname’, ‘Mohammad’
put ‘EMPLOYEE’, ‘row3’, ‘Name:Suffix’, ‘Mr.’
put ‘EMPLOYEE’, ‘row3’, ‘Details:Job’, ‘CEO’
put ‘EMPLOYEE’, ‘row3’, ‘Details:Salary’, ‘1,000,000’
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.5.2 Hbase CRUD Operations
 Hbase has low-level CRUD (create, read, update, delete) operations, as in many of the NoSQL
systems.
24.5.3 Hbase Storage and Distributed System Concepts
 Each Hbase table is divided into a number of regions, where each region will hold a range of the
row keys in the table; this is why the row keys must be lexicographically ordered.
 Hbase uses the Apache Zookeeper open source system for services related to man- aging the
naming, distribution, and synchronization of the Hbase data on the dis- tributed Hbase server
nodes, as well as for coordination and replication services.
 Hbase also uses Apache HDFS (Hadoop Distributed File System) for distributed file services.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.6 NOSQL Graph Databases and Neo4j
 Another category of NOSQL systems is known as graph databases or graph- oriented
NOSQL systems.
 The data is represented as a graph, which is a collection of vertices (nodes) and edges.
 Both nodes and edges can be labeled to indicate the types of entities and relationships they
represent, and it is generally possible to store data associated with both individual nodes and
individual edges.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.6.1 Neo4j Data Model
 The data model in Neo4j organizes data using the concepts of nodes and relation- ships.
 Both nodes and relationships can have properties, which store the data items associated with
nodes and relationships.
 Nodes can have labels; the nodes that have the same label are grouped into a collection that
identifies a subset of the nodes in the database graph for querying purposes.
 each relationship has a start node and end node as well as a relationship type, which serves a
similar role to a node label by identifying similar relationships that have the same relationship
type.
 In conventional graph theory, nodes and relationships are generally called vertices and edges.
The Neo4j graph data model somewhat resembles how data is represented in the ER and EER
models
 Properties can be specified via a map pattern, which is made of one or more “name : value” pairs
enclosed in curly brackets; for example {Lname : ‘Smith’, Fname : ‘John’, Minit : ‘B’}.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.6.1 Neo4j Data Model
Labels and properties:
 When a node is created, the node label can be specified. It is also possible to create nodes
without any labels.
Indexing and node identifiers:
 When a node is created, the Neo4j system creates an internal unique system-defined identifier for
each node.
 For example, Empid can be used to index nodes with the EMPLOYEE label, Dno to index the
nodes with the DEPARTMENT label, and Pno to index the nodes with the PROJECT label.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.6.2 The Cypher Query Language of Neo4j
 Neo4j has a high-level query language, Cypher.
 There are declarative commands for creating nodes and relationships, as well as for finding nodes
and relationships based on specifying patterns.
 Deletion and modification of data is also possible in Cypher.
Examples in Neo4j using the Cypher language
creating some nodes for the COMPANY data
CREATE (e1: EMPLOYEE, {Empid: ‘1’, Lname: ‘Smith’, Fname: ‘John’, Minit: ‘B’})
CREATE (e2: EMPLOYEE, {Empid: ‘2’, Lname: ‘Wong’, Fname: ‘Franklin’})
CREATE (e3: EMPLOYEE, {Empid: ‘3’, Lname: ‘Zelaya’, Fname: ‘Alicia’})
CREATE (e4: EMPLOYEE, {Empid: ‘4’, Lname: ‘Wallace’, Fname: ‘Jennifer’, Minit: ‘S’}) ...
CREATE (d1: DEPARTMENT, {Dno: ‘5’, Dname: ‘Research’})
CREATE (d2: DEPARTMENT, {Dno: ‘4’, Dname: ‘Administration’}) ...
CREATE (p1: PROJECT, {Pno: ‘1’, Pname: ‘ProductX’})
CREATE (p2: PROJECT, {Pno: ‘2’, Pname: ‘ProductY’})
CREATE (p3: PROJECT, {Pno: ‘10’, Pname: ‘Computerization’})
CREATE (p4: PROJECT, {Pno: ‘20’, Pname: ‘Reorganization’})
...
CREATE (loc1: LOCATION, {Lname: ‘Houston’})
CREATE (loc2: LOCATION, {Lname: ‘Stafford’})
CREATE (loc3: LOCATION, {Lname: ‘Bellaire’})
CREATE (loc4: LOCATION, {Lname: ‘Sugarland’})
...
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.6.2 The Cypher Query Language of Neo4j
Examples in Neo4j using the Cypher language
creating some relationships for the COMPANY data :
CREATE (e1) – [ : WorksFor ] –> (d1) CREATE (e3) – [ : WorksFor ] –> (d2) ...
CREATE (d1) – [ : Manager ] –> (e2) CREATE (d2) – [ : Manager ] –> (e4) ...
CREATE (d1) – [ : LocatedIn ] –> (loc1) CREATE (d1) – [ : LocatedIn ] –> (loc3) CREATE (d1) – [ : LocatedIn ] –> (loc4)
CREATE (d2) – [ : LocatedIn ] –> (loc2) ...
CREATE (e1) – [ : WorksOn, {Hours: ‘32.5’} ] –> (p1) CREATE (e1) – [ : WorksOn, {Hours: ‘7.5’} ] –> (p2) CREATE (e2) – [ :
WorksOn, {Hours: ‘10.0’} ] –> (p1) CREATE (e2) – [ : WorksOn, {Hours: 10.0} ] –> (p2) CREATE (e2) – [ : WorksOn, {Hours:
‘10.0’} ] –> (p3) CREATE (e2) – [ : WorksOn, {Hours: 10.0} ] –> (p4) ...
Basic simplified syntax of some common Cypher clauses:
Finding nodes and relationships that match a pattern: MATCH <pattern>
Specifying aggregates and other query variables: WITH <specifications>
Specifying conditions on the data to be retrieved: WHERE <condition>
Specifying the data to be returned: RETURN <data>
Ordering the data to be returned: ORDER BY <data>
Limiting the number of returned data items: LIMIT <max number>
Creating nodes: CREATE <node, optional labels and properties>
Creating relationships: CREATE <relationship, relationship type and optional properties> Deletion: DELETE <nodes or
relationships>
Specifying property values and labels: SET <property values and labels>
Removing property values and labels: REMOVE <property values and labels>
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.6.3 Neo4j Interfaces and Distributed System Characteristics
 Neo4j has other interfaces that can be used to create, retrieve, and update nodes and
relationships in a graph database.
 It also has two main versions:
1. Enterprise edition.
2. community edition.
 Both editions support the Neo4j graph data model and storage system, and Cypher graph query
language, including a high-performance native API, language drivers for several popular
programming languages, such as Java, Python, PHP.
 In addition, both editions support ACID properties.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.7 Summary
 In this chapter, we discussed the class of database systems known as NOSQL systems, which focus on efficient
storage and retrieval of large amounts of “big data.” Applications that use these types of systems include social
media, Web links, user profiles, marketing and sales, posts and tweets, road maps and spatial data, and e-mail.
 The term NOSQL is generally interpreted as Not Only SQL—rather than NO to SQL—and is meant to convey
that many applications need systems other than traditional relational SQL systems to augment their data
management needs.
 These systems are distributed databases or distributed storage systems, with a focus on semistructured data
storage, high performance, availability, data replication, and scalability rather than an emphasis on immediate
data consistency, powerful query languages, and structured data storage.
 we started with an introduction to NOSQL systems, their characteristics, and how they differ from SQL
systems. Four general categories of NOSQL systems are document-based, key-value stores, column-
based, and graph-based.
 discussed how NOSQL systems approach the issue of consistency among multiple replicas (copies) by using the
paradigm known as eventual consistency. We discussed the CAP theorem, which can be used to understand the
emphasis of NOSQL systems on availability.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
24.7 Summary
the four main categories of NOSQL systems
1.document-based systems
2.key-value stores
3.column-based systems
4.graph-based systems
 We also noted that some NOSQL systems may not fall
neatly into a single category but rather use techniques
that span two or more categories.
NOSQL Databases and Big Data Storage Systems
Fundamental software – Chapter 24
References
1. FUNDAMENTALS OF Database Systems -SEVENTH EDITION , Ramez Elmasri and Shamkant B. Navathe
2. Peter W. Resnick. "Internet Message Format. tools.ietf.org. Retrieved 2018-10-02.
3. "JSON Objects". www.w3schools.com. Retrieved 2018-10-02.

More Related Content

PDF
Comparative study of no sql document, column store databases and evaluation o...
PDF
EVALUATION CRITERIA FOR SELECTING NOSQL DATABASES IN A SINGLE-BOX ENVIRONMENT
PDF
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
PDF
Experimental evaluation of no sql databases
PPTX
Robust Module based data management system
DOCX
Robust module based data management
PDF
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
DOCX
Deep semantic understanding
Comparative study of no sql document, column store databases and evaluation o...
EVALUATION CRITERIA FOR SELECTING NOSQL DATABASES IN A SINGLE-BOX ENVIRONMENT
CASSANDRA A DISTRIBUTED NOSQL DATABASE FOR HOTEL MANAGEMENT SYSTEM
Experimental evaluation of no sql databases
Robust Module based data management system
Robust module based data management
A NOVEL APPROACH FOR HOTEL MANAGEMENT SYSTEM USING CASSANDRA
Deep semantic understanding

What's hot (20)

PDF
Quantitative Performance Evaluation of Cloud-Based MySQL (Relational) Vs. Mon...
PDF
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
PDF
Attachmate DATABridge Glossary
PDF
Redis Cashe is an open-source distributed in-memory data store.
PDF
BigData Behind-the-Scenes~20150827
PDF
Datastores
PDF
Iaetsd mapreduce streaming over cassandra datasets
PDF
DSM - Comparison of Hbase and Cassandra
PDF
Data Storage and Management project Report
PDF
Analysis on NoSQL: MongoDB Tool
PPTX
Technical Background
PPTX
Scabiv0.2
PDF
A Study of Performance NoSQL Databases
PDF
A request skew aware heterogeneous distributed
PPTX
Force11 JDDCP workshop presentation, @ Force2015, Oxford
PDF
Dsm project-h base-cassandra
PDF
Nosql intro
PPTX
A Standard Data Format for Computational Chemistry: CSX
Quantitative Performance Evaluation of Cloud-Based MySQL (Relational) Vs. Mon...
SURVEY ON IMPLEMANTATION OF COLUMN ORIENTED NOSQL DATA STORES ( BIGTABLE & CA...
Attachmate DATABridge Glossary
Redis Cashe is an open-source distributed in-memory data store.
BigData Behind-the-Scenes~20150827
Datastores
Iaetsd mapreduce streaming over cassandra datasets
DSM - Comparison of Hbase and Cassandra
Data Storage and Management project Report
Analysis on NoSQL: MongoDB Tool
Technical Background
Scabiv0.2
A Study of Performance NoSQL Databases
A request skew aware heterogeneous distributed
Force11 JDDCP workshop presentation, @ Force2015, Oxford
Dsm project-h base-cassandra
Nosql intro
A Standard Data Format for Computational Chemistry: CSX
Ad

Similar to Softwae and database in data communication network (20)

PPTX
UNIT I Introduction to NoSQL.pptx
PPTX
UNIT I Introduction to NoSQL.pptx
PDF
NoSQL BIg Data Analytics Mongo DB and Cassandra .pdf
PPTX
Nosql databases
PPT
NoSQL Fundamentals PowerPoint Presentation
PPT
No SQL Databases as modern database concepts
PPTX
Introduction to Data Science NoSQL.pptx
PPT
6269441.ppt
PPTX
Master.pptx
PPT
No sql databases
PPT
NoSQL - 05March2014 Seminar
PPTX
No sq lv2
PDF
NoSql and it's introduction features-Unit-1.pdf
PPTX
NOSQL PRESENTATION ON INTRRODUCTION Intro.pptx
PPTX
PPSX
A Seminar on NoSQL Databases.
PPTX
Introduction to asdfghjkln b vfgh n v
PPTX
No sql database
PPTX
NoSQL Basics and MongDB
PPT
No SQL Databases.ppt
UNIT I Introduction to NoSQL.pptx
UNIT I Introduction to NoSQL.pptx
NoSQL BIg Data Analytics Mongo DB and Cassandra .pdf
Nosql databases
NoSQL Fundamentals PowerPoint Presentation
No SQL Databases as modern database concepts
Introduction to Data Science NoSQL.pptx
6269441.ppt
Master.pptx
No sql databases
NoSQL - 05March2014 Seminar
No sq lv2
NoSql and it's introduction features-Unit-1.pdf
NOSQL PRESENTATION ON INTRRODUCTION Intro.pptx
A Seminar on NoSQL Databases.
Introduction to asdfghjkln b vfgh n v
No sql database
NoSQL Basics and MongDB
No SQL Databases.ppt
Ad

Recently uploaded (20)

PDF
Empathic Computing: Creating Shared Understanding
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
PPTX
MYSQL Presentation for SQL database connectivity
PDF
NewMind AI Weekly Chronicles - August'25 Week I
PDF
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
PDF
Unlocking AI with Model Context Protocol (MCP)
PDF
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
PPT
“AI and Expert System Decision Support & Business Intelligence Systems”
PPT
Teaching material agriculture food technology
PDF
Spectral efficient network and resource selection model in 5G networks
PDF
Advanced Soft Computing BINUS July 2025.pdf
PDF
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
DOCX
The AUB Centre for AI in Media Proposal.docx
PDF
Machine learning based COVID-19 study performance prediction
PDF
Reach Out and Touch Someone: Haptics and Empathic Computing
PDF
Review of recent advances in non-invasive hemoglobin estimation
PDF
KodekX | Application Modernization Development
PDF
Advanced methodologies resolving dimensionality complications for autism neur...
PDF
Advanced IT Governance
Empathic Computing: Creating Shared Understanding
Dropbox Q2 2025 Financial Results & Investor Presentation
Build a system with the filesystem maintained by OSTree @ COSCUP 2025
MYSQL Presentation for SQL database connectivity
NewMind AI Weekly Chronicles - August'25 Week I
Peak of Data & AI Encore- AI for Metadata and Smarter Workflows
Unlocking AI with Model Context Protocol (MCP)
Optimiser vos workloads AI/ML sur Amazon EC2 et AWS Graviton
“AI and Expert System Decision Support & Business Intelligence Systems”
Teaching material agriculture food technology
Spectral efficient network and resource selection model in 5G networks
Advanced Soft Computing BINUS July 2025.pdf
How UI/UX Design Impacts User Retention in Mobile Apps.pdf
The AUB Centre for AI in Media Proposal.docx
Machine learning based COVID-19 study performance prediction
Reach Out and Touch Someone: Haptics and Empathic Computing
Review of recent advances in non-invasive hemoglobin estimation
KodekX | Application Modernization Development
Advanced methodologies resolving dimensionality complications for autism neur...
Advanced IT Governance

Softwae and database in data communication network

  • 1. Fundamental software – Chapter 24 NOSQL Databases and Big Data Storage Systems prepared by : Ayoub S. Al Sahafi - Ref. No. : CL186 Supervisor: Assistance Professor Dr. Mahmoud Alkhasawneh Al-Madinah International University Faculty of computer and information Technology
  • 2. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24  24.1 Introduction to NOSQL Systems  24.1.1 Emergence of NOSQL Systems  24.1.2 Characteristics of NOSQL Systems  24.1.3 Categories of NOSQL Systems  24.2 The CAP Theorem  24.3 Document-Based NOSQL Systems and MongoDB
  • 3. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24  24.3.1 MongoDB Data Model  24.3.2 MongoDB CRUD Operations  24.3.3 MongoDB Distributed Systems Characteristics  24.4 NOSQL Key-Value Stores  24.4.1 DynamoDB Overview  24.4.2 Voldemort Key-Value Distributed Data Store
  • 4. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24  24.4.3 Examples of Other Key-Value Stores  24.5 Column-Based or Wide Column NOSQL Systems  24.5.1 Hbase Data Model and Versioning  24.6.2 The Cypher Query Language of Neo4j  24.6.3 Neo4j Interfaces and Distributed System Characteristics  24.7 Summary
  • 5. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 Introduction  An introduction to NOSQL systems, their characteristics, and how they differ from SQL systems.  Four general categories of NoSQL systems.  How NOSQL systems approach the issue of consistency among multiple replicas (copies) by using the paradigm known as eventual consistency.  The CAP theorem  Present an overview of each category of NOSQL .  In this chapter we will describe the following:
  • 6. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 Introduction  There are systems developed to manage large amounts of data in organizations such as Google, Amazon, Facebook, and Twitter and in applications such as social media, Web links, e-mail.  NoSQL is an approach to database design that can contain a wide different of data models, including key-value, document, columnar and graph formats.
  • 7. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 Introduction  NoSQL, which stands for ”Not only SQL," is an alternative to traditional relational databases in which data is placed in tables and data schema is carefully designed before the database is built. NoSQL databases are especially useful for working with large sets of distributed data.  Most NOSQL systems are distributed databases or distributed storage systems, with a focus on semi structured data storage, high performance, availability, data replication, and scalability as opposed to an emphasis on immediate data consistency, powerful query languages, and structured data storage.
  • 8. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 Terminology No Term Definition 1 NOSQL NOSQL Databases and Big Data Storage Systems 2 CAP Consistency , Availability , Partition 3 CRUD Create, Read, Update, Delete 4 BigTable is a compressed, high performance, proprietary data storage system built on Google File System 5 DynamoDB is a hosted NoSQL database offered by Amazon Web Services (AWS) 6 Cassandra Wide-column store based on ideas of BigTable and DynamoDB 7 Availability is the condition wherein a given resource can be accessed by its consumers 8 Scalability Database scalability is the ability of a database to handle changing demands by adding/removing resources. 9 Graph-based a graph database (GDB) is a database that uses graph structures for semantic queries with nodes, edges, and properties to represent and store data. Hbase HBase is a column-oriented non-relational database management system that runs on top of Hadoop Distributed File System (HDFS).
  • 9. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 Terminology No Term Definition 1 Object databases is a database management system in which information is represented in the form of objects as used in object-oriented programming. 2 XML databases is a data persistence software system that allows data to be specified, and sometimes stored, in XML format 3 Document-based A document-oriented database, or document store, is a computer program designed for storing, retrieving and managing document-oriented information, also known as semi- data. 4 document-oriented A document-oriented database, or document store, is a computer program designed for storing, retrieving and managing document-oriented information, also known as semi- data. 5 object-relational An object-relational database (ORD), or object-relational database management system (ORDBMS), is a database management system (DBMS) similar to a relational database, but with an object-oriented database model Hybrid NOSQL systems Combining Relational and NoSQL
  • 10. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 Terminology No Term Definition JSON range partitioning Range partitioning is a type of relational database partitioning wherein the is based on a predefined range for a specific data field such as uniquely numbered IDs, dates or simple values like currency. hash partitioning Hash partitioning is a partitioning technique where a hash key is used to distribute rows evenly across the different partitions (sub-tables primary key A primary key is a field in a table which uniquely identifies each row/record in a database table
  • 11. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.1.1 Emergence of NOSQL Systems  Many companies and organizations are faced with applications that store vast amounts of data.  There are millions of web applications users who submit posts, many with images and videos.  User profiles, user relationships, and posts must all be stored in a huge collection of data stores.  Some data for this type of application is not suitable for a traditional relational system and it needs multiple types of databases and data storage systems.
  • 12. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.1.1 Emergence of NOSQL Systems  For this reason, some organizations decided to develop their own systems:  Google developed a NoSQL system known as BigTable, it is an open source NoSQL.  Amazon developed a NOSQL system called DynamoDB.  Facebook developed a NOSQL system called Cassandra.
  • 13. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.1.2 Characteristics of NOSQL Systems  We divide the characteristics into two categories: b) and those related to data models and query languages: a) those related to distributed databases and distributed systems: 1- availability. 2- Scalability. 3- Replication Models. 4- Sharding of Files. 5- High-Performance Data Access. As we show in the following diagram:
  • 14. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.1.2 Characteristics of NOSQL Systems NOSQL’s characteristics categories: Scalability Replication Models availability Sharding of Files High- Performance Data Access distributed systems categories systems distributed systems horizontal vertical master-slave master-master NOSQL systems hashing object keys achieve techniques related to distributed databases and distributed systems NOSQL characteristics Categories related to data models and query languages
  • 15. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.1.3 Categories of NOSQL Systems  NOSQL systems have been characterized into four major categories: 2. NOSQL key-value stores. 1. Document-based NOSQL systems. 3. Column based or wide column NOSQL systems. 4. Graph-based NOSQL systems. Additional categories can be added as follows to include some systems that are not easily categorized into the these four categories: 5. Graph-based NOSQL systems. 6. Hybrid NOSQL systems. 7. Object databases. 8. XML databases.
  • 16. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.2 The CAP Theorem  The CAP theorem can be used to explain some competing requirements in a distributed system with replication. The three letters in CAP refer to three desirable properties of distributed systems with replicated data (following Diagram): C A P C Consistency (among replicated copies) A availability (of the system for read and write operations) P partition tolerance (in the face of the nodes in the system being partitioned by a network fault).
  • 17. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.2 The CAP Theorem The three letters in CAP refer to three desirable properties of distributed systems with replicated data (following Diagram): All client can find a replica of data, even in case of partial node failures All Client see the same view of data, even right after update or delete The system continues to work, even in presence of partial network failure
  • 18. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.3 Document-Based NoSQL Systems and MongoDB  Document-based or document-oriented NoSQL systems store data as collections of similar documents. These types of systems are also sometimes known as document stores.  A major difference between document-based systems versus object and object-relational systems and XML is that there is no requirement to specify a schema—rather, the documents are specified as self- describing data.
  • 19. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.3.1 MongoDB Data Model  MongoDB documents are stored in BSON (Binary JSON) format, which is a variation of JSON with some additional data types and is more efficient for storage than JSON.  Individual documents are stored in a collection. As a simple example: COMPANY database. The following command can be used to create a collection called project to hold PROJECT objects from the COMPANY database: db.create Collection(“project”, { capped : true, size : 1310720, max : 500 } ) * The collection is capped; this means it has upper limits on its storage space (size) and number of documents (max).
  • 20. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.3.2 MongoDB CRUD Operations  MongoDb has several CRUD operations, where CRUD stands for (create, read, update, delete).  Documents can be created and inserted into their collections using the insert operation, whose format is: db.<collection_name>.insert(<document(s)>)  The delete operation is called remove, and the format is: db.<collection_name>.remove(<condition>)  For read queries, the main command is called find, and the format is: db.<collection_name>.find(<condition>)
  • 21. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.3.3 MongoDB Distributed Systems Characteristics  The concept of replica set is used in Mongo DB to create multiple copies of the same data set on different nodes in the distributed system, and it uses a variation of the master-slave approach for replication.  For example, suppose that we want to replicate a particular document collection C. A replica set will have one primary copy of the collection C stored in one node N1, and at least one secondary copy (replica) of C stored at another node N2. Additional copies can be stored in nodes N3, N4, etc.
  • 22. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.3.3 MongoDB Distributed Systems Characteristics  There are two ways to partition a collection into shards in MongoDB: 1.range partitioning 2.and hash partitioning.  The partitioning field—known as the shard key in MongoDB—must have two characteristics: 1.it must exist in every document in the collection, 2.and it must have an index.
  • 23. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.4 NoSQL Key-Value Stores  Key-value stores focus on high performance, availability, and scalability by storing data in a distributed storage system.  The main characteristic of key-value stores is the fact that every value (data item) must be associated with a unique key, and that retrieving the value by supplying the key must be very fast.
  • 24. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.4.1 DynamoDB Overview  The DynamoDB system is an Amazon product and is available as part of Amazon’s AWS/SDK platforms (Amazon Web Services/Software Development Kit).  The basic data model in DynamoDB uses the concepts of tables, items, and attributes. • When a table is created, it is required to specify a table name and a primary key. • The primary key can be one of the following two types: 1.A single attribute. 2. A pair of attributes.
  • 25. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.4.2 Voldemort Key-Value Distributed Data Store  Voldemort is an open source system available through Apache 2.0 open source licensing rules. It is based on Amazon’s DynamoDB.  Some of the features of Voldemort are as follows: 1. Simple basic operations. 2. High-level formatted data values. Example : o s.put(k, v) inserts an item as a key-value pair with key k and value v. o s.delete(k) deletes the item whose key is k from the store. o v = s.get(k) retrieves the value v associated with key k. The values v in the (k, v) items can be specified in JSON (JavaScript Object Notation), and the system will convert between JSON and the internal storage format.
  • 26. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.4.3 Examples of Other Key-Value Stores  Oracle key-value store. Oracle has one of the well-known SQL relational data- base systems, and Oracle also offers a system based on the key-value store concept; this system is called the Oracle NoSQL Database.  Oracle key-value store. Oracle has one of the well-known SQL relational data- base systems, and Oracle also offers a system based on the key-value store concept; this system is called the Oracle NoSQL Database. 24.5 Column-Based or Wide Column NOSQL Systems  Another category of NOSQL systems is known as column-based or wide column systems.  The Google distributed storage system for big data, known as BigTable, is a well-known example of this class of NOSQL systems, and it is used in many Google applications that require large amounts of data storage, such as Gmail.  Big- Table uses the Google File System (GFS) for data storage and distribution. An open source system known as Apache Hbase is somewhat similar to Google Big- Table, but it typically uses HDFS (Hadoop Distributed File System) for data storage.
  • 27. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.5.1 Hbase Data Model and Versioning  The data model in Hbase organizes data using the concepts of namespaces, tables, column families, column qualifiers, columns, rows, and data cells.  A column is identified by a combination of (column family:column qualifier).  Data is stored in a self-describing form by associating columns with data values, where data values are strings. examples : Creating a table called EMPLOYEE with three column families: Name, Address, and Details: create ‘EMPLOYEE’, ‘Name’, ‘Address’, ‘Details’ Some Hbase basic CRUD operations: Creating a table: create <tablename>, <column family>, <column family>, … Inserting Data: put <tablename>, <rowid>, <column family>:<column qualifier>, <value> Reading Data (all data in a table): scan <tablename> Retrieve Data (one item): get <tablename>,<rowid>
  • 28. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.5.1 Hbase Data Model and Versioning examples : inserting some row data in the EMPLOYEE table: put ‘EMPLOYEE’, ‘row1’, ‘Name:Fname’, ‘Ahmad’ put ‘EMPLOYEE’, ‘row1’, ‘Name:Lname’, ‘Ali’ put ‘EMPLOYEE’, ‘row1’, ‘Name:Nickname’, ‘Khalidi’ put ‘EMPLOYEE’, ‘row1’, ‘Details:Job’, ‘Engineer’ put ‘EMPLOYEE’, ‘row1’, ‘Details:Review’, ‘Good’ put ‘EMPLOYEE’, ‘row2’, ‘Name:Fname’, ‘Sara’ put ‘EMPLOYEE’, ‘row2’, ‘Name:Lname’, ‘Rami’ put ‘EMPLOYEE’, ‘row2’, ‘Name:MName’, ‘S’ put ‘EMPLOYEE’, ‘row2’, ‘Details:Job’, ‘IT’ put ‘EMPLOYEE’, ‘row2’, ‘Details:Supervisor’, ‘Sead Noor’ put ‘EMPLOYEE’, ‘row3’, ‘Name:Fname’, ‘Hasan’ put ‘EMPLOYEE’, ‘row3’, ‘Name:Minit’, ‘E’ put ‘EMPLOYEE’, ‘row3’, ‘Name:Lname’, ‘Mohammad’ put ‘EMPLOYEE’, ‘row3’, ‘Name:Suffix’, ‘Mr.’ put ‘EMPLOYEE’, ‘row3’, ‘Details:Job’, ‘CEO’ put ‘EMPLOYEE’, ‘row3’, ‘Details:Salary’, ‘1,000,000’
  • 29. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.5.2 Hbase CRUD Operations  Hbase has low-level CRUD (create, read, update, delete) operations, as in many of the NoSQL systems. 24.5.3 Hbase Storage and Distributed System Concepts  Each Hbase table is divided into a number of regions, where each region will hold a range of the row keys in the table; this is why the row keys must be lexicographically ordered.  Hbase uses the Apache Zookeeper open source system for services related to man- aging the naming, distribution, and synchronization of the Hbase data on the dis- tributed Hbase server nodes, as well as for coordination and replication services.  Hbase also uses Apache HDFS (Hadoop Distributed File System) for distributed file services.
  • 30. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.6 NOSQL Graph Databases and Neo4j  Another category of NOSQL systems is known as graph databases or graph- oriented NOSQL systems.  The data is represented as a graph, which is a collection of vertices (nodes) and edges.  Both nodes and edges can be labeled to indicate the types of entities and relationships they represent, and it is generally possible to store data associated with both individual nodes and individual edges.
  • 31. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.6.1 Neo4j Data Model  The data model in Neo4j organizes data using the concepts of nodes and relation- ships.  Both nodes and relationships can have properties, which store the data items associated with nodes and relationships.  Nodes can have labels; the nodes that have the same label are grouped into a collection that identifies a subset of the nodes in the database graph for querying purposes.  each relationship has a start node and end node as well as a relationship type, which serves a similar role to a node label by identifying similar relationships that have the same relationship type.  In conventional graph theory, nodes and relationships are generally called vertices and edges. The Neo4j graph data model somewhat resembles how data is represented in the ER and EER models  Properties can be specified via a map pattern, which is made of one or more “name : value” pairs enclosed in curly brackets; for example {Lname : ‘Smith’, Fname : ‘John’, Minit : ‘B’}.
  • 32. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.6.1 Neo4j Data Model Labels and properties:  When a node is created, the node label can be specified. It is also possible to create nodes without any labels. Indexing and node identifiers:  When a node is created, the Neo4j system creates an internal unique system-defined identifier for each node.  For example, Empid can be used to index nodes with the EMPLOYEE label, Dno to index the nodes with the DEPARTMENT label, and Pno to index the nodes with the PROJECT label.
  • 33. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.6.2 The Cypher Query Language of Neo4j  Neo4j has a high-level query language, Cypher.  There are declarative commands for creating nodes and relationships, as well as for finding nodes and relationships based on specifying patterns.  Deletion and modification of data is also possible in Cypher. Examples in Neo4j using the Cypher language creating some nodes for the COMPANY data CREATE (e1: EMPLOYEE, {Empid: ‘1’, Lname: ‘Smith’, Fname: ‘John’, Minit: ‘B’}) CREATE (e2: EMPLOYEE, {Empid: ‘2’, Lname: ‘Wong’, Fname: ‘Franklin’}) CREATE (e3: EMPLOYEE, {Empid: ‘3’, Lname: ‘Zelaya’, Fname: ‘Alicia’}) CREATE (e4: EMPLOYEE, {Empid: ‘4’, Lname: ‘Wallace’, Fname: ‘Jennifer’, Minit: ‘S’}) ... CREATE (d1: DEPARTMENT, {Dno: ‘5’, Dname: ‘Research’}) CREATE (d2: DEPARTMENT, {Dno: ‘4’, Dname: ‘Administration’}) ... CREATE (p1: PROJECT, {Pno: ‘1’, Pname: ‘ProductX’}) CREATE (p2: PROJECT, {Pno: ‘2’, Pname: ‘ProductY’}) CREATE (p3: PROJECT, {Pno: ‘10’, Pname: ‘Computerization’}) CREATE (p4: PROJECT, {Pno: ‘20’, Pname: ‘Reorganization’}) ... CREATE (loc1: LOCATION, {Lname: ‘Houston’}) CREATE (loc2: LOCATION, {Lname: ‘Stafford’}) CREATE (loc3: LOCATION, {Lname: ‘Bellaire’}) CREATE (loc4: LOCATION, {Lname: ‘Sugarland’}) ...
  • 34. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.6.2 The Cypher Query Language of Neo4j Examples in Neo4j using the Cypher language creating some relationships for the COMPANY data : CREATE (e1) – [ : WorksFor ] –> (d1) CREATE (e3) – [ : WorksFor ] –> (d2) ... CREATE (d1) – [ : Manager ] –> (e2) CREATE (d2) – [ : Manager ] –> (e4) ... CREATE (d1) – [ : LocatedIn ] –> (loc1) CREATE (d1) – [ : LocatedIn ] –> (loc3) CREATE (d1) – [ : LocatedIn ] –> (loc4) CREATE (d2) – [ : LocatedIn ] –> (loc2) ... CREATE (e1) – [ : WorksOn, {Hours: ‘32.5’} ] –> (p1) CREATE (e1) – [ : WorksOn, {Hours: ‘7.5’} ] –> (p2) CREATE (e2) – [ : WorksOn, {Hours: ‘10.0’} ] –> (p1) CREATE (e2) – [ : WorksOn, {Hours: 10.0} ] –> (p2) CREATE (e2) – [ : WorksOn, {Hours: ‘10.0’} ] –> (p3) CREATE (e2) – [ : WorksOn, {Hours: 10.0} ] –> (p4) ... Basic simplified syntax of some common Cypher clauses: Finding nodes and relationships that match a pattern: MATCH <pattern> Specifying aggregates and other query variables: WITH <specifications> Specifying conditions on the data to be retrieved: WHERE <condition> Specifying the data to be returned: RETURN <data> Ordering the data to be returned: ORDER BY <data> Limiting the number of returned data items: LIMIT <max number> Creating nodes: CREATE <node, optional labels and properties> Creating relationships: CREATE <relationship, relationship type and optional properties> Deletion: DELETE <nodes or relationships> Specifying property values and labels: SET <property values and labels> Removing property values and labels: REMOVE <property values and labels>
  • 35. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.6.3 Neo4j Interfaces and Distributed System Characteristics  Neo4j has other interfaces that can be used to create, retrieve, and update nodes and relationships in a graph database.  It also has two main versions: 1. Enterprise edition. 2. community edition.  Both editions support the Neo4j graph data model and storage system, and Cypher graph query language, including a high-performance native API, language drivers for several popular programming languages, such as Java, Python, PHP.  In addition, both editions support ACID properties.
  • 36. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.7 Summary  In this chapter, we discussed the class of database systems known as NOSQL systems, which focus on efficient storage and retrieval of large amounts of “big data.” Applications that use these types of systems include social media, Web links, user profiles, marketing and sales, posts and tweets, road maps and spatial data, and e-mail.  The term NOSQL is generally interpreted as Not Only SQL—rather than NO to SQL—and is meant to convey that many applications need systems other than traditional relational SQL systems to augment their data management needs.  These systems are distributed databases or distributed storage systems, with a focus on semistructured data storage, high performance, availability, data replication, and scalability rather than an emphasis on immediate data consistency, powerful query languages, and structured data storage.  we started with an introduction to NOSQL systems, their characteristics, and how they differ from SQL systems. Four general categories of NOSQL systems are document-based, key-value stores, column- based, and graph-based.  discussed how NOSQL systems approach the issue of consistency among multiple replicas (copies) by using the paradigm known as eventual consistency. We discussed the CAP theorem, which can be used to understand the emphasis of NOSQL systems on availability.
  • 37. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 24.7 Summary the four main categories of NOSQL systems 1.document-based systems 2.key-value stores 3.column-based systems 4.graph-based systems  We also noted that some NOSQL systems may not fall neatly into a single category but rather use techniques that span two or more categories.
  • 38. NOSQL Databases and Big Data Storage Systems Fundamental software – Chapter 24 References 1. FUNDAMENTALS OF Database Systems -SEVENTH EDITION , Ramez Elmasri and Shamkant B. Navathe 2. Peter W. Resnick. "Internet Message Format. tools.ietf.org. Retrieved 2018-10-02. 3. "JSON Objects". www.w3schools.com. Retrieved 2018-10-02.