SlideShare a Scribd company logo
The openCypher project
Michael Hunger
Topics
• Property Graph Model
• Cypher - A language for querying graphs
• Cypher History
• Cypher Demo
• Current implementation in Neo4j
• User Feedback
• Opening up - The openCypher project
• Governance, Contribution Process
• Planned Deliverables
The Property-Graph-Model
You know it, right?
CAR
name: “Dan”
born: May 29, 1970
twitter: “@dan”
name: “Ann”
born: Dec 5, 1975
since:
Jan 10, 2011
brand: “Volvo”
model: “V70”
Labeled Property Graph Model Components
Nodes
• The objects in the graph
• Can have name-value properties
• Can be labeled
Relationships
• Relate nodes by type and direction
• Can have name-value properties
LOVES
LOVES
LIVES WITH
PERSON PERSON
Relational Versus Graph Models
Relational Model Graph Model
KNOWS
ANDREAS
TOBIAS
MICA
DELIA
Person PersonPerson-Friend
ANDREAS
DELIA
TOBIAS
MICA
Cypher Query Language
Why, How, When?
Why Yet Another Query Language (YAQL)?
• SQL and SparQL hurt our brains
• Our brains crave patterns
• It‘s all about patterns
• Creating a query language is fun (and hard work)
What is Cypher?
• A graph query language that allows for expressive and efficient
querying of graph data
• Intuitive, powerful and easy to learn
• Write graph queries by describing patterns in your data
• Focus on your domain not the mechanics of data access.
• Designed to be a human-readable query language
• Suitable for developers and operations professionals
What is Cypher?
• Cypher is declarative, which means it lets users express what
data to retrieve
• The guiding principle behind Cypher is to make simple things
easy and complex things possible
• A humane query language
• Stolen from SQL (common keywords), SPARQL (pattern
matching), Python and Haskell (collection semantics)
Why Cypher?
Compared to:
• SPARQL (Cypher came from real-world use, not academia)
• Gremlin (declarative vs imperative)
• SQL (graph-specific vs set-specific)
(Cypher)-[:LOVES]->(ASCII Art)
A language should be readable, not just writable. You will read your code
dozens more times than you write it. Regex for example are write-only.
Querying the Graph
Some Examples With Cypher
Basic Query: Who do people report to?
MATCH (:Employee {firstName:”Steven”} ) -[:REPORTS_TO]-> (:Employee {firstName:“Andrew”} )
REPORTS_TO
Steven Andrew
LABEL PROPERTY
NODE NODE
LABEL PROPERTY
Basic Query Comparison: Who do people report to?
SELECT *
FROM Employee as e
JOIN Employee_Report AS er ON (e.id = er.manager_id)
JOIN Employee AS sub ON (er.sub_id = sub.id)
MATCH
(e:Employee)-[:REPORTS_TO]->(mgr:Employee)
RETURN
*
Basic Query: Who do people report to?
Basic Query: Who do people report to?
Cypher Syntax
Only Tip of the Iceberg
Syntax: Patterns
( )-->( )
(node:Label {key:value})
(node1)-[rel:REL_TYPE {key:value}]->(node2)
(node1)-[:REL_TYPE1]->(node2)<-[:REL_TYPE2]-(node3)
(node1)-[:REL_TYPE*m..n]->(node2)
Patterns are used in
• (OPTIONAL) MATCH
• CREATE, MERGE
• shortestPath()
• Predicates
• Expressions
• (Comprehensions)
Syntax: Structure
(OPTIONAL) MATCH <patterns>
WHERE <predicates>
RETURN <expression> AS <name>
ORDER BY <expression>
SKIP <offset> LIMIT <size>
Syntax: Automatic Aggregation
MATCH <patterns>
RETURN <expr>, collect([distinct] <expression>) AS <name>,
count(*) AS freq
ORDER BY freq DESC
DataFlow: WITH
WITH <expression> AS <name>, ....
• controls data flow between query segments
• separates reads from writes
• can also
• aggregate
• sort
• paginate
• replacement for HAVING
• as many WITHs as you like
Structure: Writes
CREATE <pattern>
MERGE <pattern> ON CREATE ... ON MATCH ...
(DETACH) DELETE <entity>
SET <property,label>
REMOVE <property,label>
Data Import
[USING PERODIC COMMIT <count>]
LOAD CSV [WITH HEADERS] FROM „URL“ AS row
... any Cypher clauses, mostly match + updates ...
Collections
UNWIND (range(1,10) + [11,12,13]) AS x
WITH collect(x) AS coll
WHERE any(x IN coll WHERE x % 2 = 0)
RETURN size(coll), coll[0], coll[1..-1] ,
reduce(a = 0, x IN coll | a + x),
extract(x IN coll | x*x), filter(x IN coll WHERE x > 10),
[x IN coll WHERE x > 10 | x*x ]
Maps & Entities
WITH {age:42, name: „John“, male:true} as data
WHERE exists(data.name) AND data[„age“] = 42
CREATE (n:Person) SET n += data
RETURN [k in keys(n) WHERE k CONTAINS „a“
| {key: k, value: n[k] } ]
Optional Schema
CREATE INDEX ON :Label(property)
CREATE CONSTRAINT ON (n:Label) ASSERT n.property IS UNIQUE
CREATE CONSTRAINT ON (n:Label) ASSERT exists(n.property)
CREATE CONSTRAINT ON (:Label)-[r:REL]->(:Label2)
ASSERT exists(r.property)
And much more ...
neo4j.com/docs/stable/cypher-refcard
More Examples
MATCH (sub)-[:REPORTS_TO*0..3]->(boss),
(report)-[:REPORTS_TO*1..3]->(sub)
WHERE boss.firstName = 'Andrew'
RETURN sub.firstName AS Subordinate,
count(report) AS Total;
Express Complex Queries Easily with Cypher
Find all direct reports and how
many people they manage,
each up to 3 levels down
Cypher Query
SQL Query
Who is in Robert’s (direct, upwards) reporting chain?
MATCH
path=(e:Employee)<-[:REPORTS_TO*]-(sub:Employee)
WHERE
sub.firstName = 'Robert'
RETURN
path;
Who is in Robert’s (direct, upwards) reporting chain?
Product Cross-Sell
MATCH
(choc:Product {productName: 'Chocolade'})
<-[:ORDERS]-(:Order)<-[:SOLD]-(employee),
(employee)-[:SOLD]->(o2)-[:ORDERS]->(other:Product)
RETURN
employee.firstName, other.productName, count(distinct o2) as count
ORDER BY
count DESC
LIMIT 5;
Product Cross-Sell
Neo4j‘s Cypher Implementation
History of Cypher
• 1.4 - Cypher initially added to Neo4j
• 1.6 - Cypher becomes part of REST API
• 1.7 - Collection functions, global search, pattern predicates
• 1.8 - Write operations
• 1.9 Type System, Traversal Matcher, Caches, String functions, more
powerful WITH, Lazyness, Profiling, Execution Plan
• 2.0 Label support, label based indexes and constraints, MERGE,
transactional HTTP endpoint, literal maps, slices, new parser, OPTIONAL
MATCH
• 2.1 – LOAD CSV, COST Planner, reduce eagerness, UNWIND, versioning
• 2.2 – COST Planner default, EXPLAIN, PROFILE, vis. Query Plan, IDP
• 2.3 -
Try it out!
APIs
• Embedded
• graphDb.execute(query, params);
• HTTP – transactional Cypher endpoint
• :POST /db/data/transaction[/commit] {statements:[{statement: „query“,
parameters: params, resultDataContents:[„row“], includeStats:true},....]}
• Bolt – binary protocol
• Driver driver = GraphDatabase.driver( "bolt://localhost" );
Session session = driver.session();
Result rs = session.run("CREATE (n) RETURN n");
Cypher Today - Neo4j Implementation
• Convert the input query into an abstract syntax tree (AST)
• Optimise and normalise the AST (alias expansion, constant folding etc)
• Create a query graph - a high-level, abstract representation of the query -
from the normalised AST
• Create a logical plan, consisting of logical operators, from the query graph,
using the statistics store to calculate the cost. The cheapest logical plan is
selected using IDP (iterative dynamic programming)
• Create an execution plan from the logical plan by choosing a physical
implementation for logical operators
• Execute the query
http://neo4j.com/blog/introducing-new-cypher-query-optimizer/
Cypher Today - Neo4j Implementation
Neo4j Query Planner
Cost based Query Planner since Neo4j 2.2
• Uses database stats to select best plan
• Currently for Read Operations
• Query Plan Visualizer, finds
• Non optimal queries
• Cartesian Product
• Missing Indexes, Global Scans
• Typos
• Massive Fan-Out
openCypher
An open graph query language
Why ?
We love Cypher!
Our users love Cypher.
We want to make everyone happy through using it.
And have Cypher run on their data(base).
We want to collaborate with community and industry partners to
create the best graph query language possible!
We love the love
Future of (open)Cypher
• Decouple the language from Neo4j
• Open up and make the language design process transparent
• Encourage use within of databases/tools/highlighters/etc
• Delivery of language docs, tools and implementation
• Governed by the Cypher Language Group (CLG)
CIP (Cypher Improvement Proposal)
• A CIP is a semi-formal specification
providing a rationale for new language
features and constructs
• Contributions are welcome:
submit either a CIP (as a pull request)
or a feature request (as an issue) at
the openCypher GitHub repository
• See „Ressources“ for
• accepted CIPs
• Contribution Process
• Template
github.com/opencypher/openCypher
CIP structure
• Sections include:
• motivation,
• background,
• proposal (including the
syntax and semantics),
• alternatives,
• interactions with existing
features,
• benefits,
• drawbacks
• Example of the
“STARTS WITH / ENDS
WITH / CONTAINS” CIP
Deliverables
✔ Improvement Process
✔ Governing Body
✔ Language grammar (Jan-2016)
Technology certification kit (TCK)
Cypher Reference Documentation
Cypher language specification
Reference implementation (under Apache 2.0)
Cypher style guide
Opening up the CLG
Cypher language specification
• EBNF Grammar
• Railroad diagrams
• Semantic specification
• Licensed under a Creative Commons license
Language Grammar (RELEASED Jan-30-2016)
…
Match = ['OPTIONAL', SP], 'MATCH', SP, Pattern, {Hint}, [Where] ;
Unwind = 'UNWIND', SP, Expression, SP, 'AS', SP, Variable ;
Merge = 'MERGE', SP, PatternPart, {SP, MergeAction} ;
MergeAction = ('ON', SP, 'MATCH', SP, SetClause)
| ('ON', SP, 'CREATE', SP, SetClause);
...
github.com/opencypher/openCypher/blob/master/grammar.ebnf
Technology Compliance Kit (TCK)
● Validates a Cypher implementation
● Certifies that it complies with a given version of Cypher
● Based on given dataset
● Executes a set of queries and
● Verifies expected outputs
Cypher Reference Documentation
• Style Guide
• User documentation describing the use of Cypher
• Example datasets with queries
• Tutorials
• GraphGists
Style Guide
• Label are CamelCase
• Properties and functions are lowerCamelCase
• Keywords and Relationship-Types are ALL_CAPS
• Patterns should be complete and left to right
• Put anchored nodes first
• .... to be released ...
Reference implementation (ASL 2.0)
• A fully functional implementation of key parts of the stack
needed to support Cypher inside a platform or tool
• First deliverable: parser taking a Cypher statement and parsing
it into an AST (abstract syntax tree)
• Future deliverables:
• Rule-based query planner
• Query runtime
• Distributed under the Apache 2.0 license
• Can be used as example or as a implementation foundation
The Cypher Language Group (CLG)
• The steering committee for language evolution
• Reviews feature requests and proposals (CIP)
• Caretakers of the language
• Focus on guiding principles
• Long term focus, no quick fixes & hacks
• Currently group of Cypher authors, developers and users
• Publish Meeting Minutes -> opencypher.github.io/meeting-minutes/
“Graph processing is becoming an indispensable part of the modern big data stack. Neo4j’s Cypher
query language has greatly accelerated graph database adoption.
We are looking forward to bringing Cypher’s graph pattern matching capabilities into the Spark
stack, making it easier for masses to access query graph processing.”
- Ion Stoica, CEO & Founder Databricks
“Lots of software systems could be improved by using a graph datastore. One thing holding back the
category has been the lack of a widely supported, standard graph query language. We see the
appearance of openCypher as an important step towards the broader use of graphs across the
industry.”
- Rebecca Parsons, ThoughtWorks, CTO
Some people like it
And support openCypher
Ressources
• http://www.opencypher.org/
• https://github.com/opencypher/openCypher
• https://github.com/opencypher/openCypher/blob/master/CONTRIBUTING.
adoc
• https://github.com/opencypher/openCypher/tree/master/cip
• https://github.com/opencypher/openCypher/pulls
• http://groups.google.com/group/openCypher
• @openCypher
Please contribute
Feedback, Ideas, Proposals
Implementations
Thank You !
Questions ?

More Related Content

PDF
Intro to Cypher
PPTX
Boost Your Neo4j with User-Defined Procedures
PPTX
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
PPTX
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
PDF
An overview of Neo4j Internals
PDF
Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...
PPTX
Log analysis using elk
PDF
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science
Intro to Cypher
Boost Your Neo4j with User-Defined Procedures
Using Apache Arrow, Calcite, and Parquet to Build a Relational Cache
Get Started with the Most Advanced Edition Yet of Neo4j Graph Data Science
An overview of Neo4j Internals
Spark SQL Tutorial | Spark Tutorial for Beginners | Apache Spark Training | E...
Log analysis using elk
Scaling into Billions of Nodes and Relationships with Neo4j Graph Data Science

What's hot (20)

PPTX
The columnar roadmap: Apache Parquet and Apache Arrow
PDF
Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA
PDF
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
PDF
Introducing DataFrames in Spark for Large Scale Data Science
PDF
Streaming SQL for Data Engineers: The Next Big Thing?
PPTX
Optimizing Cypher Queries in Neo4j
PDF
Intro to Graphs and Neo4j
PDF
Data Source API in Spark
PDF
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...
PDF
Neo4j Demo: Using Knowledge Graphs to Classify Diabetes Patients (GlaxoSmithK...
PPTX
ELK Elasticsearch Logstash and Kibana Stack for Log Management
PDF
Intro to Neo4j and Graph Databases
PPTX
Serverless integration with Knative and Apache Camel on Kubernetes
PDF
Hadoop Overview & Architecture
 
PPTX
Coral & Transport UDFs: Building Blocks of a Postmodern Data Warehouse​
PPTX
Oracle REST Data Services: Options for your Web Services
PDF
Data Modeling with Neo4j
PDF
Introduction to DataFusion An Embeddable Query Engine Written in Rust
PDF
Solving PostgreSQL wicked problems
PPTX
Introduction to Apache Spark
The columnar roadmap: Apache Parquet and Apache Arrow
Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA
InfluxDB IOx Tech Talks: Query Processing in InfluxDB IOx
Introducing DataFrames in Spark for Large Scale Data Science
Streaming SQL for Data Engineers: The Next Big Thing?
Optimizing Cypher Queries in Neo4j
Intro to Graphs and Neo4j
Data Source API in Spark
Redis + Structured Streaming—A Perfect Combination to Scale-Out Your Continuo...
Neo4j Demo: Using Knowledge Graphs to Classify Diabetes Patients (GlaxoSmithK...
ELK Elasticsearch Logstash and Kibana Stack for Log Management
Intro to Neo4j and Graph Databases
Serverless integration with Knative and Apache Camel on Kubernetes
Hadoop Overview & Architecture
 
Coral & Transport UDFs: Building Blocks of a Postmodern Data Warehouse​
Oracle REST Data Services: Options for your Web Services
Data Modeling with Neo4j
Introduction to DataFusion An Embeddable Query Engine Written in Rust
Solving PostgreSQL wicked problems
Introduction to Apache Spark
Ad

Viewers also liked (20)

PDF
Importing Data into Neo4j quickly and easily - StackOverflow
PPTX
Introduction to Graph Databases
PPT
No sql matters_2012_keynote
PPT
Why relationships are cool but join sucks - Big Data & Graphs in Rome
PPTX
GraphTalk Frankfurt - Master Data Management bei der Bayerischen Versicherung
PDF
GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2
PDF
Neo4j Makes Graphs Easy? - GraphDay AmandaLaucher
PPTX
Neo4j Makes Graphs Easy
PDF
Graph your business
PPTX
GraphConnect 2014 SF: Neo4j at Scale using Enterprise Integration Patterns
PDF
GraphConnect 2014 SF: The Business Graph
PDF
Graph all the things
PDF
Graph Your Business - GraphDay JimWebber
PDF
Graph Search and Discovery for your Dark Data
PDF
Metadata and Access Control
PDF
Transparency One : La (re)découverte de la chaîne d'approvisionnement
PDF
GraphDay Noble/Coolio
PDF
Leveraging relations at scale with Neo4j
PDF
Meetup Analytics with R and Neo4j
PPTX
Graphs fun vjug2
Importing Data into Neo4j quickly and easily - StackOverflow
Introduction to Graph Databases
No sql matters_2012_keynote
Why relationships are cool but join sucks - Big Data & Graphs in Rome
GraphTalk Frankfurt - Master Data Management bei der Bayerischen Versicherung
GraphConnect 2014 SF: Betting the Company on a Graph Database - Part 2
Neo4j Makes Graphs Easy? - GraphDay AmandaLaucher
Neo4j Makes Graphs Easy
Graph your business
GraphConnect 2014 SF: Neo4j at Scale using Enterprise Integration Patterns
GraphConnect 2014 SF: The Business Graph
Graph all the things
Graph Your Business - GraphDay JimWebber
Graph Search and Discovery for your Dark Data
Metadata and Access Control
Transparency One : La (re)découverte de la chaîne d'approvisionnement
GraphDay Noble/Coolio
Leveraging relations at scale with Neo4j
Meetup Analytics with R and Neo4j
Graphs fun vjug2
Ad

Similar to The openCypher Project - An Open Graph Query Language (20)

PDF
Webinar: What's new in Neo4j 2.0
PPTX
Introduction to Neo4j and .Net
PDF
managing big data
PDF
Software Architecture: Principles, Patterns and Practices
PDF
Cypher and apache spark multiple graphs and more in open cypher
PPTX
Graph databases for SQL Server profesionnals
PDF
Congressional PageRank: Graph Analytics of US Congress With Neo4j
PPT
Hands on Training – Graph Database with Neo4j
PDF
Building DSLs with the Spoofax Language Workbench
PDF
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
PDF
Angular
PDF
Programming Languages: some news for the last N years
PDF
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
PDF
Apache Calcite (a tutorial given at BOSS '21)
PDF
Important work-arounds for making ASS multi-lingual
PDF
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
PDF
Relational Database Design Bootcamp
PPTX
Spark sql meetup
PDF
Voxxed Days Vienna - The Why and How of Reactive Web-Applications on the JVM
Webinar: What's new in Neo4j 2.0
Introduction to Neo4j and .Net
managing big data
Software Architecture: Principles, Patterns and Practices
Cypher and apache spark multiple graphs and more in open cypher
Graph databases for SQL Server profesionnals
Congressional PageRank: Graph Analytics of US Congress With Neo4j
Hands on Training – Graph Database with Neo4j
Building DSLs with the Spoofax Language Workbench
Planning with Polyalgebra: Bringing Together Relational, Complex and Machine ...
Angular
Programming Languages: some news for the last N years
Neo4j Morpheus: Interweaving Table and Graph Data with SQL and Cypher in Apac...
Apache Calcite (a tutorial given at BOSS '21)
Important work-arounds for making ASS multi-lingual
Michael Hall [InfluxData] | Become an InfluxDB Pro in 20 Minutes | InfluxDays...
Relational Database Design Bootcamp
Spark sql meetup
Voxxed Days Vienna - The Why and How of Reactive Web-Applications on the JVM

More from Neo4j (20)

PDF
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
PDF
Jin Foo - Prospa GraphSummit Sydney Presentation.pdf
PDF
GraphSummit Singapore Master Deck - May 20, 2025
PPTX
Graphs & GraphRAG - Essential Ingredients for GenAI
PPTX
Neo4j Knowledge for Customer Experience.pptx
PPTX
GraphTalk New Zealand - The Art of The Possible.pptx
PDF
Neo4j: The Art of the Possible with Graph
PDF
Smarter Knowledge Graphs For Public Sector
PDF
GraphRAG and Knowledge Graphs Exploring AI's Future
PDF
Matinée GenAI & GraphRAG Paris - Décembre 24
PDF
ANZ Presentation: GraphSummit Melbourne 2024
PDF
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
PDF
Telstra Presentation GraphSummit Melbourne: Optimising Business Outcomes with...
PDF
Hands-On GraphRAG Workshop: GraphSummit Melbourne 2024
PDF
Démonstration Digital Twin Building Wire Management
PDF
Swiss Life - Les graphes au service de la détection de fraude dans le domaine...
PDF
Démonstration Supply Chain - GraphTalk Paris
PDF
The Art of Possible - GraphTalk Paris Opening Session
PPTX
How Siemens bolstered supply chain resilience with graph-powered AI insights ...
PDF
Knowledge Graphs for AI-Ready Data and Enterprise Deployment - Gartner IT Sym...
MASTERDECK GRAPHSUMMIT SYDNEY (Public).pdf
Jin Foo - Prospa GraphSummit Sydney Presentation.pdf
GraphSummit Singapore Master Deck - May 20, 2025
Graphs & GraphRAG - Essential Ingredients for GenAI
Neo4j Knowledge for Customer Experience.pptx
GraphTalk New Zealand - The Art of The Possible.pptx
Neo4j: The Art of the Possible with Graph
Smarter Knowledge Graphs For Public Sector
GraphRAG and Knowledge Graphs Exploring AI's Future
Matinée GenAI & GraphRAG Paris - Décembre 24
ANZ Presentation: GraphSummit Melbourne 2024
Google Cloud Presentation GraphSummit Melbourne 2024: Building Generative AI ...
Telstra Presentation GraphSummit Melbourne: Optimising Business Outcomes with...
Hands-On GraphRAG Workshop: GraphSummit Melbourne 2024
Démonstration Digital Twin Building Wire Management
Swiss Life - Les graphes au service de la détection de fraude dans le domaine...
Démonstration Supply Chain - GraphTalk Paris
The Art of Possible - GraphTalk Paris Opening Session
How Siemens bolstered supply chain resilience with graph-powered AI insights ...
Knowledge Graphs for AI-Ready Data and Enterprise Deployment - Gartner IT Sym...

Recently uploaded (20)

PPT
Reliability_Chapter_ presentation 1221.5784
PPTX
oil_refinery_comprehensive_20250804084928 (1).pptx
PPT
Quality review (1)_presentation of this 21
PDF
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
PPTX
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
PPTX
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
PDF
Lecture1 pattern recognition............
PPTX
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
PDF
Mega Projects Data Mega Projects Data
PDF
Galatica Smart Energy Infrastructure Startup Pitch Deck
PDF
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
PPTX
Data_Analytics_and_PowerBI_Presentation.pptx
PPTX
climate analysis of Dhaka ,Banglades.pptx
PPTX
Computer network topology notes for revision
PDF
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
PPTX
A Quantitative-WPS Office.pptx research study
PDF
Taxes Foundatisdcsdcsdon Certificate.pdf
PPTX
Business Acumen Training GuidePresentation.pptx
PPTX
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx
Reliability_Chapter_ presentation 1221.5784
oil_refinery_comprehensive_20250804084928 (1).pptx
Quality review (1)_presentation of this 21
“Getting Started with Data Analytics Using R – Concepts, Tools & Case Studies”
advance b rammar.pptxfdgdfgdfsgdfgsdgfdfgdfgsdfgdfgdfg
iec ppt-1 pptx icmr ppt on rehabilitation.pptx
Lecture1 pattern recognition............
CEE 2 REPORT G7.pptxbdbshjdgsgjgsjfiuhsd
Mega Projects Data Mega Projects Data
Galatica Smart Energy Infrastructure Startup Pitch Deck
BF and FI - Blockchain, fintech and Financial Innovation Lesson 2.pdf
Data_Analytics_and_PowerBI_Presentation.pptx
climate analysis of Dhaka ,Banglades.pptx
Computer network topology notes for revision
TRAFFIC-MANAGEMENT-AND-ACCIDENT-INVESTIGATION-WITH-DRIVING-PDF-FILE.pdf
A Quantitative-WPS Office.pptx research study
Taxes Foundatisdcsdcsdon Certificate.pdf
Business Acumen Training GuidePresentation.pptx
DISORDERS OF THE LIVER, GALLBLADDER AND PANCREASE (1).pptx

The openCypher Project - An Open Graph Query Language

  • 2. Topics • Property Graph Model • Cypher - A language for querying graphs • Cypher History • Cypher Demo • Current implementation in Neo4j • User Feedback • Opening up - The openCypher project • Governance, Contribution Process • Planned Deliverables
  • 4. CAR name: “Dan” born: May 29, 1970 twitter: “@dan” name: “Ann” born: Dec 5, 1975 since: Jan 10, 2011 brand: “Volvo” model: “V70” Labeled Property Graph Model Components Nodes • The objects in the graph • Can have name-value properties • Can be labeled Relationships • Relate nodes by type and direction • Can have name-value properties LOVES LOVES LIVES WITH PERSON PERSON
  • 5. Relational Versus Graph Models Relational Model Graph Model KNOWS ANDREAS TOBIAS MICA DELIA Person PersonPerson-Friend ANDREAS DELIA TOBIAS MICA
  • 7. Why Yet Another Query Language (YAQL)? • SQL and SparQL hurt our brains • Our brains crave patterns • It‘s all about patterns • Creating a query language is fun (and hard work)
  • 8. What is Cypher? • A graph query language that allows for expressive and efficient querying of graph data • Intuitive, powerful and easy to learn • Write graph queries by describing patterns in your data • Focus on your domain not the mechanics of data access. • Designed to be a human-readable query language • Suitable for developers and operations professionals
  • 9. What is Cypher? • Cypher is declarative, which means it lets users express what data to retrieve • The guiding principle behind Cypher is to make simple things easy and complex things possible • A humane query language • Stolen from SQL (common keywords), SPARQL (pattern matching), Python and Haskell (collection semantics)
  • 10. Why Cypher? Compared to: • SPARQL (Cypher came from real-world use, not academia) • Gremlin (declarative vs imperative) • SQL (graph-specific vs set-specific) (Cypher)-[:LOVES]->(ASCII Art) A language should be readable, not just writable. You will read your code dozens more times than you write it. Regex for example are write-only.
  • 11. Querying the Graph Some Examples With Cypher
  • 12. Basic Query: Who do people report to? MATCH (:Employee {firstName:”Steven”} ) -[:REPORTS_TO]-> (:Employee {firstName:“Andrew”} ) REPORTS_TO Steven Andrew LABEL PROPERTY NODE NODE LABEL PROPERTY
  • 13. Basic Query Comparison: Who do people report to? SELECT * FROM Employee as e JOIN Employee_Report AS er ON (e.id = er.manager_id) JOIN Employee AS sub ON (er.sub_id = sub.id) MATCH (e:Employee)-[:REPORTS_TO]->(mgr:Employee) RETURN *
  • 14. Basic Query: Who do people report to?
  • 15. Basic Query: Who do people report to?
  • 16. Cypher Syntax Only Tip of the Iceberg
  • 17. Syntax: Patterns ( )-->( ) (node:Label {key:value}) (node1)-[rel:REL_TYPE {key:value}]->(node2) (node1)-[:REL_TYPE1]->(node2)<-[:REL_TYPE2]-(node3) (node1)-[:REL_TYPE*m..n]->(node2)
  • 18. Patterns are used in • (OPTIONAL) MATCH • CREATE, MERGE • shortestPath() • Predicates • Expressions • (Comprehensions)
  • 19. Syntax: Structure (OPTIONAL) MATCH <patterns> WHERE <predicates> RETURN <expression> AS <name> ORDER BY <expression> SKIP <offset> LIMIT <size>
  • 20. Syntax: Automatic Aggregation MATCH <patterns> RETURN <expr>, collect([distinct] <expression>) AS <name>, count(*) AS freq ORDER BY freq DESC
  • 21. DataFlow: WITH WITH <expression> AS <name>, .... • controls data flow between query segments • separates reads from writes • can also • aggregate • sort • paginate • replacement for HAVING • as many WITHs as you like
  • 22. Structure: Writes CREATE <pattern> MERGE <pattern> ON CREATE ... ON MATCH ... (DETACH) DELETE <entity> SET <property,label> REMOVE <property,label>
  • 23. Data Import [USING PERODIC COMMIT <count>] LOAD CSV [WITH HEADERS] FROM „URL“ AS row ... any Cypher clauses, mostly match + updates ...
  • 24. Collections UNWIND (range(1,10) + [11,12,13]) AS x WITH collect(x) AS coll WHERE any(x IN coll WHERE x % 2 = 0) RETURN size(coll), coll[0], coll[1..-1] , reduce(a = 0, x IN coll | a + x), extract(x IN coll | x*x), filter(x IN coll WHERE x > 10), [x IN coll WHERE x > 10 | x*x ]
  • 25. Maps & Entities WITH {age:42, name: „John“, male:true} as data WHERE exists(data.name) AND data[„age“] = 42 CREATE (n:Person) SET n += data RETURN [k in keys(n) WHERE k CONTAINS „a“ | {key: k, value: n[k] } ]
  • 26. Optional Schema CREATE INDEX ON :Label(property) CREATE CONSTRAINT ON (n:Label) ASSERT n.property IS UNIQUE CREATE CONSTRAINT ON (n:Label) ASSERT exists(n.property) CREATE CONSTRAINT ON (:Label)-[r:REL]->(:Label2) ASSERT exists(r.property)
  • 27. And much more ... neo4j.com/docs/stable/cypher-refcard
  • 29. MATCH (sub)-[:REPORTS_TO*0..3]->(boss), (report)-[:REPORTS_TO*1..3]->(sub) WHERE boss.firstName = 'Andrew' RETURN sub.firstName AS Subordinate, count(report) AS Total; Express Complex Queries Easily with Cypher Find all direct reports and how many people they manage, each up to 3 levels down Cypher Query SQL Query
  • 30. Who is in Robert’s (direct, upwards) reporting chain? MATCH path=(e:Employee)<-[:REPORTS_TO*]-(sub:Employee) WHERE sub.firstName = 'Robert' RETURN path;
  • 31. Who is in Robert’s (direct, upwards) reporting chain?
  • 32. Product Cross-Sell MATCH (choc:Product {productName: 'Chocolade'}) <-[:ORDERS]-(:Order)<-[:SOLD]-(employee), (employee)-[:SOLD]->(o2)-[:ORDERS]->(other:Product) RETURN employee.firstName, other.productName, count(distinct o2) as count ORDER BY count DESC LIMIT 5;
  • 35. History of Cypher • 1.4 - Cypher initially added to Neo4j • 1.6 - Cypher becomes part of REST API • 1.7 - Collection functions, global search, pattern predicates • 1.8 - Write operations • 1.9 Type System, Traversal Matcher, Caches, String functions, more powerful WITH, Lazyness, Profiling, Execution Plan • 2.0 Label support, label based indexes and constraints, MERGE, transactional HTTP endpoint, literal maps, slices, new parser, OPTIONAL MATCH • 2.1 – LOAD CSV, COST Planner, reduce eagerness, UNWIND, versioning • 2.2 – COST Planner default, EXPLAIN, PROFILE, vis. Query Plan, IDP • 2.3 -
  • 37. APIs • Embedded • graphDb.execute(query, params); • HTTP – transactional Cypher endpoint • :POST /db/data/transaction[/commit] {statements:[{statement: „query“, parameters: params, resultDataContents:[„row“], includeStats:true},....]} • Bolt – binary protocol • Driver driver = GraphDatabase.driver( "bolt://localhost" ); Session session = driver.session(); Result rs = session.run("CREATE (n) RETURN n");
  • 38. Cypher Today - Neo4j Implementation • Convert the input query into an abstract syntax tree (AST) • Optimise and normalise the AST (alias expansion, constant folding etc) • Create a query graph - a high-level, abstract representation of the query - from the normalised AST • Create a logical plan, consisting of logical operators, from the query graph, using the statistics store to calculate the cost. The cheapest logical plan is selected using IDP (iterative dynamic programming) • Create an execution plan from the logical plan by choosing a physical implementation for logical operators • Execute the query http://neo4j.com/blog/introducing-new-cypher-query-optimizer/
  • 39. Cypher Today - Neo4j Implementation
  • 40. Neo4j Query Planner Cost based Query Planner since Neo4j 2.2 • Uses database stats to select best plan • Currently for Read Operations • Query Plan Visualizer, finds • Non optimal queries • Cartesian Product • Missing Indexes, Global Scans • Typos • Massive Fan-Out
  • 41. openCypher An open graph query language
  • 42. Why ? We love Cypher! Our users love Cypher. We want to make everyone happy through using it. And have Cypher run on their data(base). We want to collaborate with community and industry partners to create the best graph query language possible!
  • 43. We love the love
  • 44. Future of (open)Cypher • Decouple the language from Neo4j • Open up and make the language design process transparent • Encourage use within of databases/tools/highlighters/etc • Delivery of language docs, tools and implementation • Governed by the Cypher Language Group (CLG)
  • 45. CIP (Cypher Improvement Proposal) • A CIP is a semi-formal specification providing a rationale for new language features and constructs • Contributions are welcome: submit either a CIP (as a pull request) or a feature request (as an issue) at the openCypher GitHub repository • See „Ressources“ for • accepted CIPs • Contribution Process • Template github.com/opencypher/openCypher
  • 46. CIP structure • Sections include: • motivation, • background, • proposal (including the syntax and semantics), • alternatives, • interactions with existing features, • benefits, • drawbacks • Example of the “STARTS WITH / ENDS WITH / CONTAINS” CIP
  • 47. Deliverables ✔ Improvement Process ✔ Governing Body ✔ Language grammar (Jan-2016) Technology certification kit (TCK) Cypher Reference Documentation Cypher language specification Reference implementation (under Apache 2.0) Cypher style guide Opening up the CLG
  • 48. Cypher language specification • EBNF Grammar • Railroad diagrams • Semantic specification • Licensed under a Creative Commons license
  • 49. Language Grammar (RELEASED Jan-30-2016) … Match = ['OPTIONAL', SP], 'MATCH', SP, Pattern, {Hint}, [Where] ; Unwind = 'UNWIND', SP, Expression, SP, 'AS', SP, Variable ; Merge = 'MERGE', SP, PatternPart, {SP, MergeAction} ; MergeAction = ('ON', SP, 'MATCH', SP, SetClause) | ('ON', SP, 'CREATE', SP, SetClause); ... github.com/opencypher/openCypher/blob/master/grammar.ebnf
  • 50. Technology Compliance Kit (TCK) ● Validates a Cypher implementation ● Certifies that it complies with a given version of Cypher ● Based on given dataset ● Executes a set of queries and ● Verifies expected outputs
  • 51. Cypher Reference Documentation • Style Guide • User documentation describing the use of Cypher • Example datasets with queries • Tutorials • GraphGists
  • 52. Style Guide • Label are CamelCase • Properties and functions are lowerCamelCase • Keywords and Relationship-Types are ALL_CAPS • Patterns should be complete and left to right • Put anchored nodes first • .... to be released ...
  • 53. Reference implementation (ASL 2.0) • A fully functional implementation of key parts of the stack needed to support Cypher inside a platform or tool • First deliverable: parser taking a Cypher statement and parsing it into an AST (abstract syntax tree) • Future deliverables: • Rule-based query planner • Query runtime • Distributed under the Apache 2.0 license • Can be used as example or as a implementation foundation
  • 54. The Cypher Language Group (CLG) • The steering committee for language evolution • Reviews feature requests and proposals (CIP) • Caretakers of the language • Focus on guiding principles • Long term focus, no quick fixes & hacks • Currently group of Cypher authors, developers and users • Publish Meeting Minutes -> opencypher.github.io/meeting-minutes/
  • 55. “Graph processing is becoming an indispensable part of the modern big data stack. Neo4j’s Cypher query language has greatly accelerated graph database adoption. We are looking forward to bringing Cypher’s graph pattern matching capabilities into the Spark stack, making it easier for masses to access query graph processing.” - Ion Stoica, CEO & Founder Databricks “Lots of software systems could be improved by using a graph datastore. One thing holding back the category has been the lack of a widely supported, standard graph query language. We see the appearance of openCypher as an important step towards the broader use of graphs across the industry.” - Rebecca Parsons, ThoughtWorks, CTO Some people like it
  • 57. Ressources • http://www.opencypher.org/ • https://github.com/opencypher/openCypher • https://github.com/opencypher/openCypher/blob/master/CONTRIBUTING. adoc • https://github.com/opencypher/openCypher/tree/master/cip • https://github.com/opencypher/openCypher/pulls • http://groups.google.com/group/openCypher • @openCypher
  • 58. Please contribute Feedback, Ideas, Proposals Implementations Thank You ! Questions ?

Editor's Notes

  • #42: Cypher query execution in Neo4j: Convert the input query into an abstract syntax tree (AST) Optimise and normalise the AST (alias expansion, constant folding etc) Create a query graph - a high-level, abstract representation of the query - from the normalised AST Create a logical plan, consisting of logical operators, from the query graph, using the statistics store to calculate the cost. The cheapest logical plan is selected using IDP (iterative dynamic programming) Create an execution plan from the logical plan by choosing a physical implementation for logical operators Execute the query
  • #53: you are the first to see it. permissive license Apache / CC
  • #62: In the near future, many of your apps will be driven by data relationships and not transactions You can unlock value from business relationships with Neo4j