SlideShare a Scribd company logo
Nuxeo: from SQL to
MongoDB
Florent Guillaume — Director of R&D, Nuxeo
2014-07-03
The Nuxeo Model
Nuxeo Platform
SQL DB
Document
BLOBS
<META>
<META>
<META>
Repository
BlobStore
Store
Read
Cache
Persistence
Engine
Insert
Update
Select
FS
MongoDB
VCS DBS
Nuxeo Core — Rich Documents
• Scalars
• Strings, Integers, Floats, Booleans, Dates
• Binary blobs (stored using separate BinaryStore service)
• Arrays of scalars
• Complex properties (sub-documents)
• Lists of complex properties
• System properties
• Id, type, facets, lifecycle state, ACL, version flags...
Nuxeo Core — Rich Documents
• Scalar properties and arrays
• dc:title = "My Document"
• dc:contributors = ["bob", "pete", "mary"]
• dc:created = 2014-07-03T12:15:07+0200
• ecm:uuid = 52a7352b-041e-49ed-8676-328ce90cc103
• ecm:primaryType = "MyFile"
• ecm:majorVersion = 2, ecm:minorVersion = 0
• ecm:isLatestMajorVersion = true, ecm:isLatestVersion = false
Nuxeo Core — Rich Documents
• Complex properties and lists of them
• primaryAddress = { street = "1 rue René Clair", zip = "75018",

city = "Paris", country = "France" }
• files = [
• { name = "doc.txt", length = 1234, mime-type = "plain/text",

data = 0111fefdc8b14738067e54f30e568115 }
• { name = "doc.pdf", length = 29344, mime-type = "application/pdf",

data = 20f42df3221d61cb3e6ab8916b248216 }
]
Nuxeo Core — Rich Operations
• CRUD
• Create
• Retrieve
• Update
• Delete
• Move
• Copy
• ... but in a Hierarchy
Nuxeo Core — Rich Features
• Security based on ACLs and inheritance
• block bob for Write, allow members for Read
• Proxies (multi-filing)
• Versioning
• Placeless documents (versions, tags, relations...)
• Facets (dynamic typing)
• Locking
• Search (NXQL)

SELECT * FROM File WHERE files/*/name = 'doc.txt'
Nuxeo Core — Hierarchy
• Parent-child relationship
• Recursion
• Find all the children to change something
• Lifecycle state
• Security
• Search on a subset of the hierarchy
• ... AND ecm:path STARTSWITH '/workspaces/receipts'
SQL vs DBS/MongoDB
Storage — SQL
• Stores data in a set of JOINed tables
• Star schema, around the main hierarchy
• Lists as JOINed table with item/pos
• Complex properties as sub-documents (children)
• Lists of complex properties as ordered sub-documents
• Id generated by application or database
• String / native UUID / serial integer
Storage — SQL (base hierarchy)
Storage — SQL (simple props)
Storage — SQL (complex props)
Storage — MongoDB
• Standard JSON documents
• Property names fully prefixed
• Lists as arrays of scalars
• Complex properties as sub-documents
• Complex lists as arrays of sub-documents
• Id generated by MongoDB
• Counter using findAndModify, $inc and returnNew
Storage — MongoDB
"ecm:id": "52a7352b-041e-49ed-8676-328ce90cc103",

"dc:title": "My Document",

"dc:contributors": ["bob", "pete", "mary"],

"dc:created": ISODate("2014-07-03T12:15:07+0200"),

"ecm:primaryType": "MyFile",

"ecm:majorVersion": NumberLong(2),

"ecm:minorVersion": NumberLong(0),

"ecm:isLatestMajorVersion": true,

"ecm:isLatestVersion": false,

Storage — MongoDB
primaryAddress: { street: "1 rue René Clair", zip: "75018",

city: "Paris", country: "France" },

files: [{ name: "doc.txt", length: 1234, mime-type: "plain/text",

data: "0111fefdc8b14738067e54f30e568115" },

{ name: "doc.pdf", length: 29344, mime-type: "application/
pdf",

data: "20f42df3221d61cb3e6ab8916b248216" }]

"ecm:acp": [{

name: "local",

acl: [{ grant: false, perm: "Write", user: "bob" },

{ grant: true, perm: "Read", user: "pete" },

{ grant: true, perm: "Read", user: "members" }]

}]
Hierarchy — SQL
• Parent-child relationship
• hierarchy.parentid column
• Recursion optimized through ancestors table
• For each document list all its ancestors
• Maintained by database triggers (create, delete, move, copy)
• Alternative for PostgreSQL: array column with all ancestors
Hierarchy — SQL
Hierarchy — MongoDB
• Parent-child relationship
• ecm:parentId field
• Recursion optimized through ecm:ancestorIds array
• Maintained by framework (create, delete, move, copy)
Hierarchy — MongoDB
"ecm:parentId": "afb488e7",
"ecm:ancestorIds": ["00000000", "18ba9e90",
"afb488e7"],

Proxies — SQL
• Reference to target document
• proxies.targetid column
• Holds only hierarchy-based information, no content
• Parent, name, ACL...
• Additional JOIN during search
Proxies — MongoDB
• Copy of the target document
• ecm:proxyTargetId field
• Target document knows who's pointing to it
• ecm:proxyIds field
• Maintained by framework
• Copy needs to be kept up to date when target changes
• Maintained by framework
Proxies — Semantics
• What to do when:
• Target removed (→ forbid)
• Proxy removed
• Proxy + target removed at the same time (→ ok)
• Target copied
• Proxy copied (→ new proxy to original target)
• Proxy + target copied at the same time (todo)
Security — SQL
• Generic ACP stored in acls table
• Precomputed Read ACLs needed for search
• Ordered list of identities having access, with blocking

["Management", "Supervisors", "-Temps", "bob"]
• Read ACLs are given an identifier
• Identities having access to which Read ACL is precomputed
• Maintained by database triggers
• Search matches using JOIN
Security — SQL
Security — SQL
Security — MongoDB
• Generic ACP stored in ecm:acp field
• Precomputed Read ACLs needed for search
• Simple set of identities having access

ecm:racl: ["Management", "Supervisors", "bob"]!
• Semantic restrictions on blocking
• Maintained by framework
• Search matches if intersection

{"ecm:racl": {"$in": ["bob", "members", "Everyone"]}}
Search — SQL
• Translated from NXQL to SQL
• JOIN of all required star/list/complex properties tables
• Additional UNION + JOINs for proxies
• Additional JOIN for security
• Can have correlations (reuse same JOIN)
• Fulltext index(es) on fulltext.simpletext /
fulltext.binarytext columns
• Translated from NXQL to MongoDB syntax
• Proxies queried directly
• Security queried by set intersection
• One fulltext index for ecm:fulltextSimple /
ecm:fulltextBinary fields
• Some limitations
Search — MongoDB
Search — MongoDB Limitations
• Only one fulltext search per query, restrictions on position
• No generic boolean NOT, must be pushed down as
negative operators
• Search is field/value based
• No multi-field operators (title = description,
expirationDate > modificationDate)
• No multi-field arithmetic (amount + bonus < 1000)
• Subdocument correlation with $elemMatch is less generic than
full JOINs
Transactions — SQL
• Standard SQL database capabilities
• Atomic commit
• Two-phase commit (prepare/commit) also useable, although
costly
• Rollback
• Transient data is data modified in the database but not
yet committed
• Transient data is visible along committed data for retrieval and
search
Transactions — MongoDB
• No atomic commit beyond a single document
• Commit using a big batch of create/delete/update
accumulated in-memory
• Not atomic, others can see partial state
• No transient space
• Emulate transient space in-memory, flush at commit time
• All accesses and searches must check the transient space as
well as MongoDB
Transactions — MongoDB
• No rollback
• Rollback by dropping the in-memory transient space
• Operations involving several documents in relation
• Move, delete, copy, ancestors or recursion checks
• Using transient space + MongoDB for them is too complex
• Flush to MongoDB before doing them (commit)
• Must be able to be rolled back if needed (transaction
compensation)
• Others can see state that's eventually invalid
MongoDB — Restrictions
• Eventual consistency and no transactions
• Prevents strong checks
• Duplicate name in a folder
• Move creating cycles
• Remove target before proxy
• Create document in a deleted folder
• Prevents full consistency of hierarchical processing
• Read ACLs, quotas
• Needs background jobs that check consistency
MongoDB — Features
• Bulk operations
• Map-reduce for aggregations
• Quotas / count / folder content last modified
• Conditional updates
• Locks
• Prevent dirty writes
• GridFS to store binaries
• Sharding
DBS — Future Work
Future Work
• DBS used for more services
• Directories / Vocabularies / User database
• Audit log
• DBS for other backends
• Elasticsearch
• Redis
• PostgreSQL / JSON
• Other...
Thanks!
We're Hiring!

More Related Content

PDF
What's New in Nuxeo Platform 7.3
PDF
Scaling the Content Repository with Elasticsearch
PDF
Release 8.1 - Breakfast Paris
PDF
[Webinar] Nuxeo Platform 5.6 Overview
PDF
Managing Engineering Information with Nuxeo
PDF
What's New in Nuxeo Platform 7.4 - Breakfast Presentation in Paris
PDF
What's new in MongoDB 2.6 at India event by company
PPTX
What's new in MongoDB 2.6
What's New in Nuxeo Platform 7.3
Scaling the Content Repository with Elasticsearch
Release 8.1 - Breakfast Paris
[Webinar] Nuxeo Platform 5.6 Overview
Managing Engineering Information with Nuxeo
What's New in Nuxeo Platform 7.4 - Breakfast Presentation in Paris
What's new in MongoDB 2.6 at India event by company
What's new in MongoDB 2.6

What's hot (20)

PDF
Mongo db eveningschemadesign
PDF
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
PDF
Rpsonmongodb
PPTX
Introduction to Windows Azure Data Services
PDF
Nuxeo Platform LTS 2015 Highlights
PPTX
MMS - Monitoring, backup and management at a single click
PDF
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
PDF
Node.js and couchbase Full Stack JSON - Munich NoSQL
PPTX
Choosing the right Cloud Database
PPTX
Soaring through the Clouds - Oracle Fusion Middleware Partner Forum 2016
PPT
Cloudant Overview Bluemix Meetup from Lisa Neddam
PPTX
Introduction to RavenDB
PPTX
Elk ruminating on logs
PDF
Couchbase@live person meetup july 22nd
PPTX
The Essentials of Building Cloud-Based Web Apps with Azure
PPSX
MongoDB seminar
PPTX
Migrating Customers to Microsoft Azure: Lessons Learned From the Field
PDF
Accelerating Data Ingestion with Databricks Autoloader
PPT
.NET Core Apps: Design & Development
PPTX
Webinar: Architecting Secure and Compliant Applications with MongoDB
Mongo db eveningschemadesign
MongoDB Introduction talk at Dr Dobbs Conference, MongoDB Evenings at Bangalo...
Rpsonmongodb
Introduction to Windows Azure Data Services
Nuxeo Platform LTS 2015 Highlights
MMS - Monitoring, backup and management at a single click
0812 2014 01_toronto-smac meetup_i_os_cloudant_worklight_part2
Node.js and couchbase Full Stack JSON - Munich NoSQL
Choosing the right Cloud Database
Soaring through the Clouds - Oracle Fusion Middleware Partner Forum 2016
Cloudant Overview Bluemix Meetup from Lisa Neddam
Introduction to RavenDB
Elk ruminating on logs
Couchbase@live person meetup july 22nd
The Essentials of Building Cloud-Based Web Apps with Azure
MongoDB seminar
Migrating Customers to Microsoft Azure: Lessons Learned From the Field
Accelerating Data Ingestion with Databricks Autoloader
.NET Core Apps: Design & Development
Webinar: Architecting Secure and Compliant Applications with MongoDB
Ad

Viewers also liked (20)

PDF
Manual magento 1-1
PPTX
Literatura guatemalteca de finales del siglo XIX
PDF
IVA CAIXA
DOCX
Actitud Laboral
PPTX
Digital Influence - The social professional
KEY
Motion Django Meetup
PPT
Presentación Internet_ruben diaz
PPTX
RAF TABTRONICS LLC COMPANY OVERVIEW - 2014
PDF
Norwich o crest_bred
PPS
La Diaclasa (Benaocaz)
PDF
PPTX
¿Qué es la Facioterapia?
PPT
Curso inicial
DOCX
Procesos mc perú
PDF
Psicologia+clinica+que+es
PDF
Exercici portfolio
PDF
PROGRAMA ERRADICACION DE LA MOSCA DEL MEDITERRANEO EN MENDOZA
PPTX
Contraste
Manual magento 1-1
Literatura guatemalteca de finales del siglo XIX
IVA CAIXA
Actitud Laboral
Digital Influence - The social professional
Motion Django Meetup
Presentación Internet_ruben diaz
RAF TABTRONICS LLC COMPANY OVERVIEW - 2014
Norwich o crest_bred
La Diaclasa (Benaocaz)
¿Qué es la Facioterapia?
Curso inicial
Procesos mc perú
Psicologia+clinica+que+es
Exercici portfolio
PROGRAMA ERRADICACION DE LA MOSCA DEL MEDITERRANEO EN MENDOZA
Contraste
Ad

Similar to From SQL to MongoDB (20)

PPTX
Azure DocumentDB
PPTX
PDF
MongoDB & NoSQL 101
KEY
Introduction to MongoDB
PPTX
Hibernate in XPages
PPTX
Accesso ai dati con Azure Data Platform
PPTX
mongodb_DS.pptx
PPTX
Microsoft Azure DocumentDB - Global Azure Bootcamp 2016
PDF
Mongodb my
PDF
MongoDB
PDF
Solving Your Backup Needs Using MongoDB Ops Manager, Cloud Manager and Atlas
PDF
MongoDB.local Austin 2018: Solving Your Backup Needs Using MongoDB Ops Manage...
PDF
MongoDB.local DC 2018: Solving Your Backup Needs Using MongoDB Ops Manager, C...
PPTX
Contains the SQLite database management classes that an application would use...
PPTX
SilverStripe From a Developer's Perspective
PPT
Sqlite
PPTX
A Presentation on MongoDB Introduction - Habilelabs
PDF
TechEvent 2019: Oracle to PostgreSQL - a Travel Guide from Practice; Roland S...
Azure DocumentDB
MongoDB & NoSQL 101
Introduction to MongoDB
Hibernate in XPages
Accesso ai dati con Azure Data Platform
mongodb_DS.pptx
Microsoft Azure DocumentDB - Global Azure Bootcamp 2016
Mongodb my
MongoDB
Solving Your Backup Needs Using MongoDB Ops Manager, Cloud Manager and Atlas
MongoDB.local Austin 2018: Solving Your Backup Needs Using MongoDB Ops Manage...
MongoDB.local DC 2018: Solving Your Backup Needs Using MongoDB Ops Manager, C...
Contains the SQLite database management classes that an application would use...
SilverStripe From a Developer's Perspective
Sqlite
A Presentation on MongoDB Introduction - Habilelabs
TechEvent 2019: Oracle to PostgreSQL - a Travel Guide from Practice; Roland S...

More from Nuxeo (20)

PDF
Own the Digital Shelf Strategies Food and Beverage Companies
PDF
How DAM Librarians Can Get Ready for the Uncertain Future
PDF
How Insurers Fueled Transformation During a Pandemic
PDF
Manage your Content at Scale with MongoDB and Nuxeo
PDF
Accelerate the Digital Supply Chain From Idea to Support
PDF
Where are you in the DAM Continuum
PDF
Customer Experience in 2021
PPTX
L’IA personnalisée, clé d’une gestion de l’information innovante
PDF
Gérer ses contenus avec MongoDB et Nuxeo
PPTX
Le DAM en 2021 : Tendances, points clés et critères d'évaluation
PPTX
Enabling Digital Transformation Amidst a Global Pandemic | Low-Code, Cloud, A...
PDF
Elevate your Customer's Experience and Stay Ahead of the Competition
PDF
Driving Brand Loyalty Through Superior Customer Experience
PDF
Drive Enterprise Speed and Scale with A Cloud-Native DAM
PPTX
The Big Picture: the Role of Video, Photography, and Content in Enhancing the...
PDF
How Creatives Are Getting Creative in 2020 and Beyond
PPTX
Digitalisation : Améliorez la collaboration et l’expérience client grâce au DAM
PDF
Reimagine Your Claims Process with Future-Proof Technologies
PPTX
Comment le Centre Hospitalier Laborit dématérialise ses processus administratifs
PDF
Accelerating the Packaging Design Process with Artificial Intelligence
Own the Digital Shelf Strategies Food and Beverage Companies
How DAM Librarians Can Get Ready for the Uncertain Future
How Insurers Fueled Transformation During a Pandemic
Manage your Content at Scale with MongoDB and Nuxeo
Accelerate the Digital Supply Chain From Idea to Support
Where are you in the DAM Continuum
Customer Experience in 2021
L’IA personnalisée, clé d’une gestion de l’information innovante
Gérer ses contenus avec MongoDB et Nuxeo
Le DAM en 2021 : Tendances, points clés et critères d'évaluation
Enabling Digital Transformation Amidst a Global Pandemic | Low-Code, Cloud, A...
Elevate your Customer's Experience and Stay Ahead of the Competition
Driving Brand Loyalty Through Superior Customer Experience
Drive Enterprise Speed and Scale with A Cloud-Native DAM
The Big Picture: the Role of Video, Photography, and Content in Enhancing the...
How Creatives Are Getting Creative in 2020 and Beyond
Digitalisation : Améliorez la collaboration et l’expérience client grâce au DAM
Reimagine Your Claims Process with Future-Proof Technologies
Comment le Centre Hospitalier Laborit dématérialise ses processus administratifs
Accelerating the Packaging Design Process with Artificial Intelligence

Recently uploaded (20)

PDF
PTS Company Brochure 2025 (1).pdf.......
PDF
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
PPTX
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
PDF
Adobe Illustrator 28.6 Crack My Vision of Vector Design
PDF
AI in Product Development-omnex systems
PPTX
Odoo POS Development Services by CandidRoot Solutions
PDF
Design an Analysis of Algorithms II-SECS-1021-03
PDF
Understanding Forklifts - TECH EHS Solution
PPTX
VVF-Customer-Presentation2025-Ver1.9.pptx
PDF
Design an Analysis of Algorithms I-SECS-1021-03
PPTX
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
PDF
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
PDF
Which alternative to Crystal Reports is best for small or large businesses.pdf
PPT
Introduction Database Management System for Course Database
PDF
How to Migrate SBCGlobal Email to Yahoo Easily
PPTX
L1 - Introduction to python Backend.pptx
PPTX
Transform Your Business with a Software ERP System
PPTX
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
PDF
medical staffing services at VALiNTRY
PPTX
Operating system designcfffgfgggggggvggggggggg
PTS Company Brochure 2025 (1).pdf.......
Claude Code: Everyone is a 10x Developer - A Comprehensive AI-Powered CLI Tool
Agentic AI Use Case- Contract Lifecycle Management (CLM).pptx
Adobe Illustrator 28.6 Crack My Vision of Vector Design
AI in Product Development-omnex systems
Odoo POS Development Services by CandidRoot Solutions
Design an Analysis of Algorithms II-SECS-1021-03
Understanding Forklifts - TECH EHS Solution
VVF-Customer-Presentation2025-Ver1.9.pptx
Design an Analysis of Algorithms I-SECS-1021-03
CHAPTER 12 - CYBER SECURITY AND FUTURE SKILLS (1) (1).pptx
Internet Downloader Manager (IDM) Crack 6.42 Build 42 Updates Latest 2025
Which alternative to Crystal Reports is best for small or large businesses.pdf
Introduction Database Management System for Course Database
How to Migrate SBCGlobal Email to Yahoo Easily
L1 - Introduction to python Backend.pptx
Transform Your Business with a Software ERP System
Lecture 3: Operating Systems Introduction to Computer Hardware Systems
medical staffing services at VALiNTRY
Operating system designcfffgfgggggggvggggggggg

From SQL to MongoDB

  • 1. Nuxeo: from SQL to MongoDB Florent Guillaume — Director of R&D, Nuxeo 2014-07-03
  • 4. Nuxeo Core — Rich Documents • Scalars • Strings, Integers, Floats, Booleans, Dates • Binary blobs (stored using separate BinaryStore service) • Arrays of scalars • Complex properties (sub-documents) • Lists of complex properties • System properties • Id, type, facets, lifecycle state, ACL, version flags...
  • 5. Nuxeo Core — Rich Documents • Scalar properties and arrays • dc:title = "My Document" • dc:contributors = ["bob", "pete", "mary"] • dc:created = 2014-07-03T12:15:07+0200 • ecm:uuid = 52a7352b-041e-49ed-8676-328ce90cc103 • ecm:primaryType = "MyFile" • ecm:majorVersion = 2, ecm:minorVersion = 0 • ecm:isLatestMajorVersion = true, ecm:isLatestVersion = false
  • 6. Nuxeo Core — Rich Documents • Complex properties and lists of them • primaryAddress = { street = "1 rue René Clair", zip = "75018",
 city = "Paris", country = "France" } • files = [ • { name = "doc.txt", length = 1234, mime-type = "plain/text",
 data = 0111fefdc8b14738067e54f30e568115 } • { name = "doc.pdf", length = 29344, mime-type = "application/pdf",
 data = 20f42df3221d61cb3e6ab8916b248216 } ]
  • 7. Nuxeo Core — Rich Operations • CRUD • Create • Retrieve • Update • Delete • Move • Copy • ... but in a Hierarchy
  • 8. Nuxeo Core — Rich Features • Security based on ACLs and inheritance • block bob for Write, allow members for Read • Proxies (multi-filing) • Versioning • Placeless documents (versions, tags, relations...) • Facets (dynamic typing) • Locking • Search (NXQL)
 SELECT * FROM File WHERE files/*/name = 'doc.txt'
  • 9. Nuxeo Core — Hierarchy • Parent-child relationship • Recursion • Find all the children to change something • Lifecycle state • Security • Search on a subset of the hierarchy • ... AND ecm:path STARTSWITH '/workspaces/receipts'
  • 11. Storage — SQL • Stores data in a set of JOINed tables • Star schema, around the main hierarchy • Lists as JOINed table with item/pos • Complex properties as sub-documents (children) • Lists of complex properties as ordered sub-documents • Id generated by application or database • String / native UUID / serial integer
  • 15. Storage — MongoDB • Standard JSON documents • Property names fully prefixed • Lists as arrays of scalars • Complex properties as sub-documents • Complex lists as arrays of sub-documents • Id generated by MongoDB • Counter using findAndModify, $inc and returnNew
  • 16. Storage — MongoDB "ecm:id": "52a7352b-041e-49ed-8676-328ce90cc103",
 "dc:title": "My Document",
 "dc:contributors": ["bob", "pete", "mary"],
 "dc:created": ISODate("2014-07-03T12:15:07+0200"),
 "ecm:primaryType": "MyFile",
 "ecm:majorVersion": NumberLong(2),
 "ecm:minorVersion": NumberLong(0),
 "ecm:isLatestMajorVersion": true,
 "ecm:isLatestVersion": false,

  • 17. Storage — MongoDB primaryAddress: { street: "1 rue René Clair", zip: "75018",
 city: "Paris", country: "France" },
 files: [{ name: "doc.txt", length: 1234, mime-type: "plain/text",
 data: "0111fefdc8b14738067e54f30e568115" },
 { name: "doc.pdf", length: 29344, mime-type: "application/ pdf",
 data: "20f42df3221d61cb3e6ab8916b248216" }]
 "ecm:acp": [{
 name: "local",
 acl: [{ grant: false, perm: "Write", user: "bob" },
 { grant: true, perm: "Read", user: "pete" },
 { grant: true, perm: "Read", user: "members" }]
 }]
  • 18. Hierarchy — SQL • Parent-child relationship • hierarchy.parentid column • Recursion optimized through ancestors table • For each document list all its ancestors • Maintained by database triggers (create, delete, move, copy) • Alternative for PostgreSQL: array column with all ancestors
  • 20. Hierarchy — MongoDB • Parent-child relationship • ecm:parentId field • Recursion optimized through ecm:ancestorIds array • Maintained by framework (create, delete, move, copy)
  • 22. Proxies — SQL • Reference to target document • proxies.targetid column • Holds only hierarchy-based information, no content • Parent, name, ACL... • Additional JOIN during search
  • 23. Proxies — MongoDB • Copy of the target document • ecm:proxyTargetId field • Target document knows who's pointing to it • ecm:proxyIds field • Maintained by framework • Copy needs to be kept up to date when target changes • Maintained by framework
  • 24. Proxies — Semantics • What to do when: • Target removed (→ forbid) • Proxy removed • Proxy + target removed at the same time (→ ok) • Target copied • Proxy copied (→ new proxy to original target) • Proxy + target copied at the same time (todo)
  • 25. Security — SQL • Generic ACP stored in acls table • Precomputed Read ACLs needed for search • Ordered list of identities having access, with blocking
 ["Management", "Supervisors", "-Temps", "bob"] • Read ACLs are given an identifier • Identities having access to which Read ACL is precomputed • Maintained by database triggers • Search matches using JOIN
  • 28. Security — MongoDB • Generic ACP stored in ecm:acp field • Precomputed Read ACLs needed for search • Simple set of identities having access
 ecm:racl: ["Management", "Supervisors", "bob"]! • Semantic restrictions on blocking • Maintained by framework • Search matches if intersection
 {"ecm:racl": {"$in": ["bob", "members", "Everyone"]}}
  • 29. Search — SQL • Translated from NXQL to SQL • JOIN of all required star/list/complex properties tables • Additional UNION + JOINs for proxies • Additional JOIN for security • Can have correlations (reuse same JOIN) • Fulltext index(es) on fulltext.simpletext / fulltext.binarytext columns
  • 30. • Translated from NXQL to MongoDB syntax • Proxies queried directly • Security queried by set intersection • One fulltext index for ecm:fulltextSimple / ecm:fulltextBinary fields • Some limitations Search — MongoDB
  • 31. Search — MongoDB Limitations • Only one fulltext search per query, restrictions on position • No generic boolean NOT, must be pushed down as negative operators • Search is field/value based • No multi-field operators (title = description, expirationDate > modificationDate) • No multi-field arithmetic (amount + bonus < 1000) • Subdocument correlation with $elemMatch is less generic than full JOINs
  • 32. Transactions — SQL • Standard SQL database capabilities • Atomic commit • Two-phase commit (prepare/commit) also useable, although costly • Rollback • Transient data is data modified in the database but not yet committed • Transient data is visible along committed data for retrieval and search
  • 33. Transactions — MongoDB • No atomic commit beyond a single document • Commit using a big batch of create/delete/update accumulated in-memory • Not atomic, others can see partial state • No transient space • Emulate transient space in-memory, flush at commit time • All accesses and searches must check the transient space as well as MongoDB
  • 34. Transactions — MongoDB • No rollback • Rollback by dropping the in-memory transient space • Operations involving several documents in relation • Move, delete, copy, ancestors or recursion checks • Using transient space + MongoDB for them is too complex • Flush to MongoDB before doing them (commit) • Must be able to be rolled back if needed (transaction compensation) • Others can see state that's eventually invalid
  • 35. MongoDB — Restrictions • Eventual consistency and no transactions • Prevents strong checks • Duplicate name in a folder • Move creating cycles • Remove target before proxy • Create document in a deleted folder • Prevents full consistency of hierarchical processing • Read ACLs, quotas • Needs background jobs that check consistency
  • 36. MongoDB — Features • Bulk operations • Map-reduce for aggregations • Quotas / count / folder content last modified • Conditional updates • Locks • Prevent dirty writes • GridFS to store binaries • Sharding
  • 38. Future Work • DBS used for more services • Directories / Vocabularies / User database • Audit log • DBS for other backends • Elasticsearch • Redis • PostgreSQL / JSON • Other...