SlideShare a Scribd company logo
Flowable + MongoDB
+ Machine Learning
Joram Barrez
-Flowable Core Developer-
Transactions
• Flowable relies on the transactional semantics of a
relational db
• “Atomically” moving from one stable state to another
• This doesn’t free you from forgetting about service failures,
but understanding the transactional model of Flowable sure
makes it easier to write resilient processes
• MongoDB 4.0 added support for transactions (June)
2
MongoDB
• Open-source NoSql JSON document store
• Short history
• Started in 2007 by 10gen as component of their PaaS
• Was known in the early days (2.2 versions and before) as the
dev/null db
• Acquired WiredTiger end of 2014
• WiredTiger default storage engine in 3.2
• WiredTiger enables transactional semantics (ACID) on multi-
document operations in 4.0 (*)
3* “Path to transactions” series on https://www.youtube.com/user/MongoDB/videos
Flowable – MongoDB
• All code: https://github.com/flowable/flowable-mongodb
4
Service call
Command Interceptor/
Commands
Agenda / operations
EntityManagers
DataManagers
Engine core logic
Low-level data access
High-level data functions
Implementation
• Replace the lowest layer
• MongoDB’s transactions follow a familiar programming
model
• Concept of clientSession
• Matches Flowable’s low-level session concept nicely
5
Demo
6
Relational vs MongoDB implementation
Implementation
• Replace all Datamanager interface implementations with a
MongoDB counterpart
• alpha releases
• Gather interest/feedback
• Using the existing test suite to validate the implementation
• Completed -> beta / stable release
• (Almost) 1-1 translation of the relational data structure
• Optimizations along the way
• MongoDb-specific structure optimization surely will follow
7
Challenges
• com.mongodb.MongoCommandException: Command
failed with error 112 (WriteConflict): 'WriteConflict' on server
exethanter.local:27017. The full response is { "errorLabels" :
["TransientTransactionError"], "operationTime" : {
"$timestamp" : { "t" : 1537701066, "i" : 3 } }, "ok" : 0.0, "errmsg"
: "WriteConflict", "code" : 112, "codeName" : "WriteConflict",
"$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" :
1537701066, "i" : 3 } }, "signature" : { "hash" : { "$binary" :
"AAAAAAAAAAAAAAAAAAAAAAAAAAA=", "$type" : "00" },
"keyId" : { "$numberLong" : "0" } } } }
8
Challenges
• Taking joins for granted
• Denormalization needed
• Way more work as a developer to guarantee data consistency
• E.g simple example: see ‘latest’ of Process definition
• Exchange writes/updates for faster reads
9
Luckily
• Over the past years
• We’ve made Flowable a lot faster by keeping in mind that one
exchange over a network is extremely expensive
• Denormalization, prefetching, entity counts
10
Performance
• Is the performance acceptable?
• Benchmark on AWS
• Setup (see GitHub repo)
11
Process Service
Postgres
MongoDB
m5d.2xlarge (8 cores/32Gb RAM), 100GB SSD
m5d.2xlarge (8 cores/32Gb RAM), 100GB SSD
t3.2xlarge (8 cores/32Gb RAM), 100GB SSD
max_connections = 100
shared_buffers = 8GB
effective_cache_size = 24GB
maintenance_work_mem = 2GB
checkpoint_completion_target = 0.7
wal_buffers = 16MB
default_statistics_target = 100
random_page_cost = 1.1
effective_io_concurrency = 200
work_mem = 20971kB
min_wal_size = 1GB
max_wal_size = 2GB
max_worker_processes = 8
max_parallel_workers_per_gather = 4
max_parallel_workers = 8
listen_addresses = '*'
Process Service
• Bring process into a stable state
• One transaction
• Fixed threadpool of 8 threads
12
- 6 executions
- 2 user tasks
- 1 hist. proc inst
- 11 hist. activities
- 2 hist. user tasks
- 31 variables
- 31 hist. variables
- 1 timer job
Results
• Reverse of what we expected J
13
Results
• Although the graphs seem to indicate a relative large
difference, we’re talking about sub-ms differences!
• Relational db’s have not been idling
• See our recent performance benchmarks
• https://blog.flowable.org/2018/03/05/flowable-6-3-0-
performance-benchmark/
• https://blog.flowable.org/2018/03/13/async-history-
performance-benchmark/
14
Conclusion
• The transactional support in MongoDB is impressive
• Data consistency perspective
• Performance perspective
• Using Flowable on MongoDB is a valid alternative
15
Current limitations
• Read/Write to primary only
• Adding replica nodes seemed to have a negative effect
• Even though read/write to primary (current MongoDB transactions
limitation)
• MongoDB transactions are still under development
16https://www.youtube.com/watch?v=dQh03YLkmyg
Future work
• MongoDB is designed for horizontal scale
• (Yes, (for example) postgres has partitioning, but …)
• Sharded clusters + Flowable à interesting use cases
• Shard by tenant
• Shard on process definition key
• BigData use cases … like ML!
17
Machine Learning
• Process/Case engines are in a prime position
• End-user data through forms
• Service invocation data
• (Semi-)Structured models
18
Machine Learning
• MongoDB being “BigData” (e.g better suited for streaming,
reactive, etc.) opens up use cases for ML
• Demo
• Run processes a lot from start to end
• Feed historical data into ML
• See if human work is repetitive and suggest optimizations
19
Machine Learning
1. Look for Human Decision patterns
20
Machine Learning
1. Look for Human Decision patterns
2. Gather possible data inputs and backtrack
Machine Learning
1. Look for Human Decision patterns
2. Gather possible data inputs and backtrack
3. Use machine learning (Spark decision tree algorithm) to
calculate potential patterns in the data
1. i.e. which data at the start leads to a certain path later on
(within certain % of confidence)
Architecture
23
Process
Service
UI
Stream as RDD
Decision
Analysis
Service
suggestions
Spark (cluster) +
MLlib
Process
Service
Process
Service
Decision
Analysis
Service
Decision
Analysis
Service
Architecture
• vs last year
24
Demo
Processes + Mongo + Machine Learning
25
Thank you!

More Related Content

What's hot (20)

Anatomy of a Continuous Integration and Delivery (CICD) Pipeline
Anatomy of a Continuous Integration and Delivery (CICD) Pipeline
Robert McDermott
 
Angular
Angular
sridhiya
 
Docker Basic to Advance
Docker Basic to Advance
Paras Jain
 
DockerとKubernetesをかけめぐる
DockerとKubernetesをかけめぐる
Kohei Tokunaga
 
Spring HATEOAS
Spring HATEOAS
Yoann Buch
 
Prettier - a newer approach to code formatting
Prettier - a newer approach to code formatting
Marco Liberati
 
Easy Cloud Native Transformation using HashiCorp Nomad
Easy Cloud Native Transformation using HashiCorp Nomad
Bram Vogelaar
 
What is Docker | Docker Tutorial for Beginners | Docker Container | DevOps To...
What is Docker | Docker Tutorial for Beginners | Docker Container | DevOps To...
Edureka!
 
Docker advance topic
Docker advance topic
Kalkey
 
Rootless Containers
Rootless Containers
Akihiro Suda
 
Testing ansible roles with molecule
Testing ansible roles with molecule
Werner Dijkerman
 
Building a REST Service in minutes with Spring Boot
Building a REST Service in minutes with Spring Boot
Omri Spector
 
MySQL 8 High Availability with InnoDB Clusters
MySQL 8 High Availability with InnoDB Clusters
Miguel Araújo
 
KubernetesでRedisを使うときの選択肢
KubernetesでRedisを使うときの選択肢
Naoyuki Yamada
 
The resurgence of event driven architecture
The resurgence of event driven architecture
Kim Clark
 
Getting started with Spring Security
Getting started with Spring Security
Knoldus Inc.
 
[오픈소스컨설팅] 아파치톰캣 운영가이드 v1.3
[오픈소스컨설팅] 아파치톰캣 운영가이드 v1.3
Ji-Woong Choi
 
ansible why ?
ansible why ?
Yashar Esmaildokht
 
Intro to containerization
Intro to containerization
Balint Pato
 
Angular 7 Firebase5 CRUD Operations with Reactive Forms
Angular 7 Firebase5 CRUD Operations with Reactive Forms
Digamber Singh
 
Anatomy of a Continuous Integration and Delivery (CICD) Pipeline
Anatomy of a Continuous Integration and Delivery (CICD) Pipeline
Robert McDermott
 
Docker Basic to Advance
Docker Basic to Advance
Paras Jain
 
DockerとKubernetesをかけめぐる
DockerとKubernetesをかけめぐる
Kohei Tokunaga
 
Spring HATEOAS
Spring HATEOAS
Yoann Buch
 
Prettier - a newer approach to code formatting
Prettier - a newer approach to code formatting
Marco Liberati
 
Easy Cloud Native Transformation using HashiCorp Nomad
Easy Cloud Native Transformation using HashiCorp Nomad
Bram Vogelaar
 
What is Docker | Docker Tutorial for Beginners | Docker Container | DevOps To...
What is Docker | Docker Tutorial for Beginners | Docker Container | DevOps To...
Edureka!
 
Docker advance topic
Docker advance topic
Kalkey
 
Rootless Containers
Rootless Containers
Akihiro Suda
 
Testing ansible roles with molecule
Testing ansible roles with molecule
Werner Dijkerman
 
Building a REST Service in minutes with Spring Boot
Building a REST Service in minutes with Spring Boot
Omri Spector
 
MySQL 8 High Availability with InnoDB Clusters
MySQL 8 High Availability with InnoDB Clusters
Miguel Araújo
 
KubernetesでRedisを使うときの選択肢
KubernetesでRedisを使うときの選択肢
Naoyuki Yamada
 
The resurgence of event driven architecture
The resurgence of event driven architecture
Kim Clark
 
Getting started with Spring Security
Getting started with Spring Security
Knoldus Inc.
 
[오픈소스컨설팅] 아파치톰캣 운영가이드 v1.3
[오픈소스컨설팅] 아파치톰캣 운영가이드 v1.3
Ji-Woong Choi
 
Intro to containerization
Intro to containerization
Balint Pato
 
Angular 7 Firebase5 CRUD Operations with Reactive Forms
Angular 7 Firebase5 CRUD Operations with Reactive Forms
Digamber Singh
 

Similar to MongoDB and Machine Learning with Flowable (20)

Reactive data analysis with vert.x
Reactive data analysis with vert.x
Gerald Muecke
 
MongoDB .local Toronto 2019: MongoDB – Powering the new age data demands
MongoDB .local Toronto 2019: MongoDB – Powering the new age data demands
MongoDB
 
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
MongoDB
 
Concurrency at the Database Layer
Concurrency at the Database Layer
mcwilson1
 
MongoDB .local Chicago 2019: MongoDB – Powering the new age data demands
MongoDB .local Chicago 2019: MongoDB – Powering the new age data demands
MongoDB
 
Intravert Server side processing for Cassandra
Intravert Server side processing for Cassandra
Edward Capriolo
 
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
DataStax Academy
 
What’s New with Flowable?
What’s New with Flowable?
Flowable
 
MongoDB Days UK: Building an Enterprise Data Fabric at Royal Bank of Scotland...
MongoDB Days UK: Building an Enterprise Data Fabric at Royal Bank of Scotland...
MongoDB
 
Python Ireland Conference 2016 - Python and MongoDB Workshop
Python Ireland Conference 2016 - Python and MongoDB Workshop
Joe Drumgoole
 
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
MongoDB
 
MongoDB + Spring
MongoDB + Spring
Norberto Leite
 
MongoDB and Spring - Two leaves of a same tree
MongoDB and Spring - Two leaves of a same tree
MongoDB
 
MongoDB 2.4 and spring data
MongoDB 2.4 and spring data
Jimmy Ray
 
MongoDB.pdf
MongoDB.pdf
KuldeepKumar778733
 
Papers we love realtime at facebook
Papers we love realtime at facebook
Gwen (Chen) Shapira
 
MongoDB Breakfast Milan - Mainframe Offloading Strategies
MongoDB Breakfast Milan - Mainframe Offloading Strategies
MongoDB
 
Social Analytics with MongoDB
Social Analytics with MongoDB
Patrick Stokes
 
Webinar: An Enterprise Architect’s View of MongoDB
Webinar: An Enterprise Architect’s View of MongoDB
MongoDB
 
Use Case: Apollo Group at Oracle Open World
Use Case: Apollo Group at Oracle Open World
MongoDB
 
Reactive data analysis with vert.x
Reactive data analysis with vert.x
Gerald Muecke
 
MongoDB .local Toronto 2019: MongoDB – Powering the new age data demands
MongoDB .local Toronto 2019: MongoDB – Powering the new age data demands
MongoDB
 
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
MongoDB Days UK: Using MongoDB to Build a Fast and Scalable Content Repositor...
MongoDB
 
Concurrency at the Database Layer
Concurrency at the Database Layer
mcwilson1
 
MongoDB .local Chicago 2019: MongoDB – Powering the new age data demands
MongoDB .local Chicago 2019: MongoDB – Powering the new age data demands
MongoDB
 
Intravert Server side processing for Cassandra
Intravert Server side processing for Cassandra
Edward Capriolo
 
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
NYC* 2013 - "Advanced Data Processing: Beyond Queries and Slices"
DataStax Academy
 
What’s New with Flowable?
What’s New with Flowable?
Flowable
 
MongoDB Days UK: Building an Enterprise Data Fabric at Royal Bank of Scotland...
MongoDB Days UK: Building an Enterprise Data Fabric at Royal Bank of Scotland...
MongoDB
 
Python Ireland Conference 2016 - Python and MongoDB Workshop
Python Ireland Conference 2016 - Python and MongoDB Workshop
Joe Drumgoole
 
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
MongoDB Evenings Minneapolis: MongoDB is Cool But When Should I Use It?
MongoDB
 
MongoDB and Spring - Two leaves of a same tree
MongoDB and Spring - Two leaves of a same tree
MongoDB
 
MongoDB 2.4 and spring data
MongoDB 2.4 and spring data
Jimmy Ray
 
Papers we love realtime at facebook
Papers we love realtime at facebook
Gwen (Chen) Shapira
 
MongoDB Breakfast Milan - Mainframe Offloading Strategies
MongoDB Breakfast Milan - Mainframe Offloading Strategies
MongoDB
 
Social Analytics with MongoDB
Social Analytics with MongoDB
Patrick Stokes
 
Webinar: An Enterprise Architect’s View of MongoDB
Webinar: An Enterprise Architect’s View of MongoDB
MongoDB
 
Use Case: Apollo Group at Oracle Open World
Use Case: Apollo Group at Oracle Open World
MongoDB
 
Ad

More from Flowable (17)

Flowable on Kubenetes
Flowable on Kubenetes
Flowable
 
Creating a checklist engine with Flowable
Creating a checklist engine with Flowable
Flowable
 
How SAP uses Flowable as its BPMN engine for SAP CP Workflow
How SAP uses Flowable as its BPMN engine for SAP CP Workflow
Flowable
 
FlowFest Welcome
FlowFest Welcome
Flowable
 
Low code with Flowable
Low code with Flowable
Flowable
 
Flowable 2019 What's New
Flowable 2019 What's New
Flowable
 
Complex batch process migration
Complex batch process migration
Flowable
 
CMMN makes BPMN smarter and engaging
CMMN makes BPMN smarter and engaging
Flowable
 
BPMN and CMMN execution error analysis
BPMN and CMMN execution error analysis
Flowable
 
Flowable Business Processing from Kafka Events
Flowable Business Processing from Kafka Events
Flowable
 
BpmNEXT2019 - The Case of Intentional Process
BpmNEXT2019 - The Case of Intentional Process
Flowable
 
Flowable: Life, death and all the other processes in between
Flowable: Life, death and all the other processes in between
Flowable
 
Flowable What´s coming next?
Flowable What´s coming next?
Flowable
 
Advanced process migration with Flowable
Advanced process migration with Flowable
Flowable
 
Flowable: High wealth customer engagement through chat-driven case and process
Flowable: High wealth customer engagement through chat-driven case and process
Flowable
 
Flowable: Building a crowd sourced document extraction and verification system
Flowable: Building a crowd sourced document extraction and verification system
Flowable
 
Deploying Flowable at scale in AWS
Deploying Flowable at scale in AWS
Flowable
 
Flowable on Kubenetes
Flowable on Kubenetes
Flowable
 
Creating a checklist engine with Flowable
Creating a checklist engine with Flowable
Flowable
 
How SAP uses Flowable as its BPMN engine for SAP CP Workflow
How SAP uses Flowable as its BPMN engine for SAP CP Workflow
Flowable
 
FlowFest Welcome
FlowFest Welcome
Flowable
 
Low code with Flowable
Low code with Flowable
Flowable
 
Flowable 2019 What's New
Flowable 2019 What's New
Flowable
 
Complex batch process migration
Complex batch process migration
Flowable
 
CMMN makes BPMN smarter and engaging
CMMN makes BPMN smarter and engaging
Flowable
 
BPMN and CMMN execution error analysis
BPMN and CMMN execution error analysis
Flowable
 
Flowable Business Processing from Kafka Events
Flowable Business Processing from Kafka Events
Flowable
 
BpmNEXT2019 - The Case of Intentional Process
BpmNEXT2019 - The Case of Intentional Process
Flowable
 
Flowable: Life, death and all the other processes in between
Flowable: Life, death and all the other processes in between
Flowable
 
Flowable What´s coming next?
Flowable What´s coming next?
Flowable
 
Advanced process migration with Flowable
Advanced process migration with Flowable
Flowable
 
Flowable: High wealth customer engagement through chat-driven case and process
Flowable: High wealth customer engagement through chat-driven case and process
Flowable
 
Flowable: Building a crowd sourced document extraction and verification system
Flowable: Building a crowd sourced document extraction and verification system
Flowable
 
Deploying Flowable at scale in AWS
Deploying Flowable at scale in AWS
Flowable
 
Ad

Recently uploaded (20)

Milwaukee Marketo User Group June 2025 - Optimize and Enhance Efficiency - Sm...
Milwaukee Marketo User Group June 2025 - Optimize and Enhance Efficiency - Sm...
BradBedford3
 
Zoneranker’s Digital marketing solutions
Zoneranker’s Digital marketing solutions
reenashriee
 
UPDASP a project coordination unit ......
UPDASP a project coordination unit ......
withrj1
 
Making significant Software Architecture decisions
Making significant Software Architecture decisions
Bert Jan Schrijver
 
FME as an Orchestration Tool - Peak of Data & AI 2025
FME as an Orchestration Tool - Peak of Data & AI 2025
Safe Software
 
Smadav Pro 2025 Rev 15.4 Crack Full Version With Registration Key
Smadav Pro 2025 Rev 15.4 Crack Full Version With Registration Key
joybepari360
 
IMAGE CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORK.P.pptx
IMAGE CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORK.P.pptx
usmanch7829
 
Code and No-Code Journeys: The Coverage Overlook
Code and No-Code Journeys: The Coverage Overlook
Applitools
 
AI and Deep Learning with NVIDIA Technologies
AI and Deep Learning with NVIDIA Technologies
SandeepKS52
 
Software Testing & it’s types (DevOps)
Software Testing & it’s types (DevOps)
S Pranav (Deepu)
 
Software Engineering Process, Notation & Tools Introduction - Part 3
Software Engineering Process, Notation & Tools Introduction - Part 3
Gaurav Sharma
 
IBM Rational Unified Process For Software Engineering - Introduction
IBM Rational Unified Process For Software Engineering - Introduction
Gaurav Sharma
 
MOVIE RECOMMENDATION SYSTEM, UDUMULA GOPI REDDY, Y24MC13085.pptx
MOVIE RECOMMENDATION SYSTEM, UDUMULA GOPI REDDY, Y24MC13085.pptx
Maharshi Mallela
 
GDG Douglas - Google AI Agents: Your Next Intern?
GDG Douglas - Google AI Agents: Your Next Intern?
felipeceotto
 
Wondershare PDFelement Pro 11.4.20.3548 Crack Free Download
Wondershare PDFelement Pro 11.4.20.3548 Crack Free Download
Puppy jhon
 
Reimagining Software Development and DevOps with Agentic AI
Reimagining Software Development and DevOps with Agentic AI
Maxim Salnikov
 
OpenTelemetry 101 Cloud Native Barcelona
OpenTelemetry 101 Cloud Native Barcelona
Imma Valls Bernaus
 
Women in Tech: Marketo Engage User Group - June 2025 - AJO with AWS
Women in Tech: Marketo Engage User Group - June 2025 - AJO with AWS
BradBedford3
 
Open Source Software Development Methods
Open Source Software Development Methods
VICTOR MAESTRE RAMIREZ
 
Artificial Intelligence Applications Across Industries
Artificial Intelligence Applications Across Industries
SandeepKS52
 
Milwaukee Marketo User Group June 2025 - Optimize and Enhance Efficiency - Sm...
Milwaukee Marketo User Group June 2025 - Optimize and Enhance Efficiency - Sm...
BradBedford3
 
Zoneranker’s Digital marketing solutions
Zoneranker’s Digital marketing solutions
reenashriee
 
UPDASP a project coordination unit ......
UPDASP a project coordination unit ......
withrj1
 
Making significant Software Architecture decisions
Making significant Software Architecture decisions
Bert Jan Schrijver
 
FME as an Orchestration Tool - Peak of Data & AI 2025
FME as an Orchestration Tool - Peak of Data & AI 2025
Safe Software
 
Smadav Pro 2025 Rev 15.4 Crack Full Version With Registration Key
Smadav Pro 2025 Rev 15.4 Crack Full Version With Registration Key
joybepari360
 
IMAGE CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORK.P.pptx
IMAGE CLASSIFICATION USING CONVOLUTIONAL NEURAL NETWORK.P.pptx
usmanch7829
 
Code and No-Code Journeys: The Coverage Overlook
Code and No-Code Journeys: The Coverage Overlook
Applitools
 
AI and Deep Learning with NVIDIA Technologies
AI and Deep Learning with NVIDIA Technologies
SandeepKS52
 
Software Testing & it’s types (DevOps)
Software Testing & it’s types (DevOps)
S Pranav (Deepu)
 
Software Engineering Process, Notation & Tools Introduction - Part 3
Software Engineering Process, Notation & Tools Introduction - Part 3
Gaurav Sharma
 
IBM Rational Unified Process For Software Engineering - Introduction
IBM Rational Unified Process For Software Engineering - Introduction
Gaurav Sharma
 
MOVIE RECOMMENDATION SYSTEM, UDUMULA GOPI REDDY, Y24MC13085.pptx
MOVIE RECOMMENDATION SYSTEM, UDUMULA GOPI REDDY, Y24MC13085.pptx
Maharshi Mallela
 
GDG Douglas - Google AI Agents: Your Next Intern?
GDG Douglas - Google AI Agents: Your Next Intern?
felipeceotto
 
Wondershare PDFelement Pro 11.4.20.3548 Crack Free Download
Wondershare PDFelement Pro 11.4.20.3548 Crack Free Download
Puppy jhon
 
Reimagining Software Development and DevOps with Agentic AI
Reimagining Software Development and DevOps with Agentic AI
Maxim Salnikov
 
OpenTelemetry 101 Cloud Native Barcelona
OpenTelemetry 101 Cloud Native Barcelona
Imma Valls Bernaus
 
Women in Tech: Marketo Engage User Group - June 2025 - AJO with AWS
Women in Tech: Marketo Engage User Group - June 2025 - AJO with AWS
BradBedford3
 
Open Source Software Development Methods
Open Source Software Development Methods
VICTOR MAESTRE RAMIREZ
 
Artificial Intelligence Applications Across Industries
Artificial Intelligence Applications Across Industries
SandeepKS52
 

MongoDB and Machine Learning with Flowable

  • 1. Flowable + MongoDB + Machine Learning Joram Barrez -Flowable Core Developer-
  • 2. Transactions • Flowable relies on the transactional semantics of a relational db • “Atomically” moving from one stable state to another • This doesn’t free you from forgetting about service failures, but understanding the transactional model of Flowable sure makes it easier to write resilient processes • MongoDB 4.0 added support for transactions (June) 2
  • 3. MongoDB • Open-source NoSql JSON document store • Short history • Started in 2007 by 10gen as component of their PaaS • Was known in the early days (2.2 versions and before) as the dev/null db • Acquired WiredTiger end of 2014 • WiredTiger default storage engine in 3.2 • WiredTiger enables transactional semantics (ACID) on multi- document operations in 4.0 (*) 3* “Path to transactions” series on https://www.youtube.com/user/MongoDB/videos
  • 4. Flowable – MongoDB • All code: https://github.com/flowable/flowable-mongodb 4 Service call Command Interceptor/ Commands Agenda / operations EntityManagers DataManagers Engine core logic Low-level data access High-level data functions
  • 5. Implementation • Replace the lowest layer • MongoDB’s transactions follow a familiar programming model • Concept of clientSession • Matches Flowable’s low-level session concept nicely 5
  • 7. Implementation • Replace all Datamanager interface implementations with a MongoDB counterpart • alpha releases • Gather interest/feedback • Using the existing test suite to validate the implementation • Completed -> beta / stable release • (Almost) 1-1 translation of the relational data structure • Optimizations along the way • MongoDb-specific structure optimization surely will follow 7
  • 8. Challenges • com.mongodb.MongoCommandException: Command failed with error 112 (WriteConflict): 'WriteConflict' on server exethanter.local:27017. The full response is { "errorLabels" : ["TransientTransactionError"], "operationTime" : { "$timestamp" : { "t" : 1537701066, "i" : 3 } }, "ok" : 0.0, "errmsg" : "WriteConflict", "code" : 112, "codeName" : "WriteConflict", "$clusterTime" : { "clusterTime" : { "$timestamp" : { "t" : 1537701066, "i" : 3 } }, "signature" : { "hash" : { "$binary" : "AAAAAAAAAAAAAAAAAAAAAAAAAAA=", "$type" : "00" }, "keyId" : { "$numberLong" : "0" } } } } 8
  • 9. Challenges • Taking joins for granted • Denormalization needed • Way more work as a developer to guarantee data consistency • E.g simple example: see ‘latest’ of Process definition • Exchange writes/updates for faster reads 9
  • 10. Luckily • Over the past years • We’ve made Flowable a lot faster by keeping in mind that one exchange over a network is extremely expensive • Denormalization, prefetching, entity counts 10
  • 11. Performance • Is the performance acceptable? • Benchmark on AWS • Setup (see GitHub repo) 11 Process Service Postgres MongoDB m5d.2xlarge (8 cores/32Gb RAM), 100GB SSD m5d.2xlarge (8 cores/32Gb RAM), 100GB SSD t3.2xlarge (8 cores/32Gb RAM), 100GB SSD max_connections = 100 shared_buffers = 8GB effective_cache_size = 24GB maintenance_work_mem = 2GB checkpoint_completion_target = 0.7 wal_buffers = 16MB default_statistics_target = 100 random_page_cost = 1.1 effective_io_concurrency = 200 work_mem = 20971kB min_wal_size = 1GB max_wal_size = 2GB max_worker_processes = 8 max_parallel_workers_per_gather = 4 max_parallel_workers = 8 listen_addresses = '*'
  • 12. Process Service • Bring process into a stable state • One transaction • Fixed threadpool of 8 threads 12 - 6 executions - 2 user tasks - 1 hist. proc inst - 11 hist. activities - 2 hist. user tasks - 31 variables - 31 hist. variables - 1 timer job
  • 13. Results • Reverse of what we expected J 13
  • 14. Results • Although the graphs seem to indicate a relative large difference, we’re talking about sub-ms differences! • Relational db’s have not been idling • See our recent performance benchmarks • https://blog.flowable.org/2018/03/05/flowable-6-3-0- performance-benchmark/ • https://blog.flowable.org/2018/03/13/async-history- performance-benchmark/ 14
  • 15. Conclusion • The transactional support in MongoDB is impressive • Data consistency perspective • Performance perspective • Using Flowable on MongoDB is a valid alternative 15
  • 16. Current limitations • Read/Write to primary only • Adding replica nodes seemed to have a negative effect • Even though read/write to primary (current MongoDB transactions limitation) • MongoDB transactions are still under development 16https://www.youtube.com/watch?v=dQh03YLkmyg
  • 17. Future work • MongoDB is designed for horizontal scale • (Yes, (for example) postgres has partitioning, but …) • Sharded clusters + Flowable à interesting use cases • Shard by tenant • Shard on process definition key • BigData use cases … like ML! 17
  • 18. Machine Learning • Process/Case engines are in a prime position • End-user data through forms • Service invocation data • (Semi-)Structured models 18
  • 19. Machine Learning • MongoDB being “BigData” (e.g better suited for streaming, reactive, etc.) opens up use cases for ML • Demo • Run processes a lot from start to end • Feed historical data into ML • See if human work is repetitive and suggest optimizations 19
  • 20. Machine Learning 1. Look for Human Decision patterns 20
  • 21. Machine Learning 1. Look for Human Decision patterns 2. Gather possible data inputs and backtrack
  • 22. Machine Learning 1. Look for Human Decision patterns 2. Gather possible data inputs and backtrack 3. Use machine learning (Spark decision tree algorithm) to calculate potential patterns in the data 1. i.e. which data at the start leads to a certain path later on (within certain % of confidence)
  • 23. Architecture 23 Process Service UI Stream as RDD Decision Analysis Service suggestions Spark (cluster) + MLlib Process Service Process Service Decision Analysis Service Decision Analysis Service
  • 25. Demo Processes + Mongo + Machine Learning 25