SlideShare a Scribd company logo
When and how to migrate from a relational database to Cassandra
Introduction
• Ben Slater, Chief Product Officer, Instaclustr
• Cassandra as a managed service on AWS, Azure & IBM
SoftLayer
• 20 years experience as a developer, architecture and dev team
lead
2© 2015. All Rights Reserved.
1 Introduction
2 When to consider migration
3 Preparing your application
4 Migration approaches
5 Conclusion
3© 2015. All Rights Reserved.
When to consider migration
• Reaching physical scalability limits
• Licensing costs becoming prohibitive
• Need 100% availability
• Increasing DBA time to maintain performance / availability
• Active/active multi-DC / disaster recovery requirements
• Weigh against costs:
- Initial migration
- Additional logic maintained in app (eg maintaining denormalised duplicate
data)
© 2015. All Rights Reserved. 4
Preparing your application
Some approaches while still using relational can help reduce
migration costs:
• Abstract data access layer (service oriented architecture)
• Denormalise within relational DB
• Minimise logic implemented in DB
• Build data validation checks & data profiles
© 2015. All Rights Reserved. 5
Migrations Approaches
• Big bang cutover
• Parallel run
• Table by table
© 2015. All Rights Reserved. 6
Big bang cutover
• Build & test version of app using C* and convert data from
relational to C*
• Shutdown relational, convert data, start-up on Cassandra
• Requires downtime, high risk but likely lowest effort option
© 2015. All Rights Reserved. 7
Parallel run
• Build C* tables
• Modify application to write to both C* and relational
• Develop & execute tool to perform initial sync and reconciliation of dbs
• Run and regularly reconcile
• Migrate reads to C*
• More complex to build and manage
• Lower risk and can be done with no downtime
© 2015. All Rights Reserved. 8
Table by table / function by function
• Either big-bang or parallel run approaches can be done on a
table-by-table basis
• Need to be able isolate subject areas with minimal joins in
relational DB (likely to correspond to denormalised C* tables)
• Allows staged implementation, gradually moving load from
relational to C* - useful if relational environment is under
immediate capacity pressure
• Incrementally reduce pressure on relational
© 2015. All Rights Reserved. 9
Estimating Guide
Work Items
• Revise & test operational procedures
• Performance test & soak test
• Trial conversions
• Execute production migration
• Application changes & regression test
• Build migration tool
• Build reconciliation tool
• Build C* schema
Effort Drivers
• # of source tables
• # of access paths
• migration approach
• Level of “preparedness” (slide 5)
© 2015. All Rights Reserved. 10
Considerations
• Don’t forget analytics/ad-hoc querying requirements
• Denormalise – it should feel wrong
• Keep in mind common C* data modelling traps:
– Partition keys
– Tombstones
– Secondary indexes
• Make sure your reads work before migrating / writing
• Upserts make migration easier
© 2015. All Rights Reserved. 11
Conclusion
• It has been done!
• Putting it off won’t make it any easier!
© 2015. All Rights Reserved. 12
Thank you

More Related Content

PPTX
Flowable Business Processing from Kafka Events
PDF
Cassandra summit 2015 - Simplifying Streaming Analytics
PDF
Coursera's Adoption of Cassandra
PDF
Instaclustr: When and how to migrate from a relational database to Cassandra
PPTX
Mule soft meetup anaplan
PDF
Avinton's Kubernetes orchestrated AI Platform and Big Data Solution
PPTX
New Chargeback - Sergio Ocon - ManageIQ Design Summit 2016
PDF
Streams in Parallel Development by Sven Erik Knop
Flowable Business Processing from Kafka Events
Cassandra summit 2015 - Simplifying Streaming Analytics
Coursera's Adoption of Cassandra
Instaclustr: When and how to migrate from a relational database to Cassandra
Mule soft meetup anaplan
Avinton's Kubernetes orchestrated AI Platform and Big Data Solution
New Chargeback - Sergio Ocon - ManageIQ Design Summit 2016
Streams in Parallel Development by Sven Erik Knop

What's hot (20)

PDF
Flink Forward Berlin 2018: Ravi Suhag & Sumanth Nakshatrithaya - "Managing Fl...
PDF
MongoDB and Machine Learning with Flowable
PPTX
Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"
PPTX
PPTX
Persistent Storage for Containerized Applications
PDF
Modernization patterns to refactor a legacy application into event driven mic...
PPT
UC4 SCHEDULING
PDF
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
PPTX
Migrate to platform of your choice
PDF
Azure Cosmos DB Kafka Connectors | Abinav Rameesh, Microsoft
PPSX
PPS
PPTX
Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...
PDF
ActiveMigrate - ECM Renovation Roadshow
PDF
Real-Time Vote Platform Benchmark
PPTX
Making the Transition from the Suite to the Hub
PDF
TechEvent 2019: Whats new in biGENiUS; Robert Kranabether - Trivadis
PDF
mabl's Machine Learning Implementation on Google Cloud Platform
PPTX
Making the Transition from the Suite to the Hub
PDF
WEBridge 4 SAP ( Windchill and SAP Integration)
Flink Forward Berlin 2018: Ravi Suhag & Sumanth Nakshatrithaya - "Managing Fl...
MongoDB and Machine Learning with Flowable
Flink Forward Berlin 2018: Timo Walther - "Flink SQL in Action"
Persistent Storage for Containerized Applications
Modernization patterns to refactor a legacy application into event driven mic...
UC4 SCHEDULING
How to Discover, Visualize, Catalog, Share and Reuse your Kafka Streams (Jona...
Migrate to platform of your choice
Azure Cosmos DB Kafka Connectors | Abinav Rameesh, Microsoft
Flink Forward Berlin 2018: Aljoscha Krettek & Till Rohrmann - Keynote: "A Yea...
ActiveMigrate - ECM Renovation Roadshow
Real-Time Vote Platform Benchmark
Making the Transition from the Suite to the Hub
TechEvent 2019: Whats new in biGENiUS; Robert Kranabether - Trivadis
mabl's Machine Learning Implementation on Google Cloud Platform
Making the Transition from the Suite to the Hub
WEBridge 4 SAP ( Windchill and SAP Integration)
Ad

Viewers also liked (20)

PPTX
Demos Castellana Grotte
PDF
Kurul kararı hukuk ve adalet
PPT
Report 25° Torneo Internazionale di Calcio Giovanile "Città di Abano Terme"
DOC
бал аарыдагы керемет. кyrgyz (кыргыз)
PDF
BTO 2015 I Viaggio a Corleone I INTUS DMC
DOC
аллах сүйүүсү. кyrgyz (кыргыз)
DOC
2013 2014 ieee dsp based projects
PPTX
Load testing Cassandra applications
DOC
атом керемети. кyrgyz (кыргыз)
DOC
биомиметика технология табиятты өрнөк алууда. кyrgyz (кыргыз)
PDF
Aiim White Paper: Document Process Outsourcing: in-house, onshore, near shore...
DOC
белок керемети. кyrgyz (кыргыз)
DOCX
sangkar cv new 2015
PDF
Cassandra FrOSCon 10
PPTX
Cassandra
PPTX
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
PDF
236 mobile optimization-cdnetworks
PPTX
Data Modeling with Cassandra Column Families
PPTX
Cassandra
PDF
Getting started with Spark & Cassandra by Jon Haddad of Datastax
Demos Castellana Grotte
Kurul kararı hukuk ve adalet
Report 25° Torneo Internazionale di Calcio Giovanile "Città di Abano Terme"
бал аарыдагы керемет. кyrgyz (кыргыз)
BTO 2015 I Viaggio a Corleone I INTUS DMC
аллах сүйүүсү. кyrgyz (кыргыз)
2013 2014 ieee dsp based projects
Load testing Cassandra applications
атом керемети. кyrgyz (кыргыз)
биомиметика технология табиятты өрнөк алууда. кyrgyz (кыргыз)
Aiim White Paper: Document Process Outsourcing: in-house, onshore, near shore...
белок керемети. кyrgyz (кыргыз)
sangkar cv new 2015
Cassandra FrOSCon 10
Cassandra
DataStax - Analytics on Apache Cassandra - Paris Tech Talks meetup
236 mobile optimization-cdnetworks
Data Modeling with Cassandra Column Families
Cassandra
Getting started with Spark & Cassandra by Jon Haddad of Datastax
Ad

Similar to When and how to migrate from a relational database to Cassandra (20)

PDF
Migrating to Cassandra
PPTX
Scaling a SaaS backend with PostgreSQL - A case study
PDF
Are we there Yet?? (The long journey of Migrating from close source to opens...
PDF
[EPPG] Oracle to PostgreSQL, Challenges to Opportunity
PDF
Are You Ready for 12c? Data Migration and Upgrade Best Practices
PPTX
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
PDF
Oracle to PostgreSQL, Challenges to Opportunity.pdf
PDF
Migration Best Practices: From RDBMS to Cassandra without a Hitch
PDF
From rdbms to cassandra without a hitch
PDF
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
PPT
Manager's Guide To Oracle Cost Containment
 
PPT
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
PDF
Oracle to MySQL 2012
PPTX
Migration of a relational database to a NoSQL store
PDF
Emerging database landscape july 2011
PPTX
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
PPTX
Evolutionary database design
PDF
CTO Leadership Series: Schema Evolution Patterns
PDF
CTO Leadership Series: Schema Evolution Patterns
PDF
[db tech showcase Tokyo 2017] C34: Replacing Oracle Database at DBS Bank ~Ora...
Migrating to Cassandra
Scaling a SaaS backend with PostgreSQL - A case study
Are we there Yet?? (The long journey of Migrating from close source to opens...
[EPPG] Oracle to PostgreSQL, Challenges to Opportunity
Are You Ready for 12c? Data Migration and Upgrade Best Practices
Cassandra Community Webinar: MySQL to Cassandra - What I Wish I'd Known
Oracle to PostgreSQL, Challenges to Opportunity.pdf
Migration Best Practices: From RDBMS to Cassandra without a Hitch
From rdbms to cassandra without a hitch
Agile Oracle to PostgreSQL migrations (PGConf.EU 2013)
Manager's Guide To Oracle Cost Containment
 
An Effective Approach to Migrate Cassandra Thrift to CQL (Yabin Meng, Pythian...
Oracle to MySQL 2012
Migration of a relational database to a NoSQL store
Emerging database landscape july 2011
SAS Institute on Changing All Four Tires While Driving an AdTech Engine at Fu...
Evolutionary database design
CTO Leadership Series: Schema Evolution Patterns
CTO Leadership Series: Schema Evolution Patterns
[db tech showcase Tokyo 2017] C34: Replacing Oracle Database at DBS Bank ~Ora...

Recently uploaded (20)

PPTX
Machine Learning_overview_presentation.pptx
PDF
Dropbox Q2 2025 Financial Results & Investor Presentation
PDF
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
PDF
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
PDF
Per capita expenditure prediction using model stacking based on satellite ima...
PDF
Accuracy of neural networks in brain wave diagnosis of schizophrenia
PPTX
Spectroscopy.pptx food analysis technology
PPTX
A Presentation on Artificial Intelligence
PPTX
MYSQL Presentation for SQL database connectivity
PDF
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
PPTX
1. Introduction to Computer Programming.pptx
PDF
Machine learning based COVID-19 study performance prediction
PPTX
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
PPTX
Group 1 Presentation -Planning and Decision Making .pptx
PPT
Teaching material agriculture food technology
PDF
Getting Started with Data Integration: FME Form 101
PDF
NewMind AI Weekly Chronicles - August'25-Week II
PPTX
Programs and apps: productivity, graphics, security and other tools
PDF
Network Security Unit 5.pdf for BCA BBA.
PDF
Assigned Numbers - 2025 - Bluetooth® Document
Machine Learning_overview_presentation.pptx
Dropbox Q2 2025 Financial Results & Investor Presentation
Video forgery: An extensive analysis of inter-and intra-frame manipulation al...
7 ChatGPT Prompts to Help You Define Your Ideal Customer Profile.pdf
Per capita expenditure prediction using model stacking based on satellite ima...
Accuracy of neural networks in brain wave diagnosis of schizophrenia
Spectroscopy.pptx food analysis technology
A Presentation on Artificial Intelligence
MYSQL Presentation for SQL database connectivity
Blue Purple Modern Animated Computer Science Presentation.pdf.pdf
1. Introduction to Computer Programming.pptx
Machine learning based COVID-19 study performance prediction
KOM of Painting work and Equipment Insulation REV00 update 25-dec.pptx
Group 1 Presentation -Planning and Decision Making .pptx
Teaching material agriculture food technology
Getting Started with Data Integration: FME Form 101
NewMind AI Weekly Chronicles - August'25-Week II
Programs and apps: productivity, graphics, security and other tools
Network Security Unit 5.pdf for BCA BBA.
Assigned Numbers - 2025 - Bluetooth® Document

When and how to migrate from a relational database to Cassandra

  • 1. When and how to migrate from a relational database to Cassandra
  • 2. Introduction • Ben Slater, Chief Product Officer, Instaclustr • Cassandra as a managed service on AWS, Azure & IBM SoftLayer • 20 years experience as a developer, architecture and dev team lead 2© 2015. All Rights Reserved.
  • 3. 1 Introduction 2 When to consider migration 3 Preparing your application 4 Migration approaches 5 Conclusion 3© 2015. All Rights Reserved.
  • 4. When to consider migration • Reaching physical scalability limits • Licensing costs becoming prohibitive • Need 100% availability • Increasing DBA time to maintain performance / availability • Active/active multi-DC / disaster recovery requirements • Weigh against costs: - Initial migration - Additional logic maintained in app (eg maintaining denormalised duplicate data) © 2015. All Rights Reserved. 4
  • 5. Preparing your application Some approaches while still using relational can help reduce migration costs: • Abstract data access layer (service oriented architecture) • Denormalise within relational DB • Minimise logic implemented in DB • Build data validation checks & data profiles © 2015. All Rights Reserved. 5
  • 6. Migrations Approaches • Big bang cutover • Parallel run • Table by table © 2015. All Rights Reserved. 6
  • 7. Big bang cutover • Build & test version of app using C* and convert data from relational to C* • Shutdown relational, convert data, start-up on Cassandra • Requires downtime, high risk but likely lowest effort option © 2015. All Rights Reserved. 7
  • 8. Parallel run • Build C* tables • Modify application to write to both C* and relational • Develop & execute tool to perform initial sync and reconciliation of dbs • Run and regularly reconcile • Migrate reads to C* • More complex to build and manage • Lower risk and can be done with no downtime © 2015. All Rights Reserved. 8
  • 9. Table by table / function by function • Either big-bang or parallel run approaches can be done on a table-by-table basis • Need to be able isolate subject areas with minimal joins in relational DB (likely to correspond to denormalised C* tables) • Allows staged implementation, gradually moving load from relational to C* - useful if relational environment is under immediate capacity pressure • Incrementally reduce pressure on relational © 2015. All Rights Reserved. 9
  • 10. Estimating Guide Work Items • Revise & test operational procedures • Performance test & soak test • Trial conversions • Execute production migration • Application changes & regression test • Build migration tool • Build reconciliation tool • Build C* schema Effort Drivers • # of source tables • # of access paths • migration approach • Level of “preparedness” (slide 5) © 2015. All Rights Reserved. 10
  • 11. Considerations • Don’t forget analytics/ad-hoc querying requirements • Denormalise – it should feel wrong • Keep in mind common C* data modelling traps: – Partition keys – Tombstones – Secondary indexes • Make sure your reads work before migrating / writing • Upserts make migration easier © 2015. All Rights Reserved. 11
  • 12. Conclusion • It has been done! • Putting it off won’t make it any easier! © 2015. All Rights Reserved. 12