Migrating From Cassandra to DynamoDB: The Whys and Hows
Oct 19, 2021
IT teams everywhere are asking a common question: should we migrate from Apache Cassandra to Amazon DynamoDB? It’s a good consideration and not a decision that can be taken lightly. Going from Cassandra to DynamoDB requires much more than a “lift-and-shift” effort – the migration journey can be a time-consuming and challenging one.
But there are real advantages to going from one NoSQL database to the other. The key lies in understanding the differences between the two database solutions and knowing how to get the most out of DynamoDB on the Amazon Web Services (AWS) cloud if you do go that route.
Cassandra vs. DynamoDB
Cassandra is a NoSQL, open-source, distributed database solution. Companies choose Cassandra for its high availability, speed, cross datacenter replication capabilities, and scalability. On the downside, Cassandra’s architecture requires lots of operational overhead. It can also be difficult and expensive to find IT professionals with Cassandra expertise. For both of these reasons, DynamoDB is often the better database solution for enterprises.
DynamoDB is a fully managed, NoSQL database service from AWS that provides single-digit millisecond performance at any scale. As a fully managed solution, DynamoDB doesn’t require engineers to provision, manage, or configure hardware. AWS also ensures resiliency and durability with its multi-AZ setup – all writes are written synchronously to multiple Availability Zones (AZs) and asynchronously replicated to an additional one.
Furthermore, DynamoDB performance does not degrade as data volumes increase – AWS divides data automatically into partitions, providing users with the option to assign capacity at the partition level. And the solution integrates seamlessly with other AWS services, enabling engineers to develop modern applications, leverage AI/ML, implement big data analytics, and more.
Finally, DynamoDB’s serverless provisioning model is available without specialized licensing and allows users to access compute resources on-demand without needing to overprovision for peak demand scenarios. According to AWS, customers report up to 70% cost savings after switching to DynamoDB to support their applications.
Given these features and performance metrics, it’s clear that DynamoDB is a superior NoSQL solution. And many enterprise leaders reach this conclusion. Where migration momentum slows is at the point when it’s time to plan the actual crossover.
Fortunately, AWS has taken care of this as well through Database Migration Service (DMS) and Schema Conversion Tool (SCT), which help organizations migrate data while keeping applications online. Using AWS DMS and AWS SCT makes a lot of sense for companies with in-house database migration expertise.
But what happens when businesses don’t have the capacity or experience with database migration? They typically turn to an AWS Premier Consulting Partner like ClearScale which has helped companies all over the world set up DynamoDB databases for their needs.
Real-world Cassandra to DynamoDB Migration
One ClearScale engagement, in particular, reveals how much organizations can benefit from migrating to DynamoDB from Cassandra. Previously, ClearScale’s client was storing and processing large volumes of data on AWS EC2-hosted Cassandra NoSQL clusters. The client’s largest cluster was handling more than 15 billion reads and writes per day.
However, the company didn’t want to be responsible for managing these clusters any longer. After explaining the benefits of adopting a fully managed database solution like DynamoDB, ClearScale convinced the client to move forward with a migration MVP. As part of the project, ClearScale designed the potential architecture and a three-stage plan.
In the first stage of the project, ClearScale set out to move Cassandra data to DynamoDB. The team set up a data pipeline to run in parallel to the existing pipeline. Then, ClearScale moved data from the source data stores (Cassandra) to the target (DynamoDB) data stores. ClearScale also kept the dual pipelines active to maintain ongoing replication between the datasets.
In the second stage, ClearScale validated that the data in DynamoDB was consistent with the original data in Cassandra. To do that, ClearScale conducted:
- Identity and correctness testing. This involved using a special system to ensure the data written to Cassandra and DynamoDB was consistent, though not identical due to formatting reasons and the unique characteristics of each database.
- Clustered synthetical load testing to prove the hypothesis that DynamoDB would sustain more than 15 billion reads and writes per day.
- Cost testing with the goal of showing that DynamoDB would cost much less than Cassandra in terms of both human resources and maintenance costs.
After successfully validating ClearScale’s proof of concept, the client gave the go-ahead for the third stage – a full-scale migration. Throughout the migration process, the client’s Cassandra source databases remained completely online and functional. Once the data in DynamoDB was validated, ClearScale helped the client officially cutover from Cassandra. This required modifying the readers to access data only from DynamoDB and decommissioning Cassandra writers and nodes.
The Results
Through this engagement, ClearScale was able to demonstrate tangible benefits on a number of fronts. First, DynamoDB’s on-demand and provisioned capacity were less costly for the client than maintaining EC2 fleets for Cassandra. The company’s administration and infrastructure expenses also came down, thanks to the fully managed nature of DynamoDB. In addition, the new database solution was more resilient, scalable, secure, and fault-tolerant.
Interested in learning more about ClearScale’s cloud database services?
Get in touch today to speak with a cloud expert and discuss how we can help:
Call us at 1-800-591-0442
Send us an email at sales@clearscale.com
Fill out a Contact Form
Read our Customer Case Studies