Leveraging Overlay Multicast on AWS for Active-Active Disaster Recovery
Dec 26, 2016


Cloud computing architecture has grown in popularity since Amazon first introduced AWS in 2006. As more and more companies recognize the benefit of moving away from data center-driven architectures with high costs of maintenance, the need to grow functionality in the cloud to support complex business needs has increased as well.
Due to the need to have robust architectures in place to support critical business operations, companies have relied heavily on disaster recovery paradigms that could be used if the primary system is out of commission. An organization that has an extensive and high-demand CRM application reached out to ClearScale, an AWS Premier Consulting Partner, to see if there was a way to migrate to AWS. In any other circumstance, a migration is a straightforward approach, but the requirements the client required made it a bit more challenging.
The Challenge: Implementing Disaster Recovery
The challenge the company faced was the need to not only migrate all 52 applications and 138 database servers to AWS for disaster recovery purposes, but they also needed to have an Active-Active Disaster Recovery (DR) between their Phoenix data center to the AWS Cloud for the majority of their platform. Without this in place, the client was unwilling to migrate to the сloud given their need to have in-sync DR as well as robust security protocols.
Unfortunately, there was no clear approach on how to accomplish this effort, especially given the client’s need to have an Active-Active DR solution. ClearScale was asked to build out a detailed architectural design and migration plan based on the client’s requirements and culminating with a Proof of Concept (POC) to determine its feasibility.
The Solution: Overlay Multicast POC
Since ultimately the DR solution had to live in AWS, the issue lay in how the environment could maintain an Active-Active in-sync status with the data center in Phoenix. ClearScale determined, based on the customer requirements, that in order to achieve this the ideal approach would be to use a Multicast implementation. However, AWS did not support Multicast natively for this particular need, so ClearScale determined that building a PoC around the concepts of Overlay Multicast could potentially solve their issue.
In any other circumstance, many customers of AWS don’t require Multicast in order to support their operations. However, ClearScale was able to demonstrate that using IP level Multicast with unicast IP routing — like what is found in AWS Virtual Private Cloud (VPC) — allowed for point-to-point network tunnels to other AWS EC2 peers. Using a Packer template, we created an AMI with preinstalled/preconfigured MCD daemon. This daemon is used to automatically create GRE tunnels between instances using EC2 tags for discovery of instances within the same multicast group. Then omping and iperf tools were installed to check multicast work and necessary iptables rules to allow GRE and multicast traffic.
Testing Process Logical Diagram
As the individual applications create and transmit a Multicast data packet, that packet will be received by the local instance kernel and replicated for each subscriber or consumer of that information. For example, if there are five subscribers or consumers of application data and the instance is transmitting around 1,000 packets-per-second (pps) stream to these five subscribers, it would consume about 5,000 pps of the instance’s network capacity.
The Results
After building out the PoC, ClearScale proceeded to set up a producer of data as well as consumers of the Overlay Multicast implementation. With message lengths of around 1500 Bytes and a test bandwidth of 500 Megabits/second, ClearScale ran a 60-minute test to determine the feasibility of the implementation.
They demonstrated that with the Overlay Multicast approach, the consumers’ actual bandwidth averaged out to around 59 Megabytes/second with only about 0.5% of the packages lost with a jitter rate of about 0.012 milliseconds. Ultimately, ClearScale determined that because the Multicast realization was using GRE tunnels, the producer nodes would need to have larger sizes than the consumer nodes because the Multicast packages would be sent to each consumer node directly.
The End Result
The PoC was proven to be successful and ultimately met the needs of the client. The architectural design and deployment methods utilized for the PoC were verified by ClearScale to be usable for a full deployment of their infrastructure within AWS by leveraging an Overlay Multicast Network implementation.
This client engagement allowed ClearScale to demonstrate our ability to think beyond just what is available in AWS and look at alternative models to implement a unique solution. Since 2011, ClearScale has shown that our cumulative years in the cloud space will often reveal unique solutions to complex problems that meet the needs of our clientele.
Get in touch today to speak with a cloud migration expert and discuss how we can help:
Call us at 1-800-591-0442
Send us an email at sales@clearscale.com
Fill out a Contact Form
Read our Customer Case Studies