For many organizations, having so much data at their fingertips can be both a blessing and a curse. While lots of data provide teams with valuable insights that can guide effective decision-making, such an avalanche of information can also be a struggle to sift through and make sense of.
Data is also ever-changing — and it’s up to teams to keep up so that they’re always gleaning the most relevant insights. The teams that stand apart are the ones that can zoom in and out seamlessly, focusing on individual data points and then broadening their view for a big picture perspective.
Leveraging the right solutions is step one when trying to keep data accessible and flexible. ClearScale, an AWS Premier Consulting Partner, heard from a client who had to collect thousands of requests per second while also updating individual records. They needed a complex but flexible infrastructure to handle these tasks — so ClearScale sprung into action with a Big Data analytics solution.
This client faced a common problem: they had access to lots and lots of data but were struggling to handle the data in an organized, efficient fashion. ClearScale often has a pretty straightforward solution for moving data in batches. But this situation was complicated by the fact that this client also needed to be able to access individual records for updating — and they needed to do both of these tasks as quickly as possible.
But the client didn’t just want to work fast — they also wanted to work cost-efficiently. They wanted to use all their data, which meant they wanted a solution that could help them easily prune raw data for valuable insights. It was clear that a multi-faceted approach would be needed to address the client’s complex set of challenges.
The ClearScale Solution – AWS Data and Analytics
Moving data in batches isn’t too complicated. Typically, ClearScale starts with an Amazon Kinesis Stream, which collects and processes large streams of data in real-time. This data would then be imported in bulk to Amazon Redshift, a data warehouse that can handle large-scale migrations. But that wouldn’t work in this case, because the client wanted to be able to access individual records for updating. That meant ClearScale had to get creative, finding a solution that enabled granular data collection in addition to large-scale collection.
ClearScale decided to use a combination of Amazon Kinesis Streams, AuroraDB, and DynamoDB to solve this problem. First, data would arrive in Kinesis Stream. Then, a consumer AWS Lambda function would insert this data into either a DynamoDB table or into an Aurora cluster. DynamoDB is built to handle raw data, while AuroraDB can handle rolled-up data — so each of the client’s use cases would be met. Lambda is ultimately the missing link that ties the entire solution together, forwarding the requests into these different databases.
Kinesis Data Flow Diagram
This approach successfully leveraged multiple tools to collect data in a way that’s both flexible and comprehensive. There is a Kinesis Stream in each region, which means data from all over the world is streaming into it. With help from Lambda, the client could control and manage the movement of this large-scale data store in an easy, scalable way.
But the time saved and performance aren’t the only considerations for major companies who want to better handle their data. There’s also a financial component. This suite of solutions devised by ClearScale keeps costs down for this client by building in a TTL (Time to Live) on the DynamoDB table so that it automatically prunes raw data after a set amount of time passes.
Years of expertise and broad technical expertise help ClearScale confidently undertake projects like this — ones that might not have a clear, straightforward solution. Carefully leveraging different tools to meet clients’ needs has led to recognition by Amazon Web Services as a Premier Consulting Partner and has helped ClearScale satisfy clients across industries.