With the ever-increasing breadth of information surrounding your customers’ demographic and financial transaction history, the amount of associated data can grow very large, very quickly. From how they use their credit cards, to what vehicles they purchase/lease and when, to how often they move, rent, or buy a place to live, and even how many times they use checks over debit or credit cards, consumers’ financial transactional footprints are varied and considerable.
This large volume of information can be overwhelming when viewed on a case-by-case basis. Organizations that must determine credit behavior, foresee spending patterns, or identify populations that are at risk of exiting the market quickly learn that attempting to track this type of data on an enormous customer database is mind-boggling to say the least.
Traditional approaches to sifting through such a multitude of information involved data tables, queries, pivots, data matching, and using human interaction in order to glean patterns and cues which would determine the appropriate course of action. As more and more data points become available every day, it is unreasonable to assume that any one person or team will be able to keep up with this sort of data analysis in a timely manner, let alone act on it by the time they finally come to their necessary conclusions.
The Problem: Produce Customer Data Insights
Recently, a client approached ClearScale with this dilemma. They needed to not only sift through their customer data and the associated data points to infer consumer credit behavior, but they also required a way to make it a regular, painless process that produced insights as needed. With few options, they asked ClearScale, an AWS Premier Consulting Partner with Big Data Competency, to review their situation and provide recommendations on how they could leverage AWS services to reduce bottlenecks in processing and increase throughput of solutions.
The Solution: Amazon Machine Learning
Once we investigated the scope of what they needed, our solution for the client was clear: Leverage Amazon Machine Learning services. With machine learning, you are able to use complex algorithms to perform predictive analytic modeling against indexed datasets. Amazon makes this process even easier for developers by providing visualization tools and wizards that create machine learning models whose insights are attained using simple APIs that can be ingested by other services.
Making this process even more effective, the Amazon machine learning models not only evaluate a given indexed data set to determine usable patterns, but they also modify their approach to fine-tune them, making the results more actionable.
Because it uses the AWS infrastructure, the machine learning service is easily scalable and can generate billions of predictions on a daily basis which are then served up in real-time via the APIs. The added low cost of operation, as well as the service’s integrated data visualization models, made this solution even more attractive for the client’s needs.
Although AWS provides options for how to store the soon-to-be analyzed data, including using the results of an Amazon Redshift query or Amazon Relational Database Service (RDS), ClearScale set up a dedicated S3 bucket to store the client’s data within an AWS Virtual Private Cloud (VPC). After setting up, fine-tuning, and testing the initial machine learning model, we did a batch analysis against the entire data set to see how the predictive analytics would perform. We then reviewed with our client four sets of data that came back: entries that satisfied the data conditions sought, entries that did not satisfy the data conditions, entries that did not satisfy data conditions but were marked as positive, and entries that satisfied the data conditions but were flagged as negative.
Reviewing these datasets was critical to the success of our client. If we did not check the various outputs from the testing phase, we ran the risk of introducing bad or missing data from the datasets. This would increase the likelihood that the model itself could degrade, which would, in turn, provide inaccurate information to the client. In fact, AWS maintains a threshold that prevents the model from running if roughly 10% of the datasets fail due to bad or missing data.
Once the datasets were reviewed with the client, ClearScale went back and performed final changes against the machine learning models and configured AWS to provide redundancy across several different zones. Data and information traveling between points are encrypted in transit and at rest, and any requests made to the batch or real-time APIs are done over SSL.
Furthermore, up to 200 transactions per second (TPS) can be requested, allowing usage in any desktop or mobile application that needs near real-time results. All of the setup and configuration of these attributes were created and managed using AWS Identity Access Management (IAM) so that the client could determine which users had access to manage these machine learning models and algorithms in the future.
Using our proven approach of fully evaluating our client’s needs before researching and providing a recommended course of action and implementation, ClearScale was able to develop and deploy a very beneficial machine learning solution. The expertise we have demonstrated with the AWS platform and services has been reinforced time and again, no matter how complex the challenge.
At ClearScale, we believe that in order for our clients to succeed, we must make certain not only that our provided solution meets their current or future needs, but also that we are setting them on the road to success by completing an in-depth review of their concerns and delivering the training they need to guide their own destiny.