The applications for matchmaking powered by artificial intelligence (AI) go far beyond online dating. The use of machine learning (ML) can help match job seekers to jobs, startups to venture capitalists, or nonprofit organizations to grants. It can even match prospective college students to institutions of higher education.

However, it’s not just the match that matters. AI offers opportunities to do more with the data gathered for these applications. For example, the data gathered for matching students to colleges and universities could also be used to tailor interventions to help ensure each student’s long-term success.

That was one of the capabilities a ClearScale client, SeligoAI, envisioned for university admissions and the student-progress-tracking platform it requested us to help develop.

The idea was to gather a broad range of detailed data on individuals and run it against ML algorithms. This would generate continuously refined, probability-based intelligence that colleges and universities could use in their recruitment and retention efforts.

As an AWS Premier Consulting Partner, ClearScale drew on numerous AWS services to build the Seligo platform. It was successfully deployed and has proven successful on the market.

SeligoAI then requested ClearScale’s assistance in enhancing the platform, particularly in terms of optimizing the use of the data gathered. Among the AWS services ClearScale is using to meet the client’s request, there are two that aren’t typically associated with recruitment and retention-related applications: Amazon Personalize and Amazon Forecast. Their unique capabilities are enabling us to develop a tool that our client can use beyond its original purpose.

Amazon Personalize Overview

Amazon Personalize is a fully managed service that trains, tunes, and deploys ML models in the cloud. It’s commonly used for applications such as product recommendations for eCommerce, hotel recommendations for travel websites, and matches for dating sites.

Users provide an activity stream from their applications and websites, an inventory of the information they want to recommend, and optional demographic information. Amazon Personalize processes and examines the data, identifies what is meaningful or relevant, and makes appropriate recommendations.

The data remains private and secure and is only used for customized recommendations. Customers only pay for what they use.

Amazon Personalize allows creating truly individualized notifications, marketing activities, student recommendations, and results re-ranking. In our case, ClearScale uses it to help schools answer the questions “How can we attract students?” and “How can we find the most prospective students?”

Amazon Forecast Overview

Amazon Forecast is a fully managed service that uses ML to combine historical time series with other variables to build forecasts. It is typically used in applications for forecasting product demand, manufacturing demand, travel demand, IT capacity, logistics, and web traffic.

Users import data into a database. From there, Amazon Forecast automatically loads the data, inspects it, and identifies what’s meaningful. It then produces a custom forecasting model capable of making predictions that are up to 50% more accurate than looking at time series data alone.

Like Amazon Personalize, users only pay only for what they use.

Amazon Forecast allows our client to address the questions “Should we conduct a marketing activity for the specific prospective student and which activity in particular?” and “When should it occur?”

The Base Solution

The initial solution ClearScale created was based on SeligoAI’s specific desire for a college/university recruitment and retention tool. An overview of it, including its architecture and the AWS services used to build it, can be found here.

In brief, the platform integrates probability analysis into the funnel management system employed at universities and colleges. It gathers a wide range of data about prospective students. It then uses AI algorithms enabled by Amazon ML to analyze the data against predetermined indices.

The resulting probability values simplify prospect evaluation, providing contextual intelligence to better assess student potential. The platform is accessible to college and university staff via web and mobile applications.

Opening the Door for Amazon Forecast and Personalize

ClearScale is addressing a key challenge SeligoAI has been facing: a lack of labeled historical data. That’s a problem for ML-powered applications because many algorithms can only find target attributes if a human has mapped to them.

It’s also an expensive proposition for companies wishing to fix the problem. That’s because labeling by humans is a slow, labor-intensive process. Many companies can’t handle labeling in-house or afford to outsource it.

ClearScale is meeting the challenge by employing Amazon SageMaker GroundTruth. It uses an ML model to automatically label raw data to produce high-quality training datasets quickly at a fraction of what manual labeling costs. Data is only routed to humans if an active learning model can’t confidently label it. Using Amazon SageMaker will substantially lower costs for our client.

The labeled data can then be fed to Amazon Forecast. The service will use machine learning to predict if a particular student will be a good match for a particular institution and have the potential to succeed there. College and university personnel can then choose which targeted marketing and recruitment activities to conduct.

To improve the effectiveness of these efforts, the data can then be passed on to Amazon Personalize. This will enable the delivery of highly personalized, relevant information and experiences to specific students quickly.

Amazon Personalize also works with Amazon Pinpoint. The digital user engagement service is being integrated into the solution and will be used for push notifications. Pinpoint makes it easy to power personalized content recommendations and targeted marketing promotions while capturing insights about how targets are interacting with engagements across all channels.

Using Amazon Forecast and Amazon Personalize will provide SeligoAI with a way to create data-informed strategies for student recruitment, engagement, retention, and success. It can also be used for similar applications in other industries where large amounts of disparate data can be leveraged to forecast a probability and then create highly personalized campaigns to exploit it.

Data Gathering

The best-designed platform for any kind of matching application is only as good as the data it uses. The quantity of data matters as well. The more data to work with, the greater the ability to generate the desired, highly accurate results.

Data is available from a wide variety of sources, including mobile devices. The problem is that the ability to extract data and prepare it for analysis is often limited to large companies with large budgets.

ClearScale is helping to resolve this issue, in part, with the use of Amazon Kinesis Data Streams. Kinesis Data Streams continuously capture gigabytes of data per second from hundreds of thousands of sources to enable real-time analytics. It’s used in conjunction with Amazon EMR. EMR is a cloud-native Big Data platform that allows for processing vast amounts of data quickly and cost-effectively at scale.

Data Storage and Preparation

With the ability to capture more data from disparate sources comes the need for storage for unrelated datasets.

All data sources have their own schema, making it difficult to place them in traditional relational databases. Data also doesn’t have explicit similarities. To deal with these issues, ClearScale is using AWS Glue, AWS Lake Formation, and AWS Glue ML Transforms.

Amazon S3 was selected as the primary storage platform for the solution’s associated data lake because of its virtually unlimited scalability. It can be seamlessly and non-disruptively increased, with the client only paying for what is used. All data types can be stored in their native formats. It also integrates with AWS Glue to query and process data.

AWS Lake Formation makes it quick and easy to set up a secure data lake that breaks down data silos and combines different types of analytics to gain insights and guide better business decisions.

AWS Glue, a fully managed, pay-as-you-go, extract, transform, and load (ETL) service automates the time-consuming steps of data preparation for analytics. It automatically discovers and profiles data via the Glue Data Catalog.

Because SeligoAI must work with disparate sets of data drawn from numerous sources, fuzzy matching will be employed to align similar categories and entities in the data sets, as well as to join datasets on specific entities. Duplicate records can be a problem in applications that require fuzzy matching, which entails identifying non-exact matches of the target item. The use of AWS Glue FindMatches ML Transform in the solution helps identify matching records. It can also find duplicate or linked records in unrelated datasets.

On-Device Marketing

In 2019, 16% of internet users in the United States alone will use a mobile phone exclusively to go online. The number of mobile phone internet users in the United States is estimated to increase from 237 million in 2017 to 275 million by 2022, according to mobile device usage statistics in 2018. That means mobile devices represent a huge opportunity for reaching individual users with marketing tactics.

ClearScale’s solution also integrates the necessary components to take advantage of this opportunity. It will enable on-device marketing capabilities that colleges and universities can use remotely. This includes:

  • Adding mentions of schools into mobile device searches via Spotlight for iPhone and iPad. The search tool is a way for searching a mobile device, the web, the App Store, and Maps for things needed quickly.
  • Using geofencing marketing, which offers the ability to show local notifications without internet connectivity based on geo-position, daily visits activities, and other variables.
  • Using advertising beacons, based on individual devices which can broadcast notifications even after application deletion.
  • System-level integration with social networks, enabling students to share information on institutions and related topics.
  • Enabling institutions and their partners to offer promos for students via Apple Wallet.
  • Using Apple StoreKit, along with Amazon Forecast and Amazon Personalize, to show ratings or reviews only when it’s likely that a user will provide the most accurate and positive feedback.
  • Using Amazon Forecast to predict which specific university-provided communications should be routed to Apple News to have the most impact on a target audience.
  • Triggering Siri suggestions about an institution’s events, offers, and activities.
  • Using the iMessage app for text messages, just-in-time offers, reviews, polls, and other activities without installing the customer’s app.

Other activities using these and similar capabilities can help customers in other industries also expand their marketing activities and deliver them to end-user devices.

More from AWS

Beyond services to create and deliver marketing capabilities, ClearScale’s solution incorporates components for easier management of the workflows entailed.

This includes using AWS Step Functions, a workflow management tool, for data pipeline orchestration. AWS Step Functions coordinates multiple services into serverless workflows comprised of a series of steps, with the output of one acting as the input for the next.

Step Functions automatically triggers and tracks each step and retries when there are errors. Step Functions also logs the state of each step so any problems can be diagnosed and debugged quickly.

Other key AWS services used in the solution include:

  • Amazon Neptune — a fast, reliable, fully-managed graph database that makes it easy to work with highly connected datasets.
  • Amazon Redshift — a data warehouse that allows for querying data in a data lake without moving it or transforming it into a set schema.
  • Amazon Aurora — used as the online transactional processing endpoint in the solution, the MySQL and PostgreSQL-compatible relational database combines the performance and availability of traditional enterprise databases with the simplicity and cost-effectiveness of open-source databases.

The resulting platform is now used by the client in human resources-oriented applications. It’s also utilized for college and university recruitment and retention.

Learn More About ClearScale

ClearScale delivers best-of-breed cloud systems integration and app development. As an AWS Premier Consulting Partner, we design and implement custom solutions using a wide range of AWS services. Our team of AWS-certified and experienced engineers, architects, and developers have a long track record of successful projects for cloud-based application development, cloud migration, Big Data, and much more.

To learn how ClearScale can help your company leverage Amazon Forecast, Personalize, or other AWS services in your next project, contact us.