By Anthony Loss, Lead Solutions Architect, ClearScale
There’s no denying the breadth of services that AWS offers developers. The problem is that some of these services offer similar capabilities and work for the same use cases. That can make it difficult to determine the optimal one to employ. Consider the AWS Step Functions vs Lambda dilemma.
You can use both services for many of the same types of projects and tasks, such as building web apps and creating workflows. Each can help automate tasks and, in doing so, optimize budgets. Both are serverless, so there’s no infrastructure to manage. And that’s just a few of the similarities.
This all leads to the AWS Step Functions vs. Lambda quandary. When should you use one or the other? Are there advantages and disadvantages of using one service over the other?
The answers aren’t clear-cut, even though the two services really are very different. AWS Lambda is an event-driven compute service; AWS Step Functions is a visual workflow service. The information that follows helps distinguish the two services and provides some of the considerations for determining which to use when.
AWS Lambda Overview
AWS Lambda is a serverless computing service that runs code in response to events, such as changes in state or an update, and automatically manages the underlying compute resources. You can run functions independently of other code and output a result directly to users or other functions, or for consumption by other services.
Because it’s serverless, AWS Lambda performs all the operational and administrative activities for you. This includes capacity provisioning, deploying your code, monitoring and logging your code. (That’s right, no need to install CloudWatch agents like you would on EC2. It’s already done for you.) And applying security patches to the underlying compute resources.
Pricing is based on the number of executions. You pay only for the compute time used; there’s no charge when your code isn’t running.
AWS Lambda Specifications
AWS Lambda supports the following languages and runtimes: Node.js: 10, 12 Python 2.7, 3.6, 3.7 and 3.8, Ruby 2.5, 2.7, Go 1.x, .NET Core: 2.1, 3.1, and Java 8, 11.
Functions run inside AWS-managed containers (no need for you to manage or configure), each with a 64-bit Amazon Linux AMI image. This limits you to binaries compiled in that environment. However, you can implement outside libraries that Lambda may not support natively with “environmental variables” such as C# applications. You can define memory resources between 128 MB and 3,008 MB in 64-MB increments. The amount you allocate is linearly proportional to the CPU available with every 1,792 MB of memory granting access to the equivalent of one vCPU.
AWS Lambda allows a function to run for a limited time before stopping. The default is three seconds. The maximum allowed time is 15 minutes.
The /tmp directory is used as ephemeral disk space. Subsequent triggers don’t have access to the /tmp directory. The limit for uncompressed function packages is 250MB. Compressed function packages are limited to 50MB.
How AWS Lambda Works
To understand how AWS Lambda works, it’s important to understand its components and execution model. The components include lambda functions and packaging functions.
Lambda functions are standalone functions invoked by the AWS Lambda engine and terminated when they finish their work. Packaging functions compress Lambda functions, including their dependencies, and transfer them to AWS.
The execution model employs:
- Function code. This is the function that is run when Lambda is triggered.
- Configuration specs. These are the details that define how functions are executed and what resources are required.
- Environment variables. These variables enable you to store runtime information in key-value pairs and define the function configuration externally.
- Events. These are requests that are served by one instance of a Lambda function. Events typically stream from other parts of the Amazon ecosystem — for example from databases like DynamoDB or changes to files on S3 — and trigger a Lambda function that processes the data in the event.
The process starts when you create a function by uploading your code or building it in the AWS Lambda console. You choose the memory, timeout period, and AWS IAM role. Next, you specify the AWS or third-party source to trigger the function.
AWS Lambda allocates a container and loads the code resources necessary to execute the function. Each function runs in its own container. When an event triggers a function, AWS Lambda runs the function, launching and managing the compute resources as needed to keep up with incoming requests.
Each instance can only process one invocation at a time. To handle simultaneous invocations, the number of instances can autoscale to meet request demands.
AWS Lambda Features and Benefits
Lambda features and benefits include:
- The ability to use any third-party or native libraries. You can also package frameworks, SDKs, libraries, and more as a Lambda layer, and manage and share them across multiple functions.
- Completely automated administration, freeing you to focus on building differentiated backend services.
- Function packaging and deployment as container images, making it easy to build Lambda-based apps using familiar container image tooling, workflows, and dependencies.
- Built-in fault tolerance by virtue of AWS Lambda maintaining compute capacity across multiple Availability Zones (AZs) in each AWS region to help protect code against individual machine or data center facility failures.
- Automatic scaling to support the rate of incoming requests without any manual configuration.
- The ability to add custom logic to AWS resources, so you can easily apply compute to data as it enters or moves through the cloud.
- The ability to create new backend application services triggered on demand using the AWS Lambda API or custom API endpoints built using Amazon API Gateway.
- Built-in SDK that integrates with AWS IAM to ensure secure code access to other AWS services.
- AWS Lambda extensions that enable easy integration with various monitoring, observability, security, and governance tools.
AWS Lambda Challenges
Despite all its benefits, AWS Lambda has its share of challenges. For example, in terms of logging and monitoring apps in AWS Lambda, you can’t count on background daemons to monitor a web server. The code runs on ephemeral containers. Application logs can’t be persisted locally for later inspection or syncing with external platforms.
Debugging AWS Lambda functions can be difficult. Logs from different invocations are mixed together in AWS CloudWatch logs. Microservices and event-driven architectures also make debugging harder since the execution of a given business rule or job is scattered through multiple functions and message buffer systems.
In addition, a Lambda function invocation can only last up to 15 minutes. Longer-running processes may require more time. In these cases, you must break down tasks into chunks that can be processed within the timeout period.
The maximum RAM memory allocated to a Lambda function is 3 GB. If an application requires more, you must break down the compute processing task into chunks that can fit within the memory limit. Yet another issue: the deployment package can only have up to 250 MB when uncompressed.
When to Use AWS Lambda
AWS Lambda is an ideal compute service for many use cases, as long as you can run your code using the AWS Lambda standard runtime environment and with the resources AWS Lambda provides.
Common use cases include data analytics, edge computing, file processing, mobile backends, web apps, rapid document conversion, predictive page rendering, log analysis on the fly, automated backups, processing uploaded S3 objects, backend cleaning, and bulk real-time data processing.
AWS Step Functions Overview
AWS Step Functions is a serverless orchestration service that lets you coordinate multiple Lambda functions into flexible workflows that are easy to debug and change. Think of AWS Step Functions as Lambda orchestration. You can easily manage a serverless application that is loosely coupled and cloud-native, at any size. Through the graphical console, you see your app’s workflow as a series of event-driven steps. At each step, AWS Step Functions manages input, output, error handling, and retries. It logs the state of each step. If things go wrong, you can diagnose and debug problems quickly.
You can choose a standard workflow for processes that are long-running or that require human intervention. Or you can use express workflows for high-volume, short-running (fewer than five minutes) processes.
The billing model is volume-based. So payment is dependent on the number of times a step in your workflow is executed.
AWS Step Functions Specifications
State machines in AWS Step Functions are defined in JSON using the declarative Amazon States Language.
To create an activity worker, you can use any programming language as long as you can communicate with AWS Step Functions using web service APIs. You can also use an AWS SDK in your choice of language.
Step Functions has two workflow types. With standard workflows, each step will execute exactly once and can run for up to one year. With express workflows, one or more steps may execute more than once and can run for up to five minutes.
- 2,000 per second execution rate
- 4,000 per second state transition rate
- Priced per state transition
- Shows execution history and visual debugging
- Supports all service integrations and patterns
- 100,000 per second execution rate
- Nearly unlimited state transition rate
- Priced per number and duration of executions
- Sends execution history to Amazon CloudWatch
- Supports all service integrations
How AWS Step Functions Works
AWS Step Functions employs two main concepts: state machines and tasks. A state machine is a workflow. A task is a state in a workflow that represents a single unit of work that another service performs. Think of a task as a single Lambda function. Each step in a workflow is a state. A state handles the following functions:
- Performs some work in the state machine
- Makes a choice between branches of execution
- Stops execution with failure or success
- Passes its input to its output or injects some fixed data
- Provides a delay for a certain amount of time or until a specified time/date
- Begins parallel branches of execution
A task represents a single unit of work performed by a state machine. A task works by:
- Invoking an AWS Lambda function
- Using an activity, which is the code that awaits input from an operator
- Calling the API of another service
Using AWS Step Functions, you define state machines that describe your workflow as a series of steps, their relationships, and their inputs and outputs. When state machine is created, AWS Step Function stitches the components together and shows you how your system is configured.
The visual console automatically graphs each state in the order of execution, making it easy to design multi-step apps. The console highlights the real-time status of each step and provides a detailed history of every execution
AWS Step Functions Features and Benefits
Features and benefits of AWS Step Functions include:
- Workflow configuration. You define your workflows as state machines, which transform complex code into easy-to-understand statements and diagrams.
- Built-in service primitives. AWS Step Functions provides ready-made steps for your workflow that implement basic service primitives, so you can remove that logic from your app.
- AWS service integrations. You can use service integrations to call over 200 AWS services.
- Coordination of distributed components. AWS Step Functions can coordinate any app that can make an HTTPS connection, regardless of where it is hosted.
- Component reuse. AWS Step Functions coordinates your existing Lambda functions and microservices into robust apps, so you can rewire them into new compositions.
- Workflow abstraction. AWS Step Functions separates the logic of your app from its implementation. You can add, move, swap, and reorder steps without having to make changes to your business logic.
- State management. AWS Step Functions maintains the state of your app during execution, so you don’t have to manage state yourself with data stores or by building complex state management into your tasks.
- Built-in error handling. AWS Step Functions automatically handles errors and exceptions with built-in try/catch and retry.
- History of each execution. AWS Step Functions delivers real-time diagnostics and dashboards, integrates with Amazon CloudWatch and AWS CloudTrail, and logs every execution.
- Visual monitoring. You can watch the steps execute visually and quickly verify that everything operates as expected. The console clearly highlights errors, so you can identify root causes and troubleshoot issues.
- High availability. AWS Step Functions has built-in fault tolerance and maintains service capacity across multiple Availability Zones in each region to protect apps against individual machine or data center failures.
- Automatic scaling. AWS Step Functions automatically scales the operations and underlying compute to run the steps of your app in response to changing workloads.
AWS Step Functions Challenges
Although AWS Step Functions makes it easier to create and manage complex workflows, it also presents some challenges. Among them:
- By decoupling business logic from workflow logic, your application code can be more difficult to understand for others on your team that may need to modify or update it.
- State machines can only be defined in the proprietary, JSON-based Amazon States Language.
- Only a maximum of 256KB of data can pass through your workflows. The maximum execution time for a state machine is one year. Execution history is retained for only 90 days.
- AWS Step Functions have limits per workflow and per AWS account. You can change some, but not all, by submitting a limit increase request via the AWS support center. For example, a workflow can’t have more than 25,000 state transitions in a single execution.
- A Step Functions request can’t have a payload larger than 1MB. If some parts of a workflow use the AWS API inefficiently, a large number of requests from your workflow can trigger the API limits.
- If you move away from AWS Step Functions in the future, you must redefine your application workflows manually or with a different vendor.
- Each state machine exposes data to AWS CloudWatch, but this built-in observability isn’t sufficient to monitor all functions and microservices.
When to Use AWS Step Functions
AWS Step Functions works well for use cases where the priority is rapid iteration on state transitions. And if AWS Lambda functions perform most or all of the actions in the app. It’s also excellent for delayed or long-running workflows. It enables you to have a workflow for up to a year while implementing the waiting state.
In addition, the standard workflow is excellent for business-critical workflows and provides better error-handling logic than Lambda functions.
Common use cases include data processing and ETL orchestration, e-commerce, machine learning operations, media processing, microservices orchestration, DevOps, security and IT automation, web apps, transcoding media files, and sequencing batch processing jobs.
ClearScale has used both AWS Step Functions and AWS Lambda in developing client solutions. Sometimes we use one or the other. Sometimes we use them together. The AWS Step Functions vs. Lambda decision is usually predicated on the specific project and business requirements.
Extensive expertise and experience help our team employ the best-suited service for the job. As an AWS Premier Tier Services Partner, we have well-established success in developing solutions using AWS services. You can read relevant case studies here.
For more information on how ClearScale can apply our expertise to your project, contact us today: