AWS Lambda Performance and Cold Starts
This article was written by Clay Smith, a Developer Advocate at New Relic. The original article is located here.
One of the most discussed components of serverless compute architecture is Function-as-a-Service (FaaS) products like Amazon Web Services Lambda. AWS Lambda and competitors like Google Cloud Functions or Microsoft Azure Functions are designed to let developers write scalable code without having to think about the details of the container, operating system, or infrastructure that actually runs the program.
While this offers a less complex (and potentially much less expensive) way to build systems, it presents a new challenge to operators and developers: How do you build fast and resilient functions when many traditional system and application metrics are either unavailable or no longer relevant?
To help answer that question and make more informed performance decisions, let’s look at metrics from AWS Lambda functions that respond to external API requests. We’ll analyze function invocation time and HTTP request timing data for Lambda functions behind an API Gateway to understand the latencies of different components.
Observing Cold Start Time in the Real World
Cold start time refers to the increased invocation time that can occur when a Lambda function is invoked after not being used for a long enough period of time. Observing cold start time is relatively straightforward. In this example using data from New Relic Infrastructure’s AWS Lambda integration, we can see how long functions took to execute for four example functions created for this post:
In the chart above, the function in green executes faster than the others. It finishes in less than 100 milliseconds in most cases. (This threshold is important since, as of early 2017, AWS bills Lambda functions in 100-millisecond intervals.)
Functions will vary in their execution time, yet all of these functions are identical and run in the same region. The only difference is how often they are triggered from an external API request. Using New Relic Synthetics monitors, the green function (memInfo0) is invoked once a minute, while the other functions are invoked every 10, 30, and 60 minutes, respectively. We also can count the number of invocations (in order to safely remain in the AWS Lambda free tier):
In theory, because the green function is kept warm by Lambda’s internal scheduler so that it can more quickly respond to frequent requests, it should execute faster in response to its event trigger. While it’s nice to have a Lambda function execute slightly faster, does it actually matter in the real world when responding to an external API request?
End-to-End Request Visibility Using Synthetic Checks
To better understand how Lambda functions can respond to real-world requests (including network latency, TLS negotiation, and connection time), we’ve set up and collected data from Virginia-based synthetic check monitors invoking identical Lambda functions behind an API Gateway running in a West Coast AWS region.
This data provides a more detailed view of requests powered by Lambda functions than just looking at function invocation time. Surprisingly, the performance benefit we received from warming the green function had no significant impact on how quickly the external request completed and the warmed Lambda function sometimes responded more slowly than the cold Lambda functions did.
The data also reveals that the time needed to establish a secure TLS connection to the API gateway from across North America — more than 200 milliseconds in some cases — is more significant than the Lambda execution time itself.
Because these functions rarely operate in isolation, understanding the latencies of different components — including clients, network latency, gateways, and other service dependencies — is important to understanding what (and what not) to optimize.
Lambda Optimization: Good Questions to Ask
In our example, the benefit of keeping a simple Lambda function warm was overwhelmed by other considerations. However, this is not necessarily true for all Lambda functions. Here’s a checklist of questions to ask when analyzing performance:
- Is the function performing a CPU-intensive task or making many network requests? For CPU-heavy functions, how does boosting the memory setting affect invocation time (and does it save money)?
- How does overall Lambda function size affect invocation time? Can any heavyweight library dependencies be removed, and does that cut invocation time?
- What’s the performance impact of using different serverless frameworks? Does it make sense to change, update, or remove a framework for certain functions?
- Does function warming have any meaningful impact when compared to an identical cold function in the same region?
Don’t Think Servers; Still Think Performance
As many others have said: Serverless does not mean no servers. Instead, as noted in the AWS console, serverless is more a question of not having to think about servers. While a serverless approach can simplify some computing tasks, the need to collect, understand, and analyze performance data remains as important as ever.
Functions-as-a-Service products are still new, and as developers acquire better tools and get more familiar with designing, operating, and troubleshooting FaaS systems, we should get better at building more impressive serverless architectures and apps. Until then, looking at the data and asking hard questions is a good place to start.