How to measure and improve your serverless application's health?

Taavi Rehemägi - Nov 15 '21 - - Dev Community

This article will cover how the health of your serverless application can be measured and improved.

Technology and its implementation methodology evolve with time very rapidly. Cost efficiency and productivity are the key drivers of technological evolution these days. With the advent of the cloud, infrastructure costs have been brought down significantly. Serverless technology adds icing to the cake! Serverless, or in other words, pay-as-you-go computing, enables users to not pay for infrastructure while apps are sitting idle.

AWS Lambda and serverless computing have become synonymous with each other. But, that's not exactly true. AWS Lambda is a compute service on the AWS cloud provider. While serverless stands for any and every service you can use to serve your app without managing your servers. These services are numerous on AWS, like Kinesis, S3, API Gateway, and Lambda. The same applies to other cloud providers such as Azure and Google Cloud!

If you choose to use AWS Lambda to create functions or any serverless architecture-based service, you will have to deal with some trade-offs.

To name a few, you lose some flexibility. Mainly because you cannot connect to the instance as you would with EC2 or EKS. But the main problem is the difficulty in monitoring issues, diagnosing where they are happening, and debugging them. Considering these limitations, this article will cover how the health of your AWS Lambda functions can be measured and improved.

Before moving on, let's just mention some positive trade-offs. The main upside is that you do not have to manage any servers. You just deploy the code, and the cloud provider does the rest. You don't have to scale anything because it will auto-scale to keep up with spikes in usage. Meaning, you can sleep at night and not worry about downtime. I like to sleep. Very much.

How to measure the health of your AWS Lambda functions

Here are a few metrics which should be considered while measuring how good your Lambda functions are performing. It is important to note that these measures vary from function to function depending on their type. For example, user-facing functions need to respond quickly to synchronous requests, while a calculation-intensive non-user-facing request will have different priorities.

  • How many times functions are invoked in a given period?
  • What is the average time taken by these function calls? And what is the breakdown between lambda service and function code?
  • What is the memory usage of these functions?
  • Which are the most invoked functions, and which caused the most errors?
  • How many functions were throttled?
  • How frequently does my function have a cold-start?

Dashbird helps to aggregate and analyze the metrics above to diagnose issues that surface in a serverless environment. Dashbird parses CloudWatch logs and presents an overview that enables users to analyze Lambda functions effectively, including their health, errors, invocations, among many other vital factors.

Here is an image of the main dashboard on Dashbird. It's a convenient tool that lets you analyze the health of your Lambda functions. As you can see, parameters like the number of invocations, duration, memory usage are among the few parameters displayed straight up.

1. Fixing / Improving bad lambdas

Dashbird lets you analyze individual Lambda functions. You can narrow down your analysis by clicking on a specific Lambda function above. Say you click on the function named "timeout" in the view above; it will show a detailed picture of the timed-out function as shown below.

debug aws lambda

Now, if you wish further to analyze the logs or other factors of specific invocation, you can do so by clicking on an invocation in the list, which shows this.

analyze aws lambda

2. Allocating more memory to resource-hungry functions

Dashbird also provides memory usage for each Lambda function. The memory usage in the screenshot below helps you identify if we need to allocate more memory to improve health.

troubleshoot AWS Lambda

3. Unused libraries/frameworks

The size of the deployed archive can impact the performance of your Lambda function. Removing any unused dependencies, like libraries or frameworks, will speed up the warm-up and invocation time. If none of the libraries can be removed, one can look for alternative lightweight, efficient libraries to replace the currently used ones.

4. Avoid Cold starts, keep lambdas warm

AWS Lambda is billed per invocation. This also includes the first, more extended invocation, called cold-start. If you already applied the last three points in this list, and some Lambda functions still have slow cold-starts, you can configure provisioned concurrency for them.

Provisioned concurrency lets you keep a specified number of execution environments warm to speed cold-starts up for new requests. 

This is the AWS CLI command to set up provisioned concurrency for 100 for a function:

$ aws lambda put-provisioned-concurrency-config_
_ --function-name _
_ --qualifier _
_ --provisioned-concurrent-executions 100

5. X-Ray SDK

With AWS X-Ray, developers can analyze and debug performance issues and troubleshoot them. You can use AWS X-Ray SDKs to create your own trace segments, annotate your traces, and view trace segments for downstream calls made from your Lambda functions. 

Luckily, Dashbird has X-Ray integration making it incredibly easy to trace the logs!

x-ray integration

Takeaways

Serverless comes with limitations of less control on your application infrastructure; however, with analysis in the right direction powered by the right metrics and utilizing tools provided by Dashbird, one can overcome these limitations.

I hope you liked reading this short overview of monitoring the health of your Lambda functions. Feel free to let us know in the comments below if you have any questions or remarks!

We aim to improve Dashbird every day, and user feedback is essential for that, so please let us know if you have any feedback about these improvements and new features! We would appreciate it!

Serverless comes with limitations of less control on your application infrastructure; however, with analysis in the right direction powered by the right metrics and utilizing tools provided by Dashbird, one can overcome these limitations.

I hope you liked reading this short overview of monitoring the health of your Lambda functions.


Further reading:

6 quick ways to cut costs on your Lambdas

Complete AWS Lambda handbook for beginners (part 3)\

Exploring AWS Lambda deployment limits

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .