In this blog post, we'll explore how to use OpenTelemetry to instrument and monitor serverless applications running on AWS Lambda. We'll look at the Open Telemetry SDK, a component that simplifies the configuration and deployment of OpenTelemetry agents for Lambda functions.
What is OpenTelemetry?
OpenTelemetry is an open-source project that provides a unified framework for collecting, processing, and exporting observability data from distributed systems. Observability data includes metrics, traces, and logs that help developers and operators understand the behavior and performance of their applications.
OpenTelemetry supports multiple languages, platforms, and vendors, and offers a vendor-neutral API and SDK for instrumenting applications. It also provides a set of exporters that allow users to send their data to various backends, such as New Relic, Jaeger, Zipkin, AWS X-Ray, and many more.
Why use OpenTelemetry for serverless?
Serverless applications are composed of ephemeral, event-driven functions that run on demand in response to triggers. This makes them scalable, cost-effective, and easy to deploy but also introduces new challenges for observability.
For example, serverless functions have short lifespans and may not have access to persistent storage or network resources. This means traditional methods of collecting and exporting observability data, such as using agents or sidecars, may not work well for serverless functions. Additionally, serverless functions may have complex invocation patterns and dependencies, which make it difficult to trace the end-to-end execution flow and identify bottlenecks or errors.
OpenTelemetry addresses these challenges by providing a lightweight and flexible solution for instrumenting serverless functions. With OpenTelemetry you can:
Add minimal code changes to the functions to enable automatic instrumentation of common libraries and frameworks.
Configure the sampling rate and span attributes to control the amount and quality of trace data collected.
Choose from a variety of exporters to send their data to the backend of their choice.
Leverage the Open Telemetry Lambda Collector to simplify the deployment and configuration of OpenTelemetry agents for Lambda functions (not covered in this blog).
We'll focus on the automatic instrumentation of the Lambda function using the OpenTelemetry libraries and SDKs for Node.js.
Pre-requisites
What do you need to follow this guide successfully?
New Relic ingest license key. (Don’t have an account? Sign up for a free New Relic account).
Basic knowledge of OpenTelemetry
Libraries overview
These are the few key libraries and APIs that are essential in instrumenting the AWS Lambda function with the Node.js runtime.
@opentelemetry/instrumentation-aws-lambda: This module provides automatic instrumentation for the AWS Lambda module, and it's' included in the @opentelemetry/auto-instrumentations-node
bundle as a plug-in. This is necessary to instrument our AWS Lambda functions.
@opentelemetry/sdk-trace-node: This module provides automated instrumentation and tracing for Node.js applications. It exposes an API called NodeTracerProvider
, which can be used with the registerInstrumentations
API from @opentelemetry/instrumentation
to load individual instrumentation plugins.
OTLPTraceExporter: This is a common API provided in the @opentelemetry/exporter-trace-otlp-http
and @opentelemetry/exporter-trace-otlp-proto
packages for exporting the collected telemetry to a collector or a choice of backend. We'll use New Relic's native OpenTelemetry Protocol (OTLP) endpoint to export our telemetry data.
Instrumenting our serverless functions with OpenTelemetry
In this example, we're using the serverless framework to quickly set up the Lambda function along with an API gateway for the entry point. The lambda function is a simple Koa REST API with a few functional endpoints.
Step 1: Install packages
First, install all required packages for Node.js and Node runtime on the Lambda function. Copy the command below and run it in the root of your Node app
npm install --save
@opentelemetry/auto-instrumentations-node
@opentelemetry/sdk-trace-node
@opentelemetry/exporter-trace-otlp-proto
Step 2: Configure SDK
Next, create a new file with the name otel-wrapper.js
in the root or src folder of your Function in the serverless framework project. Now, copy and paste the following code into the newly created file:
const { NodeTracerProvider } = require("@opentelemetry/sdk-trace-node");
const { registerInstrumentations } = require("@opentelemetry/instrumentation");
const {
getNodeAutoInstrumentations,
} = require("@opentelemetry/auto-instrumentations-node");
const { BatchSpanProcessor } = require("@opentelemetry/sdk-trace-base");
const { Resource } = require("@opentelemetry/resources");
const {
SemanticResourceAttributes,
} = require("@opentelemetry/semantic-conventions");
const {
OTLPTraceExporter,
} = require("@opentelemetry/exporter-trace-otlp-proto");
const {
CompositePropagator,
W3CBaggagePropagator,
W3CTraceContextPropagator,
} = require("@opentelemetry/core");
// For troubleshooting, set the log level to DiagLogLevel.DEBUG
const { diag, DiagConsoleLogger, DiagLogLevel } = require("@opentelemetry/api");
// var logger = diag.createComponentLogger(DiagLogLevel.WARN);
diag.setLogger(new DiagConsoleLogger(), DiagLogLevel.WARN);
const COLLECTOR_STRING = `${process.env.OTEL_EXPORTER_OTLP_ENDPOINT}/v1/traces`;
/**
* The `newRelicExporter` is an instance of OTLPTraceExporter
* configured to send traces to New Relic's OTLP-compatible backend.
* Make sure you have added your New Relic Ingest License to NR_LICENSE env-var
*/
const newRelicExporter = new OTLPTraceExporter({
url: COLLECTOR_STRING,
headers: {
"api-key": `${process.env.NR_LICENSE}`,
},
});
const provider = new NodeTracerProvider({
resource: new Resource({
[SemanticResourceAttributes.SERVICE_NAME]: process.env.OTEL_SERVICE_NAME,
}),
});
provider.addSpanProcessor(
new BatchSpanProcessor(
newRelicExporter,
//Optional BatchSpanProcessor Configurations
{
// The maximum queue size. After the size is reached spans are dropped.
maxQueueSize: 1000,
// The maximum batch size of every export. It must be smaller or equal to maxQueueSize.
maxExportBatchSize: 500,
// The interval between two consecutive exports
scheduledDelayMillis: 500,
// How long the export can run before it is cancelled
exportTimeoutMillis: 30000,
}
)
);
provider.register({
propagator: new CompositePropagator({
propagators: [new W3CBaggagePropagator(), new W3CTraceContextPropagator()],
}),
});
registerInstrumentations({
instrumentations: [
/**
* AutoInstrumentations META package for Node.js
* contains bundled libraries for most libraries and frameworks
*/
getNodeAutoInstrumentations({
"@opentelemetry/instrumentation-fs": {
enabled: process.env.ENABLE_FS_INSTRUMENTATION,
requireParentSpan: true,
},
// Enable instrumentation for KOA framework
"@opentelemetry/instrumentation-koa": {
enabled: true,
},
/**
* Enable instrumentation for AWS Lambda framework
* Use the requestHook and responseHook to
* add additional attributes to the traces
*/
"@opentelemetry/instrumentation-aws-lambda": {
enabled: true,
disableAwsContextPropagation: true,
requestHook: (span, { event, context }) => {
span.setAttribute("faas.name", context.functionName);
if (event.requestContext && event.requestContext.http) {
span.setAttribute(
"faas.http.method",
event.requestContext.http.method
);
span.setAttribute(
"faas.http.target",
event.requestContext.http.path
);
}
if (event.queryStringParameters)
span.setAttribute(
"faas.http.queryParams",
JSON.stringify(event.queryStringParameters)
);
},
responseHook: (span, { err, res }) => {
if (err instanceof Error)
span.setAttribute("faas.error", err.message);
if (res) {
span.setAttribute("faas.http.status_code", res.statusCode);
}
},
},
}),
],
});
Highlighting a few key things in the instrumentation in otel-wrapper.js
, under the configurations for instrumentation-aws-lambda
, the property disableAwsContextPropagation
has been set to TRUE
. This is required to skip Amazon X-Ray parent extraction, which may conflict with the OpenTelemetry traceparent
headers.
Most Node.js OpenTelemetry libraries offer two methods, requestHook
and responseHook
, for adding custom attributes to spans. These hooks can also be used to add conditional attributes to spans. Take note of the sample responseHook
below, which adds an extra attribute to the span called faas.error
containing the error message thrown by the code.
responseHook: (span, { err, res }) => {
if (err instanceof Error)
span.setAttribute("faas.error", err.message);
if (res) {
span.setAttribute("faas.http.status_code", res.statusCode);
}
},
Step 3: Configure environment
In the code above in the otel-wrapper.js
, there are a couple of environment variables that are being referred to. We need to define these in our function configuration. Since we're using the serverless framework in this example, we need to add them to our serverless.yml
configuration file.
Most importantly, to start our function with the OpenTelemetry instrumentations, we need to launch our function a little differently. Notice the NODE_OPTIONS
environment variable in the list of all the defined variables. We're setting the argument to use our otel-wrapper.js
as the module to be initialized to invoke our function.
functions:
api:
handler: index.handler
environment:
NODE_OPTIONS: "--require ./otel-wrapper"
OTEL_SERVICE_NAME: "Node-Lambda-Otel-SDK"
# External API for Distributed traces
EXPRESS_OTEL_API_ENDPOINT: "http://3.230.230.121/v3/api"
# License key stored in SSM parameter.
NR_LICENSE: ${ssm:/nr_experiments_ingest_key}
OTEL_EXPORTER_OTLP_ENDPOINT: https://otlp.nr-data.net:4317
# Disabling FS instrumentation to reduce noise
ENABLE_FS_INSTRUMENTATION: false
Step 4: Explore data
Once your function is deployed, it's time to explore the telemetry data from the Lambda in the New Relic platform.
Lambda requests metrics from Spans
Attributes from Distributed Traces
Look at the highlighted attributes in the screenshot above: faas.http.method
, faas.http.target
, and faas.name
. These attributes were added by the requestHook
method, but not all the attributes are present in this trace; that's because there are conditional attributes depending on the incoming request.
Our Lambda has two different endpoints which are bridged via API gateway, /path
and /weather
. Let's take a look at the traces from the second endpoint, which is /weather?location=bangalore
The /weather
endpoint in the Lambda function calls an external entity that's also instrumented with OpenTelemetry. As a result, the distributed traces capture both the entities and the New Relic platform automatically generates a service map that shows the complete journey of the request.
Lambda distributed traces and service map
In this specific trace, the additional attribute faas.http.queryParams
is now visible in the list of attributes. Refer to the screenshot below.
Conditional attributes from requestHook
When exploring the spans in the New Relic UI, you can also check the attributes of the called entity from the distributed traces screen.
Attributes from the called OTEL entity
Conclusion
We've seen how OpenTelemetry can help us instrument and monitor serverless applications running on AWS Lambda. Instrumenting AWS Lambda functions with OpenTelemetry SDKs and exporting traces to New Relic can greatly enhance the observability of serverless applications. By following the steps outlined in this blog post, you can gain valuable insights into the performance, behavior, and dependencies of your Lambda functions. By exporting traces from OpenTelemetry to New Relic, we can take advantage of its powerful features, including service map visualizations, transactions, alerting, and much more.
Next steps?
Check out the source code for the application used in this blog on this GitHub repository: github.com/zmrfzn/lambda-opentelemetry
If you're interested in learning more about OpenTelemetry and serverless observability, you can check out these resources: