The DevOps mindset and the “shift-left” mentality impact how you work as a back-end engineer.
With more power, comes more responsibility. You’ll pick up new processes and tools and handle more operational tasks. The task is not done when you commit code to GitHub! You need to monitor how an application behaves once your CI/CD pipeline deploys it to production.
These new responsibilities include observability and testing, which traditionally weren’t always something back-end engineers needed to worry about implementing.
Observability tools help you measure the internal state of a distributed system by examining distributed traces. Tracing is a fast-growing and immediately valuable resource for those who work in distributed systems.
Testing tools help you understand whether your distributed application or service performs according to its design and business requirements through automation. You can test whether different services work together as expected (integration tests), whether your application returns the correct output from an action (functional tests), replicate user behavior (end-to-end), and more.
It gets harder when your organization uses a distributed infrastructure. This complicates which tools you can use for observability and testing. The sooner you integrate both observability and testing tools into your workflows the better. Instrumenting your back-end services early in the development process, makes it easier to troubleshoot issues and release high-quality code.
Let’s look at the landscape of available observability and testing tools today, with an emphasis on the open-source ecosystem, in search of those that can help you do both observability and testing.
Tracetest
Tracetest is an open-source testing tool based on distributed tracing that enables you to test your distributed application by asserting on spans within a distributed trace. It allows you to use your trace data generated on your OpenTelemetry instrumented code to check and assert if your application has the desired behavior defined by your test definitions.
It’s designed to help back-end engineers implement observability-driven development, where back-end engineers instrument their services with distributed tracing during development for high-quality observability.
You can leverage trace-based testing to build, execute, and view tests against your code in one place. Tracetest generates end-to-end tests automatically based on any distributed system instrumented with distributed tracing like OpenTelemetry, and integrates easily with Jaeger, Grafana Tempo, New Relic, Lightstep, Opensearch, Datadog, and more, with even more planned for the future.
Tracetest is a new addition to the CNCF landscape, and is 100% open source, with code first published on GitHub in February 2022. If you like what you are seeing from Tracetest, give the project a star on GitHub!
Tracetest features for observability
- Get value from trace data you’re already collecting.
- Out-of-the-box integrations with the most popular trace data stores.
- Bake observability into your back-end code with by adding OpenTelemetry instrumentation.
- Find the “unknown unknowns” in your infrastructure with visibility into communication between services.
Tracetest features for testing
- Create tests against your traces to ensure your distributed system handles requests between microservices as expected and demanded.
- Define assertions against both the response and distributed trace, which ensures both your response and underlying process work without error.
- Help QA engineers write valuable end-to-end tests with a visual UI.
- Reuse tests and assertions across multiple microservices with a powerful filtering engine.
Malabi
Malabi is an open-source test framework. With Malabi, you can write integration tests on distributed systems by collecting data from a microservice during a test run, then exposing an endpoint to make assertions on that data. The maintainers say Malabi implements trace-based testing, similar to Tracetest. Malabi uses OpenTelemetry to collect your trace data.
When you pick out any product or platform—open source, closed SaaS web app, or anything in-between—it’s important to consider its development velocity. Malabi hasn’t seen a commit to GitHub in a year, which might signal that it won’t get more features or technical support if you run into an issue.
Malabi features for observability
- Malabi isn’t designed with observability in mind, which means it has no features in this area.
Malabi features for testing
- Validate any integration between parts of a distributed system before you push to production.
- Add a simple JavaScript-based assertion library to any microservice you want to test.
Prometheus
Prometheus is the de facto standard for monitoring, one aspect of observability, focusing on gathering metrics and enabling alerts. It uses a robust time-series database for storing high-resolution metrics data and multiple modes for visualizing what you’ve collected from your back-end services.
Prometheus is an enormously popular open-source project, with 46.5k stars on GitHub and full graduated status from the Cloud Native Computing Foundation (CNCF), which also helps manage its governance and roadmap. There is undoubtedly a ton of community support and love for the value Prometheus delivers for back-end engineers who need robust observability tools.
Prometheus features for observability
- Store long-term metrics data, for historical analysis, with an efficient time-series database and scaling functionality through sharding and federation.
- Create powerful alerts with PromQL, a flexible query language that maintains dimensional information.
- Push metrics and/or alerts to other tools in your observability infrastructure with open-source client libraries and integrations.
Prometheus features for testing
- Since Prometheus is only a metrics collection and alerting tool, it doesn’t help for back-end developers looking to test their services.
Jaeger
Jaeger is an open-source end-to-end tracing tool designed to help developers monitor and troubleshoot transactions in distributed environments. The goal is to simplify how developers debug a set of distributed services, which is far more complex than dealing with a single monolith.
Jaeger is fully open source! The project started at Uber, which released the source code, and eventually donated the project to CNCF.
Jaeger features for observability
- Monitor transactions between distributed services to understand the health and performance of your infrastructure.
- Perform root cause analysis by drilling down into single transactions that cause user-facing issues.
- Optimize for performance and latency by discovering which services respond slowest to requests.
Jaeger features for testing
- Jaeger is designed for end-to-end tracing, but it doesn’t have any tools to help you develop tests for your back-end services.
Grafana Tempo
Grafana Tempo is an open-source, high-scale distributed tracing back-end responsible for collecting and storing trace data. The project is open source under the AGPLv3 license. It’s built and maintained by Grafana Labs, the company behind other open-source projects like Loki for logs, Grafana for visualizing and alerting on metrics data, and Mimir for storing metrics data. It was first announced in October 2020 and became generally available in 2021.
Grafana Tempo features for observability
- Ingest trace data from the most popular open-source tracing protocols, including OpenTelmetry, Jaeger, and Zipkin.
- Affordable long-term storage for trace data to unlock historical data trends and analysis.
Grafana Tempo features for testing
- While Grafana Tempo helps you implement tracing in your back-end services, it doesn’t have tools for writing or executing tests.
OpenSearch
OpenSearch is an open-source database to ingest, search, visualize, and analyze data. It’s built on top of Apache Lucerce, a FOSS library for indexing and search, which OpenSearch leverages for more advanced analytics capabilities, like anomaly detection, machine learning, full-text search, and more.
OpenSearch was born from a bit of open-source controversy. In early 2021, Elastic announced they would change the licensing model for their popular Elasticsearch and Kibana projects. AWS responded by forking those projects into OpenSearch and OpenSearch Dashboards, respectively, under a more permissive ALv2 license.
OpenSearch features for observability
- Ingest trace data from OpenTelemetry or Jaeger, which can be used to visualize and identify performance problems.
- Leverage community plugins to gather observability data from Prometheus and customize the output with rich visualizations.
- Filter, transform, normalize, and aggregate data to make your analytics and visualizations more relevant and less complex.
OpenSearch features for testing
- While OpenSearch can collect metrics, traces, and logs, all of which can be used to validate tests, it doesn’t have any features to help developers create, deploy, or manage those tests—you’ll need to find a discrete tool and connect its outputs to OpenSearch.
SigNoz
The team behind SigNoz describes itself as an open-source alternative to enterprise-level observability platforms like Datadog, New Relic, and more. Unlike some of the more generalist tools on this list, SigNoz focuses on application performance monitoring (APM), which attempts to measure performance from the end-user experience perspective, helping developers fix issues before real users are affected.
Since SigNoz started in January 2021, the project has amassed nearly 12k GitHub stars and offers a paid ($200/mo) cloud-based version of its software that’s managed by their team.
SigNoz features for observability
- Support for OpenTelemetry as the foundation for instrumentation and generating trace data from your application.
- A unified UI for metrics, traces, and logs, which reduces the need to context-switch between other observability tools, like Prometheus and Jaeger, to debug and troubleshoot issues.
- Flamegraphs and individual request traces to help discover the root of a performance problem.
- Build dashboards and alerts based on attributes within your logs.
- Quickly visualize the slowest endpoints in your application.
SigNoz features for testing
- Because SigNoz is an observability-only tool, it doesn’t currently have any specific features that help backend developers understand the health and performance of a distributed system.
Postman
Postman is a departure from the tracing- and observability-focused tools we just covered. Instead, Postman is a cloud platform for building and using APIs. Once your back-end team is on Postman, it acts like an API repository, giving you a single place to create, document, mock, and test your APIs across their entire lifecycle.
Postman itself is not open source—it’s a closed cloud platform—but the company has an established open source philosophy and maintains a handful of open-source projects like Newman, for running and testing a Postman Collection on the CLI, or SDKs and code generators in a variety of programming languages. As proof of Postman’s staying power, the company most recently received funding in August 2021, a series D for $225M, valuing the company at $5.6 billion.
Postman features for observability
- As an API development platform, Postman doesn’t offer any observability features.
Postman features for testing
- Store and manage all your organization’s API specifications, documentation, test cases, metrics, and more in one centralized location.
- Debug and test your APIs with a client that supports complex requests using HTTP, REST, SOAP, GraphQL, and Websockets, which can be bundled into Postman Collections for reuse.
- Integrate your API lifecycle with source control, CI/CD pipelines, API gateways, and application performance monitoring (APM) platforms.
Wrapping up
I’ve tried to cover some of the key players in a fast-moving space, with tons of variety. They are all free and open-source software available on GitHub. Some focus exclusively on observability, others on testing, while a select few bridge the gap between those two to help back-end engineers like yourself ship higher-quality deployments through observability-based testing.
One clear takeaway is the enormous value in instrumenting your back-end code with distributed tracing and OpenTelemetry sooner rather than later! Many of these popular observability and testing tools integrate with OpenTelemetry’s collector or SDK, which means you can instrument once and test out multiple tools to find the workflows that work best for back-end development at your organization.
If having both observability and testing functionality in a single tool and using tracing to enable observability-driven development sound like wins to you, check out Tracetest. And once you're generating valuable end-to-end tests faster than ever, let us know your tracing successes on Discord!
If you like our direction and what you are seeing from Tracetest - give us a star on GitHub!