Introduction
In this post I’ll continue with my series of Real Time data integration in Cloud. This time I’ll explain How I built a real time telemetry data ingestion pipeline using AWS EKS to process UDP data from the F1 2023 Playstation 4 game. The solution efficiently handles the ingestion, processing and visualization of live racing telemetry, all while ensuring scalability, reliability and cost-effectiveness.
One of the reasons because this use case was chosen because F1 car telemetry data is an excellent representation of high-frequency, combined with the complexity of UDP as a protocol make it an ideal scenario to showcase the solution.
As a Cloud Architect, I cannot proceed with the article without remarking that I review and apply the AWS Well Architected Framework in any project but also in my personal projects like this. The major part of the architectural decisions were made according to best practices and design principles of Well Aerchitected Framework. For example, using Spot instances in order to achieve cost efficiency, EKS/Kubernetes for scalability, and Grafana for maintain operational excellence via monitoring.
Design Considerations
As I’ve commented in the Introduction, the architecture was designed with several WAF key principles in mind:
Scalability: The system is capable of scaling to ingest and process real time data without performance degradation. AWS EKS and kubernetes provide a level of elasticity that allow to scale in and out in terms of Cluster and also more specifically horizontally the pods can leverage HPA (Horizontal Pod Austoscaling) in K8s (Kubernetes).
Reliability: A TCP HealthCheck sidecar ensures that the UDP listener service remains available, preventing data loss.
Cost Optimization: Spot instances and ARM-based instances (C6g) were chosen for their lower cost, while maintaining the necessary performance.
Performance efficiency: Low latency and high troughput is required. for that reason, the combination of Spot Instances Network load balancer and A*RM based architecture* is the best suited for high performance network operations.
Security: A Custom private network, with Load Balancers and security groups fine grained to allow only the specified access/permissions. In addition, AWS IAM policies and EKS Teams ensures to manage access securely.
Why UDP ?
I consider to slightly pause here, prior to go deep in the architecture details, and explain a bit the protocol that we are going to ingest in AWS for this architecture, that is UDP (User Datagram Protocol), to handle the telemetry data generated from the F1 2023 game.
UDP is a connectionless protocol, meaning it sends data without establishing a persistent connection between the sender and receiver, unlike TCP (Transmission Control Protocol), which maintains a connection and ensures reliable delivery through acknowledgments and retransmissions.
For the scope of this article is interesting to briefly explain and justify why UDP is being used, and not only in this case (F1 2023 game) but also as a standard to send telemetry or sensors/iot data. There are several reasons for that:
Low latency: In a fast-paced environment like racing, timing is crucial. UDP allows data to be sent without waiting for ACK from the receiver, unlike TCP, which requires confirmation that each packet has been received successfully. That makes UDP faster by reducing considerably latency.
Real-Time Data: Telemetry data becomes outdated very quickly, if a packet is lost, it’s often more important to receive the next one as soon as possible rather than retry sending old data.
High Throughput: UDP can handle a large volume of data without the overhead of managing the connection, errors, or delivering ordered.
Lightweight: The F1 telemetry is broadcast to multiple clients (such as applications or external systems) UDP is more efficient for multicast communication because don’t require setting up individual connections (as TCP would)
On the other hand, there are a few drawbacks that should be taken into account:
No guaranteed delivery: As UDP does not provide ACK of received packets, there’s no guarantee that the data will reach the destination.
No Error Connections: Corrupted data packets may be received without any mechanism to request a retransmission.
Out-of-Order delivery
Lack of congestion and Flow control: Unlike TCP, UDP doesn’t adjust retransmission rate based on network conditions. The quantity of data sent at once either is controlled, meaing that there’s a risk of overhelming the receiver if the data stream is too. fast and too frequent.
Let’s remark that for the purpose of our project, some of this drawbacks are not handled as we consider them ‘out of the scope’, but, I still consider interesting to take them into account and include them if this architecture is used as a blueprint for a new workload. It’s important to remark, that for the specific case of real-time F1 car telemetry, the drawbacks are mitigated by the fact that data is constantly being generated and not persisted. It’s only being monitored in real-time.
Architecture
High-level architecture diagram of the solution:
Key Components
PlayStation 4 with F1 2023: Sends real-time telemetry data via UDP to a specific IP and port. The game allows for the data to be broadcast to the entire network or to a targeted endpoint.
EKS Cluster: Deployed in the
eu-central-1
region via AWS CDK (EKS Blueprints). The cluster is composed of spot instances, specifically ARM-based C6g instances.Network Load Balancer (NLB): Exposes the UDP listener service to the public internet, mapping to a static Elastic IP.
Application Load Balancer (ALB): Provides secure access to the WebSocket server for external clients and to Grafana for real-time telemetry visualization.
UDP Listener Service: Receives telemetry data from the PlayStation and processes the UDP packets according to the F1 game’s telemetry specification. A sidecar service (Nginx) performs TCP health checks to ensure availability.
WebSocket Server: Publishes the telemetry data to connected clients. For this project, only speed data is transmitted.
Grafana OSS: Configured with a WebSocket plugin to allow real-time visualization of telemetry data, enabling users to monitor key metrics like vehicle speed.
Workflow
The diagram has been labeled with the main workflow process for telemetry ingestion. The different steps have been brifly explained next:
The PlayStation 4 with F1 2023 sends UDP telemetry data to the Elastic IP associated with the Network Load Balancer.
The NLB forwards this data to the UDP listener service running on AWS EKS.
A sidecar TCP health check ensures the UDP listener's availability by monitoring the service.
The UDP listener processes the telemetry data and forwards it to the WebSocket server.
The WebSocket server broadcasts the telemetry data to connected clients.
Grafana, connected via the WebSocket plugin, visualizes the data, providing real-time insights into the car's speed.
Demo / Showcase
After deploying the project in my AWS personal account I’ve recorded myself playing a short Race in the official Formula1 Circuit de Barcelona-Catalonia . Obviously is my favourite circuit, as I was born a few kilomentres away from it.
Next we can see the recorded game and, to monitor the telemetry generated, the Logs of Websocket server generated in real-time and the grafana dashboard with speed gauge configured:
Conclusion
This architecture demonstrates how AWS EKS, in combination with other services, can build a scalable, reliable, and cost effective solution for ingestion UDP data. In my opinion, this open source project serves well as a blueprint or template for integrating UDP-based real-time communication into the cloud. The workload is simple and cover only one use case (Car’s speed), but can be considered as a starting point, and can be adapted for other high frequency data streaming use cases, making it flexible and scalable template.
From an operational perspective, I want to mention that I’ve used Infrastructure as code (IaC) by defining the source code of the recources through CDK and the Kubernetes manifests via kustomize and ArgoCD for easy gitops. This ensures a repeatable ans maintanable process for scalins and evolving the olution over time.
For local testing, I’ve used Minikube, in order to emulate Kubernetes, that allowed me to test the deployments before to rolling them out to AWS, ensuring a smooth integration with the target environment (develop for this project)
Next Steps
- The full code of this solution is available in my GitHub repository:
https://github.com/acriado-dev/aws-eks-udp-telemetry
- Future improvements could include processing all Car’s telemetry or adding data analytics for the data ingested. I’m always openned to add some functionality in the project.