Concurrency & Fault-tolerant In Distributed Systems

Peter Mbanugo - Nov 4 - - Dev Community

In the landscape of modern distributed systems, different programming languages have taken varied approaches to handling concurrency. While the Erlang VM (BEAM) has long been celebrated for its "let it crash" philosophy and robust actor model, other languages are now catching up with their own implementations. Let's explore how Go, Rust, Elixir, and JavaScript tackle the challenges of building resilient, distributed systems.

Concurrency Models: A Brief Overview

BEAM (Erlang/Elixir)

The Erlang VM's actor model remains unmatched in its native support for fault-tolerant distributed systems. Processes in BEAM are lightweight, isolated, and communicate through message passing. The supervision tree model ensures robustness by allowing parts of the system to fail and recover gracefully.

Go

Go takes a different approach with goroutines and channels. While not strictly an actor model, Go's "share memory by communicating" philosophy aligns well with actor principles. Goroutines are lightweight and efficient, but lack the built-in supervision and fault tolerance of BEAM.

Rust

Rust's fearless concurrency is built on its ownership system and type safety. While it provides excellent low-level control and zero-cost abstractions, concurrent programming in Rust typically requires more explicit handling of thread safety and error recovery.

JavaScript

JavaScript's event loop and single-threaded nature make it an outlier. While Worker threads exist in Node.js and browsers, JavaScript's concurrency model is fundamentally centered around asynchronous I/O rather than true parallelism.

Aspect Erlang (BEAM) Go Rust
Concurrency Model Actor Model Communicating Sequential Processes (CSP) Concurrency with Ownership Model
Concurrency Primitives Lightweight Processes (Actors) Goroutines Threads and async/await
Communication Asynchronous Message Passing Synchronous/Asynchronous Channels Channels and message passing
State Sharing No Shared State (Isolation) Mutable Shared Memory (with synchronization) Mutable Shared Memory (with ownership and borrowing)
Fault Tolerance Built-in with Supervision Trees Manual Handling with error checking Safety guarantees with strict compile-time checks and option/result types
Scalability Extremely high (millions of processes) Very high (millions of goroutines) High (efficient memory and thread management)
Use Cases Telecommunication systems, distributed systems, real-time applications Web servers, network services, concurrent applications Systems programming, performance-critical applications, safe concurrency

P.S. this table was generated by GPT-4o

The Quest for Fault Tolerance

The journey to achieve fault tolerance has led to the adoption of actor-like patterns in various programming languages, inspired by the success of Erlang/Elixir. In the .NET ecosystem, Microsoft Orleans offers virtual actor abstractions, providing a framework for developers to build distributed systems. However, it doesn't fully replicate the lightweight process model inherent in the BEAM.

In the realm of Go, Ergo serves as an ambitious attempt to close the gap between Go’s native concurrency model using goroutines and the robust actor-based systems seen in Erlang VM. This approach seeks to merge Go's efficiency with the actor model’s strengths.

Similarly, in the Rust ecosystem, frameworks like Actix and Ractor emerge, leveraging Rust’s safety guarantees while aiming to mimic BEAM’s ability to manage failures with supervision hierarchies. I don’t think Actix allows easy communication across nodes, but Ractor is actively working to get a production-ready cluster module.

On a different note, Cloudflare Workers introduced actor-like patterns through Durable Objects, which, while innovative, are constrained by the single-threaded nature of the V8 JavaScript engine. This approach showcases how JavaScript environments can mimic actor patterns, albeit with limitations.

These efforts emphasize the industry's recognition of the actor model’s contributions to building resilient and robust systems across different programming landscapes, even as they face challenges in performance and runtime integration.

The Actor Model's Enduring Appeal

The actor model has proven its worth over decades, particularly in distributed systems. Key advantages include:

  1. Isolation: Actors maintain private state and communicate only through messages

  2. Location Transparency: The same code works locally or distributed

  3. Fault Tolerance: Supervision hierarchies manage failure gracefully

  4. Scalability: Actors can be easily distributed across nodes

While frameworks like Ergo and Ractor show promise in bringing actor-style concurrency to more languages, they may face fundamental challenges:

  1. Performance Overhead: Actor frameworks often introduce additional overhead compared to native concurrency primitives.

  2. Mental Model: Developers must adapt to thinking in terms of message passing rather than shared state.

Conclusion

I’m curious to see how applications built on Ergo, Ractor, or Actix performs, under the same workload as the one running on Erlang VM.

The BEAM runtime demonstrates the power of building concurrency and fault tolerance into the core runtime. While other languages can approximate these capabilities through frameworks, the elegance and robustness of having it built into the runtime remains compelling. I believe that’s why Gleam decided to use the BEAM when it was being built.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .