Ruby: Thread and Fiber

hungle00 - Nov 9 - - Dev Community

In a previous post, I introduce briefly about Fiber and then, build a simple asynchronous HTTP server using Async gem. In this post, I'll focus on comparing Fiber and Thread and explain why Fiber is better than Thread in some cases.

  1. Fibers and Threads
  2. Example: Http request
  3. Example: Http server
  4. Concluding

Fibers and Threads

Thread

thread = Thread.new do
  #...
end
thread.join
Enter fullscreen mode Exit fullscreen mode

Fiber

fiber = Fiber.new do
  #...
end
fiber.resume # transfer / Fiber.schedule
Enter fullscreen mode Exit fullscreen mode

As you can see, they have quite similar syntax, so what are the differences between them?

  • The level:
    • Threads are created 1:1 with threads on OS.
    • Fibers are implemented at the programming language level, multiple fibers can run inside a thread.
  • Scheduling mechanism:
    • Threads are run pre-emptive by almost modern OS.
    • Fibers are referred to as a mechanism for cooperative concurrency.

Threads will run automatically, they are scheduled by OS.
With Thread, programmers are just allowed to create new Threads, make them do some tasks, and use the join method to get the return from execution. The OS will run threads and decide when to run and pause to achieve concurrency.

[
  Thread.new { # code },
  Thread.new { # code }
].each(&:join)
Enter fullscreen mode Exit fullscreen mode

Meanwhile, Fiber gives us more control
With Fiber, programmers are free to start, pause, and resume them.

  • Fiber.new { } : create new fiber, started with resume
  • Fiber.yield: pause current Fiber, moves control to where fiber was resumed
  • After suspension, Fiber can be resumed later at the same point with the same execution state.
fib2 = nil

fib = Fiber.new do
  puts "1 - fib started"
  fib2.transfer
  Fiber.yield
  puts "3 - fib resumed"
end

fib2 = Fiber.new do
  puts "2 - control moved to fib2"
  fib.transfer
end

fib.resume
puts ""
fib.resume
Enter fullscreen mode Exit fullscreen mode
1 - fib started
2 - control moved to fib2

3 - fib resumed
Enter fullscreen mode Exit fullscreen mode

Fiber over Thread

  • A fiber is lighter-weight than a thread, so we can spawn more fibers than threads
  • Less context-switching time ( the advantages of cooperative scheduling compare to preemptive scheduling

Fiber scheduler

In a previous article, I noted that before Fibers lacked the scheduler implementation to be useful, now it is officially supported from Ruby 3. If you want to enable the fiber scheduling, you need to set a Fiber Scheduler object.

Fiber.set_scheduler(scheduler)
Enter fullscreen mode Exit fullscreen mode

You can check the list of Fiber Scheduler implementations here Fiber Scheduler List project.
I suggest using this Fiber Schedulers implementation - Async gem. It was created by Samuel Williams and has the robust API to write concurrency code.

The next part will help you understand more about how to use Thread, Fiber, and Async gem to write concurrent HTTP requests.

HTTP requests example

For example, we will get a list of uuid from this site

require "net/http"

def get_uuid
  url = "https://httpbin.org/uuid"
  response = Net::HTTP.get(URI(url))
  JSON.parse(response)["uuid"]
end
Enter fullscreen mode Exit fullscreen mode

This request will take about 1s to finish.

Sequentially version

def get_http_sequently
  results = []

  10.times.map do
    results << get_uuid
  end

  results
end

now = Time.now
puts get_http_sequently
puts "Fiber runtime: #{Time.now - now}" # about 11-12s
Enter fullscreen mode Exit fullscreen mode

One request took about 1s so if we call sequentially, this code will take about 10s.

Ruby 3 concurrency tools

Concurrency version with thread

def get_http_via_threads
  results = []

  10.times.map do
    Thread.new do
      results << get_uuid
    end
  end.map(&:value)

  results
end
# => 1.3s
Enter fullscreen mode Exit fullscreen mode

Concurrency version with fiber

require "async"

def get_http_via_fibers
  Fiber.set_scheduler(Async::Scheduler.new)
  results = []

  10.times do
    Fiber.schedule do
      results << get_uuid
    end
  end
  results
ensure
  Fiber.set_scheduler(nil)
end
# => 1.2s
Enter fullscreen mode Exit fullscreen mode

Because all requests are called concurrently, the total time is about the time of the slowest request.

Ruby 3 concurrency tools

More about Async

Another implementation uses Async gem like that, we use Kernel#Async method instead of Async::Scheduler

def get_http_via_async
  results = []

  Async do
    10.times do
      Async do
        results << get_uuid
      end
    end
  end
  results
end
Enter fullscreen mode Exit fullscreen mode

The general structure of Async Ruby programs:

  • You always start with an Async block which is passed a task.
  • That main task is usually used to spawn more Async tasks with task.async.
  • These tasks run concurrently with each other and the main task.

Screenshot 2024-11-07 at 14 15 23

The task is built on top of each Fiber.

HTTP server example

The minimal HTTP server in Ruby can be implemented by using the built-in class TCPServer, it'll look like this:

socket = TCPServer.new(HOST, PORT)
socket.listen(SOCKET_READ_BACKLOG)

loop do
  conn = socket.accept # wait for a client to connect
  request = RequestParser.call(conn)
  #... status, headers, body
end
Enter fullscreen mode Exit fullscreen mode

Now we'll make the server handle more than 1 request per time.

Thread pool version

pool = ThreadPool.new(size: 5)
loop do
  conn = socket.accept # wait for a client to connect
  pool.schedule do
    # handle each request
    request = RequestParser.call(conn)
  end
end
Enter fullscreen mode Exit fullscreen mode

The idea is to use a thread pool to limit the number of threads running concurrently.

Async version

Async do
  loop do
    conn = socket.accept # wait for a client to connect
    Async do
      # handle each request
      request = RequestParser.call(conn)
    end
   end
end
Enter fullscreen mode Exit fullscreen mode

The Falcon is the most-known app server that uses async for connection pool Falcon.

More detail about implementation and benchmark testing on this repo

Concluding

  1. Threads and fibers allow programmers to write concurrent code, it's very useful for handling blocking-IO operations.
  2. As a Ruby developer, we don't use Thread directly most of the time. But in reality, for web development, a lot of tools use threads.
    • A web server like Puma or Webrick
    • A background job like Sidekiq, GoodJob, and SolidQueue
    • An ORM like ActiveRecord or Sequel
    • A Http client HTTParty or RestClient
  3. Fiber (+ FiberScheduler) is just been released from Ruby 3 maybe may have a bright future due to its advantages compared to Thread. Here's a couple of the most useful tools on top of fiber:
    • async-http a featureful HTTP client
    • falcon HTTP server built around Async core
    • ...

The content of this article is my last tech sharing with my team at https://pixta.vn/. You can see the full sharing here.

. . . . . . . . . .