Let's build a code execution engine

Arik - Sep 10 '23 - - Dev Community

Have you ever wondered what happens behind the scenes when you hit "Run" on a code snippet in online development environments like Go Playground or OneCompiler?

By following along, at the end of this post, you will have the front-end and backend for a very bare-bones implementation that resembles something like this:

screenshot

If you just want to see the code, you can find it here.

It's important to note that while skeletal, the implementation is by no means a "toy". We will address the most important considerations for building the core requirement of such a platform. Namely:

  1. Security - we are letting users execute arbitrary code on our servers so we need a way to isolate the code execution in order to limit the possibility for abuse as much as possible.

  2. Scalability - We need a way to scale our system as the number of users grow.

  3. Limits - We want to limit the amount of resources we are allocating for a given code execution so it doesn't tax our servers as well as hurt other users' experience.

What are we going to use

In order to address all the considerations mentioned above, we are going to use Tork to do all the heavy lifting for us.

In a nutshell, Tork is a general purpose, distributed workflow engine that I've been working on for the past couple of months.

It uses Docker containers for the execution of workflow tasks which addresses point #1 and point #3 for us - we'll see exactly how in a minute.

It also supports a distributed setup to scale task processing to an arbitrary number of worker nodes which addresses point #2.

There are two ways we can go about this:

  1. Download and install the vanilla Tork and write a new thin API service that will sit between the client and Tork -- because we don't necessarily want to expose Tork's native API to them in order to have tight control over which parameters are sent to Tork. The advantage with this approach is that I can write my "middleware" server in a language of my choice.

  2. Extend Tork to expose our custom API endpoint and disable all other endpoints. This requires knowledge of Go programming.

For the purposes of this demo I'll be going with option #2.

OK, let's write some code

You'll need:

  1. Docker installed on the machine that you're running the demo on.
  2. Golang >= 1.19+

Create a new directory for the project:



mkdir code-execution-demo
cd code-execution-demo


Enter fullscreen mode Exit fullscreen mode

Initialize the project:



go mod init example.com/code-execution-demo


Enter fullscreen mode Exit fullscreen mode

Get the Tork dependency:



go get github.com/runabol/tork


Enter fullscreen mode Exit fullscreen mode

Create a main.go file at the root of the project with the minimum boilerplate necessary to start Tork:



package main

import (
    "fmt"
    "os"

    "github.com/runabol/tork/cli"
    "github.com/runabol/tork/conf"
)

func main() {
   // Load the Tork config file (if exists) 
   if err := conf.LoadConfig(); err != nil {
     fmt.Println(err)
     os.Exit(1)
   }

   // Start the Tork CLI
   app := cli.New()
   if err := app.Run(); err != nil {
     fmt.Println(err)
     os.Exit(1)
   }
}


Enter fullscreen mode Exit fullscreen mode

Start Tork:



go run main.go


Enter fullscreen mode Exit fullscreen mode

If all goes well, you should see something like this:



 _______  _______  ______    ___   _ 
|       ||       ||    _ |  |   | | |
|_     _||   _   ||   | ||  |   |_| |
  |   |  |  | |  ||   |_||_ |      _|
  |   |  |  |_|  ||    __  ||     |_ 
  |   |  |       ||   |  | ||    _  |
  |___|  |_______||___|  |_||___| |_|
...


Enter fullscreen mode Exit fullscreen mode

Let's use the RegisterEndpoint hook to register our custom endpoint:



package main

import (
    "fmt"
    "net/http"
    "os"

    "github.com/runabol/tork/cli"
    "github.com/runabol/tork/conf"
    "github.com/runabol/tork/middleware/web"
)

func main() {
    // removed for brevity

    app.RegisterEndpoint(http.MethodPost, "/execute",handler)

    // removed for brevity
}

func handler (c web.Context) error {
  return c.String(http.StatusOK, "OK")
}



Enter fullscreen mode Exit fullscreen mode

Start Tork in standalone (not distributed) mode:



go run main.go run standalone


Enter fullscreen mode Exit fullscreen mode

Call the new endpoint from another terminal window:



% curl -X POST http://localhost:8000/execute
OK


Enter fullscreen mode Exit fullscreen mode

So far so good.

Let's assume the client is going to send us the following JSON object:



{
  "language":"python|bash|go|etc.",
  "code":"the source code to execute"
}


Enter fullscreen mode Exit fullscreen mode

Let's write a struct that we can bind these values to:



type ExecRequest struct {
    Code     string `json:"code"`
    Language string `json:"language"`
}


Enter fullscreen mode Exit fullscreen mode


func handler(c web.Context) error {
  req := ExecRequest{}
  if err := c.Bind(&req); err != nil {
    c.Error(http.StatusBadRequest, err)
    return nil
  }  

  return c.JSON(http.StatusOK,req)
}


Enter fullscreen mode Exit fullscreen mode

At this point we just echo the request back to the user. But it's a good stepping stone to make sure the binding logic works. Let's try it:



% curl -X POST -H "content-type:application/json" -d '{"language":"bash","code":"echo hello world"}' http://localhost:8000/execute

{"code":"echo hello world","language":"bash"}


Enter fullscreen mode Exit fullscreen mode

OK, next we need to convert the request to a Tork task:



func buildTask(er ExecRequest) (input.Task, error) {
        var image string
        var run string
        var filename string

        switch er.Language {
        case "":
                return input.Task{}, errors.Errorf("require: language")
        case "python":
                image = "python:3"
                filename = "script.py"
                run = "python script.py > $TORK_OUTPUT"
        case "go":
                image = "golang:1.19"
                filename = "main.go"
                run = "go run main.go > $TORK_OUTPUT"
        case "bash":
                image = "alpine:3.18.3"
                filename = "script"
                run = "sh ./script > $TORK_OUTPUT"
        default:
                return input.Task{}, errors.Errorf("unknown language: %s", er.Language)
        }

        return input.Task{
                Name:    "execute code",
                Image:   image,
                Run:     run,
                Files: map[string]string{
                        filename: er.Code,
                },
        }, nil
}


Enter fullscreen mode Exit fullscreen mode

So we are doing three things here essentially:

  1. Map the language field to a Docker image.
  2. Write the code to an appropriate file in the container depending on the language.
  3. Run the necessary command to execute the code in the container.

Let's use it in our handler:



task, err := buildTask(req)
if err != nil {
  c.Error(http.StatusBadRequest, err)
  return nil
}


Enter fullscreen mode Exit fullscreen mode

And finally, let's submit the job:



input := &input.Job{
  Name:  "code execution",
  Tasks: []input.Task{task},
}   

job,err:= engine.SubmitJob(c.Request().Context(),input)
if err != nil {
  return err
}

fmt.Printf("job %s submitted!\n", job.ID)   


Enter fullscreen mode Exit fullscreen mode

Let's try to run our updated handler:



go run main.go run standalone


Enter fullscreen mode Exit fullscreen mode


curl -X POST -H "content-type:application/json" -d '{"language":"bash","code":"echo hello world"}' http://localhost:8000/execute


Enter fullscreen mode Exit fullscreen mode

If all goes well, you should see something like this in the logs:



job 5488620e9bc34e09b6ec3677ea28a067 submitted!


Enter fullscreen mode Exit fullscreen mode

Next, we want to get the execution output so we can return it to the client. But since Tork operates asynchronously we need a way to tell Tork to let us know what the job is done (or failed).

This is where JobListener comes in:



result := make(chan string)

listener := func(j *tork.Job) {
  if j.State == tork.JobStateCompleted {
    result <- j.Execution[0].Result
  } else {
    result <- j.Execution[0].Error
  }
}

// pass the listener to the submit job call
job, err := engine.SubmitJob(c.Request().Context(), input, listener)
if err != nil {
  return err
}

return c.JSON(http.StatusOK, map[string]string{"output": <-result})


Enter fullscreen mode Exit fullscreen mode

Since the job listener is not executing in the "main" thread/goroutine we need a way to pass it back to the main thread. Luckily, Golang has this really convenient thing called a channel which does exactly that.

OK, let's see if this works:



curl -X POST -H "content-type:application/json" -d '{"language":"bash","code":"echo hello world"}' http://localhost:8000/execute
{"output":"hello world\n"}


Enter fullscreen mode Exit fullscreen mode

Nice!

Security

Let's update our Task definition to enforce a more strict set of security constraints:



input.Task{
  Name:  "execute code",
  Image: image,
  Run:   run,
  Limits: &input.Limits{
    CPUs:   ".5", // no more than half a CPU
    Memory: "20m", // no more than 20MB of RAM
  },
  Timeout:  "5s", // terminate container after 5 seconds
  Networks: []string{"none"}, // disable networking
  Files: map[string]string{
  filename: er.Code,
}


Enter fullscreen mode Exit fullscreen mode

Let's disable Tork's built-in endpoints:

Create a file named config.toml in the root of your project with the following contents:



# config.toml
[coordinator.api]
endpoints.health = true
endpoints.jobs = false
endpoints.tasks = false
endpoints.nodes = false
endpoints.queues = false
endpoints.stats = false


Enter fullscreen mode Exit fullscreen mode

Now when you start the project you should see that Tork picked up the config:



% go run main.go run standalone          
7:08PM INF Config loaded from config.tom
...


Enter fullscreen mode Exit fullscreen mode

Frontend

Let's try to get the frontend to talk to our backend:



git clone git@github.com:runabol/code-execution-demo.git
cd code-execution-demo/frontend
npm i
npm run dev


Enter fullscreen mode Exit fullscreen mode

And open http://localhost:3000

Scaling Up

The last point we are left to address is scalability.

There are many ways to tweak Tork for scalability, but for our purposes here we'll keep it simple and do the bare minimum of starting a RabbitMQ broker which will allow us to distribute the task processing:

Start a RabbitMQ broker. Example:



docker run \
  -d \
  --name=tork-rabbit \
  -p 5672:5672 \
  -p 15672:15672 \
  rabbitmq:3-management


Enter fullscreen mode Exit fullscreen mode

Next, add the following to your config.toml:



[broker]
type = "rabbitmq"

[broker.rabbitmq]
url = "amqp://guest:guest@localhost:5672"


Enter fullscreen mode Exit fullscreen mode

Stop Tork if it's currently running.

Start the Tork Coordinator:



go run main.go run coordinator


Enter fullscreen mode Exit fullscreen mode

From a separate terminal window start a worker. You can also start additional workers if you like:



go run main.go run worker


Enter fullscreen mode Exit fullscreen mode

Using curl or the frontend try to submit a code snippet.

Conclusion

Hope you enjoyed this tutorial as much as I did.

The full source code can be found on Github.

. . . . . . . . . . . . . . . . . . .