Have you ever wondered what happens behind the scenes when you hit "Run" on a code snippet in online development environments like Go Playground or OneCompiler?
By following along, at the end of this post, you will have the front-end and backend for a very bare-bones implementation that resembles something like this:
If you just want to see the code, you can find it here.
It's important to note that while skeletal, the implementation is by no means a "toy". We will address the most important considerations for building the core requirement of such a platform. Namely:
Security - we are letting users execute arbitrary code on our servers so we need a way to isolate the code execution in order to limit the possibility for abuse as much as possible.
Scalability - We need a way to scale our system as the number of users grow.
Limits - We want to limit the amount of resources we are allocating for a given code execution so it doesn't tax our servers as well as hurt other users' experience.
What are we going to use
In order to address all the considerations mentioned above, we are going to use Tork to do all the heavy lifting for us.
In a nutshell, Tork is a general purpose, distributed workflow engine that I've been working on for the past couple of months.
It uses Docker containers for the execution of workflow tasks which addresses point #1 and point #3 for us - we'll see exactly how in a minute.
It also supports a distributed setup to scale task processing to an arbitrary number of worker nodes which addresses point #2.
There are two ways we can go about this:
Download and install the vanilla Tork and write a new thin API service that will sit between the client and Tork -- because we don't necessarily want to expose Tork's native API to them in order to have tight control over which parameters are sent to Tork. The advantage with this approach is that I can write my "middleware" server in a language of my choice.
Extend Tork to expose our custom API endpoint and disable all other endpoints. This requires knowledge of Go programming.
For the purposes of this demo I'll be going with option #2.
OK, let's write some code
You'll need:
- Docker installed on the machine that you're running the demo on.
- Golang >= 1.19+
Create a new directory for the project:
mkdir code-execution-demo
cd code-execution-demo
Initialize the project:
go mod init example.com/code-execution-demo
Get the Tork dependency:
go get github.com/runabol/tork
Create a main.go
file at the root of the project with the minimum boilerplate necessary to start Tork:
package main
import (
"fmt"
"os"
"github.com/runabol/tork/cli"
"github.com/runabol/tork/conf"
)
func main() {
// Load the Tork config file (if exists)
if err := conf.LoadConfig(); err != nil {
fmt.Println(err)
os.Exit(1)
}
// Start the Tork CLI
app := cli.New()
if err := app.Run(); err != nil {
fmt.Println(err)
os.Exit(1)
}
}
Start Tork:
go run main.go
If all goes well, you should see something like this:
_______ _______ ______ ___ _
| || || _ | | | | |
|_ _|| _ || | || | |_| |
| | | | | || |_||_ | _|
| | | |_| || __ || |_
| | | || | | || _ |
|___| |_______||___| |_||___| |_|
...
Let's use the RegisterEndpoint
hook to register our custom endpoint:
package main
import (
"fmt"
"net/http"
"os"
"github.com/runabol/tork/cli"
"github.com/runabol/tork/conf"
"github.com/runabol/tork/middleware/web"
)
func main() {
// removed for brevity
app.RegisterEndpoint(http.MethodPost, "/execute",handler)
// removed for brevity
}
func handler (c web.Context) error {
return c.String(http.StatusOK, "OK")
}
Start Tork in standalone
(not distributed) mode:
go run main.go run standalone
Call the new endpoint from another terminal window:
% curl -X POST http://localhost:8000/execute
OK
So far so good.
Let's assume the client is going to send us the following JSON object:
{
"language":"python|bash|go|etc.",
"code":"the source code to execute"
}
Let's write a struct that we can bind these values to:
type ExecRequest struct {
Code string `json:"code"`
Language string `json:"language"`
}
func handler(c web.Context) error {
req := ExecRequest{}
if err := c.Bind(&req); err != nil {
c.Error(http.StatusBadRequest, err)
return nil
}
return c.JSON(http.StatusOK,req)
}
At this point we just echo the request back to the user. But it's a good stepping stone to make sure the binding logic works. Let's try it:
% curl -X POST -H "content-type:application/json" -d '{"language":"bash","code":"echo hello world"}' http://localhost:8000/execute
{"code":"echo hello world","language":"bash"}
OK, next we need to convert the request to a Tork task:
func buildTask(er ExecRequest) (input.Task, error) {
var image string
var run string
var filename string
switch er.Language {
case "":
return input.Task{}, errors.Errorf("require: language")
case "python":
image = "python:3"
filename = "script.py"
run = "python script.py > $TORK_OUTPUT"
case "go":
image = "golang:1.19"
filename = "main.go"
run = "go run main.go > $TORK_OUTPUT"
case "bash":
image = "alpine:3.18.3"
filename = "script"
run = "sh ./script > $TORK_OUTPUT"
default:
return input.Task{}, errors.Errorf("unknown language: %s", er.Language)
}
return input.Task{
Name: "execute code",
Image: image,
Run: run,
Files: map[string]string{
filename: er.Code,
},
}, nil
}
So we are doing three things here essentially:
- Map the
language
field to a Docker image. - Write the
code
to an appropriate file in the container depending on thelanguage
. - Run the necessary command to execute the code in the container.
Let's use it in our handler:
task, err := buildTask(req)
if err != nil {
c.Error(http.StatusBadRequest, err)
return nil
}
And finally, let's submit the job:
input := &input.Job{
Name: "code execution",
Tasks: []input.Task{task},
}
job,err:= engine.SubmitJob(c.Request().Context(),input)
if err != nil {
return err
}
fmt.Printf("job %s submitted!\n", job.ID)
Let's try to run our updated handler:
go run main.go run standalone
curl -X POST -H "content-type:application/json" -d '{"language":"bash","code":"echo hello world"}' http://localhost:8000/execute
If all goes well, you should see something like this in the logs:
job 5488620e9bc34e09b6ec3677ea28a067 submitted!
Next, we want to get the execution output so we can return it to the client. But since Tork operates asynchronously we need a way to tell Tork to let us know what the job is done (or failed).
This is where JobListener
comes in:
result := make(chan string)
listener := func(j *tork.Job) {
if j.State == tork.JobStateCompleted {
result <- j.Execution[0].Result
} else {
result <- j.Execution[0].Error
}
}
// pass the listener to the submit job call
job, err := engine.SubmitJob(c.Request().Context(), input, listener)
if err != nil {
return err
}
return c.JSON(http.StatusOK, map[string]string{"output": <-result})
Since the job listener is not executing in the "main" thread/goroutine we need a way to pass it back to the main thread. Luckily, Golang has this really convenient thing called a channel which does exactly that.
OK, let's see if this works:
curl -X POST -H "content-type:application/json" -d '{"language":"bash","code":"echo hello world"}' http://localhost:8000/execute
{"output":"hello world\n"}
Nice!
Security
Let's update our Task
definition to enforce a more strict set of security constraints:
input.Task{
Name: "execute code",
Image: image,
Run: run,
Limits: &input.Limits{
CPUs: ".5", // no more than half a CPU
Memory: "20m", // no more than 20MB of RAM
},
Timeout: "5s", // terminate container after 5 seconds
Networks: []string{"none"}, // disable networking
Files: map[string]string{
filename: er.Code,
}
Let's disable Tork's built-in endpoints:
Create a file named config.toml
in the root of your project with the following contents:
# config.toml
[coordinator.api]
endpoints.health = true
endpoints.jobs = false
endpoints.tasks = false
endpoints.nodes = false
endpoints.queues = false
endpoints.stats = false
Now when you start the project you should see that Tork picked up the config:
% go run main.go run standalone
7:08PM INF Config loaded from config.tom
...
Frontend
Let's try to get the frontend to talk to our backend:
git clone git@github.com:runabol/code-execution-demo.git
cd code-execution-demo/frontend
npm i
npm run dev
And open http://localhost:3000
Scaling Up
The last point we are left to address is scalability.
There are many ways to tweak Tork for scalability, but for our purposes here we'll keep it simple and do the bare minimum of starting a RabbitMQ broker which will allow us to distribute the task processing:
Start a RabbitMQ broker. Example:
docker run \
-d \
--name=tork-rabbit \
-p 5672:5672 \
-p 15672:15672 \
rabbitmq:3-management
Next, add the following to your config.toml
:
[broker]
type = "rabbitmq"
[broker.rabbitmq]
url = "amqp://guest:guest@localhost:5672"
Stop Tork if it's currently running.
Start the Tork Coordinator:
go run main.go run coordinator
From a separate terminal window start a worker. You can also start additional workers if you like:
go run main.go run worker
Using curl
or the frontend try to submit a code snippet.
Conclusion
Hope you enjoyed this tutorial as much as I did.
The full source code can be found on Github.