A new graphics API was born

There are a few popular "low-level" graphic APIs available like OpenGL, DirectX, Metal and Vulkan. Depending on your requirements you had to pick one or use some higher-level graphic abstractions to be able to develop for multiple platforms with a single code base.

For a long time, before Vulkan, only OpenGL was "cross-platform". OpenGL wasn't famous for its performance and therefore often not supported by AAA-games. Most of them use DirectX on Windows.

Now it's 202x. It feels a bit like waste of time, that we need to use different graphic APIs on different operating systems using exactly the same hardware under the hood. Only with OpenGL it was possible, but the amount of resource overhead needed to for example run a game with OpenGL compared to the same settings on Windows with DirectX seems to be a dealbreaker for most game companies.

The next milestone towards cross-platform graphic APIs was set in 2016 with Vulkan. Behind its development is none other than the Khronos Group, well-known for OpenGL and many other open standards. In 2018 commercial Apple drivers (MoltenVK) for Vulkan got open-sourced and since then free to use.

Vulkan is also well suited for mobile devices. We were talking about resources and performance on desktop machines. Well, mobile device app developers should also care a lot about performance and Vulkan can help, especially superseding OpenGL ES.

How to start writing a Vulkan application?

I think most of us using a new API just read through some specs and finally start. That's possible, but without a good background you will not understand it at all. The next thing is, that a simple "hello world" triangle in a Vulkan C++ application has about 600 lines of code. Changing its color per frame, double/triple buffering or input vertex buffers easily takes you beyond a thousand lines of code. And I think that's also the moment, when developers suddenly start feeling fine with OpenGL. But don't despair, it's not too hard.

1. Compare some Vulkan and OpenGL code

Comparing 600 lines of C++ code using Vulkan to 100 lines for OpenGL should maybe make us think what OpenGL is doing behind the scenes. One of the big changes coming from OpenGL is that state changes are now not only expensive in performance, but also in writing additional code. In Vulkan we kinda pre-bake graphic pipelines and run pre-recorded command buffers so e.g. changing to wireframe rendering would be maybe another graphic pipeline with an almost identical setup. At least it's not just one gl... call in the render loop.

The reason why everything seems to get more complicated is because we give more detailed information to the Vulkan driver, so optimizations are better. In OpenGL everything can happen at any time and the driver needs to be very smart. The good thing on the other hand side is that the render loop is much more compact.

2. Understand why some Vulkan boiler-plate is necessary to get the most out of the hardware

A good starting point without copying code from tutorials not knowing what it does is to understand how today's hardware is working. E.g. before Vulkan I haven't heard of tiling. I could not understand why linear reading from an image buffer storing it like a screenshot is maybe not possible. But after reading about hardware and caching it absolutely makes sense.

Kudos to Samsung's Galaxy GameDev pages. You can find some good resources about how graphic hardware works and how Vulkan is working.

3. Getting comfortable with the base Vulkan architecture and the first steps

Although there are a lot of good resources, too, I'll try to give a much more compact overview.

Vulkan C API

Like OpenGL, Vulkan is a dynamic API with spec versions. Depending on the version you should be able to dynamically load function pointers
There are 3 API-tiers. Every program starts with discovering the hardware setup top-down and finally picks one ore more suitable devices for rendering or computation:
- entry-point functions
- instance functions
- device functions
Function return value is a Vulkan result code or void. All other return values are out parameters. Mostly functions with void return value are somehow validated at a subsequent API call.
Functions may return bad result codes for expected errors or simply crash for unexpected input or state. The driver is sometimes not even wasting time for null-checks of pointers just crashing, while there is a so called optional validation layer for debugging and tracking bad usage, like a bad order of destroying handles.
There is no thread context at all. Handles can be passed to other threads as well. Multi-threading is possible and good for performance, primarily for command buffer operations.

Vulkan architecture overview

On top we have so called Vulkan instances, like a context boundary.
Inside an instance we have physical devices.
Physical devices can be used to discover hardware features and limits to determine a suitable target device to use
Out of a physical device, we can create a logical device handle which will be the mostly used handle in the application.
(Logical) devices have so called (command-) queues to push working items. There are queues for graphic- and presentation operations. It can also be that there is only one queue doing both. Queues are CPU-concurrently overlapping processed and each queue in parallel. We may need synchronization:
- host sync(hronization) via fences => CPU thread blocking
- intra-queue sync via semaphores => disallow overlapping
- inter-queue sync via events => sync queues
There are a lot of other objects, like a pipeline, buffers, memory, descriptors and so on, but they are just building blocks for recording command buffers, which are finally submitted one or multiple times.

Vulkan pseudo code

setup()                     // A
loop {
  i = wait_for_next_image() // B
  update(i)                 // C
  draw_and_present(i)       // D
}

wait_idle()                 // E
teardown()                  // F

A: setup device, queues, buffers, memory etc.
- e.g. if we calculate frames ahead, we may need multiple buffers/memory allocations to allow render overlapping with writing buffers
- static immutable buffers can be shared
B: as long as the swapchain doesn't have some already presented and not queued image, we stop here
C: we update whatever we need for index i
D: we draw and present i-th image
E: before we destroy everything, we need to wait until it's not used anymore
F: destroy in LIFO-fashion

4. Go through some tutorials

Vulkan Tutorial eBook + GitHub repo
- step by step source files and shaders of that tutorial
SaschaWillems Vulkan example triangle
Vulkan Guide for more

5. Do your thing

Well, this is where I stop for now. I think I'm somewhere between tutorials and guides and still cannot yet "do my thing", but the more time I spend reading, understanding code and iterating, the less I have to copy code, because "it makes sense".

Conclusion

Vulkan API is not a "natural" thing like a web shop API. A lot of internals are exposed and it feels like it's overall boiler-plate code, but it's actually about performance, a good memory design, smart and small updates and clear picture about staging work that hardware is never getting idle. All in all progress is sometimes not that easy, but I think for future development it's a good choice to adopt Vulkan - also called glNext as the successor for OpenGL.

There are a lot of beginner tutorials and expert stuff like ray-tracing and I see a gap in the middle, so I'm planing to share some experience and some of my best practices for intermediate Vulkan content.

Vulkan - some introductory words