I recently had a bad day (or week) with some code that was particularly slow, but after spending all my time looking in the wrong place, a few hours of coding fixed the issue to absurd levels.
My Code
As a preface, I made a half-baked post about this recently, but this is going to be the nicer version with a new update.
My code is a raycaster, which is how you turn a 2D tile grid into a "3D" looking world. It's primitive, but that's because back when they were first implemented they had to be, they didn't have the computing power to churn out 3D frames at real time.
My code was casting rays until it hit a wall, for every pixel column, which is 1,280 times. Not bad so far.
I also need to draw the ceiling and the floor, which is every pixel on the screen, so 1,280 times 640, which is 819,200.
This is more of an issue as you could probably tell.
Blaming Something Else
Once I've calculated what the ceiling and floor should look like, I have myself a nice byte slice. These are just RGBA values, but my graphics library doesn't talk in that language, it needs triangles. So, it has to translate all those bytes into triangles, a taxing job.
Or So I Thought
This is when I decided to count how many times I was calling expensive functions, such as square root and trigonometric functions. If you want to see the results they are in this post.
In short, I was calling these functions millions of times per frame, which is obviously too many.
Cache
I figured I could pre-calculate most of these values before runtime, but how do you store them? Obviously you can't store all the possible inputs to cosine, no one has the memory for that, however we could convert the input to an integer so that we have 360 possible answers. This greatly saves on memory while also giving us that pre-calculation.
It Looks Bad
That's great and all for when you're really close to a wall, so it's using all 360 of those cached values, but what about when I'm looking at a wall far away? Well, one point of the wall would be using the same cosine value as another point a meter further along the wall, which is clearly wrong. I needed a way to increase the accuracy, while still maintaining low memory.
Times It By Ten
Let's say I hold my cosine values in this array.
cosMap := [360]float64{}
You can now imagine how I would easily be able to populate this array and retrieve items from it. So how do we increase its accuracy? Well, we times it by ten!
cosMap := [3600]float64{}
Now we have one decimal place of precision!
Here's how you would access this array.
func Cos(x float64) float64 {
return cosMap[int(x*10)]
}
Pretty simple right? And you could probably imagine how to populate the array.
It Looks Great!
Turns out 1 decimal place of precision is all you need to make your raycaster look fine. There was heavy artifacts when using integers, but now they're barely noticeable. Personally, at some point I would double the array size, but that's it, no more. Often we think we need all the precision that we can get, and yes, there are cases where we do, but in general, you can do with 1 or 2 decimal places.