So recently a thought struck me that we use random functions in our software all the time right? Either in games or in machine learning models, cryptographic software, etc.
But are these "randomly" generated numbers are actually random? I mean there must be some algorithm that calculates this so-called random number right? And if it's calculated it's not really "random" now is it?
So today I'll try to explain how random is a random number in programming languages.
So all the random functions in different programming languages like Math.random() in Javascript are not "True" random number generators, they are called PRNGs (Psuedo Random Number Generators)
The word Psuedo is self-explanatory that these random generators are not truly random. So now the question that arises is - Okay random number generators are not truly random then to what extent these numbers are random?
This is the question that measures the quality of a PRNG.
There are many algorithms that are used to generate random numbers, but what all of them have in common is a Period and a Seed.
Seed
A seed is an initial value that is passed in the random function to create a random number.
To understand how a seed works in a PRNG, Let's take a look at an algorithm called LCG (Linear congruential generator)
LCG
For those familiar with random number generator algorithms, do you have a favorite? Whether it's LCG, Xorshift, or another algorithm, do share your preferences and experiences with different generators.
- Back to the example of LCG
So the formula of an LCG is X = (a.Xcurrent + C) % mod m
where,
X is the sequence of pseudo-random numbers
m, ( > 0) the modulus
a, (0, m) the multiplier
c, (0, m) the increment
Xcurrent , [0, m) – Current Seed
So what happens is once the initial seed is selected it is passed in this equation where Xcurrent is the only thing that is variable and everything else is a constant, so the seed is passed in this equation as Xcurrent, and the operations are done, once the answer is generated, next time the previous answer is used as seed and so on.
Here is an example of how does it look
Let's consider an LCG with a specific set of constants:
a = 7,
c = 5,
m = 32,
and initial seed as 3
so
1. X = (7*3 + 5) mod 32 = 26 => X = 26
2. X = (7*26 + 5) mod 32 = 11 => X = 11
3. X = (7*11 + 5) mod 32 = 22 => X = 22
4. X = (7*22 + 5) mod 32 = 19 => X = 19
5. X = (7*19 + 5) mod 32 = 26 => X = 26
6. X = (7*26 + 5) mod 32 = 11 => X = 11
This is how once a seed is decided the generated value is used as a seed to create the next random number, Now am sure you must have noticed something strange in this sequence right? After point 4 the sequence started to repeat. First 26 and then 11 and so on?
Period
So there are multiple algorithms to generate a PRN, like LCG or Xorshift128+ (Used by the v8 engine) but what all of them have in common is that there is a limit to producing random numbers before which they start to repeat the sequence. This is what is known as a Period. Every PRNG has repeatability after a point, now how long the sequence is before the numbers start to repeat is what tests the quality of the PRNG.
Why there is a Period?
But why does this happen? Why do numbers start to repeat?
It might sound like a silly question, but it was bugging me so I tried to understand this,
what I understood is that this happens because computers have a limit to generate numbers, sure there are infinite numbers, but there is a limit till which a computer can comprehend, like in a 64-bit architecture that limit is −9,223,372,036,854,775,808 to 9,223,372,036,854,775,808 so computer needs to maintain the number in that limit.
So algorithms try to keep this number in range like in LCG after multiplication and addition in the seed, a modulus is performed so that the number does not keep increasing till the point that the computer can't comprehend it anymore.
And if there are only finite numbers that the computer can generate through an algorithm then after a while it's inevitable that the seed that started the sequence will be generated and once that number is generated the whole sequence will be repeated.
So yeah if you know the seed you can predict the whole sequence.
Here is a replit link for you to try I have set a seed and ran a loop for 10 values no matter how many times you run the loop it will have the same result.
Now if you are a Minecraft player you must have heard "seed" before as well. When you enter a seed that is shared by someone else you also get spawned in the same area as them. Why? Because Minecraft uses a random generator to create an area, and if it's random then it also has some seed, and if you have that same seed then yes you will get the same output.
Have you ever played a game that allowed you to input a 'seed' for generating the game world? Share your experience and any interesting outcomes!
While talking about PRNGs it is important to note that PRNGs are not cryptographically secure as also mentioned in the v8 Docs.
It is not recommended for hashing, signature generation, and encryption/decryption.
For those purposes you can use window.crypto.getRandomValues
"but" at the cost of performance.
So why do we use PRNGs if it has so many flaws?
I mean you know the seed and voilà you know the sequence, it's not cryptographically secure then where is it used?
Applications of PRNGs
Well PRNGs are very efficient and fast they are very useful in games like you want to spawn the player at a random place and spawn some buildings randomly, but you don't want it to be truly random and efficient because games need to render everything fast.
If we go back to the example of Minecraft, Minecraft does not create the whole environment as soon as you enter in the game, because if that had been the case there would have been a great toll on the game and a huge storage space used by the game. Instead, it creates environments on the fly as players start to move in a direction.
Minecraft features randomly generated structures such as villages, temples, and dungeons. The placement of these structures adds surprises to the landscape, and their designs vary based on procedural algorithms.
Advantages of Procedural Generation:
Scalability: Minecraft's procedural generation allows the game to scale infinitely without the need for massive storage space for pre-designed maps.
Exploration: As players traverse the world, new chunks of terrain are generated dynamically, providing a sense of discovery and unpredictability.
Replayability: Different seed values and the randomness in terrain generation enhance the replayability of the game, as players can experience entirely new worlds with each playthrough.
Impact on Performance:
Generating terrain in real-time as players explore reduces the initial loading time and minimizes the demand on system resources compared to loading an entire pre-designed world.
PRNGs are also used for debugging, where engineers can pre-define the seed and can run iterations on a fixed sequence to run some tests.
Truly Random Number Generators are great but they come at the cost of performance so it's up to us to decide the tradeoff. Overall it was a great topic to read and learn about.
As a developer, how often do you use random number generators in your projects? Do you have any favorite algorithms or tips for ensuring randomness in your applications?
In conclusion, while pseudorandom number generators (PRNGs) play a crucial role in various applications, from gaming to simulations, understanding their limitations is equally important. The predictable nature of PRNGs, marked by periods and seed-dependent sequences, raises questions about their suitability for cryptographic purposes. As developers, we navigate the trade-off between efficiency and true randomness, opting for PRNGs in scenarios where rapid, reproducible results are paramount. Reflect on the challenges posed by the inherent predictability of PRNGs and consider: How do we balance the convenience of PRNGs with the need for cryptographic security in our applications?
If there is anything that I missed about PRNGs that could have made the blog better or you have any questions related to it please let me know in the comments.
Thanks for reading this far😁.