Premature optimization, where software thrives unless you kill it first - a tale of Java GC

wasteofserver - Jul 12 - - Dev Community

Premature optimization, where software thrives unless you kill it first - a tale of Java GC


This post was originally posted on wasteofserver.com, you may find newer revisions and additional comments there.


Before going heads on into Java and the ways to tackle interference, either from the garbage collector or from context switching, let's first glance over the fundamentals of writing code for your future self.

Premature optimization is the root of all evil.

You've heard it before; premature optimization is the root of all evil. Well, sometimes. When writing software, I'm a firm believer of being:

1) as descriptive as possible ; you should try to narrate intentions as if you were writing a story.

2) as optimal as possible ; which means that you should know the fundamentals of the language and apply them accordingly.

As descriptive as possible

Your code should speak intention, and a lot of it pertains to the way you name methods and variables.

int[10] array1; // bad
int[10] numItems; // better
int[10] backPackItems; // great
Enter fullscreen mode Exit fullscreen mode

Just by the variable name, you can already infer functionality

While numItems is abstract, backPackItems tells you a lot about expected behaviour.

Or say you have this method:

List<Countries> visitedCountries() {
    if(noCountryVisitedYet)
        return new ArrayList<>(0);
    }
    // (...)
    return listOfVisitedCountries;
}
Enter fullscreen mode Exit fullscreen mode

As far as code goes, this looks more or less ok.

Can we do better? We definitely can!

List<Countries> visitedCountries() {
    if(noCountryVisitedYet)
        return Collections.emptyList();
    }
    // (...)
    return listOfVisitedCountries;
}
Enter fullscreen mode Exit fullscreen mode

Reading Collections.emptyList() is much more descriptive than new ArrayList<>(0);

Imagine you're reading the above code for the first time and stumble on the guard clause that checks if the user has actually visited countries. Also imagine this is buried in a lengthy class, reading Collections.emptyList() is definitely more descriptive than new ArrayList<>(0), you're also making sure it's immutable making sure client code can't modify it.

As optimal as possible

Know your language and use it accordingly. If you need a double there's no need to wrap it in a Double object. The same goes to using a List if all you actually need is an Array.

Know that you should concatenate Strings using StringBuilder or StringBuffer if you're sharing state between threads.

// don't do this
String votesByCounty = "";
for (County county : counties) {
    votesByCounty += county.toString();
}

// do this instead
StringBuilder votesByCounty = new StringBuilder();
for (County county : counties) {
    votesByCounty.append(county.toString());
}

Enter fullscreen mode Exit fullscreen mode

Know how to index your database. Anticipate bottlenecks and cache accordingly. All the above are optimizations. They are the kind of optimizations that you should be aware and implement as first citizens.

How do you kill it first?

I'll never forget about a hack I read a couple of years ago. Truth be said, the author backtracked quickly, but it goes to show how a lot of evil can spur from good intention.

// do not do this, ever!
int i = 0;
while (i<10000000) {
    // business logic

    if (i % 3000 == 0) { //prevent long gc
        try {
            Thread.sleep(0);
        } catch (Ignored e) { }
    }
}
Enter fullscreen mode Exit fullscreen mode

A garbage collector hack from hell!

You can read more on why and how the above code works in the original article and, while the exploit is definitely interesting, this is one of those things you should never ever do.

  • Works by side effects, Thread.sleep(0) has no purpose in this block
  • Works by exploiting a deficiency of code downstream
  • For anyone inheriting this code, it's obscure and magical

Only start forging something a bit more involved if, after writing with all the default optimizations the language provides , you've hit a bottleneck. But steer away from concoctions as the above.

Premature optimization, where software thrives unless you kill it first - a tale of Java GC
An interpretation of Java's future Garbage Collector "imagined" by Microsoft Copilot

How to tackle that Garbage Collector?

If after all's done, the Garbage Collector is still the piece that's offering resistance, these are some of the things you may try:

  • If your service is so latency sensitive that you can't allow for GC, run with "Epsilon GC" and avoid GC altogether. -XX:+UnlockExperimentalVMOptions -XX:+UseEpsilonGC This will obviously grow your memory until you get an OOM exception, so either it's a short-lived scenario or your program is optimized not to create objects

  • If your service is somewhat latency sensitive, but the allowed tolerance permits some leeway , run GC1 and feed it something like -XX:MaxGCPauseTimeMillis=100 (default is 250ms)

  • If the issue spurs from external libraries , say one of them calls System.gc() or Runtime.getRuntime().gc() which are stop-the-world garbage collectors, you can override offending behaviour by running with -XX:+DisableExplicitGC

  • If you're running on a JVM above 11, do try the Z Garbage Collector (ZGC), performance improvements are monumental! -XX:+UnlockExperimentalVMOptions -XX:+UseZGC. You may also want to check this JDK 21 GC benchmark.

Version Start Version End Default GC
Java 1 Java 4 Serial Garbage Collector
Java 5 Java 8 Parallel Garbage Collector
Java 9 ongoing G1 Garbage Collector

Note 1: since Java 15, ZGC is production ready, but you still have to explicitly activate it with -XX:+UseZGC.

Note 2: The VM considers machines as server-class if the VM detects more than two processors and a heap size larger or equal to 1792 MB. If not server-class, it will default to the Serial GC.

In essence, opt for GC tuning when it's clear that the application's performance constraints are directly tied to garbage collection behavior and you have the necessary expertise to make informed adjustments. Otherwise, trust the JVM's default settings and focus on optimizing application-level code.

u/shiphe - you'll want to read the full comment

Other relevant libraries you may want to explore:

Java Microbenchmark Harness (JMH)

If you're optimizing out of feeling without any real benchmarking, you're doing yourself a disservice. JMH is the de facto Java library to test your algorithms' performance. Use it.

Java-Thread-Affinity

Pinning a process to a specific core may improve cache hits. It will depend on the underlying hardware and how your routine is dealing with data. Nonetheless, this library makes it so easy to implement that, if a CPU intensive method is dragging you, you'll want to test it.

LMAX Disruptor

This is one of those libraries that, even if you don't need, you'll want to study. The idea is to allow for ultra low latency concurrency. But the way it's implemented, from mechanical sympathy to the ring buffer, brings a lot of new concepts. I still remember when I first discovered it, seven years ago, pulling an all-nighter to digest it.

Netflix jvmquake

The premiss of jvmquake is that when things go sideways with the JVM, you want it to die and not hang. A couple of years ago, I was running simulations on an HTCondor cluster that was on tight memory constraints and sometimes jobs would get stuck due to "out of memory" errors. This library forces the JVM to die, allowing you to deal with the actual error. On this specific case, HTCondor would auto re-schedule the job.

Final thoughts

The code that made me write this post? I've written way worse. I still do. The best we can hope for is to continuously mess up less.

I'm expecting to be disgruntled looking at my own code a few years down the road.

And that's a good sign.


Premature optimization, where software thrives unless you kill it first - a tale of Java GC

Given the nature of this post, I found it appropriate to promote a product I've had for quite some time, a home shredder!

This is the 3rd different model/brand I've had and can definitely attest to Amazon Basics sturdiness.

While I do prefer signed and encrypted PDFs, there are a lot of financial institutions that still share data via paper. I shred those. Then I shred some other trivial stuff just to make it safe by obscurity.

In all honesty, I find it soothing.

Edits & Thank You:

. . . .