Deploying your Java code to production limits your troubleshooting options. Connecting to your app in production with a debugger is usually out of the question, and you might not even be able to get console access. So even with monitoring, you’re going to end up troubleshooting many problems post-mortem. This means looking at logs and, if you’re lucky, working with a Java stack trace.
That’s right, I said you’re lucky if you have a stack trace. It’s like getting a compass, a map, and a first-class airplane ticket handed to you all at once! Let’s talk about what a Java stack trace is and how you can use it.
What's a Java Stack Trace?
A stack trace, also called a stack backtrace or even just a backtrace, is a list of stack frames. These frames represent a moment during an application’s execution. A stack frame is information about a method or function that your code called. So the Java stack trace is a list of frames that starts at the current method and extends to when the program started.
Sometimes there’s confusion between a stack and the Stack. A stack is a data structure that acts as a stack of papers on your desk: it’s first-in-last-out. You add documents to the pile and take them off in the reverse order you put them there. The Stack, more accurately called the runtime or call stack, is a set of stack frames a program creates as it executes, organized in a stack data structure.
Let’s look at an example.
Java Stack Trace Example
Let’s take a look at a Java program. This class calls four methods and prints a stack trace to the console from the last one.
public class StackTrace { public static void main(String[] args) { a(); } static void a() { b(); } static void b() { c(); } static void c() { d(); } static void d() { Thread.dumpStack(); } }
When you run the class, you’ll see something like this:
java.lang.Exception: Stack trace at java.base/java.lang.Thread.dumpStack(Thread.java:1383) at com.ericgoebelbecker.stacktraces.StackTrace.d(StackTrace.java:23) at com.ericgoebelbecker.stacktraces.StackTrace.c(StackTrace.java:19) at com.ericgoebelbecker.stacktraces.StackTrace.b(StackTrace.java:15) at com.ericgoebelbecker.stacktraces.StackTrace.a(StackTrace.java:11) at com.ericgoebelbecker.stacktraces.StackTrace.main(StackTrace.java:7)
The d() method() is at the top of the stack because that’s where the app generated the trace. The main() method is at the bottom because that’s where the program started. When the program started, the Java runtime executed the main() method. Main() called a(). A() called b(), and b() called c(), which called d(). Finally, d() called dumpStack(), which generated the output. This Java stack trace gives us a picture of what the program did, in the order that it did it.
A Java stack trace is a snapshot of a moment in time. You can see where your application was and how it got there. That’s valuable insight that you can use a few different ways.
How to Use Java Stack Traces
Now that you’ve seen what Java stack traces show you, how can you use them?
Java Exceptions
Stack traces and exceptions are often associated with each other. When you see a Java application throw an exception, you usually see a stack trace logged with it. This is because of how exceptions work.
When Java code throws an exception, the runtime looks up the stack for a method that has a handler that can process it. If it finds one, it passes the exception to it. If it doesn’t, the program exits. So exceptions and the call stack are linked directly. Understanding this relationship will help you figure out why your code threw an exception.
Let’s change our sample code.
First, modify the d() method:
static void d() { throw new NullPointerException("Oops!"); }
Then, change main() and a() so main can catch an exception. You'll need to add a checked exception to a() so the code will compile.
public static void main(String[] args) { try { a(); } catch (InvalidClassException ice) { System.err.println(ice.getMessage()); } } static void a() throws InvalidClassException { b(); }
You’re deliberately catching the “wrong” exception. Run this code and watch what happens.
Exception in thread "main" java.lang.NullPointerException: Oops! at com.ericgoebelbecker.stacktraces.StackTrace.d(StackTrace.java:29) at com.ericgoebelbecker.stacktraces.StackTrace.c(StackTrace.java:24) at com.ericgoebelbecker.stacktraces.StackTrace.b(StackTrace.java:20) at com.ericgoebelbecker.stacktraces.StackTrace.a(StackTrace.java:16) at com.ericgoebelbecker.stacktraces.StackTrace.main(StackTrace.java:9)
The exception bubbled up the stack past main() because you were trying to catch a different exception. So the runtime threw it, terminating the application. You can still see a stack trace though, so it’s easy to determine what happened.
Now, change main() to catch a NullPointerException instead. You can remove the checked exception from a() too.
public static void main(String[] args) { try { a(); } catch (NullPointerException ice) { System.err.println(ice.getMessage()); } } static void a() { b(); }
Rerun the program.
Oops!
We lost the stack trace! By only printing the message attached to the exception, you missed some vital context. Unless you can remember why you wrote Oops! in that message, tracking down this problem is going to be complicated. Let’s try again.
public static void main(String[] args) { try { a(); } catch (NullPointerException npe) { npe.printStackTrace(); } }
Rerun the application.
java.lang.NullPointerException: Oops! at com.ericgoebelbecker.stacktraces.StackTrace.d(StackTrace.java:28) at com.ericgoebelbecker.stacktraces.StackTrace.c(StackTrace.java:24) at com.ericgoebelbecker.stacktraces.StackTrace.b(StackTrace.java:20) at com.ericgoebelbecker.stacktraces.StackTrace.a(StackTrace.java:16) at com.ericgoebelbecker.stacktraces.StackTrace.main(StackTrace.java:9)
That’s better! We see the stack trace, and it ends at d() where the exception occurred, even though main() printed it.
Logging Java Stack Traces
What if you don’t want to print an error message to the console but to a log file instead? The good news is that most loggers, including Log4j and Logback, will write exceptions with stack traces if you call them with the right arguments.
Pass in the exception object as the last argument to the message, without a formatting directive. So if you used Log4j or Logback with the sample code like this:
logger.error(“Something bad happened:”, npe);
You would see this in your log file:
Something bad happened: java.lang.NullPointerException: Oops! at com.ericgoebelbecker.stacktraces.StackTrace.d(StackTrace.java:28) at com.ericgoebelbecker.stacktraces.StackTrace.c(StackTrace.java:24) at com.ericgoebelbecker.stacktraces.StackTrace.b(StackTrace.java:20) at com.ericgoebelbecker.stacktraces.StackTrace.a(StackTrace.java:16) at com.ericgoebelbecker.stacktraces.StackTrace.main(StackTrace.java:9)
One of the best things you can do with exceptions and stack traces is to log them so you can use them to isolate a problem. If you get in the habit of printing useful log messages with details like stack traces and log indexing, then search tools, like Scalyr, become one of the most powerful tools in your troubleshooting tool bag.
The Java Debugger
Debuggers work by taking control of a program's runtime and letting you both observe and control it. To do this, it shows you the program stack and enables you to traverse it in either direction. When you’re in a debugger, you get a more complete picture of a stack frame than you do when looking at stack traces in a log message.
Let’s make a small code change and then throw the sample code into a debugger.
First, add a local variable to the d() method:
static void d() { String message = “Oops.” throw new NullPointerException(message); }
Then add a breakpoint where d() throws the exception in your debugger. I’m using IntelliJ's debugger for this image.
Here you can see that the string we added to d() is part of the stack frame because it’s a local variable. Debuggers operate inside the Stack and give you a detailed picture of each frame.
Forcing a Thread Dump
Thread dumps are great post-mortem tools, but they can be useful for runtime issues too. If your application stops responding or is consuming more CPU or memory than you expect, you can retrieve information about the running app with jstack.
Modify main() so the application will run until killed:
public static void main(String[] args) throws Exception { try { while(true) { Thread.sleep(1000); } } catch (NullPointerException ice) { ice.printStackTrace(); } }
Run the app, determine its pid, and then run jstack. On Windows, you'll need to press ctrl-break in the DOS window you're running your code in.
$ jstack <pid>
Jstack will generate a lot of output.
2019-05-13 10:06:17 Full thread dump OpenJDK 64-Bit Server VM (12+33 mixed mode, sharing): Threads class SMR info: _java_thread_list=0x00007f8bb2727190, length=10, elements={ 0x00007f8bb3807000, 0x00007f8bb2875000, 0x00007f8bb2878000, 0x00007f8bb4000800, 0x00007f8bb300a800, 0x00007f8bb287b800, 0x00007f8bb287f000, 0x00007f8bb28ff800, 0x00007f8bb300b800, 0x00007f8bb3805000 } "main" #1 prio=5 os_prio=31 cpu=60.42ms elapsed=103.32s tid=0x00007f8bb3807000 nid=0x2503 waiting on condition [0x0000700001a0e000] java.lang.Thread.State: TIMED_WAITING (sleeping) at java.lang.Thread.sleep(java.base@12/Native Method) at com.ericgoebelbecker.stacktraces.StackTrace.main(StackTrace.java:9) "Reference Handler" #2 daemon prio=10 os_prio=31 cpu=0.08ms elapsed=103.29s tid=0x00007f8bb2875000 nid=0x4603 waiting on condition [0x0000700002123000] java.lang.Thread.State: RUNNABLE at java.lang.ref.Reference.waitForReferencePendingList(java.base@12/Native Method) at java.lang.ref.Reference.processPendingReferences(java.base@12/Reference.java:241) at java.lang.ref.Reference$ReferenceHandler.run(java.base@12/Reference.java:213) "Finalizer" #3 daemon prio=8 os_prio=31 cpu=0.13ms elapsed=103.29s tid=0x00007f8bb2878000 nid=0x3903 in Object.wait() [0x0000700002226000] java.lang.Thread.State: WAITING (on object monitor) at java.lang.Object.wait(java.base@12/Native Method) - waiting on <0x000000070ff02770> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(java.base@12/ReferenceQueue.java:155) - locked <0x000000070ff02770> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(java.base@12/ReferenceQueue.java:176) at java.lang.ref.Finalizer$FinalizerThread.run(java.base@12/Finalizer.java:170) "Signal Dispatcher" #4 daemon prio=9 os_prio=31 cpu=0.27ms elapsed=103.28s tid=0x00007f8bb4000800 nid=0x3e03 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "C2 CompilerThread0" #5 daemon prio=9 os_prio=31 cpu=6.12ms elapsed=103.28s tid=0x00007f8bb300a800 nid=0x5603 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE No compile task "C1 CompilerThread0" #7 daemon prio=9 os_prio=31 cpu=12.01ms elapsed=103.28s tid=0x00007f8bb287b800 nid=0xa803 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE No compile task "Sweeper thread" #8 daemon prio=9 os_prio=31 cpu=0.73ms elapsed=103.28s tid=0x00007f8bb287f000 nid=0xa603 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Service Thread" #9 daemon prio=9 os_prio=31 cpu=0.04ms elapsed=103.27s tid=0x00007f8bb28ff800 nid=0xa503 runnable [0x0000000000000000] java.lang.Thread.State: RUNNABLE "Common-Cleaner" #10 daemon prio=8 os_prio=31 cpu=0.27ms elapsed=103.27s tid=0x00007f8bb300b800 nid=0xa303 in Object.wait() [0x000070000293b000] java.lang.Thread.State: TIMED_WAITING (on object monitor) at java.lang.Object.wait(java.base@12/Native Method) - waiting on <0x000000070ff91690> (a java.lang.ref.ReferenceQueue$Lock) at java.lang.ref.ReferenceQueue.remove(java.base@12/ReferenceQueue.java:155) - locked <0x000000070ff91690> (a java.lang.ref.ReferenceQueue$Lock) at jdk.internal.ref.CleanerImpl.run(java.base@12/CleanerImpl.java:148) at java.lang.Thread.run(java.base@12/Thread.java:835) at jdk.internal.misc.InnocuousThread.run(java.base@12/InnocuousThread.java:134) "Attach Listener" #11 daemon prio=9 os_prio=31 cpu=0.72ms elapsed=0.10s tid=0x00007f8bb3805000 nid=0x5e03 waiting on condition [0x0000000000000000] java.lang.Thread.State: RUNNABLE "VM Thread" os_prio=31 cpu=3.83ms elapsed=103.29s tid=0x00007f8bb2874800 nid=0x3703 runnable "GC Thread#0" os_prio=31 cpu=0.13ms elapsed=103.31s tid=0x00007f8bb282b800 nid=0x3003 runnable "G1 Main Marker" os_prio=31 cpu=0.26ms elapsed=103.31s tid=0x00007f8bb2845000 nid=0x3103 runnable "G1 Conc#0" os_prio=31 cpu=0.04ms elapsed=103.31s tid=0x00007f8bb3810000 nid=0x3303 runnable "G1 Refine#0" os_prio=31 cpu=0.39ms elapsed=103.31s tid=0x00007f8bb2871000 nid=0x3403 runnable "G1 Young RemSet Sampling" os_prio=31 cpu=13.60ms elapsed=103.31s tid=0x00007f8bb2872000 nid=0x4d03 runnable "VM Periodic Task Thread" os_prio=31 cpu=66.44ms elapsed=103.27s tid=0x00007f8bb2900800 nid=0xa403 waiting on condition JNI global refs: 5, weak refs: 0
My application was running 11 threads, and jstack generated a stack trace for all of them. The first thread, helpfully named main, is the one we're concerned with. You can see it sleeping on wait().
Java Stack Traces: Your Roadmap
A stack trace is more than just a picture inside your application. It's a snapshot of a moment in time that includes every step your code took to get there. There's no reason to dread seeing one in your logs because they're a gift from Java that tells you exactly what happened. Make sure you're logging them when an error crops up and send them to a tool like Scalyr so they're easy to find.
Now that you understand what a Java stack trace is and how to use it, take a look at your code. Are you throwing away critical information about errors and exceptions in your code? Is there a spot where a call to Thread.dumpstack() might help you isolate a recurring bug? Perhaps it's time to run your app through the debugger a few times with some strategically-chosen breakpoints.