Introduction
In the previous parts we've done many measurements with AWS Lambda using Java 17 runtime with and without using AWS SnapStart and additionally using SnapStart and priming DynamoDB invocation :
- cold starts using different deployment artifact sizes
- cold starts and deployment time using different Lambda memory settings
- warm starts using different Lambda memory settings
We've done all those measurements using the following JAVA_TOOL_OPTIONS: "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" defined in the AWS SAM template.yaml. This means that client compilation (c1) without profiling will be applied. It was considered the best choice due to the article Optimizing AWS Lambda function performance for Java by Mark Sailes. But all these measurements have been done for Java 11 and before Lambda SnapStart has been released. So now it's time to revisit this topic and measure cold and warm start times with different Java compilation options without SnapStart enabled, with SnapStart enabled (and additionally with priming). In this article we'll do it for Java 17 runtime and will compare it with the same measurements for Java 21 already performed in the article Measuring cold and warm starts with Java 21 using different compilation options
Meaning of Java compilation options
This picture shows Java compilation available.
If you don't specify any options, the default one applied for will be tiered compilation. You can read more about it in this article Tiered Compilation in JVM or generally about client (C1) and server (C2) compilation in the article Client, Server, and Tiered Compilation. There are also many other settings so you can apply to each of the compilation options. You can read more about them in this article JVM c1, c2 compiler thread – high CPU consumption?
Measuring cold starts and deployment time with Java 17 using different compilation options
In our experiment we'll re-use the application introduced in part 8 for this. Here is the code for the sample application. There are basically 2 Lambda functions which both respond to the API Gateway requests and retrieve product by id received from the API Gateway from DynamoDB. One Lambda function GetProductByIdWithPureJava17Lambda can be used with and without SnapStart and the second one GetProductByIdWithPureJava17LambdaAndPriming uses SnapStart and DynamoDB request invocation priming.
The results of the experiment below were based on reproducing more than 100 cold and approximately 100.000 warm starts. For it (and experiments from my previous article) I used the load test tool hey, but you can use whatever tool you want, like Serverless-artillery or Postman. I ran all these experiments with 5 different compilation options defined in the template.yaml. This happens in the Globals section where variable named "JAVA_TOOL_OPTIONS" is defined in the Environment section of the Lambda function:
Globals:
Function:
CodeUri: target/aws-pure-lambda-snap-start-17-1.0.0-SNAPSHOT.jar
Runtime: java21
....
Environment:
Variables:
JAVA_TOOL_OPTIONS: "-XX:+TieredCompilation -XX:TieredStopAtLevel=1"
- no options (tiered compilation will take place)
- JAVA_TOOL_OPTIONS: "-XX:+TieredCompilation -XX:TieredStopAtLevel=1" (client/C1 compilation without profiling)
- JAVA_TOOL_OPTIONS: "-XX:+TieredCompilation -XX:TieredStopAtLevel=2" (client/C1 compilation with basic profiling)
- JAVA_TOOL_OPTIONS: "-XX:+TieredCompilation -XX:TieredStopAtLevel=3" (client/C1 compilation with full profiling)
- JAVA_TOOL_OPTIONS: "-XX:+TieredCompilation -XX:TieredStopAtLevel=4" (server/C2 compilation)
For their meaning see our explanations above. We will refer to those compilation options in the table column "Compilation Option" by their number in the tables below, for example number 5 stays for JAVA_TOOL_OPTIONS: "-XX:+TieredCompilation -XX:TieredStopAtLevel=4". Abbreviation c is for the cold start and w is for the warm start.
Cold (c) and warm (w) start times without SnapStart in ms:
Compilation Option | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 2831.33 | 2924.85 | 2950.12 | 3120.34 | 3257.03 | 3386.67 | 5.73 | 6.50 | 7.88 | 20.49 | 49.62 | 1355.08 |
2 | 2880.53 | 2918.79 | 2974.45 | 3337.29 | 3515.86 | 3651.65 | 6.11 | 7.05 | 8.94 | 23.54 | 62.99 | 1272.96 |
3 | 2906.39 | 2950.59 | 3016.8 | 3283.31 | 3409.65 | 3593.65 | 5.73 | 6.61 | 7.87 | 21.07 | 53.74 | 1548.95 |
4 | 3247.9 | 3348.82 | 3481.41 | 3673.51 | 3798.97 | 3904.13 | 6.72 | 7.75 | 9.38 | 24.69 | 72.67 | 1494.98 |
5 | 4146.66 | 4231.9 | 4377.42 | 4557.21 | 4699.03 | 4780.63 | 6.11 | 7.27 | 10.15 | 29.87 | 103.03 | 2062.84 |
Cold (c) and warm (w) start times with SnapStart without Priming in ms:
Compilation Option | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 1506.20 | 1577.06 | 1845.01 | 2010.62 | 2280.46 | 2281 | 5.82 | 6.72 | 8.39 | 22.81 | 798.46 | 1377.54 |
2 | 1521.33 | 1578.64 | 1918.35 | 2113.65 | 2115.77 | 2117.42 | 6.01 | 7.05 | 8.94 | 23.92 | 101.41 | 1077.45 |
3 | 1463.16 | 1532.00 | 1886.03 | 1990.62 | 2020.69 | 2021.39 | 5.92 | 6.72 | 8.00 | 22.09 | 95.17 | 1179.13 |
4 | 1657.88 | 1755.07 | 2057.37 | 2158.49 | 2169.30 | 2170.65 | 6.41 | 7.27 | 8.80 | 24.30 | 96.69 | 1374.43 |
5 | 2269.10 | 2340.50 | 2581.36 | 2762.91 | 2807.45 | 2808.89 | 6.41 | 7.75 | 11.34 | 32.86 | 1506.60 | 1941.26 |
Cold (c) and warm (w) start times with SnapStart and with DynamoDB invocation Priming in ms:
Compilation Option | c p50 | c p75 | c p90 | c p99 | c p99.9 | c max | w p50 | w p75 | w p90 | w p99 | w p99.9 | w max |
---|---|---|---|---|---|---|---|---|---|---|---|---|
1 | 708.90 | 790.50 | 960.61 | 1041.61 | 1148.80 | 1149.91 | 5.64 | 6.61 | 8.38 | 21.07 | 141.53 | 373.37 |
2 | 692.79 | 758.00 | 1003.80 | 1204.06 | 1216.15 | 1216.88 | 6.21 | 7.27 | 9.38 | 25.09 | 103.03 | 256.65 |
3 | 670.98 | 720.33 | 1007.82 | 1072.25 | 1200.45 | 1200.64 | 5.38 | 6.11 | 7.27 | 19.15 | 99.81 | 303.52 |
4 | 732.99 | 828.88 | 1030.07 | 1271.24 | 1350.41 | 1390.03 | 6.30 | 7.05 | 8.52 | 23.17 | 103.03 | 469.45 |
5 | 937.84 | 1056.29 | 1227.14 | 1422.78 | 1445.72 | 1447.09 | 6.30 | 7.75 | 11.16 | 32.86 | 122.69 | 381.03 |
Conclusions
For all measurements for Java 17 we discovered that setting compilation options -XX:+TieredCompilation -XX:TieredStopAtLevel= 3 or 4 produced much worse cold and warm starts as the tiered compilation or -XX:TieredStopAtLevel=1 and 2 (client compilation without or with basic profling). With Java 21 we observed much worse cold and warm starts also starting with -XX:TieredStopAtLevel=2 which is not the case for Java 17.
For the Lambda function with Java 17 without SnapStart enabled tiered compilation (default one) is a better option for having lower cold and warm starts for nearly all percentiles for our use case
For the Lambda function with Java 21 without SnapStart enabled it’s different see my article Measuring cold and warm starts with Java 21 using different compilation options : client compilation without profiling (-XX:+TieredCompilation -XX:TieredStopAtLevel=1 ) is the better option for having lower cold and warm starts for nearly all percentiles for our use case.
For the Lambda function with Java 17 with SnapStart enabled (with priming of DynamoDB invocation or without priming) the tiered compilation or -XX:TieredStopAtLevel=1 and 2 produced very close cold and warm starts which vary a bit depending on the compilation option and percentile.
For the Lambda function with Java 21 with SnapStart enabled (with priming of DynamoDB invocation or without priming) it was different: tiered compilation (the default one) outperformed the client compilation without profiling (-XX:+TieredCompilation -XX:TieredStopAtLevel=1 ) in terms of the lower cold start time and also for the warm start time for nearly all percentiles for our use case.
So please review/re-measure the cold and warm start times for your use case if you use Java 17, as tiered compilation can be a better choice if you enable SnapStart for your Lambda function(s). In our case we didn't use any framework like Spring Boot, Micronaut or Quarkus which may also impact the measurements.