◀Back
Optimize Size of a Native Executable using Build Reports
You can optimize your native executable by taking advantage of different tools provided with Native Image. The guide demonstrates how to use the Build Report tool to better understand the contents of a produced native executable, and how a small alteration in an application, without any semantic change, can influence the final binary size.
Note: Build Report is not available in GraalVM Community Edition.
Prerequisites
Make sure you have installed a GraalVM JDK. The easiest way to get started is with SDKMAN!. For other installation options, visit the Downloads section.
For the demo, you will run a simple Java application that extracts the i-th word from an input string. The words are delimited by commas and may be enclosed by an arbitrary number of whitespace characters.
- Save the following Java code to a file named IthWord.java:
public class IthWord { public static String input = "foo \t , \t bar , baz"; public static void main(String[] args) { if (args.length < 1) { System.out.println("Word index is required, please provide one first."); return; } int i = Integer.parseInt(args[0]); // Extract the word at the given index. String[] words = input.split("\\s+,\\s+"); if (i >= words.length) { System.out.printf("Cannot get the word #%d, there are only %d words.%n", i, words.length); return; } System.out.printf("Word #%d is %s.%n", i, words[i]); } }
- Compile the application:
javac IthWord.java
(Optional) Test the application with some arbitrary argument to see the result:
java IthWord 1
The output should be:
Word #1 is bar.
- Build a native executable from the class file along with a Build Report:
native-image IthWord --emit build-report
The command generates an executable file,
_ithword_
, in the current working directory. The Build Report file, ithword-build-report.html, is automatically created alongside the native executable. A link to the report is also listed in the Build artifacts section at the end of the build output. You can specify a different filename or path for the report by appending it to thebuild-report
argument, for example,--emit build-report=/tmp/custom-name-build-report.html
.(Optional) Run this executable with the same argument:
./ithword 1
The output should be identical to the former one:
Word #1 is bar.
-
A Build Report is an HTML file. Open the report in a browser. First, you are greeted with the general summary about the image build. You can see the total image size above the Image Details chart in top-right:
The initial size looks as expected, but, for the reference, the size of a HelloWorld application is around 7 MB. So the difference is substantial, despite the fact that the code is quite straightforward. Continue with the investigation.
-
Go to the Code Area tab either by clicking its tab in the navigation or the corresponding bar in the chart.
The breakdown chart you see now visualizes how different packages relate to each other in terms of their bytecode size. Note that the shown packages contain only the methods found to be reachable by the static analysis. This means that the shown packages (and their classes) are the only ones that end up being compiled and are in the final binary.
The first conclusion you can draw is that the most of the code originates from either JDK or Native Image internal code — see that the
IthWord
class only contributes 0.013% of the total bytecode size of all the reachable methods. -
Drill-down to the
java
package just by clicking it. Most of the reachable code (almost the half) comes from thejava.util
package. Also, you can notice thatjava.text
andjava.time
packages contribute to almost 20% of thejava
package size. But does the application use these packages? -
Drill-down to the
text
pacakge:You now see that most of the reachable classes are used for text formatting (see the list of packages and classes below). By now, you can suspect that included formatting classes can only be reachable (although not actually used) from one place:
System.out.printf
. -
Go back to the
java
package (by clicking the central circle or just thejava
name in the top of the chart). -
Next drill-down to the
time
package:Almost half of the package size comes from its
format
subpackage (similar to the situation in thejava.text
package). So,System.out.printf
is your first opportunity for improving the binary size. - Go back to the initial application and simply switch from using
System.out.printf
toSystem.out.println
:public class IthWord { public static String input = "foo \t , \t bar , baz"; public static void main(String[] args) { if (args.length < 1) { System.out.println("Word index is required, please provide one first."); return; } int i = Integer.parseInt(args[0]); // Extract the word at the given index. String[] words = input.split("\\s+,\\s+"); if (i >= words.length) { // Use System.out.println instead of System.out.printf. System.out.println("Cannot get the word #" + i + ", there are only " + words.length + " words."); return; } // Use System.out.println instead of System.out.printf. System.out.println("Word #" + i + " is " + words[i] + "."); } }
-
Repeat the steps 2-4 (compile the class file, build the native executable, and open the new report).
-
See in the Summary section that the total binary size got reduced by almost 40%:
-
Go to the Code Area tab again and drill-down to the
java
package. You can see that the initial assumption is correct: bothjava.text
andjava.time
packages are not reachable anymore.Continue to see if there is more reachable code that the application does not necessarily need.
As you may have guessed already, the other candidate resides in the
java.util
package, and is theregex
subpackage. The package alone contributes nearly 15% of thejava
package size now. Notice that the regular expression (\\s+,\\s+
) is used to split the original input into the words. Although very convenient, it makes the aforementionedregex
package unnecessary dependency. The regular expression itself is not complex, and could be implemented differently. -
Next go to the Image Heap tab to continue our exploration. The section provides a list of all object types that are part of the image heap: the heap that contains reachable objects such as static application data, metadata, and byte arrays for different purposes. In this case, the list looks as usual: most of the size comes from the raw string values stored in their dedicated byte array (around 20%),
String
andClass
objects (around 20%), and also from code metadata (20%).There are no specific object types that heavily contribute to the image heap in this application. But there is one unexpected entry: a small size contribution (~2%) is due to the resources that are embedded into the image heap. The application does not use any explicit resources, so this is unexpected.
-
Switch to the Resource tab to continue the investigation. This section provides a list of all the resources that are explicitly requested through the configuration file(s). There is also the option to toggle other kinds of resources (Missing resources, Injected resources, and Directory resources); however, this is beyond the scope of this guide. Learn more in Native Image Build Report.
To conclude this part, there is only one resource (
java/lang/uniName.dat
) that comes from thejava.base
module that also contributes to the image heap, but is not requested from the application code explicitly. You cannot do anything about this, but keep in mind that the JDK code (indirectly reachable from the user code) can also use the additional resources, which then adversely affect the size. - Now go back to the application code, and implement a new approach that does not use regular expressions.
The following code uses
String.substring
andString.indexOf
to preserve the semantics, but also keep the logic relatively simple:public class IthWord { public static String input = "foo \t , \t bar , baz"; public static void main(String[] args) { if (args.length < 1) { System.out.println("Word index is required, please provide one first."); return; } int i = Integer.parseInt(args[0]); // Extract the word at the given index using String.substring and String.indexOf. String word = input; int j = i, index; while (j > 0) { index = word.indexOf(','); if (index < 0) { // Use System.out.println instead of System.out.printf. System.out.println("Cannot get the word #" + i + ", there are only " + (i - j + 1) + " words."); return; } word = word.substring(index + 1); j--; } index = word.indexOf(','); if (index > 0) { word = word.substring(0, word.indexOf(',')); } word = word.trim(); // Use System.out.println instead of System.out.printf. System.out.println("Word #" + i + " is " + word + "."); } }
-
Repeat the steps 2-4 again (compile the class file, build the native executable, and open the new report).
-
Once more, you can see the improvement in the total binary size (around 15%) in the Summary section:
Additionally, a previously registered resource is not part of the generated binary anymore (see the Resources section again to confirm):
This guide demonstrated how to optimize the size of a native executable using Build Reports. Build Reports allow you to explore the contents of the generated native executables in greater detail. A better understanding of which code is reachable enables you to implement the application in a way that preserves its semantics while removing unnecessary JDK dependencies.