GraalVM Native Image offers quick startup and less memory consumption for a Java application, running as a native executable, by default. You can optimize this native executable even more for additional performance gain and higher throughput by applying Profile-Guided Optimizations (PGO).
With PGO you can collect the profiling data in advance and then feed it to the native-image
tool, which will use this information to optimize the performance of the resulting binary.
Note: PGO is not available in GraalVM Community Edition.
This guide shows how to apply PGO and transform your Java application into an optimized native executable.
For the demo part, you will run a Java application performing queries implemented with the Java Streams API. A user is expected to provide two integer arguments: the number of iterations and the length of the data array. The application creates the data set with a deterministic random seed and iterates 10 times. The time taken for each iteration and its checksum is printed to the console.
Below is the stream expression to optimize:
Arrays.stream(persons)
.filter(p -> p.getEmployment() == Employment.EMPLOYED)
.filter(p -> p.getSalary() > 100_000)
.mapToInt(Person::getAge)
.filter(age -> age > 40)
.average()
.getAsDouble();
Follow these steps to build an optimized native executable using PGO.
Note: Make sure you have installed a GraalVM JDK. The easiest way to get started is with SDKMAN!. For other installation options, visit the Downloads section.
Save the following code to the file named Streams.java:
import java.util.Arrays;
import java.util.Random;
public class Streams {
static final double EMPLOYMENT_RATIO = 0.5;
static final int MAX_AGE = 100;
static final int MAX_SALARY = 200_000;
public static void main(String[] args) {
int iterations;
int dataLength;
try {
iterations = Integer.valueOf(args[0]);
dataLength = Integer.valueOf(args[1]);
} catch (Throwable ex) {
System.out.println("Expected 2 integer arguments: number of iterations, length of data array");
return;
}
Random random = new Random(42);
Person[] persons = new Person[dataLength];
for (int i = 0; i < dataLength; i++) {
persons[i] = new Person(
random.nextDouble() >= EMPLOYMENT_RATIO ? Employment.EMPLOYED : Employment.UNEMPLOYED,
random.nextInt(MAX_SALARY),
random.nextInt(MAX_AGE));
}
long totalTime = 0;
for (int i = 1; i <= 20; i++) {
long startTime = System.currentTimeMillis();
long checksum = benchmark(iterations, persons);
long iterationTime = System.currentTimeMillis() - startTime;
totalTime += iterationTime;
System.out.println("Iteration " + i + " finished in " + iterationTime + " milliseconds with checksum " + Long.toHexString(checksum));
}
System.out.println("TOTAL time: " + totalTime);
}
static long benchmark(int iterations, Person[] persons) {
long checksum = 1;
for (int i = 0; i < iterations; ++i) {
double result = getValue(persons);
checksum = checksum * 31 + (long) result;
}
return checksum;
}
public static double getValue(Person[] persons) {
return Arrays.stream(persons)
.filter(p -> p.getEmployment() == Employment.EMPLOYED)
.filter(p -> p.getSalary() > 100_000)
.mapToInt(Person::getAge)
.filter(age -> age >= 40).average()
.getAsDouble();
}
}
enum Employment {
EMPLOYED, UNEMPLOYED
}
class Person {
private final Employment employment;
private final int age;
private final int salary;
public Person(Employment employment, int height, int age) {
this.employment = employment;
this.salary = height;
this.age = age;
}
public int getSalary() {
return salary;
}
public int getAge() {
return age;
}
public Employment getEmployment() {
return employment;
}
}
$JAVA_HOME/bin/javac Streams.java
(Optional) Run the demo application, providing some arguments to observe performance.
$JAVA_HOME/bin/java Streams 100000 200
$JAVA_HOME/bin/native-image Streams
An executable file, streams
, is created in the current working directory.
Now run it with the same arguments to see the performance:
./streams 100000 200
This version of the program is expected to run slower than on GraalVM’s or any regular JDK.
Build an instrumented native executable by passing the --pgo-instrument
option to native-image
:
$JAVA_HOME/bin/native-image --pgo-instrument Streams
Run it to collect the code-execution-frequency profiles:
./streams 100000 20
Notice that you can profile with a much smaller data size. Profiles collected from this run are stored by default in the default.iprof file.
Note: You can specify where to collect the profiles when running an instrumented native executable by passing the
-XX:ProfilesDumpFile=YourFileName
option at run time.
Finally, build an optimized native executable by specifying the path to the collected profiles:
$JAVA_HOME/bin/native-image --pgo=default.iprof Streams
Note: You can also collect multiple profile files, by specifying different filenames, and pass them to the
native-image
tool at build time.
Run this optimized native executable timing the execution to see the system resources and CPU usage:
time ./streams 100000 200
You should get the performance comparable to, or faster, than the Java version of the program. For example, on a machine with 16 GB of memory and 8 cores, the TOTAL time
for 10 iterations reduced from ~2200 to ~270 milliseconds.
This guide showed how you can optimize native executables for additional performance gain and higher throughput. Oracle GraalVM offers extra benefits for building native executables, such as Profile-Guided Optimizations (PGO). With PGO you “train” your application for specific workloads and significantly improve the performance.