A lot of my research has involved Java bytecode instrumentation with ASM and more recently, also JVMTI development. For instance, with VMVM, we instrumented bytecode to enable efficient java class reinitialization. With Phosphor, we instrumented bytecode to track metadata with every single variable, enabling performant and portable dynamic taint tracking. In ElectricTest, we combined bytecode instrumentation with JVMTI to efficiently detect data dependencies between test cases running within the same JVM.
As I built each of these projects, I leaned heavily on code snippets that I found across the internet – while I had done a lot of Java development before starting these projects, I’d done nothing with bytecode instrumentation. I found it particularly difficult to find examples of how to use JVMTI (aside from the basic man-pages, and a few excellent blog posts by Kelly O’Hair [1, 2, 3]).
To try to make it easier for others to use the same tools to build their own systems, I’m compiling some examples that I think might be useful here. I don’t intend for this to serve as a beginner’s resource – there are plenty of bytecode instrumentation tutorials out there – instead, I plan to collect some interesting examples (mostly related to JVMTI), that I think would be useful. If you have any particular requests, please let me know (as a comment here, or via email, or twitter).
Byte code rewriting can be used to change and insert instructions in code, and JVMTI can be used to interact with low level events in the JVM (such as objects being freed, garbage collection beginning, and allows you to assign a relatively efficient tag to any reference type (object or array). Each one of these examples has some interesting trick though, that I thought was worthy to share. Each one is a maven project, and you can build and run the tests with mvn verify (the JVMTI projects should work on Mac OS X and Linux, but are not configured to build on Windows – it’s possible to do but the scripts aren’t there). All of the examples are in a GitHub project. To import them in eclipse, first run mvn eclipse:eclipse in the project to generate eclipse project configuration files.
- Method Coverage recording – efficiently records per-method coverage information (e.g. which methods in an application under test are executed by each test). Byte code is instrumented dynamically as its loaded into the JVM (using a java agent). There is a local cache within each class that records whether a method was hit during the current test case, and a global collection that stores them too. This local + global cache is much more performant than just keeping a global cache, because when each method is executed we can first check a local boolean field (which is easily optimized by the JIT compiler), and if it hasn’t been hit, THEN we store the fact that the method was executed in a global set (which is relatively much more expensive).
- Static Instrumenter – applies byte code instrumentation statically, rather than at load time. This technique is needed if you want to instrument various core JRE classes that would be loaded already (and immutable) after your javaagent gets called.
- Heap Tagging – uses JVMTI and byte code instrumentation to allow you to apply an arbitrary object “label” to every reference type (objects or arrays). Doing this for many instances of classes (objects) is trivial: we just add a field to each class to store the tag, and generate some code to set and fetch it for each class (every class is made to implement the interface Tagged). However, you can’t do this for all classes – the constant pool offsets for some fields of some classes (like Object, Long, Byte, etc.); plus you can’t do this for arrays (which aren’t instances of classes). For this, we use JVMTI’s getTag and setTag functions. Tagger provides an abstraction to get and set the label of an object. The JVMTI code implementation is mostly book-keeping that makes sure that we don’t leak memory from these object labels. The JVMTI code is largely inspired by another excellent example by Kelly O’Hair.
- Heap Walking – uses JVMTI for a slightly contrived (but still somewhat interesting) example of heap walking and tagging. It crawls the heap (using FollowReferences), and for every object, builds a list of the static fields that can reach that object. After crawling, the library can return the list of static fields that point (perhaps indirectly) to the requested object. This example also shows off how to calculate the internal JVM field offsets for classes (which was a pain to write out my first time…).
Let me know (email, twitter, or comment on this blog post) if you have any questions or requests.
I imported Method Coverage recording maven project. When i run the JUnit test, its unable to identify method call(otherMethod) due to which assert is failing….Can you please let me know how to run the test?
The tests are configured to instrument the code as it is loaded into the JVM with a javaagent. This configuration is in Maven (you can see in the pom.xml). If you try running the tests in eclipse they will not run in this special configuration and will fail. If you want to run this outside of maven, you would have to specify the javaagent argument as in the maven-fail safe configuration. Are you running the tests in eclipse, or with “mvn verify”?
hi jon, is there a way to capture the state of the application when an exception occurs?
Yup – the easiest way would be to register an uncaught exception handler to receive otherwise unhandled exceptions and then you can implement some code in there to capture whatever state it is that you need. If you are concerned with a specific type of exception only (which might also be caught by the app), you could instrument the constructor of that exception to trap into some code similarly, recording whatever state you are interested in.
Hi Jon, Is there a way to modify a Heap Walking example to build a list of all fields and not only static fields.
A list of all fields of what? Right now it’s finding all static fields, then listing everything reachable from those (e.g. as fields, or members of an array). All fields of an object? Why do you want to do this with JVMTI instead of reflection?