Generating JVM bytecode 4

GCJ lets you compile your Java program to a native executable. This can be handy, but this isn't as handy as you might think in relation to generating JVM bytecode.

A GCJ-compiled program starts quickly, and can be used in places where you can't use a JAR file or a shell script wrapper (the two most common choices for launching Java programs), but when you start to execute your generated bytecode, you'll notice that code runs pretty slowly. About the same speed as most interpreters seem to run. This is not a coincidence.

So that's a shame. If you're not generating bytecode, and just using a tree-walking interpreter, this won't be a problem, and you might still want to consider GCJ. Just watch out for GNU Classpath: there are lots of bugs and unimplemented features (including unimplemented features that just silently do the wrong thing), and performance characteristics can be very different. BigInteger, for example, is way slower at the moment.

One idea I haven't yet investigated is that of using GCJ to make a native binary that uses CNI to start Sun's JVM. I'd use GCJ-compiled code to generate the bytecode and hand over to HotSpot for the actual execution. Given that you'd have to pay for the Sun JVM initialization in the end, and you'd miss out on the benefit of having the compiler "warm up" the JVM for execution of the generated code, I'm not sure this would really gain much over just writing a little C++ JNI-based launcher. Maybe start the Sun JVM on another thread while compiling the bytecodes on the main thread?

The start-up time only matters for scripts that take microseconds to execute, because the JVM soon more than makes up the lost time, but it would be nice to have our cake and eat it.


Generating JVM bytecode 3

ASM, gnu.bytecode, and org.mozilla.classfile: a comparison
I explained earlier that I chose ASM over BCEL because the latter has a reputation for being big and slow, while the former makes a big thing of being smaller and faster. Plus ASM has better documentation.

There are other choices, though. You can write your own library, for example. From the various implementations I've seen, I reckon that's about 3000 lines of work. If that end of things interests you, go for it. But it's not obvious that there's anything sufficiently wrong with the existing libraries that you'd really want to write your own.

The only other alternative I'd seriously considered was Per Bothner's gnu.bytecode, written as part of Kawa. Per describes Kawa as "The Kawa Language Framework", but most people who've heard of it probably think of it as a Java implementation of Scheme. (Such a thing is indeed part of Kawa, most of the documentation is about Kawa Scheme, and most people who use Kawa use it as a Scheme implementation, even though it also implements XQuery and Emacs Lisp, so it's easy to see how this comes about.)

So. Why didn't I try gnu.bytecode?

1. It's not packaged and available in Ubuntu's repositories. Despite the fact that nothing I'm the major author of and still use fits this criterion, I mostly shy away from stuff I can't apt-get install. If no-one cares enough to maintain a package in the enormous Debian repositories, my thinking goes, how good can it be? (At the same time, right now, the only GUI application I'm running that's available on packages.ubuntu.com is Firefox. And I'm running all the stuff I usually run. But still, my gut feeling is always that "not in Debian" is a bad thing. ASM is. BCEL is.) Per points out that Kawa is packaged in Fedora.

2. It's easy to fall into the trap of thinking it's part of a Scheme implementation. Because the JavaDoc on the web is collected together, it's not obvious that gnu.bytecode is completely stand-alone. If you download the source and grep the imports, or download the source, delete everything outside gnu/bytecode/ and compile what's left,you can confirm that gnu.bytecode is, indeed, completely stand-alone. But you'd be forgiven (by me, at least) for assuming you'd get all the Scheme junk too. Moreover, it's not obvious that gnu.bytecode is sufficiently complete for compiling arbitrary languages. How do you know, given its bundling, that it's not just whatever was necessary for Scheme? (I have an existence proof for you, now, but more on that later.)

3. It's effectively undocumented. There's no tutorial or even a simple example, most of the classes and methods are undocumented, and those few methods that are "documented" tend to be sufficiently telegraphic that you really need to look at the source to understand what they're trying to tell you. (And a lot of the seemingly obvious methods hide little tricks that you'd do well to examine the source to learn more about, too. More on this, too, later.)

I was in a hurry to dump my old AST-walking interpreter, and ASM seemed fine, so gnu.bytecode really didn't seem like a tempting proposition, given the above. Then, earlier this week, Patrick Wright asked me "why not gnu.bytecode?", and Per Bothner (who was cc:ed) convinced me that point 2 above wasn't really an issue: gnu.bytecode is stand-alone, is intended to be complete, and is intended to be used in other language implementations. I stuck to my "but it's undocumented" guns until Patrick sent me a link to Peter Sestoft's Runtime Code Generation with JVM and CLR, section 2.4 of which is by far the most useful gnu.bytecode documentation I've seen: it's a rough equivalent of the trivial example given by most other bytecode libraries of how to create a class and add a method with a little bit of code in it.

I'd complained in an earlier post that, as far as I know, there's no decent comparison of the bytecode libraries (though Sestoft's paper comes closest, it's pretty old, and isn't really a comparison so much as a couple of data points), so in a moment of foolhardiness I decided that the almost exactly 1000 lines of my language's JvmCodeGenerator.java would be an interesting experiment.

After I'd rewritten my code generator to use gnu.bytecode, I came across org.mozilla.classfile and, either because of a subconscious desire to procrastinate or a desire to be thorough, I rewrote my code generator again.

It's worth pointing out that ASM and gnu.bytecode and org.mozilla.classfile don't really address the same problem. ASM is a bytecode manipulation library, gnu.bytecode is a bytecode generation library that's the lowest-level part of a high-level framework for language implementations, and org.mozilla.classfile is pretty much "the simplest thing that could possibly work", used as the back-end for their Rhino Javascript compiler.

ASM is designed to be equally at home modifying existing classes as writing new ones. This is why it's quite awkward to use for compilation, in comparison to something like gnu.bytecode which was expressly designed for that purpose. If you need to do both, your time is probably better invested in learning ASM. If you only need to compile new classes, gnu.bytecode is worth considering. If you want something small and simple (or you're specifically looking for a GPL-licensed library), org.mozilla.classfile is also worth considering.

The mixed blessing of more checking
Unlike ASM, which mostly signals the fact that you've made a mistake by throwing ArrayIndexOutOfBoundsException or NullPointerException from its bowels, or by letting you output bad/unverifiable code, gnu.bytecode and org.mozilla.classfile make a bit of an effort to keep you on the straight and narrow. Great, you think. And sometimes it is. But at other times I found myself thinking that "mistakes like this were actually easier to understand within the context of bad code". As usual in life, everything's a trade-off, and there's no One True Answer here, either.

Of course, this is one area in which I can't really make a fair comparison, because I've hunted down a lot more bugs in my code generator with ASM, and the bugs in my gnu.bytecode code generator were potentially different kinds of bugs, being errors in my translation between the two APIs rather than the kind of bug you'd see in real life as you tried to write a new code generator. Rewriting again, I made more errors because I was using a lower-level toolkit and because I felt confident doing a lot of the rewrite with search-and-replace. This caused a little bit of trouble because I was moving to a lower-level system, so something that had been a single concept in the higher-level libraries actually needed to be made more explicit.

Naming consistency
In general, gnu.bytecode's naming seems more in line with the JVMS than ASM's. I've been guilty of the mistake of thinking I have better terminology myself often enough, and it never works out unless you can completely isolate people from what lies beneath. And you rarely can, so it always just ends up confusing having to remember which term to use where.

Of course, the fact that gnu.bytecode mostly avoids this trap makes it all the more annoying when it falls right into it, dragging you after it. Calling the constant for the gnu.bytecode.Type representing java.lang.Object "pointer_type", for example, is unfortunate. (It's that kind of unidiomatic language that makes the Sun verifier so unintelligible without reference to the source. Here too, I found pointer_type by searching for java.lang.Object, after my guess "object_type" failed. I might have tried "reference_type", but "pointer_type"?)

The gnu.bytecode "if" instructions are a bit confusing, too. ASM has ifCmp, and gnu.bytecode has a huge range of methods, some with names that make them seem like implementation details rather than things you should be calling. For the common case where you want "ifeq label", gnu.bytecode is slightly more compact, convenient, and obvious, using "code.emitGotoIfEq(label)" instead of "mg.ifCmp(type, GeneratorAdapter.EQ, label)". In the general case, though, where you're implementing all your relational operators with code that's exactly the same apart from the "if" bytecode, it's a pain that you can't just pass a conveniently-named constant as you can with ASM. With gnu.bytecode you either end up passing magic numbers (there's no gnu.bytecode equivalent of ASM's Opcode, which has a named constant for each bytecode) or, as I did, having a secondary switch inside the general-purpose method that invokes the relevant CodeAttr method by name.

Finally, there's no emitNop in gnu.bytecode. You have to use put1(0) instead. The "nop" instruction is bytecode 0, you see. Again, something like ASM's Opcode would have helped here. The source just uses the appropriate magic numbers each time it needs them.

As for org.mozilla.classfile, the interface is way smaller. You deal with a single class (plus a poor-man's enum), and a summary of the public API fits comfortably on my screen. You can emit any instruction you like, and it'll look something like "cfw.add(ByteCode.NOP)". You'd be right to infer from the name "add" that org.mozilla.classfile goes mad for overloading. Personally, I'd rather have gnu.bytecode-like addPushInt, addPushBoolean, addPushLong, addPushDouble, and addPushString than the actual five overloads of addPush, but it's not too bad.

The uniformity of passing ByteCode bytes around in org.mozilla.classfile makes for trivial abstracting-out of common sequences. So in contrast to gnu.bytecode, I had no trouble translating the ASM GeneratorAdapter.EQ example I just mentioned.

One minor drawback of org.mozilla.classfile is that, being lower-level, it's sometimes slightly more verbose than gnu.bytecode, but not all the time. The fact that there's only one class (ClassFileWriter) to deal with, for example, removes some verbiage, because you don't have to deal with Method and CodeAttr objects. (One side effect of this is that this is the only library where you can only generate code for one method at a time. That required minor fiddling in my code generator, because I needed to explicitly delay generating code for function definitions at global scope until I'd finished generating code for code at global scope, where with gnu.bytecode or ASM I could just push a new object representing the new method's code onto a stack. But it makes little difference.)

Label handling is pretty similar in all three libraries. I preferred ASM's verb "mark" to gnu.bytecode's "define" for the method that fixes a label to the current location, but I preferred gnu.bytecode's "new Label()" to ASM's "mg.newLabel()" for creating a free label, because the latter reads to me as if it ought to mean "new label fixed to the current location". org.mozilla.classfile uses acquireLabel and markLabel. Perfect.

A hierarchy of Types?
In a break from its usual over-designed style, ASM has one class Type that's used to represent primitive types, reference types, and array types alike. gnu.bytecode has a Type class, but it also has ArrayType and ClassType and ObjectType. The last of which isn't what you think. All this is a minor pain. Primarily (in my case) because it means that I can no longer treat "void" uniformly, even though all my other types are ClassType, void is just a Type. So my method that translates my language's types to gnu.bytecode types has to return Type to cope with the "void" case, and almost all callers have to cast that to ClassType to be able to use it. A minor annoyance, but no-one likes seeing casts in their code, and ASM's uniformity was nice and clean.

Unrelated and minor, but one of the few head-scratching gnu.bytecode mistakes I made was that I accidentally created two ClassType instances for the class I was generating. Obviously, stuff added to one isn't visible in the other. It was my fault for coding so late at night, but it did make me wonder about the wisdom of a public constructor on ClassType.

Where gnu.bytecode uses a more complicated solution than ASM, org.mozilla.classfile goes in the opposite direction and uses a simpler solution than ASM. It has no type for types at all, using String instead. This is convenient, but the lack of distinction or automatic conversion between class names and signatures can be annoying at times. Especially because parameter names in the API don't usually make the distinction either, so you'll be relying on your wits (or the JVMS) to work out which you should supply. org.mozilla.classfile has no user-visible abstractions for field references or method references. This, to my mind, actually improves one-off calls to fixed fields/methods. The temptation to cache a field or method reference doesn't exist, and the code reads as clearly as I can imagine. If you keep calling the same method or accessing the same field, though, it's easy to find yourself wishing there were a convenient way to say this just the once. (In the most frequent case in my code, I went for a method that just emits the get/put/invoke in question.)

One other note: org.mozilla.bytecode forces you to pay more attention to return types because you're constructing descriptors manually. This is a price you pay for going low-level.

Where the lack of documentation for gnu.bytecode really bites you is when there are a bunch of deprecated alternatives lying around. Sometimes they're literally @deprecated, sometimes they have a comment pointing you in the right direction, but mostly you're just supposed to know.

The Sestoft paper, for example, uses CodeAttr.initCode and then calls getCode, and the reader's left wondering why getCode doesn't just initCode if code is null. It turns out that you're probably much better off calling CodeAttr.startCode, which does this and also creates a Variable instance for each of your method's parameters. (You can get at the Variables by numeric index, so it's hard to imagine when you'd be harmed by this behavior.) Funnily enough, this is documented. The JavaDoc for initCode clearly says most people are better off with getCode. The trouble is that documentation's so sparse that you just don't bother reading it. Where with ASM I worked mainly from the documentation, needing the source pretty much only to work out how to use their verifier (as you saw in an earlier post), with gnu.bytecode I needed to work from the source.

It would be nice to see a clear out of some of the old stuff. It wouldn't solve the documentation problem, but it would help, just by making the forest smaller.

The org.mozilla.classfile API, on the other hand, is pretty lean and mean. It depends on a few other org.mozilla classes for maps with integer keys and/or values that don't require boxing and a weird non-generic variant of ArrayList that keeps the first few elements in fields instead of an array, but swapping these out for their closest JDK equivalents is easy. If you know the JVMS reasonably well, things are pretty self-explanatory, and there's pretty much only one way to do anything.

Debugging information
I've been putting off adding debugging information – local variable tables and line number tables – to my ASM-based code generator for ages. It's way harder than I think it ought to be, unless I'm missing a trick. By contrast, gnu.bytecode has exactly the interface I'd like to see. Local variable tables pretty much added themselves in the translation (I certainly wrote no new code) and line number information required little more than a line per visit method in my code generator's AST node visitor. The resultant line number tables are long and repetitive, but I soon decided that the seemingly obvious optimization (removing any entry that maps a bytecode index to a line number if the previous entry mapped to the same line number) would be invalid in the presence of branches, so long tables is just a fact of life. It would be nice if the disassembler could be easily told to not dump line number tables. It takes an int "flags" parameter, but doesn't currently use it in any way. The disassembler, by the way, is really nice. I've encountered it before and still think it's the best "javap" going; complete and with highly readable output.

It's also a lot easier to deal with "register allocation" using gnu.bytecode. You can pushScope and popScope on the CodeAttr, and local slots allocated during that scope will be freed for reuse. With ASM, I had to manage this myself, and was using a trivial scheme where I never reused slots. This didn't seem to affect HotSpot much in terms of performance, but it probably increased the potential for unwanted heap retention as references hung around in local slots that would never again be accessed.

Turning to org.mozilla.classfile, you get the same simple interface to the line number table, and a seemingly convenient interface to the local variable table, but one that's broken in that it assumes that all variables live from their point of definition to the end of the method. You also get no help when it comes to reusing local slots.

If you want or need stack maps, neither gnu.bytecode or org.mozilla.classfile offer them at the time of writing. They seem to be on the roadmap for gnu.bytecode, but I've no idea about org.mozilla.classfile; judging by the revision history, it's not been actively worked on for years.

I imagine this is where many people stop skimming and start reading. Which is silly. The stuff above is far more important. Basically, in all aspects of performance, there's pretty much nothing in it between ASM and gnu.bytecode, as far as my totally unscientific tests have shown.

Size-wise, gnu.bytecode initially seems quite large. It's about a 100KiB JAR file for just gnu.bytecode. asm-3.1.jar, by contrast, is 44KiB, and [my modified variant of] org.mozilla.classfile is 23KiB. But adding together all the ASM JAR files I ended up dragging in, as I realized that to get a semi-convenient API, I needed more and more of the "optional" JAR files, I find I was using 160KiB of ASM JAR files, 60KiB more than I'm using of Kawa. If you really want or need small, though, org.mozilla.classfile is your winner. (If you're scratching your head about why anyone would care about this in 2008, that code needs compiling at run-time, and J2ME users don't necessarily have the space to spare. But the main reason I bring it up is that ASM seems to consider its size a feature. BCEL, I'm led to believe, is relatively huge.)

One reason why org.mozilla.classfile is so small is that it doesn't include a disassembler, and it doesn't include a verifier. While you don't necessarily need either of these in your shipping product, you'll want both during development.

Despite doing slightly more, gnu.bytecode compiles slightly faster than ASM. org.mozilla.classfile is slightly faster still. When I say slightly, though, I mean it. If ASM's too slow for you, I seriously doubt that gnu.bytecode's going to change your life. But again, if you're looking at J2ME or the like, you might want to save every scrap you can.

As I said at the beginning, which is best for you depends on what you're doing. The only sensible "conclusion" I can offer is a summary of some of their best features.

ASM's best features:

  • Only option if you need stack maps.
  • Best verifier (better than BCEL's, too), though not 100% the same as Sun's.
  • Most extensive documentation.
  • Also able to manipulate existing classes; potentially handy if you're not writing a compiler.

gnu.bytecode's best features:

  • Does several compilation-related tasks for you.
  • Ties in with even higher-level libraries if you're interested.
  • Best disassembler (better than javap(1), too).

org.mozilla.classfile's best features:

  • Very little API to learn, very little need for documentation.
  • Tiny code size.
  • MPL1.1/GPLv2 or later.