Generating JVM bytecode 2

Constant pools
If you're compiling a language other than Java, you may find the fact that the class file constant pool can only contain JVM primitive types or instances of java.lang.String somewhat frustrating. I wanted a constant pool for my boxed integers and boxed reals.

Disappointingly, the "getstatic", "bipush", "aaload" to load a boxed real constant from my private static final Object[] home-made "constant pool" seems to perform about the same as the "ldc" and "invokestatic" for boxing a JVM double (using something like java.lang.Double).

If you've got a more expensive constructor, though, it's a win. BigInteger constants benefit, and I imagine Pattern constants would too, if you have regular expression literals in your language. I'm not sure why none of the bytecode generation libraries seem to have any support for this. Surely it's a common enough need? Mine is only 100 lines of code including comments, but that still seems a bit weak.

Strange performance problem of the week
I finally tracked down a huge slowdown that had been dogging me for ages now. My most modest test case (not Java) which finally produced little enough bytecode that I could spot the problem by inspection looked like this:

X1 := 0.1;
for (p := 1.0; p > 0.003; p *= 0.98) {
X2 := 0.1;
for (y := -1.2; y < 1.2; y += 0.07) {
for (x := -1.6; x < 1.0; x += 0.03) {
for (c := 6.0; c < 20.0; ++c) {
tmp := X2;

Running this on Java 6 took about 2s on my machine. Commenting out the definition of X1 (which you'll note is unused) took the run time down to 0.2s. (The types are all effectively java.lang.Double, but that's being inferred by the compiler in this example.)

Running with -Xint increased the run time, but the presence or absence of X1 no longer made a difference.

So why was the unused variable making my program 10x slower? It turned out to be a bug in my code generator. There was no "pop" to get rid of the value of X1 after computing it for its initialization. Leaving that on the operand stack somehow combines with a "sufficiently complicated" method (changing those nested loops to just be one loop that counts to 30,000,000, for example, makes the problem disappear) to upset Hotspot.

I don't really have any particularly useful advice here. I don't know of a tool that spots such mistakes, and I don't know the exact conditions to reproduce the problem (and, since it seems to involve leaving junk on the operand stack, I can't provide a Java example). But if you're having weird performance problems with your generated code, I guess this is one extra avenue to explore.

In my case, Hotspot doesn't appear to compile the method at all. You can see this with the output of +XX:PrintOptoAssembly on a fastdebug build.

Annoyingly vague JVM output of the week
Why is ArrayStoreException so vague? It needs the same battering with the clue stick as those annoying human bug reporters who give you no more to go on than "something's gone wrong". Tell me what type you expected, and what type you saw, damn it! Sun's JVM tells you one of those things, but doesn't make it clear which (and the JavaDoc for the exception doesn't help; I ended up looking at the Hotspot source, which really shouldn't be necessary). jamvm(1) was even less helpful, with no detail message at all. I've sent a patch to the author.

My mistake was the usual gotcha: "anewarray" does what it says on the tin, and creates a new array of the reference type you give it. That is, you don't give it an array type if you want a single-dimensional array. Adding an example to that instruction's section in the JVMS would go a long way.

Building jamvm(1) on Ubuntu
To hack better diagnostics into jamvm(1), I needed to build from source. It took me a while to work out that to get a working VM, you need to configure with --with-classpath-install-dir=/usr (and not /usr/share/classpath). It's clear if you look at the source, but not very clear from the documentation.


Generating JVM bytecode

If you've been wondering where the useful code snippets and hints have been hiding the last few months, the good news is they're back in this post. The bad news is that I'll be talking mostly about objectweb.org's ASM all-purpose Java bytecode manipulation and analysis framework, and don't have any significant experience of the alternatives. When I chose ASM, I'd have loved to read a good comparison of the choices, but I didn't find one. And I'm sad to report that I won't be writing one, at least not right now. Maybe if someone points me to a particularly tempting alternative?

objectweb.org's ASM
I chose ASM because it prides itself on being small and fast, and because it's still frequently maintained. I don't honestly know enough about the competition to know what kind of state they're in, and whether the ASM team's comparative benchmarks are fair and representative. ASM is fast, but I don't know that BCEL (the best-known alternative) would be significantly slower. The documentation for both is pretty weak, but ASM's seems better, and the example code for ASM seems more direct and less verbose.

That said, small does not imply simple, and ASM is a little hairy. The authors went a bit pattern-mad, and it's very "designed" in ways that often seem to be to the detriment of convenience and simplicity. (If you think java.io is awkward, you won't like ASM: it's like java.io, only more so.) In addition, the "convenience" layer in the "commons" package isn't high-level enough that you won't need a good understanding of Java bytecode, but it is far enough from the interface presented by the basic "asm" package (which is a pretty direct mapping), and badly-enough documented that, to be honest, it seems more of a hindrance than a help. I've given up fighting it and am starting to use the underlying functionality instead, even to the extent of reverting some of my existing "commons"-using code. (One problem in particular is that, as in a C++ program that uses stdio and iostreams, you need to be careful that each layer knows what's going on. Now imagine that iostreams was badly documented and seemingly incomplete such that you repeatedly found yourself falling back to stdio. That's the ASM situation. It's possibly nothing that good documentation couldn't fix, but that doesn't help us in the here and now.)

Stick to the basic "asm" package, though, and things don't seem too bad. I do have this sneaking feeling there's an even skinnier guy struggling to get out, though. And I for one would like to make his acquaintance. If you see him, let me know.

You'll want the ASM3 API documentation and it's probably worth looking at the asm-guide.pdf overview, though it's incomplete and, if you're interested in generating bytecode, will seem too focused on processing existing bytecode (ASM supports both, and you'll notice that they describe it as a "manipulation and analysis framework" rather than the skinny "generation framework" I really wanted).

More importantly, you'll need to have read The Java Virtual Machine Specification (2e) by Lindholm and Yellin, which is also available on dead tree, but is equally outdated there. The web version would be greatly improved by a table of contents with direct links to the individual pages for opcodes starting with each letter of the alphabet, and even more so from having the material from the JVMS maintenance page worked in to the main text. (Those PDFs suggest Sun has the source to the book, so it's a real shame they don't re-issue the HTML version or an all-in-one PDF version.)

The only other book on JVM bytecode I've read was Joshua Engel's "Programming for the Java Virtual Machine" from the late 1990s. My only retained memory was that it was full of mistakes, and re-reading it recently, I can confirm that to be a reasonable summary of the book. You can safely ignore it, unless you want to see the most amusing typo in any book covering compilers. You can be sure I'll be building a "poophole optimizer" in every compiler I write. (That's the funniest error, but it's also the most trivial. Most of the code examples contain at least one error, and there are numerous statements that make you wonder to what extent this "acknowledged expert in the Java virtual machine" knew what he was talking about. He certainly didn't know how to pick technical reviewers, that's for sure.)

You know what I'd love? Someone like Sun's John Rose to go over the JVMS "Compiling for the Java Virtual Machine" chapter and tell us what kind of code we should be generating for the modern HotSpot JVM. Which idioms and bytecode-level optimizations are good, and which idioms we should avoid. Especially balanced against issues like code size, verifiability, and future-proofing.

The verifier in Sun's JVM may be great for keeping us (and our programs) out of mischief, but it totally sucks in terms of the quality of its diagnostics. Have a look at OpenJDK's "check_code.c" for the details, but suffice it to say that the errors are lacking in clarity, lacking in detail, lacking in context, and don't even manage to use the usual terminology associated with the JVM and its bytecode. Some messages don't even manage to be valid English sentences. "unitialized" indeed.

Normally this isn't a problem because javac(1) is generating the bytecode and although there have been cases where javac(1) has generated unverifiable bytecode, I can only remember it causing me trouble personally in one instance during the last decade-and-a-bit. If you're writing your own bytecode-generating programs, though, you're a lot more likely to see VerifyError.

Free JVMs
There are a bunch of Free JVMs, and for a laugh I told apt-get(1) to install sablevm, kaffe, and jamvm (in that order, because that was the order in which they came to mind). jamvm(1) turned out to be the most useful, despite the fact (or because of the fact) that it doesn't have a verifier. This, combined with the fact that it seemed to be pretty robust, meant that I could often turn a verification error into a run-time error, and that can actually be useful. The other two weren't much use to me. IIRC, sablevm(1) did verify but with diagnostics no more useful than Sun's, and kaffe(1) mostly crashed on unverifiable code. I've kept jamvm(1) installed because it might come in handy again, but the other two are long gone.

BCEL's verifier
Assuming you get far enough to write a .class file, BCEL includes a fancy verifier called JustIce. I link to a page that proclaims itself obsolete because there's a link on that page to the author's Diplomarbeit, which is potentially useful to you. (It's in English.)

Anyway, apt-get(1) libbcel-java and you can run:

java -cp $bcel_cp org.apache.bcel.verifier.Verifier YourGeneratedClass

JustIce is pretty verbose, which is mildly annoying when it's shouting "EVERYTHING IS OKAY!", but comes in handy when something's broken. Unlike Sun's verifier, you always get a bytecode index, you'll automatically get a disassembly, and you get plain English explanations using the usual JVM bytecode terminology. (Yes, you'll still have to have a rough idea of what you're doing, but I don't see how that's unavoidable. If you're that lost, you should probably be getting someone else to write your bytecode-generating code for you.)

ASM's verifier
ASM also comes with a verifier, and it's less loquacious, but it's under-documented and it took me ages to work out how to make it work on the byte[] from my ClassWriter, despite it being a mere two lines (the ASM 3.1 JavaDoc actually explains this, but earlier versions don't):

byte[] bytecode = ...
ClassReader cr = new ClassReader(bytecode);
CheckClassAdapter.verify(cr, false, new PrintWriter(System.err));

The beauty of this verifier is that you get nice little stack pictures (that show the local slots too). The disassembly is nice and clear, and the explanations are fine too, and are somewhat less wordy than BCEL's.

(Note that, although I show BCEL's verifier being run as an external program and ASM's verifier being run as part of an application, that's only because I'm using ASM and not BCEL. You could equally well run BCEL's verifier as part of your application and ASM's verifier as an external application, and probably should if you're using BCEL.)

It's a shame that ASM's CheckClassAdapter does so little checking for the kind of errors that actually seem likely. I realize error-checking's not free, but ASM feels like it's sacrificed too much. Mostly you'll just see ArrayIndexOutOfBoundsException and NullPointerException exceptions thrown from its bowels. You'd be well advised to "apt-get source asm3" (I didn't work out how to use CheckClassAdapter before reading the source, for example).

Anyway, back to work. I've got bytecode to generate...


.mp3 id3 tags

I mentioned recently that I'd switched to Rhythmbox on Linux for my music needs. One thing I didn't mention was the fruitless struggle I'd had trying to persuade it to let me fix some of the metadata. I've since had some success, which I'd like to share.

First things first: if you're using Rhythmbox, you'll probably be tempted by Rhythmbox's UI for editing the odd title or artist. My needs were borderline between that and needing the command line, but it turns out that using Rhythmbox's UI doesn't edit the tags: it just overrides the copy of the tags in Rhythmbox's own XML store.

But if you don't really care about the quality of your tags, you can just use the Rhythmbox UI and forget about all this id3 tag nonsense. If you're happy with that, stop reading now.

I'd tried using id3(1) on the command-line, and that had worked for some .mp3s but not others. For a long time, I suspected that Rhythmbox just wasn't noticing the change. I tried removing Rhythmbox's on-disk XML "database", but it would regenerate it with the same content. I asked strace(1), and it reckoned there wasn't any secret back-up. I looked for duplicates of the offending .mp3 files, but didn't find any. I looked at the tags of one of the offending files, and they looked fine. Yet Rhythmbox was showing something different. Eventually I actually looked at the .mp3 file itself, using less(1), which clearly showed an incorrect id3 tag at the start of the file.

At the start of the file? I thought the whole thing about id3 tags was that they come at the end of the file? Looking at the end of the file, I saw a different id3 tag, with the new data. Wikipedia's id3 article explains that there are two unrelated versions, id3v1 and id3v2. Sadly, their whole "neutral point of view" thing means they don't tell you which to use.

Rhythmbox, then, can use either id3v1 or id3v2 tags, but given both it silently uses the latter.

id3(1) only handles id3v1 tags and completely ignores id3v2 tags (they aren't shown, and you can't delete them).

id3v2(1), despite the name, is able to handle both kinds of id3 tag. It can convert from id3v1 to id3v2, though doing so won't automatically delete the id3v1 tag. It can remove either or both kinds of tag from a file. It can edit the data in id3v2 tags (but not id3v1 tags).

I've duly apt-get removed id3, apt-get installed id3v2, converted and deleted my id3v1 tags, and fixed the mistakes. Rhythmbox was running while I did this, and refreshed its display as things changed on disk, which was nice of it.

In future, I hope Rhythmbox's UI is changed to actually edit tags. Until then, I'll be using id3v2(1) to remove any stray id3v1 tags I happen to run across and fix any errors.