2012-01-01

Beware convenience methods

The Android documentation links from various methods to Beware the default locale. It's great advice, but most people don't know they need it.

I've never been a great fan of convenience methods (or default parameters), and methods that supply a default locale are a prime example of when not to offer a [supposed] convenience method.

The problem here is that most developers don't even understand the choice that's being made for them. Developers tend to live in a US-ASCII world and think that, even if there are special cases for other languages, they don't affect the ASCII subset of Unicode. This is not true. Turkish distinguishes between dotted and dotless I. This means they have two capital 'I's (one with and one without a dot) and two lowercase 'i's (one with and one without a dot). Most importantly, it means that "i".toUpperCase() does not return "I" in a Turkish locale.

The funny thing is, toLowerCase and toUpperCase just aren't that useful in a localized context. How often do you want to perform these operations other than for reflective/code generation purposes? (If you answered "case-insensitive comparison", go to the back of the class; you really need to use String.equalsIgnoreCase or String.CASE_INSENSITIVE_ORDER to do this correctly.)

So given that you're doing something reflective like translating a string from XML/JSON to an Enum value, toUpperCase will give you the wrong result if your user's device is in a Turkish locale and your string contains an 'i'. You need toUpperCase(Locale.ROOT) (or toUpperCase(Locale.US) if Locale.ROOT isn't available). Why doesn't Enum.valueOf just do the right thing? Because enum values are only uppercase by convention, sadly.

(The exception you'll see thrown includes the bad string you passed in, which usually contains a dotted capital I, but you'd be surprised how many people are completely blind to that. Perhaps because monitor resolutions are so high that it looks like little more than a dirt speck: I versus İ.)

In an ideal world, the convenience methods would have been deprecated and removed long ago, but Sun had an inordinate fondness for leaving broken glass lying in the grass.

The rule of thumb I like, that would have prevented this silliness in the first place, is similar to Josh Bloch's rule of thumb for using overloading. In this case, it's only acceptable to offer a convenience method/default parameter when there's only one possible default. So the common C++ idiom of f(Type* optional_thing = NULL) is usually reasonable, but there are two obvious default locales: the user's locale (which Java confusingly calls the "default" locale) and the root locale (that is, the locale that gives locale-independent behavior).

If you think you're safe from these misbegotten methods because you'd never be silly enough to use them, you still need to watch out for anything that takes a printf(3)-style format string. Sun made the mistake of almost always offering those methods in pairs, one of which uses the user's locale. Which is fine for formatting text for human consumption, but not suitable for formatting text for computer consumption. Computers aren't tolerant of local conventions regarding the use of ',' as the decimal separator (in Germany, for example, "1,234.5" would be "1.234,5"), and it's surprisingly easy to write out a file yourself and then be unable to read it back in! (There are a lot more of these locales, though, so in my experience these bugs get found sooner. The Enum.valueOf bug pattern in particular regularly makes it into shipping code.)

Where locales are concerned, there's really no room for convenience methods. Sadly, there's lots of this broken API around, so you should be aware of it. Especially if you're developing for Android where it's highly likely that you actually have many users in non-en_US locales (unlike traditional Java which ran on your server where you controlled the locale anyway).

2011-09-22

How can I get a thread's stack bounds?

I've had to work out how to get the current thread's stack bounds a couple of times lately, so it's time I wrote it down.

If you ask the internets, the usual answer you'll get back is to look at the pthread_attr_t you supplied to pthread_create(3). One problem with this is that the values you get will be the values you supplied, not the potentially rounded and/or aligned values that were actually used. More importantly, it doesn't work at all for threads you didn't create yourself; this can be a problem if you're a library, for example, called by arbitrary code on an arbitrary thread.

The real answer is pthread_getattr_np(3). Here's the synopsis:
#include <pthread.h>

int pthread_getattr_np(pthread_t thread, pthread_attr_t* attr);

You call pthread_getattr_np(3) with an uninitialized pthread_attr_t, make your queries, and destroy the pthread_attr_t as normal.
pthread_attr_t attributes;
errno = pthread_getattr_np(thread, &attributes);
if (errno != 0) {
  PLOG(FATAL) << "pthread_getattr_np failed";
}

void* stack_address;
size_t stack_size;
errno = pthread_attr_getstack(&attributes, &stack_address, &stack_size);
if (errno != 0) {
  PLOG(FATAL) << "pthread_attr_getstack failed";
}

errno = pthread_attr_destroy(&attributes);
if (errno != 0) {
  PLOG(FATAL) << "pthread_attr_destroy failed";
}

Note that you don't want the obsolete pthread_attr_getstackaddr(3) and pthread_attr_getstacksize(3) functions; they were sufficiently poorly defined that POSIX replaced them with pthread_attr_getstack(3).

Note also that the returned address is the lowest valid address, so your current stack pointer is (hopefully) a lot closer to stack_address + stack_size than stack_address. (I actually check that my current sp is in bounds as a sanity check that I got the right values. Better to find out sooner than later.)

Why should you call pthread_getattr_np(3) even if you created the thread yourself? Because what you ask for and what you get aren't necessarily exactly the same, and because you might want to know what you got without wanting to specify anything other than the defaults.

Oh, yeah... The "_np" suffix means "non-portable". The pthread_getattr_np(3) function is available on glibc and bionic, but not (afaik) on Mac OS. But let's face it; any app that genuinely needs to query stack addresses and sizes is probably going to contain an #ifdef or two anyway...

2011-05-04

signal(2) versus sigaction(2)

Normally, I'm pretty gung-ho about abandoning old API. I don't have head space for every crappy API I've ever come across, so any time there's a chance to clear out useless old junk, I'll take it.

signal(2) and sigaction(2) have been an interesting exception. I've been using the former since the 1980s, and I've been hearing that it's not portable and that I should be using the latter since the 1990s, but it was just the other day, in 2010, that I first had an actual problem. (I also knew that sigaction(2) was more powerful than signal(2), but had never needed the extra power before.) If you've also been in the "there's nothing wrong with signal(2)" camp, here's my story...

The problem

I have a bunch of pthreads, some of which are blocked on network I/O. I want to wake those threads forcibly so I can give them something else to do. I want to do this by signalling them. Their system calls will fail with EINTR, my threads will notice this, check whether this was from "natural causes" or because I'm trying to wake them, and do the right thing. So that the signal I send doesn't kill them, I call signal(2) to set a dummy signal handler. (This is distinct from SIG_IGN: I want my userspace code to ignore the signal, not for the kernel to never send it. I might not have any work to do in the signal handler, but I do want the side-effect of being signalled.)

So imagine my surprise when I don't see EINTR. I check, and the signals are definitely getting sent, but my system calls aren't getting interrupted. I read the Linux signal(2) man page and notice the harsh but vague:
The only portable use of signal() is to set a signal's disposition to
SIG_DFL or SIG_IGN. The semantics when using signal() to establish a
signal handler vary across systems (and POSIX.1 explicitly permits this
variation); do not use it for this purpose.

POSIX.1 solved the portability mess by specifying sigaction(2), which
provides explicit control of the semantics when a signal handler is
invoked; use that interface instead of signal().

It turns out that, on my system, using signal(2) to set a signal handler is equivalent to using the SA_RESTART with sigaction(2). The Open Group documentation for sigaction(2) actually gives an example that's basically the code you'd need to implement signal(2) in terms of sigaction(2).) The SA_RESTART flag basically means you won't see EINTR "unless otherwise specified". (For a probably untrue and outdated list of exceptions on Linux, see "man 7 signal". The rule of thumb would appear to be "anything with a timeout fails with EINTR regardless of SA_RESTART", presumably because any moral equivalent of TEMP_FAILURE_RETRY is likely to lead to trouble in conjunction with any syscall that has a timeout parameter.)

Anyway, switching to sigaction(2) and not using the SA_RESTART flag fixed my problem, and I'll endeavor to use sigaction(2) in future.

Assuming I can't stay the hell away from signals, that is.

Alternative solutions

At this point, you might be thinking I'm some kind of pervert, throwing signals about like that. But here's what's nice about my solution: I use a doubly-linked list of pthreads blocked on network I/O, and the linked list goes through stack-allocated objects, so I've got no allocation/deallocation, and O(1) insert and remove overhead on each blocking I/O call. A close is O(n) in the number of threads currently blocked, but in my system n is currently very small anyway. Often zero. (There's also a global lock to be acquired and released for each of these three operations, of course.) So apart from future scalability worries, that's not a bad solution.

One alternative would be to dump Linux for that alt chick BSD. The internets tell me that at least some BSDs bend over backwards to do the right thing: close a socket in one thread and blocked I/O on that socket fails, courtesy of a helpful kernel. (Allegedly. I haven't seen BSD since I got a job.) Given Linux's passive-aggressive attitude to userspace, it shouldn't come as a surprise that Linux doesn't consider this to be its problem, but changing kernel is probably not an option for most people.

Another alternative would be to use shutdown(2) before close(2), but that has slightly different semantics regarding SO_LINGER, and can be difficult to distinguish from a remote close.

Another alternative would be to use select(2) to avoid actually blocking. You may, like me, have been laboring under the misapprehension that the third fd set, the one for "exceptional conditions", is for reporting exactly this kind of thing. It isn't. (It's actually for OOB data or reporting the failure of a non-blocking connect.) So you either need to use a timeout so that you're actually polling, checking whether you should give up between each select(2), or you need to have another fd to select on, which your close operation can write to. This costs you up to an fd per thread (I'm assuming you try to reuse them, rather than opening and closing them for each operation), plus at least all the bookkeeping from the signal-based solution, plus it doubles the number of system calls you make (not including the pipe management, or writing to the pipe/pipes when closing an fd). I've seen others go this route, but I'd try incest and morris dancing first.

I actually wrote the first draft of this post last August, and the SA_RESTART solution's been shipping for a while. Still, if you have a better solution, I'd love to hear it.

2009-08-21

Java on a thousand cores

Cliff Click wrote a recent blog post that started "warning: no technical content", which wasn't true. Buried 3/4 of the way through was a link to the PDF slides for the "Java on 1000 cores" talk he's been giving recently. I was in the audience of the specific instance of the talk he mentioned, but if you weren't, you can now watch the video on YouTube.

I tend to avoid talks because so many of them are a complete waste of time. This one, though, was so good it had me thinking about why I've mostly given up on talks.

For one thing, most speakers speak far too slowly. I know one of the rules everyone's taught about public speaking is to speak more slowly. That because you're nervous you'll speak faster than you realize, and that sounds like you're rushing, and that's bad. Which might have made sense in the 1700s when public speaking likely meant boring your congregation to sleep every Sunday morning, where the point was less the content than the submission to authority. Going slow might make sense if you're trying to educate beginners (though there I think repetition and usage is what matters, not slow presentation). I can even admit that going slow is downright necessary if you're walking someone through a physical procedure, but it's an awful way to present summaries or reviews (in the sense of "review paper"). And most talks are, one way or another, summaries or reviews.

As if looking for a less literal way in which they can speak too slowly, few speakers have a good feel for how many words any particular point deserves. You know these people. You've been in their talks. These people have their four obvious bullet points on their slide, four bullet points that you grasp in the second or two after the slide is up, and somehow they manage to spend five minutes laboriously wading their way through these points while adding nothing to your understanding. And all the time, you stare at the screen willing the next slide to appear. Wishing you'd brought your Apple remote. (They're not paired by default, you know.)

Don't be afraid to have a lot of stuff on your slides, either. The people who tell you not to have more than a couple of points per slide have nothing to say. At best, they're salesmen. (See Steve Jobs' slides and presentations. Things of beauty both, but designed for selling, not transfer of technical information and ideas.) The naysayers also assume you're making all the usual mistakes too, when they should instead tell you to speak fast and only speak useful words, in which case it won't matter how much stuff is on each slide, because no-one will be bored enough that they're sat there just reading the slides. If they're not listening, taking away the slides is not a fix for your problem.

Another bad habit is not asking anything of your audience. I don't mean questions. I mean prerequisites. Seriously, those five people at your talk about building STL-like containers who don't know what the STL is? Tell them to fuck off and stop wasting everybody's time. Or, if you think that's a bit much, just start with "this talk assumes a basic familiarity with X, Y, and Z; if you don't have this, I'm afraid you probably won't get much out of this talk". You're not helping anyone by spending ten minutes rambling on about stuff that anyone who's going to get anything out of your talk already knows anyway. Unless you're someone like Steve Jones, you probably don't need to explain science to the layman. It's much more likely you're talking to an audience of your peers, and you should respect their time as you'd expect them to respect yours.

Also, please don't waste my time with talks where the only thing I learn is how much you like the sound of your own voice, or talks that only exist because you get some kind of merit badge for having given a talk. In the former case, get a blog. Then people can decide for themselves whether they like the sound of your voice, and subscribe if they do and ignore you if they don't. In the merit badge case, corporate life is full of bogus achievements; I'm sure you can find one that minimizes waste of other people's time. Hell, if it helps, I'll make you a little "I didn't waste anyone's time giving a gobshite talk" certificate.

Et cetera.

Anyway, getting back to Cliff Click's talk... his was a great example of what talks should be like. His content was interesting and concentrated. He assumed his audience knew the basics of the areas he was talking about. He spoke fast. Fast enough that I was often still digesting his previous point while he was on to the next. (And for all you MBAs: this is a good thing. Better too much content than too little. I can always re-read the slides/re-watch the video afterwards. And sometimes thinking about the last thing I was interested in helps me coast through a bit I'm less interested in.)

One thing I particularly like about Cliff Click's stuff in general is his practitioner's point of view. He asks the important practical questions: "Have you successfully built one?", "Did it do what you expected?", "What is it good for?", "What isn't it good for?", "Is it worth it?".

As for this particular talk, there were several new ideas to me (I hadn't heard of the optimistic escape detection [as opposed to escape analysis] he mentioned in passing, for example), several things I found surprising (that write buffers turned out to be unimportant with Azul's CLZ [unrelated to ARM's CLZ], for example), and a lot of familiar stories about mixed hardware-software designs, interesting mainly because it hadn't occurred to me that they might be common to all hardware-software companies.

Plus it was a model of how to give a good talk.