2011-05-29

Language-induced brain damage is better than the alternative

Whenever old uncle Edsger had had a skinful, he'd rant about the harmful effects of bad languages on programmers. As a kid, I was never particularly convinced. He was too old to be all that concerned about C, and Ada hadn't even been invented yet, let alone Java, so his rants about Fortran, PL/I, Cobol and APL all seemed comically anachronistic. The languages he'd talk about were all pretty moribund anyway, at least for the purposes of a software engineer (as opposed to, say, a physicist).

BASIC, his other bête noire, never really seemed that bad. I grew up with a pretty good BASIC, the main deficiency of which seemed to be the lack of a garbage collected heap (except for strings, which were special). Even as a kid, it was pretty clear that fixed-size arrays were evil, so it was distressing that one's language forced one into the habit. But to me, that disproved Ed's claim: I was clearly a child of the BASIC era, and yet wasn't I sophisticated enough to recognize the problems? Wasn't this ability to recognize the flaws of one's language almost a test of one's latent aptitude, and thus useful in distinguishing those with real potential from those without?

In the years since, I've used a lot of languages. It's hard to imagine a well-rounded programmer who hasn't been exposed to an assembly language, a Lisp, Haskell or an ML, SQL, C, C++, and a Java-like managed language. And it's probably a shame too few encounter a Smalltalk. Even if you don't like these languages, and even if you wouldn't use them commercially, I think they each influence the way we think about computation and programming. And I think that that makes us better at doing our jobs, regardless of which language we're actually using.

(I deliberately omitted logic programming languages -- both deductive and the even less common inductive -- because if they did have an effect on me or my thinking, I've absolutely no idea what it was, and if they didn't I've absolutely no idea what I've missed.)

So it seems to me like there's a trade-off. Yes, learning a new class of language will change the way you think, but it will be both for better and worse. I don't think you can avoid this, and I think that deliberately remaining ignorant is worse than just accepting the mental scarring as a fact of life. Hell, I even think that learning absolutely appalling languages like Ada, S, and Javascript is an important learning experience. Those who cannot remember the past are condemned to repeat it.

But what I think is really interesting, and another reason it was hard to believe Ed's claim, is that pretty much by definition you can't see the damage a language does to you as clearly as you can see the good. You're likely to remember that language X taught you Y, but you don't even know that it failed to expose you to Z. So back in my BASIC days, I never bemoaned the lack of a sequence type or a map type. I almost missed the former, but would have been over-specific in my demands: I wanted to dynamically size arrays. What I thought I wanted was something like C's realloc(3), not C++'s std::vector. It wasn't until I was a C programmer and had realloc(3) that I realized how small an advance that is, and it wasn't until I was a C++ programmer that I realized that, really, I wanted a managed heap. (Not always, of course, because someone has to implement the managed language's runtime, but another thing that learning plenty of languages teaches you is the importance of always using the highest-level one you can afford for any given task.)

I was reminded of this topic recently when someone sent round a link to a Javascript x86 emulator. The interesting part to me was Javascript Typed Arrays. Javascript is very definitely in the class of languages that I'd never voluntarily use, but that doesn't mean I'm not interested to see what they're up to. And, as maintainer of an implementation of java.nio buffers, I was interested to see the equivalent functionality that Javascript users are getting.

If you don't know java.nio buffers, they're one of Java's many ill-conceived APIs. I say this as a fan of managed languages in general, and a long-time Java user, but having both used and implemented java.nio buffers, there's very little love lost between me and them. They're almost exactly not what I would have done. Surprisingly to me, given my admitted dislike of Javascript, Javascript's typed arrays are pretty much exactly what I would have done.

If I were asked to point to the most damaging design error in java.nio buffers, it would be one that I think was a side-effect of the kind of brain damage that C causes. Specifically, I contend that C programmers don't usually have a clear mental distinction between containers and iterators. I think that was one of the things that C++'s STL really taught us: that containers and iterators (and algorithms) are distinct, and that it's important to maintain these distinctions to get a high-quality library. The design of ICU4C suffers greatly from an ignorance of this idea (ICU4C is the C/C++ equivalent of the heinous java.text classes and such all-time API war crimes as java.util.Calendar, brought to you by the same people).

Java programmers ought not to be ignorant of this important lesson, but it took two attempts to get half-decent collections in the library (three if you count the late addition of generics), and iteration has been such a dog's breakfast in Java that I don't think the lesson to students of Java is nearly as clear as it is to students of C++.

(Dog's breakfast? Enumeration versus Iterator versus int indexes; raw arrays versus collections; the awful and verbose Iterator interface; and most of all the modern Iterable turd which makes the "enhanced" for loop less generally useful than it should have been and encourages the confusion between collections and iterators because the modern style involves an anonymous and invisible iterator. From arguments I've had with them, I think those responsible were hampered by the brain damage inflicted C and their ignorance of C++, an ignorance of which they're bizarrely boastful.)

But java.nio buffers are far far worse. There, rather than offering any kind of iterators, the collections themselves (that is, the buffers) have an implicit position. (Buffers have other state that really belongs in an iterator, and that is inconsistently inherited by related buffers, but that's beyond the scope of this discussion.) You can simulate iterators by making new buffers (with Buffer.duplicate, say) but it's awkward and ass-backward, leading to ugly and intention-obscuring calling code, and leading to another generation of programmers with this particular kind of brain damage.

(At this point you might argue that the ideal model is one of collections and ranges rather than iterators, since C++ iterators do tend to come in pairs, and from there you might argue that a range is really just another way of expressing a view, and from there that a view is best expressed as a collection, and from there that the containers-are-iterators model I'm complaining actually makes sense. It's one of those "how did we get into Vietnam"-style arguments, where any individual step isn't entirely unreasonable in itself, but where the final result is utterly fucked. The problem here being not so much a land war in Asia but having all collections have an implicit position to support iterator-style usage. Which in practice means that you've got a new orthogonal axis of "constness" to worry about, and that it's a lot harder to share containers. It's actively working against what I think most people consider to be one of the big lessons Java taught us: design for immutability. In a functional language, always working with views and relying on referential transparency might be fine, but Java is not that language, and many of the mistakes in the collections API are, I think, down to trying to pretend that it is. Which I hope makes it clear that I'm not blaming C any more than I'm blaming Haskell: I'm just giving examples of mistakes caused by transferring concepts into situations where they no longer make sense.)

The Javascript DataView class is sorely missing from java.nio's mess too. It's a really common use case that's very poorly served by java.nio. My java.nio implementation has something similar internally, but it's really nice to see the Javascript guys exposing what appears to be a lean, sane, task-focused API.

I do think there's a really nice comparative programming languages book to be written, and I think one of the most interesting chapters would be the one about iteration styles. I don't know whether it's surprising that something so fundamental should differ so wildly between languages (I didn't even touch on the Smalltalk style, which is something completely different again from any of the styles I did touch on), or whether it's exactly in the fundamentals that you'd expect to find the greatest differences.

If this is the brain-damage uncle Ed was so keen to warn me about, all I can say is "bring it on!". As far as I can tell, more different kinds of brain damage seems to lead to better outcomes than staying safe at home with the first thing that ever hit you over the head.

2011-05-19

Fuck Sony

So the only reason I use my PS3 any more is for Netflix. It sucks hard when it comes to anything else. It's a crap DVD player compared to the 360, cross-platform games are usually better on the 360, the platform exclusives are usually more interesting on the 360, and the PS3 doesn't even work as reliably with my Harmony remote as the 360 does.

And that's ignoring the PS3's UI shitfest.

There's nothing wrong with the 360's Netflix player (though it does tend to lag behind the PS3's because it's beholden to Microsoft's slow OS update schedule rather than downloading each time you start it). I use the PS3 for Netflix because I resent paying Microsoft $60/year protection money for Netflix on the Xbox 360.

The flaw in my logic is that paying Microsoft's protection money only upsets me one day each year. Using the PS3 upsets me every single night.

If I'm not being forced to install some stupid update that benefits Sony but not me, I'm being forced to fail to log in twice to their broken online gaming network so I can use Netflix, the only thing on the PS3 that doesn't suck.

Tonight, I'm told my password is no longer valid. (A not unreasonable thing to declare when your broken online gaming network has just lost millions of your users' passwords and credit card numbers.) I'm told that a link has been mailed to my email address. I wait for it to arrive, click it, and get:
PLAYSTATION



Site Maintenance Notice

The server is currently down for maintenance.

We apologize for the inconvenience. Please try again later.



メンテナンスのお知らせ

現在、サーバーのメンテナンス中です。

大変申し訳ございませんが、しばらくしてから再度接続してください。

So now my PS3 is completely useless. I've no idea when they'll fix this, but if it's anything like as fast as they fixed their last network problem, I've got about 30 days to wait.

Typical fucking Sony bullshit. And I'm finally sick of it. The PS3 is dead to me. I don't know whether I'm actually going to support Microsoft's protection racket, but I'd rather watch Netflix on my Nexus S than put up with any more of the PS3's passive-aggressive nonsense.

2011-05-15

Ubuntu 11.04

If Mac OS is the continuing evolution of Steve Jobs' vision of how we should use our computers, it's becoming increasingly clear that Ubuntu is Mark Shuttleworth's indirect request that we all just fuck off and get ourselves an OS from someone who actually gives a shit.

Rise
I was a big fan of Ubuntu in the beginning. I liked Debian in principle, but hated having to choose between the "stable" and "testing" branches, the former of which was literally years out of date, while the latter was too unstable for my taste (leading me to dub the choice "stale" or "broken"). Ubuntu at the time seemed to strike a happy medium: a reasonably well-tested 6-month snapshot of Debian "testing". As far as I recall, my only real complaint in the early days was that its color scheme had been decided upon by someone we can only assume was legally blind. Turd brown with safety orange highlights: no sighted person's first choice.

It also seemed, in those early days, as if Canonical was adding some value. They were acting as editors, shielding us from the petty internecine freetard religious wars. So, for example, those of us who just wanted to be able to play our mp3s didn't have to have an opinion on exactly which of the 100 not-quite-shippable music apps to choose, nor did we have to trawl through them all trying to find one that we'd consider acceptable: someone at Canonical had made a good-enough choice for us.

Decline...
Then things turned bad. Each release was less stable than the last. Only the LTS ("Long Term Support") releases were even half-way reasonable, and then they started fucking them up too, changing major components shortly before release, swapping in things that couldn't be considered stable. (And, of course, the user who restricts themselves to LTS releases gets to relive the old Debian "stable" days. Given that Debian is no longer as pathologically bad at shipping as it once was, such a user would have to ask themselves "why not Debian?".)

The usual volunteer disease afflicted Ubuntu too: people would only work on stuff that interested them. Which basically means that the same three components (window manager, desktop, music system) get rewritten over and over and over, each one being replaced before it can actually mature to the state where an impartial observer might call it good.

...and Fall
And now we have Ubuntu 11.04. The worst release yet. A release so bad even noted free software apologist Ryan Paul hates it.

I've no idea what the underlying stuff is like, because the surface layer of crap is so bad that it's taken away all my will to use it, and I'm spending my time surfing the web trying to decide which other distro to jump ship for. (Presumably Debian, but if I'm going to go to all the trouble of reinstalling, I may as well do the legwork.)

Misguided netbook focus
What sucks? There's yet another implementation of a dock from someone who appears to know nothing of the competition that can't be gleaned from screenshots. The old task-bar-thing has moved to the top of the screen (and apparently can't be moved back). The old menus are gone, and so are the buttons representing windows (the latter of which never worked very well anyway, compared to Mac OS or Windows). My system monitor and weather thingies disappeared (and if they can be added back, it's not in any way I can find), the rather nice world map used for the world clock is gone, my launcher for Chrome was replaced by Firefox and random crap I've never used like OpenOffice (and if I can add my own launchers, I couldn't work out how). The replacement for the apps menu appears to be an enormous search box that -- despite using almost a quarter of the area of my 30" display -- somehow only manages to show four apps at a time.

(Despite all this upheaval, there's no attempt to introduce users to the new system.)

The reason for moving the task-bar-thing to the top of the screen is because they've tried to switch to a Mac-style screen menu bar (rather than a Windows-style per-window menu bar). Unfortunately, this doesn't work with any of the apps I actually use. The only thing I've found that it did work with was the on-line help which I tried to use, but that inexplicably starts in some kind of full-screen mode, making it really frustrating to try to actually follow the instructions in the help.

I'm sure some of this must be semi-reasonable on a 10" netbook screen, but can only assume that none of the freetards responsible was able to get their mothers to buy them a 30" display. For example, even on Mac OS, the per-screen menu doesn't work very well on a 30" display. The screen's just too damned big.

ChromeOS: the netbook done right
But why would I be running Ubuntu on a netbook? Why wouldn't I be running ChromeOS? The only reason I can think of is if the netbook was my only computer. But that would be pretty stupid for the kind of person who even considers Linux. Sure, I have the Linux kernel on my Android phone, my Android tablet, my ChromeOS netbook (sorry, "Chromebook"), and my big-ass make -j16 desktop. But there's only one of those devices I'd consider using a Linux distro or desktop on, and honestly that's only for lack of an alternative.

I was hugely skeptical of ChromeOS until I acquired a Cr-48 and started using it. It's replaced my MacBook Pro at work. It hasn't replaced any of my Android devices, nor my work or home desktops, but that's fine and hardly unexpected. An Android-based netbook might be an interesting idea, but it would represent a different trade-off. For example, ChromeOS' multi-account model is its multi-user model. Pro: you can safely let friends or strangers log in to your Chromebook. Con: if you personally have multiple accounts (one for work, one for talking to the wife, and one for talking to the mistress, say), it's awkward to switch between them because you have to actively log back in. Android doesn't have a multi-user model, but supports multiple accounts being logged in simultaneously. Pro: you don't have to log in and out. Con: you can't log in and out, so an Android device is something you no more want to hand out than you would your wallet.

This whole Ubuntu netbook mania just seems like a way to screw your real users with no realistic hope of gaining new users. Not happy ones, anyway. Sadly, it looks like we're going to have this stuff forced down our throats whether we like it or not; GNOME Shell looks to be pretty much the same.

A work-around
As a work-around until you install something less lossy, here's how to go back to the pre-11.04 desktop. Click the "power off" button to get to "System Settings". Why wasn't I able to find that myself? I must be stupid, not trying the "power off" button!

2011-05-04

signal(2) versus sigaction(2)

Normally, I'm pretty gung-ho about abandoning old API. I don't have head space for every crappy API I've ever come across, so any time there's a chance to clear out useless old junk, I'll take it.

signal(2) and sigaction(2) have been an interesting exception. I've been using the former since the 1980s, and I've been hearing that it's not portable and that I should be using the latter since the 1990s, but it was just the other day, in 2010, that I first had an actual problem. (I also knew that sigaction(2) was more powerful than signal(2), but had never needed the extra power before.) If you've also been in the "there's nothing wrong with signal(2)" camp, here's my story...

The problem

I have a bunch of pthreads, some of which are blocked on network I/O. I want to wake those threads forcibly so I can give them something else to do. I want to do this by signalling them. Their system calls will fail with EINTR, my threads will notice this, check whether this was from "natural causes" or because I'm trying to wake them, and do the right thing. So that the signal I send doesn't kill them, I call signal(2) to set a dummy signal handler. (This is distinct from SIG_IGN: I want my userspace code to ignore the signal, not for the kernel to never send it. I might not have any work to do in the signal handler, but I do want the side-effect of being signalled.)

So imagine my surprise when I don't see EINTR. I check, and the signals are definitely getting sent, but my system calls aren't getting interrupted. I read the Linux signal(2) man page and notice the harsh but vague:
The only portable use of signal() is to set a signal's disposition to
SIG_DFL or SIG_IGN. The semantics when using signal() to establish a
signal handler vary across systems (and POSIX.1 explicitly permits this
variation); do not use it for this purpose.

POSIX.1 solved the portability mess by specifying sigaction(2), which
provides explicit control of the semantics when a signal handler is
invoked; use that interface instead of signal().

It turns out that, on my system, using signal(2) to set a signal handler is equivalent to using the SA_RESTART with sigaction(2). The Open Group documentation for sigaction(2) actually gives an example that's basically the code you'd need to implement signal(2) in terms of sigaction(2).) The SA_RESTART flag basically means you won't see EINTR "unless otherwise specified". (For a probably untrue and outdated list of exceptions on Linux, see "man 7 signal". The rule of thumb would appear to be "anything with a timeout fails with EINTR regardless of SA_RESTART", presumably because any moral equivalent of TEMP_FAILURE_RETRY is likely to lead to trouble in conjunction with any syscall that has a timeout parameter.)

Anyway, switching to sigaction(2) and not using the SA_RESTART flag fixed my problem, and I'll endeavor to use sigaction(2) in future.

Assuming I can't stay the hell away from signals, that is.

Alternative solutions

At this point, you might be thinking I'm some kind of pervert, throwing signals about like that. But here's what's nice about my solution: I use a doubly-linked list of pthreads blocked on network I/O, and the linked list goes through stack-allocated objects, so I've got no allocation/deallocation, and O(1) insert and remove overhead on each blocking I/O call. A close is O(n) in the number of threads currently blocked, but in my system n is currently very small anyway. Often zero. (There's also a global lock to be acquired and released for each of these three operations, of course.) So apart from future scalability worries, that's not a bad solution.

One alternative would be to dump Linux for that alt chick BSD. The internets tell me that at least some BSDs bend over backwards to do the right thing: close a socket in one thread and blocked I/O on that socket fails, courtesy of a helpful kernel. (Allegedly. I haven't seen BSD since I got a job.) Given Linux's passive-aggressive attitude to userspace, it shouldn't come as a surprise that Linux doesn't consider this to be its problem, but changing kernel is probably not an option for most people.

Another alternative would be to use shutdown(2) before close(2), but that has slightly different semantics regarding SO_LINGER, and can be difficult to distinguish from a remote close.

Another alternative would be to use select(2) to avoid actually blocking. You may, like me, have been laboring under the misapprehension that the third fd set, the one for "exceptional conditions", is for reporting exactly this kind of thing. It isn't. (It's actually for OOB data or reporting the failure of a non-blocking connect.) So you either need to use a timeout so that you're actually polling, checking whether you should give up between each select(2), or you need to have another fd to select on, which your close operation can write to. This costs you up to an fd per thread (I'm assuming you try to reuse them, rather than opening and closing them for each operation), plus at least all the bookkeeping from the signal-based solution, plus it doubles the number of system calls you make (not including the pipe management, or writing to the pipe/pipes when closing an fd). I've seen others go this route, but I'd try incest and morris dancing first.

I actually wrote the first draft of this post last August, and the SA_RESTART solution's been shipping for a while. Still, if you have a better solution, I'd love to hear it.