2005-07-17

Book: "Exceptional C++ Style"

I'm a Java programmer. I might have been employed as a C++ programmer for the last 5 years, but I still write a lot of Java, I still care more about Java, and I still think like a Java programmer. Non-type template parameters? Why would you give up the run-time configurability?

The main thing I've got from working in C++ is greater respect for the zero-overhead principle. (In Stroustrup's own words: "what you don't use, you don't pay for" and "what you use can be implemented without overhead compared to hand coding"). I like the idealism of "let's do the right thing, and then worry about how to reduce the overhead", but I've come to rather like the honesty and discipline that the zero-overhead principle encourages. It can be a useful additional constraint when I'm struggling to choose between alternative solutions.

This doesn't change the fact that C++ is a pain in the ass, and that's the reason why there are so many books about avoiding C++-induced rectal trauma.

When I first came (back) to C++ after Java, I read "Effective C++", "More Effective C++", "Exceptional C++", "More Exceptional C++", and "Effective STL" in what seems like quick succession. I may have disliked both authors' weak attempts at humor, but the books were useful and interesting.

It used to bother me that the authors didn't seem particularly critical of misfeatures in the language or library; it offended me slightly that they gave the impression of being the kind of people who delight in the obscure and tricksy. The last kind of person you want to have to work with. But these are also probably the kinds of people most likely to be attracted to C++, to stick with C++, and to have a desire to write books about C++.

"Exceptional C++ Style" seems a little more reflective than the C++ books of old. It does contain items that are the old-style "here's some code; spot the mistakes". Item 1 "Uses and Abuses of vector", for example, which gets the book off to a boring start. There's also an increasing amount of sameness. We've seen this very item before, if not in one of Sutter's other books, then in "Effective STL".

Items 2 and 3, though, suddenly get much better. They take you on a tour of the alternatives to sprintf, and the trade-offs. Best of all, there's even mention of boost::lexical_cast. As I've said before, C++ without boost is like crossing the desert without a camel. It's nice to see it given the same status as the rather basic standard library.

Items 9 and 10 cover export, and why even if your compiler implemented it, it's not what you want anyway.

Item 13 "A Pragmatic Look at Exception Specifications" is a C++ analog to the Java advice "don't use checked exceptions". Hopefully whoever designs the next successful language will have used C++ and/or Java enough to realize that this is another of those ideas that seems great on paper but just doesn't work out in practice. And this is what I mean when I say this book seems more reflective than earlier similar books: it's able to take a long look back at something that's been implemented and used for years now, and ask "how did that work out?".

Two highlights of the middle section are item 18, an argument against public virtual (non-final, in Java terms) functions and item 19 on enforcing rules for derived classes, including the static-typer's guideline of "prefer compile-time errors to run-time errors". A belief C++ and Java programmers have in common.

Then there's a whole section of "Traps, Pitfalls, and Puzzlers"; exactly the kind of thing I don't like.

The book ends with a section of "Style Case Studies", which are much better. It's one thing to hear "know the library", "prefer iterators", "use standard idioms", "be compatible with the STL", and so on over and over again, but it's much more interesting to see actual code be improved. I particularly liked item 35 "Generic Callbacks", which inspired me to spend several evenings coming up with the best, safest C++ wrapper possible for dlopen(3) and friends. (The source is in salma-hayek if you're interested.) Item 36 "Construction Unions" taught me that all names containing __ are reserved, no matter where in the name the double underscore appears.[1]

The last four items take a close look at std::string, the basic premise being a Java anathema: that as much functionality as possible should be provided by non-friend non-member functions. (Or, conversely, that a class should have as few member functions as possible.)

It's interesting that interpreting the zero-overhead principle solely in terms of performance, and ignoring the cognitive cost of having lots of member functions has lead to std::string being one of the largest, ugliest classes you've ever seen. And the real beauty of it is that it's not very functional. Where Java has stuff that's actually useful, such as startsWith, endsWith, replaceAll, and matches, C++ tends to have four different forms of the same function because it wants to support several distinct kinds of argument. So instead of insisting "thou shalt use two iterators", we also have (const char*, size_t), (const basic_string&, size_t, size_t), and (char, size_t) forms of various functions. Ugly. Unnecessary. Confusing.

(Of particular annoyance to me is that Java, with no particular tradition of begin and end iterators, uses begin and end indexes for substring operations yet C++ uses a begin index and a count. And in functions such as std::copy, as Sutter points out, it's actually a count followed by a begin index. Perfect.)

What I don't understand is why Sutter thinks that you need erase, insert, and replace. I've always liked the way that "replace" is the only mutating operation you need on text. (I also didn't understand why Sutter felt that the (size_t, char) form worthy of a special case to avoid the (size_t, char) constructor. Why couldn't we just use a special-purpose iterator rather than insist on actually having a char[] to iterate over?)

It sounds odd to a Java developer, to hear someone campaigning for non-friend non-member functions. Almost like a call for a return to C programming. Dig the 70s-style global functions, baby! But when you look at the mess made of std::string, you can see the guy's point. And overloading and templates make it more a reasonably sensible proposition in C++. You could also do this with a mixin class, though, thanks to multiple implementation inheritance.

Welcome to C++, a Larry Wall world where there's more than one way to do it.

Why would you choose the globals over mixins? The one reason I can think of is that with a mixin, the implementor has to have the mixin available, and realize that the mixin would be useful. A global function is a global function. No invitation necessary.

I wonder how it's supposed to work in stuff other than standard collection classes, though? I'd like to see an example of a GUI program written in this maximally non-friend non-member style. What would it look like? Would it be useful?

In the meantime, I wish I could write define private member functions in C++ that I didn't declare as part of the class. Objective C++ lets me do this (because I don't have to declare the class all at once; I can have part of its declaration in the .mm file). In Java, I'd like a kind of static import (damn, they already used that name!) similar to Objective C categories where a non-instantiable class full of static methods can be imported, and the methods become available as if they were methods on the class of their first parameter. For example:

// StringExtras.java
class StringExtras {
public static int parseAsBase(String s, int base) { ... }
}

// C.java
import static StringExtras;
class C {
int m(String s) {
return s.parseAsBase(2);
}
}

Why does Objective C seem to have had so little influence, despite having a lot of useful ideas?

Anyway. What was I doing before I started ranting? Oh yes: I was talking about "Exceptional C++ Style". Interesting book. I'd like to see some more reflective books on C++. Those were the high points of this book. Puzzles bore me. Or worse, they frustrate me when I can't just fix the design flaw that gave rise to them.

[1. That paragraph used to end with the sentence "Something Sun's dtrace(1) developers don't appear to know". I was referring to an example of "Putting developer-defined DTrace probe points in an application", but Bryan Cantrill points out that "DTrace is written entirely in C; the libdtrace API is entirely an extern "C" API, and we don't support USDT probes in C++. (Due to regrettable restrictions around the extern "C" construct.) The C++ naming conventions therefore do not apply to us — and there is no such restriction around __ in the middle of C identifiers." If you check the C standard, you'll see he's right. I'm not sure why this is missing from annex C, "Compatibility", of the C++ standard. Anyway, my apologies to the dtrace(1) team.]