Book: "Mac OS X Internals: A Systems Approach"

The advertising blurb on the back of the book claims that "Mac OS X Internals: A Systems Approach is the first book that dissects the internals of the system, presenting a detailed picture that grows incrementally as you read". In a sense this is true, but the picture doesn't grow in a way that suggests the author started with a map or itinerary. "A Brownian Motion Tour" would seem an equally apt title. I'm not sure that the lack of a plan was the cause of the book's weakness, though, or just a symptom. I've a hunch that the real disease was the attempt to serve too many masters.

I'll come back to that, but speaking of growth, this book is gigantic. It has 1641 numbered pages, is printed on reasonably thick paper, and has a hard cover. It's too heavy to comfortably hold while reading. Really, you need to rest it on a table or lap. The book's bulk would be annoying in and of itself, but the fact that so much of the book's content seems like filler makes it all the more annoying.

The first couple of chapters give an overview of where Mac OS X came from, and its major components. There's some interesting stuff for the newcomer to Mac OS in the second chapter, "An Overview of Mac OS X", such as dyld interposing, Mac OS' equivalent of LD_PRELOAD (see Apple's Dynamic Library Programming Topics and Dave Dribin's recent blog entry Tracing Objective-C Messages), but I'd have liked a quicker introduction and to have left the good bits for later, where they could be presented a little less out of the blue.

If you need an overview of Mac OS X technologies, you're better off with Apple's Mac OS X Technology Overview.

If you want to know about Mac OS debugging, see TN2124: Mac OS X Debugging Magic (and some of the documents it links to) and Performance Overview.

Chapter 3 "Inside an Apple" is a 100-page introduction to Apple's PowerPC architecture. That is, the architecture Apple's just completed its transition away from. Appendix A, "Mac OS X on x86-Based Macintosh Computers", says that it's beyond the appendix's scope to detail the hardware differences, so if you're primarily interested in the future, you're out of luck. Chapter 3's probably not a bad introduction to PowerPC Macs. I didn't know before reading it that the G5 has no BAT registers, for example, and I'd not previously heard of the processor's softpatch facility. There was also a nice overview of the PowerPC ABI. Again, though, Appendix A has no overview of the x86 ABI, which for most current Mac programmers would be more useful. PowerPC may be something of an oddity in the wider world, but for Mac programmers, x86 is the strange new world. For those Linux/Windows programmers who're used to x86 but don't know PowerPC, this might be of some use, but even then: PowerPC is over for Apple, and the future is in the far more familiar shape of x86. I couldn't help but feel that this chapter was mainly written as nerd porn. A better editor would have cut most of it and left us with PowerPC and x86 ABI overviews.

As it is, you can read Apple's own perfectly good Mac OS X ABI documentation (for all Mac OS X architectures).

The next chapter, "The Firmware and the Bootloader", is worse. In the increasingly unlikely event you need to know more about Open Firmware, you can find plenty on the web. Likewise if you want to write Towers of Hanoi in Forth, seemingly a special interest of the author's that he couldn't resist sharing. I look forward to another author showing me how to knock up a train-spotting database in Z80 assembler some day. I'm not sure many readers will gain anything from 100 pages about this legacy technology. The last 10 pages of this chapter talk about Intel's EFI, which replaces Open Firmware in x86 Apple hardware, and there's some discussion of the bootloader, but my dream editor would still have cut most of this chapter, or possibly have asked for an EFI rewrite with a 10-page addendum on Open Firmware.

Chapters 5 and 6 ("Kernel and User-Level Startup", and "The xnu Kernel") are fairly tedious walk-throughs of parts of the kernel. This is nothing like as clear or directed as "Lion's Commentary on Unix, Sixth Edition", because the book doesn't claim to be an introductory text. That's one of the big problems for me with this book: it's not clear who it's really suited for. The preface's "Who This Book Is For" suggests it's useful to application programmers, system programmers, users, system administrators, technical support staff, experts in other OSes who want to know how Mac OS differs from their OS, and students in advanced OS courses. Phew! If only he could have slipped in a section for the third-world children who assemble computers, then there would have been something for everyone. The trouble is, in attempting to serve all these masters, the book flounders. Application and system programmers are not well served because although the book takes you on a long tour of Mac OS facilities, it makes little or no attempt to motivate the different choices. A lot of what's here but not in standard Unix texts like "Advanced Programming in the Unix Environment" is obviously harmful to portability, and often restricted to Mac OS on PowerPC (though rarely to the Intel variant, because coverage of that is mostly restricted to a 10-page appendix). There would need to be good reason to use this stuff instead of traditional Unix facilities, but we're not helped out on that front. Writers of kernel extensions would probably be better served by a more focused book. For experts in other OSes, the book is limited by its huge size and the paucity of direct comparison. Also lacking is the kind of comparison that motivates the trade-offs between different systems. As for sysadmins and support staff (other than those who'd rather be kernel hackers), I'm mystified. There are occasional mentions of things like launchd(8), but this is no "Mac OS for Unix Geeks".

Another failing of this book is that it's very heavy on implementation, very heavy on history, and very light on design. If you're used to books such as Bach's "The Design of the UNIX Operating System", McKusik et al's "The Design and Implementation of the 4.4 BSD Operating System", or Tanenbaum's books, you'll be disappointed. I don't know if the author isn't interested in design, or just doesn't understand the difference between history and design, but this book is let down either way.

The most interesting parts of these kernel chapters for me were the description of the commpage and the description of the kernel support for the Mac OS 9 emulation (the "Classic environment"), and the virtual machine monitor. I hadn't realized that Classic support was so deeply ingrained, or so seemingly general-purpose, but then I've never had cause to think about it: I've owned several Macs since 2001, and I've never once run the Classic environment. (Yes, I'm a Unix snob. Here's a nickel, kid.) Moreover, none of this exists on Mac OS X for Intel, so it's at best only of historical interest. The modern "Rosetta" system for running PowerPC Mac OS X binaries on Intel Macs does get a mention, taking two pages of the 10-page Intel appendix, but I'd have liked to have seen much more concentration on that.

Apple's meager Rosetta documentation is better, and their Universal Binary Programming Guidelines is a better guide to the differences between Mac OS on Intel and on PowerPC than this book's appendix.

Chapter 7, "Processes", which actually covers all kinds of processes, threads, and tasks, has a good example of the kind of worthless filler that keeps cropping up in the book: the section "Java Threads". There are several interesting things to be said about Java threading on Mac OS. For one thing, there's the fact that although Sun explicitly tell you never to create your UI other than on the Java event dispatch thread, it's easy to screw up because you can usually get away with it on systems other than Mac OS. Many Java programs that ignore Sun's rule work fine on other systems and deadlock on Mac OS. (Sun's own installer for the JDK 6 source had this problem at one point, for example.) The other interesting thing about Java threading on Mac OS is the difference between the thread that Java's main method runs on, the Java event dispatch thread, and the AppKit "main thread". Confusion between these threads seems to be a common cause of deadlocks and crashes in Java programs using JNI to access Cocoa on Mac OS. What does "Mac OS X Internals" give us? A short Java program that starts three Java Threads with three different priorities. That's about as instructive as it sounds. The author even explicitly states that the example code isn't useful in conjunction with the tool he uses to list Mach threads.

I don't know why there's no mention of Java's Runtime.exec, because the next example is a Cocoa program using NSTask. This is followed by an equally uninstructive example of NSThread before the author wanders off to present alternative Carbon threading mechanisms, with the usual lack of explanation of what advantages and disadvantages they have. The chapter finishes with a walk-through of the execve implementation that fails to mention Rosetta, and fails to detail the exact interpretation of "#!" lines (one of the things that varies quite a bit from one Unix to another).

Chapter 8, "Memory", has a good example of the book collapsing under its own weight. The implementation of prebinding is described, with a note that prebinding is deprecated as of Mac OS 10.4, but if you want to know why it's deprecated, you have to go back to chapter 2, "An Overview of Mac OS X", where there's more information on prebinding than in the "Memory" chapter itself. This despite the fact that prebinding has long been deprecated, and so is probably not essential for an "overview". Presumably the information in chapter 2 was written while Mac OS 10.3 was still current, and only lightly fixed up afterwards, rather than being merged in chapter 8. Of course, the book is too large for this kind of error to be easily spotted.

There's a decent look at the implementation of malloc(3) in the section "Memory Allocation in User Space", though it would have been better if Mac OS' heap implementation had been contrasted with at least the Doug Lea heap-based implementation in glibc. There's also a quick overview of Mac OS' 64-bit support, but since most libraries won't be 64-bit until 10.5, there's not much to say there. (It would have been a good idea to at least mention the Apple-suggested work-around of having a 32-bit process for any UI and a 64-bit process for number crunching, though.)

Chapter 9 is "Interprocess Communication", another apparent failure of the book's "systems approach". It starts with a tour of Mach ports and messages, switches to Unix signals with quick detours into asynchronous I/O and ptrace(2), touches on pipes and FIFOs, briefly mentions POSIX semaphores and shared memory, veers abruptly into Cocoa's Distributed Objects, moves on to AppleEvents, then back to Cocoa for mention of NSNotificationCenter, back to Mac OS' notify(3), across to BSD's kqueue(2), loops back through Cocoa to CFNotificationCenter and Core Foundation run loops, and ends with a section that mentions spin locks, Mach mutexes, Mach locks/lock groups/lock sets, Mach semaphores, and ends somewhat jarringly on advisory file locks. This is pretty much as bewildering as it sounds, being quite a hack through the jungle with little overall plan. Worst of all, there's the usual lack of comparison of the alternatives or discussion of their strengths and weaknesses. Sometimes there is obviously only one choice for what you're trying to accomplish, but often there are choices that might seem like they'll work but don't, and always there are choices that will lead to less portability, worse performance, or more complexity. This is ignored.

It's not a direct replacement for this chapter, but if you want a good book on the portable user-space subset of this chapter, Bill Gallmeister's "POSIX.4 Programmers Guide" from O'Reilly is one of the finest technical books I've ever read. If you're interested in kernel mechanisms, I actually think Apple's Kernel Programming Guide is better than this. (Apple's documentation quality is wildly variable. Some bits are perfunctory, others detailed and well-written. You never know which you're going to get. Sadly, there are few references to Apple's documentation in this book.)

The last few chapters are better than the rest. "Extending the Kernel" is the first time you feel you're benefitting from any kind of insight from the author, rather than just listening to someone list the order of functions called during some operation, something you could instrument the compiler to do if you cared. Apple's IOKit documentation is pretty good, but example code is always welcome, and semi-useful examples especially so. This is where the source to several of the author's various slashdotted examples appears. It's hard to shake off the feeling that you're really holding a two or three hundred page book on these articles, with a thousand pages of filler in front. Despite the C masquerading as C++ (the use of #define instead of typedef had me spluttering), these chapters are the best in the book. The "File Systems" chapter is good if you have an interest in Spotlight and its implementation, and even covers an alternative way in which it could have been implemented. The final chapter, "The HFS Plus File System" is along the lines of Apple's TN1150: HFS Plus Volume Format. The book is on the whole a little less detailed than the technote, but it does use a companion hfsdebug program (available from the author's web site) to show the various features in action.

The quality of the book's diagrams varied. Some were clear enough, but others would give Tufte a fatal apoplexy and start him spinning in his grave. If you like ten different sizes and styles of text per diagram, five different sizes and styles of arrowed lines, and text at a bewildering number of different angles, you'll love some of the examples in this book. I don't know what software the author used for this task, but it wasn't well suited to it. It certainly increased my appreciation of the usually very clear diagrams I've seen in O'Reilly books. Addison-Wesley need to pay more attention to this; the quality of the diagrams in their recent "File System Forensic Analysis" was exemplary.

The use of large numbers of short footnotes was also distracting. A more careful editor would have pressed for them to be eliminated or worked in to the text. The information content per footnote was very low; they served mainly as interruptions.

I realize that I'm something of a dissenter here. The book's web site currently has 5 glowing reviews, from Dominic Giampaolo, Jim Mauro, David Butenhof, Marc Rochkind, and Ulfar Erlingsson. I'm not convinced. In his praise, David Butenhof says "I've read just a few sections in detail, but skimmed through many". Rochkind's review admits "I'm only up to page 50". Page 50 is less than 10 pages in to chapter 2. What kind of a review is it, that doesn't involve actually reading the book? Giampaolo extrapolates from having read the file system and Spotlight stuff, but wasn't to know that they're the high points of the book. Erlingsson's "much of this information is not available anywhere else in a accessible form for an OSX audience, as far as I know" is probably because he's a Microsoft guy who doesn't spend a lot of time reading Apple documentation. Butenhof's "excellent comparison of HFS+ and NTFS"? Maybe he read a different book to me, because in my copy there's just a two-page table that might better be described as "superficial".

There are nuggets of gold in this book, but they're hidden in a wrist-breakingly dense forest of dead tree. This is neither a good introductory work nor a good reference work. There are several good books still to be written on Mac OS, but this is none of them, though it might be the beginnings of a good book on Mac OS kernel extensions and/or file systems.