2005-03-24

Classes per package in the JDK

I don't often get to see how many files are in a directory. Edit pretty much takes care of the whole directories thing for me. I just have to remember some fragment of a file's name, and it finds it, Google-fast.

Today I had reason to see how many files were in a particular directory, and I wondered how that measured up to other directories. In particular, was it as large as javax.swing?

Here's a bash(1) one-liner (broken down for ease of reading) to count the number of classes in each directory under the current directory:

dirs=`find . -type d ` ; \
for d in $dirs ; \
do \
count=`find $d -maxdepth 1 -and -name *.java | wc -l` ; \
echo -e "$count\t$d" ; \
done > /tmp/classes-per-directory

I ran this at the root of both the 1.4 and 1.5 JDKs' source trees.

Here's 1.5's top 20 largest packages:

hydrogen:/tmp$ sort -nr classes-per-directory-1.5 | \
head -20
215 ./com/sun/org/apache/bcel/internal/generic
191 ./org/omg/CORBA
138 ./javax/swing
130 ./com/sun/org/apache/xalan/internal/xsltc/compiler
108 ./java/awt
107 ./java/lang
91 ./com/sun/corba/se/PortableActivationIDL
89 ./java/util
89 ./com/sun/corba/se/spi/activation
80 ./java/nio
80 ./java/io
78 ./javax/management
69 ./javax/swing/text
69 ./javax/print/attribute/standard
69 ./java/security
67 ./com/sun/jmx/snmp
66 ./com/sun/org/apache/xerces/internal/dom
64 ./javax/swing/plaf/basic
62 ./com/sun/org/apache/html/internal/dom
61 ./java/net

There were plenty of packages unfamiliar to me there, so here's the top 20 without the non-java/javax stuff:

hydrogen:/tmp$ sort -nr classes-per-directory-1.5 | \
grep -v com/sun/ | grep -v ./org/ | head -20
138 ./javax/swing
108 ./java/awt
107 ./java/lang
89 ./java/util
80 ./java/nio
80 ./java/io
78 ./javax/management
69 ./javax/swing/text
69 ./javax/print/attribute/standard
69 ./java/security
64 ./javax/swing/plaf/basic
61 ./java/net
60 ./javax/swing/plaf/synth
52 ./java/awt/image
44 ./java/awt/event
43 ./javax/swing/event
43 ./java/security/cert
42 ./javax/swing/plaf/metal
42 ./javax/naming
41 ./javax/swing/plaf
hydrogen:/tmp$

That looks a lot more like what I expected. Still a couple of surprises, though. I'm surprised that a new package, javax.management, should make the top 10. And I'm surprised that a package I've never even heard of, javax.print.attribute.standard, should also make the top 10.

A quick glance at the javax.print.attribute.standard JavaDoc had me confused. What's the difference between the first two classes, Chromaticity and ColorSupported, other than the former's beautiful name? Well, it turns out that the author thought it was confusing too, and explains:

Don't confuse the ColorSupported attribute with the Chromaticity attribute. Chromaticity is an attribute the client can specify for a job to tell the printer whether to print a document in monochrome or color, possibly causing the printer to print a color document in monochrome. ColorSupported is a printer description attribute that tells whether the printer can print in color regardless of how the client specifies to print any particular document.

There turn out to be three distinct sets of attributes in this package. The largest of which seems to have about 30 members, which takes it out of the top 20.

Here, for comparison, is how 1.4 looks:

hydrogen:/tmp$ sort -nr classes-per-directory-1.4 | \
head -20
182 ./org/omg/CORBA
137 ./javax/swing
106 ./java/awt
89 ./com/sun/corba/se/ActivationIDL
86 ./java/lang
80 ./java/nio
77 ./java/io
69 ./javax/swing/text
69 ./javax/print/attribute/standard
64 ./java/security
63 ./javax/swing/plaf/basic
61 ./java/util
56 ./org/w3c/dom/html
54 ./org/omg/PortableServer
54 ./org/apache/xalan/templates
54 ./java/net
52 ./java/awt/image
52 ./com/sun/corba/se/internal/ior
49 ./org/omg/DynamicAny
49 ./com/sun/java/swing/plaf/windows
hydrogen:/tmp$ sort -nr classes-per-directory-1.4 | \
grep -v com/sun/ | grep -v ./org/ | head -20
137 ./javax/swing
106 ./java/awt
86 ./java/lang
80 ./java/nio
77 ./java/io
69 ./javax/swing/text
69 ./javax/print/attribute/standard
64 ./java/security
63 ./javax/swing/plaf/basic
61 ./java/util
54 ./java/net
52 ./java/awt/image
44 ./java/text
44 ./java/awt/event
43 ./javax/swing/event
42 ./javax/naming
42 ./java/security/cert
41 ./javax/swing/plaf
39 ./javax/swing/plaf/metal
39 ./java/beans
hydrogen:/tmp$

So apart from java.util shooting up the chart, things have been pretty static on the java/javax front. Interestingly, just over half the new classes in java.util are new exceptions relating to formatting.

Perhaps the most interesting fact – and it's one I can't easily show here without a graph – is that most packages don't contain many top-level classes.