16 October 2008

Static Analysis Resources

I compiled these links while putting together a talk for the Capital District Java Users Group (CDJDN) near Albany, NY. My presentation slides are posted on the group's website.

01 August 2008

Compiling package-info.java over and over again...

Java packages may have a corresponding package-info.java source file. This file provides a central location for package-wide information. A Java compiler will compile the package-info.java source file just like any other Java source file... sort of.

The thing is, the package-info.java source file is a special sort of source file:

  • Its contents typically consist of Javadoc documentation comments that describe the package. In fact, the package-info.java feature replaces the legacy package.html file that served the same purpose.
  • Sometimes a package is annotated in its package-info.java file.
  • Usually, package-info.java is devoid of source code proper.

See The Java Language Specification, 3rd ed. sec "Package Annotations" for more information on the intention of the package-info.java feature.

No corresponding package-info.class file is produced by the compiler when a package-info.java file contains only Javadoc comments. Why is this an issue? It's an issue because most build tools interpret the presence of an up-to-date .class file as a reason not to compile a corresponding .java file. Without a package-info.class, a build system might compile package-info.java every time the build is invoked regardless of whether package-info.java has changed. Such unnecessary compiler invocations have the effect of increasing build execution time. This is not such a big deal for small projects, but the impact on build time can become annoying as the size of a codebase grows.

In my environment a corresponding package-info.class file is created only if at least one package annotation exists. Intuitively, it would seem that such annotations would need to have retention policy "stronger" than java.lang.annotation.RetentionPolicy.SOURCE. However, I observe the generation of a package-info.class file even if no annotation exists with a stronger retention policy (RetentionPolicy.CLASS (the default) or RetentionPolicy.RUNTIME).

The presence of source code proper does not cause a package-info.class file to be produced. Rather, class files with a names appropriate to the types defined in the source code are generated. For example, if I write a package private Foo class in package-info.java, then Foo.class is generated.

So then, how can a build be protected from unnecessarily invoking the compiler due to the lack of a package-info.class file? One solution to this issue is to annotate each package with an annotation that serves no purpose other than to force the generation of a corresponding package-info.class file.

The annotation I wrote for this solution is called ForcePackageByteCode. I gave it a retention policy of RetentionPolicy.CLASS which clearly states that it needs to be in a .class file. Just annotate the package statement in a package-info.java file with ForcePackageByteCode:

 * This package handles all of the foo.
package com.example.foo;

Admittedly, this solution is not ideal because it creates extraneous output for the sole purpose of optimizing future build invocations. On the plus side, the size of the extraneous files is relatively small. This minor cost in size can be well worth the build time savings over the course of a project.

It would be nice if release candidate builds could "turn-off" the extraneous file generation by simply changing the ForcePackageByteCode retention policy to RetentionPolicy.SOURCE. Instead, a global search and replace opertion that changes all occurrences of the annotation to a comment for all package-info.java files will do the trick:


There are other solutions to this problem. The Apache Ant build tool is addressing this issue with built-in tool support. The Ant team has devised some heuristics for deciding when to ignore package-info.java files during compilation. Such work is helpful, but relying on support from a particular build tool has the drawback of being a partial solution if source is built in several build environments.

Alternatively, the need for a solution like the ForcePackageByteCode annotation is reduced if there is another reason for a project to annotate packages. For example, the static byte code analysis tool FindBugs supports the edu.umd.cs.findbugs.annotations.DefaultAnnotation annotation. This annotation declares that specified annotations be applied to all classes, fields, and methods of a package. For example, annotating a package with @DefaultAnnotation(NonNull.class) tells FindBugs that the NonNull annotation should be applied to all classes, fields, and methods of a package. Applying DefaultAnnotation to a package-info.java file causes a package-info.class to be produced obviating the need for ForcePackageByteCode.

ForcePackageByteCode source code is available as part of the Virtual Team Tools project.

31 July 2008

Java ArrayWrapperList

I have some Java networking code that uses the Java NIO (java.nio) package and the Java Concurrency utilities (java.util.concurrent) package. My code is structured to enqueue pending output on a write buffer queue. The write buffer queue is a sequence of ByteBuffers implemented as a java.util.concurrent.LinkedBlockingDeque<ByteBuffer>.

Now, I want to perform a gathering write operation of pending ByteBuffers in the write buffer queue. The BlockingDeque has a drainTo(Collection<? super E>) method that removes elements from the Deque and places them in a specified Collection of the appropriate type. Here's the snag: the java.nio.channels.SocketChannel write() method requires a ByteBuffer[], not a Collection<ByteBuffer>.

Converting from a Collection<ByteBuffer> to a ByteBuffer[] seems as simple as calling the java.util.Arrays.asList(T...) method. Just wrap the target array as a List then call write() with the filled array:

ByteBuffer[] writeBufferArray = new ByteBuffer[BUFFER_COUNT]; // write() needs this
List<ByteBuffer> writeBufferList = Arrays.asList(writeBufferArray); // drainTo() needs this
int writeBufferCount = writeBufferDeque.drainTo(writeBufferList); // fills writeBufferArray?
socketChannel.write(writeBufferArray, 0, writeBufferCount);

Unfortunately this does not work because an UnsupportedOperationExcepion is raised on the call to drainTo(). This happens because the List produced by Arrays.asList(T...) does not permit append operations. It enforces this restriction by not implementing the java.util.AbstractList.add(int, Object) method.

It would be easy enough to copy the ByteBuffers to a "temporary" Collection<ByteBuffer>, then call the toArray() method, but that results in a "double copy." Each ByteBuffer reference would be copied from the BlockingDeque to the temporary Collection<ByteBuffer>, then copied from the temporary Collection<ByteBuffer> to the array produced by toArray().

Avoiding the double copy means that some way must be found to write directly to a ByteBuffer[] from drainTo(). Since drainTo() demands a Collection, some alternative array wrapper must be used.

One potential candidate is the venerable java.util.ArrayList. An ArrayList properly encapsulates an array by creating an internal copy of any array given to it and producing a defensive copy when toArray() is invoked on it. ArrayList does its job well, but it's not suited to this task because it performs copy operations to and from its own array, not a client provided array.

My solution for a List that writes-through to an externally accessible array is called ArrayWrapperList. It works like this:

ByteBuffer[] writeBufferArray = new ByteBuffer[BUFFER_COUNT];
List<ByteBuffer> writeBufferList = new ArrayWrapperList<ByteBuffer>();
writeBufferDeque.drainTo(writeBufferList); // writes-through directly to writeBufferArray
socketChannel.write(writeBufferArray, 0, writeBufferCount);

Conceptually, the code behaves as though writeBufferList.toArray(writeBufferArray) is invoked immediately after the call to drainTo(). It should be noted that in my production version I invoke drainTo(Collection<? super E>, int) not drainTo(Collection<? super E>). The upper bound on the drained elements ensures that the size of writeBufferArray is not exceeded.

ArrayWrapperList does not encapsulate the array it manages like java.util.ArrayList. Rather ArrayWrapperList trades-off encapsulation for the ability to write directly to a client accessible array.

ArrayWrapperList source code is available as part of the Virtual Team Tools project.