There was some flux in life around the time that I finished my degree. Our daughter was born and my academic adviser left the country within a week or two after I presented my work. All of that was followed by a job search, and, well, the report got lost in the shuffle.
14 October 2009
13 October 2009
Software systems often have parts that are more "foundational" than others. Lots of other parts of the system depend on these foundational portions. A base class that has many descendant types is a good example of foundational code.
Now, in a properly architected, designed, and factored system, such foundational portions should not be a problem. Best practices will be observed; anti-patterns will be avoided. If you work in such systems: congratulations! Write books and blogs and go forth. Stop reading now. Have a nice day.
For the rest of us: eventually, some change will be suggested for a foundational portion of the system. How can this be done? The mere suggestion will cause other developers to say that too many things are touched by the foundational portions, best not to change things.
Others will say that the fundamental part should be improved because the portion in question is so foundational. Best to build a house on a solid foundation. Surely the political debates will rage in the wake of the suggested change.
How can the debate be steered towards best-practices and higher-quality software? Is there some middle ground between not touching the foundational code and making a radical change that could cause the already shaky system to fail in unanticipated ways?
What if the code is improved, but not the "whole way?" Let's consider an example (I'm thinking about Java as an implementation language here, but that's just a detail): suppose that you have some base class B. B was written years ago by a programmer armed with a humanities degree and a "Learn Java in X days" book where X is far less time than you spent working on your CS degree (and even you're still not totally sure what the volatile keyword does).
Class B was prolific and has on the order of 50 descendant classes. Changing the poorly written B could be disastrous.
Suppose an issue arises with a particular field F of B. Field F is not initialized at construction time and so begins life as null. Now, B has an accessor and a mutator for F, so other parts of the code can change F. To make things interesting, our code poet skipped day three of his book in which he was instructed to validate his data. This means that the mutator has no guard clauses to ensure that changes to F occur with valid values.
Then one fine day, suppose that the dreaded NullPointerException rears its ugly head because some calculation assumed that F is non-null, but alas, it is null. What happens next?
The race is on to "fix the bug," and eventually some change is made in some descendant class for the particular case in which F is null - some believe that all is well again. Others know that little has been gained because F might be null in other cases.
Then comes the suggestion, "why not simply initialize field F at B construction-time and then ensure the validity of field F by proper checks when it is changed?" Crazy Talk! Type B is too foundational! Too many things depend on it!! Who knows what will happen?!! Changing it could cause the lights to go out!!!
How do we firm up the foundation while allaying fears? Suppose the "change" is "soft." Suppose that a parameter is added to all B constructors that accepts a value for F so that F can be properly initialized. Now recompile and look for errors. All of the B subtypes should be broken: fix them by passing null, or a valid value if you can easily find one at the construction point. Also, augment the B constructor and the F mutator with guard clauses that check for null. But here's the soft part: the guard clause don't do anything like throw an IllegalArgumentException, instead they generate a log message, or raise some system operator alarm, or send an email to some portion of the development team.
The soft change shouldn't change the current behavior of the system. Those who were afraid to make the change win - because nothing really changed (ok, the time taken to write to the log, or send the email could cause some crazy timing bug. Technically, there was a change, but only a tiny one that is not likely to result in catastrophe).
These checks allow you to study the system as it runs and determine where null values are generated. Analyze them one at a time in the child classes and refactor so that null is no longer passed. Eventually, you should work through them, and then, when the team can see that nulls are not being passed anymore, you can confidently change the guard clauses in B to be "hard" and throw an exception like IllegalArgumentException.
07 October 2009
This is my reading list for "enterprise" projects that use Java as the primary programming language. The list is short, but it has served me well.
This "bare-bones" reading list is designed primarily for programmers who (like me) work on large-scale, "legacy" Java code-bases. The list could serve as the foundation for a more expanded list with "more advanced topics."
- "Effective Java," 2nd edition, Bloch - this is the one book to have if you have only one book.
- "Java Concurrency in Practice," Goetz, et. al. this is a must have for concurrent enterprise apps.
- "The Java Programming Language," 4th ed. Refer to it often.
- (freely available) "The Java Language Spec," http://java.sun.com/docs/books/jls/ Refer to it just as often.
That's it for the core, "must-have" Java stuff, but you'll need to understand patterns as well:
- "Design Patterns," Gamma, et. al.
- "Patterns of Enterprise Application Architecture," Fowler
- "xUnit Test Patterns," Meszaros
Then there's the "recommended" section for Java, a little dated, but still relevant:
- "Hardcore Java," Simmons
- "Better, Faster, Lighter Java," Tate & Gehtland
That's it. Happy reading!
16 October 2008
- Wikipedia List of tools for static code analysis
- Free/Open Source
- Java Compiler
- Overview (Compiler Tree API)
- The 12-steps plan (for extending the Javac compiler) - Java: Developing On The Streets
- Tom Ball's Blog: Hacking javac
- JDK 6 Java Compiler (javac)-related APIs & Developer Guides -- from Sun Microsystems
- The Java Compiler API
- java.net: The Open Road: Superpackages
- The Java Community Process(SM) Program - JSRs: Java Specification Requests - detail JSR# 199
- List of Java virtual machines - Wikipedia, the free encyclopedia
- Talks, Blogs, etc.
- my take on things: Why is Static Analysis Uncommon?
- Greg - Stack Overflow
- What code analysis tools do you use for your Java projects? - Stack Overflow
- JavaOne Technical Sessions - Improving Software Quality with Static Analysis - 2008
- TS-2007.pdf (application/pdf Object)
- java.net: The Open Road: javax.annotation
- Gregg Sporar's Blog: Talking About Static Analysis
- Fabiano Cruz's Blog: Java Static Analysis made easy with Apache Maven 2
- YouTube - Using Static Analysis For Software Defect Detection
01 August 2008
Java packages may have a corresponding
package-info.java source file. This file provides a central location for package-wide information. A Java compiler will compile the
package-info.java source file just like any other Java source file... sort of.
The thing is, the
package-info.java source file is a special sort of source file:
- Its contents typically consist of Javadoc documentation comments that describe the package. In fact, the
package-info.javafeature replaces the legacy
package.htmlfile that served the same purpose.
- Sometimes a package is annotated in its
package-info.javais devoid of source code proper.
See The Java Language Specification, 3rd ed. sec 184.108.40.206 "Package Annotations" for more information on the intention of the
package-info.class file is produced by the compiler when a
package-info.java file contains only Javadoc comments. Why is this an issue? It's an issue because most build tools interpret the presence of an up-to-date
.class file as a reason not to compile a corresponding
.java file. Without a
package-info.class, a build system might compile
package-info.java every time the build is invoked regardless of whether
package-info.java has changed. Such unnecessary compiler invocations have the effect of increasing build execution time. This is not such a big deal for small projects, but the impact on build time can become annoying as the size of a codebase grows.
In my environment a corresponding
package-info.class file is created only if at least one package annotation exists. Intuitively, it would seem that such annotations would need to have retention policy "stronger" than
java.lang.annotation.RetentionPolicy.SOURCE. However, I observe the generation of a
package-info.class file even if no annotation exists with a stronger retention policy (
RetentionPolicy.CLASS (the default) or
The presence of source code proper does not cause a
package-info.class file to be produced. Rather, class files with a names appropriate to the types defined in the source code are generated. For example, if I write a package private
Foo class in
Foo.class is generated.
So then, how can a build be protected from unnecessarily invoking the compiler due to the lack of a
package-info.class file? One solution to this issue is to annotate each package with an annotation that serves no purpose other than to force the generation of a corresponding
The annotation I wrote for this solution is called
ForcePackageByteCode. I gave it a retention policy of
RetentionPolicy.CLASS which clearly states that it needs to be in a
.class file. Just annotate the package statement in a
package-info.java file with
/** * This package handles all of the foo. */ @ForcePackageByteCode package com.example.foo;
Admittedly, this solution is not ideal because it creates extraneous output for the sole purpose of optimizing future build invocations. On the plus side, the size of the extraneous files is relatively small. This minor cost in size can be well worth the build time savings over the course of a project.
It would be nice if release candidate builds could "turn-off" the extraneous file generation by simply changing the
ForcePackageByteCode retention policy to
RetentionPolicy.SOURCE. Instead, a global search and replace opertion that changes all occurrences of the annotation to a comment for all
package-info.java files will do the trick:
There are other solutions to this problem. The Apache Ant build tool is addressing this issue with built-in tool support. The Ant team has devised some heuristics for deciding when to ignore
package-info.java files during compilation. Such work is helpful, but relying on support from a particular build tool has the drawback of being a partial solution if source is built in several build environments.
Alternatively, the need for a solution like the
ForcePackageByteCode annotation is reduced if there is another reason for a project to annotate packages. For example, the static byte code analysis tool FindBugs supports the
edu.umd.cs.findbugs.annotations.DefaultAnnotation annotation. This annotation declares that specified annotations be applied to all classes, fields, and methods of a package. For example, annotating a package with
@DefaultAnnotation(NonNull.class) tells FindBugs that the
NonNull annotation should be applied to all classes, fields, and methods of a package. Applying
DefaultAnnotation to a
package-info.java file causes a
package-info.class to be produced obviating the need for
31 July 2008
I have some Java networking code that uses the Java NIO (
java.nio) package and the Java Concurrency utilities (
java.util.concurrent) package. My code is structured to enqueue pending output on a write buffer queue. The write buffer queue is a sequence of
ByteBuffers implemented as a
Now, I want to perform a gathering write operation of pending
ByteBuffers in the write buffer queue. The
BlockingDeque has a
drainTo(Collection<? super E>) method that removes elements from the
Deque and places them in a specified
Collection of the appropriate type. Here's the snag: the
write() method requires a
ByteBuffer, not a
Converting from a
Collection<ByteBuffer> to a
ByteBuffer seems as simple as calling the
java.util.Arrays.asList(T...) method. Just wrap the target array as a
List then call
write() with the filled array:
ByteBuffer writeBufferArray = new ByteBuffer[BUFFER_COUNT]; // write() needs this List<ByteBuffer> writeBufferList = Arrays.asList(writeBufferArray); // drainTo() needs this int writeBufferCount = writeBufferDeque.drainTo(writeBufferList); // fills writeBufferArray? socketChannel.write(writeBufferArray, 0, writeBufferCount);
Unfortunately this does not work because an
UnsupportedOperationExcepion is raised on the call to
drainTo(). This happens because the
List produced by
Arrays.asList(T...) does not permit append operations. It enforces this restriction by not implementing the
java.util.AbstractList.add(int, Object) method.
It would be easy enough to copy the
ByteBuffers to a "temporary"
Collection<ByteBuffer>, then call the
toArray() method, but that results in a "double copy." Each
ByteBuffer reference would be copied from the
BlockingDeque to the temporary
Collection<ByteBuffer>, then copied from the temporary
Collection<ByteBuffer> to the array produced by
Avoiding the double copy means that some way must be found to write directly to a
drainTo() demands a
Collection, some alternative array wrapper must be used.
One potential candidate is the venerable
ArrayList properly encapsulates an array by creating an internal copy of any array given to it and producing a defensive copy when
toArray() is invoked on it.
ArrayList does its job well, but it's not suited to this task because it performs copy operations to and from its own array, not a client provided array.
My solution for a
List that writes-through to an externally accessible array is called
ArrayWrapperList. It works like this:
ByteBuffer writeBufferArray = new ByteBuffer[BUFFER_COUNT]; List<ByteBuffer> writeBufferList = new ArrayWrapperList<ByteBuffer>(); writeBufferDeque.drainTo(writeBufferList); // writes-through directly to writeBufferArray socketChannel.write(writeBufferArray, 0, writeBufferCount);
Conceptually, the code behaves as though
writeBufferList.toArray(writeBufferArray) is invoked immediately after the call to
drainTo(). It should be noted that in my production version I invoke
drainTo(Collection<? super E>, int) not
drainTo(Collection<? super E>). The upper bound on the drained elements ensures that the size of
writeBufferArray is not exceeded.
ArrayWrapperList does not encapsulate the array it manages like
ArrayWrapperList trades-off encapsulation for the ability to write directly to a client accessible array.