Sunday, July 16, 2006

Missing API details in Java: Null references

Even though Java APIs tend to be well documented thanks to the Javadoc, there are some details that are quite often missing, causing developers to program by coincidence. One of the main issues is the handling of null references.

Although there are guaranteed to be no dangling references on the Java platform, a reference can still be null and cause the infamous NullPointerException (aka NPE) when passed to an unwary piece of code. Null references are very convenient in expressing the absense of something, but I these special cases are often not well documented. There are three main cases of null references that are commonly used but seldom documented: optional arguments, member variables, and return values.

Optional arguments

Instead of overloading a method name to cover the case where one or more of the arguments are unavailable, the method can allow some of its arguments to be null. This is especially common for constructors that allow optional configuration options. This practice is otherwise very convenient, but disturbingly often not documented, leaving the client developer to wonder whether it is OK to pass a null reference to as a seemingly optional method argument. Often the solution is to just pass the null reference and rely on coincidence to keep it working.

A good example is the DocumentBuilder.parse(InputStream stream, String systemId) method in JAXP. It is explicitly documented that an IllegalArgumentException is thrown when the stream argument is null, but the systemId argument is just documented as "Provide a base for resolving relative URIs". There is also an overloaded DocumentBuilder.parse(InputStream stream) method, and incidentally it happens that calling the former method with a null systemId is equivalent to calling the latter method.

Now a JAXP client developer that has an InputStream and system identifier string that might be null, could either do the right thing and program defensively:
if (systemId == null) {
builder.parse(stream);
} else {
builder.parse(stream, systemId);
}

or rely on the coincidence that a null system identifier is actually allowed:
builder.parse(stream, systemId);

The latter case is in my experience what most of the developers would do, and thus the JAXP implementation is in practice required to keep allowing null system identifiers. The systemId argument should therefore be documented as "Optional base for resolving relative URIs" or even more explicitly as "Base for resolving relative URIs, or null".

Member variables

A good practice in Java is to keep all member variables private or at least protected. Unfortunately this allows the developer to be lazy in documenting the permitted states of the variable. After all, a private member is not a part of the public interface of a class, so why bother documenting it. A member variable can be null either by having explicitly been set so or by having been passed as null to a constructor or a setter. Often you need to explicitly search through the sources of a class to determine the possible states of a member variable. This is especially important when using the JavaBean conventions where a private member variable is often exposed trough a getter method with a template javadoc that contains no mention of the valid states of the underlying variable.

The JavaBean case is actually especially troublesome as the common pattern for JavaBean properties is:
private Object something;

/** Returns something */
public Object getSomething() {
return something;
}

/** Sets something */
public void setSomething(Object something) {
this.something = something;
}

It is most often not documented whether null references are allowed in the setter or if the client is required to explicitly set the property before doing anything with a bean instance. This is in my experience the main cause ofNullPointerExceptions in component-based systems.

Return values

Null references are commonly used to represent the absence of some value. For example the Map.get(Object key) method returns null when an entry for the given key is not found in the map. Such cases are usually well documented (the Map.get method returns "the value to which this map maps the specified key, or null if the map contains no mapping for this key"), but in some cases it is just implicitly assumed that a client developer will expect a null return value.

The most common causes of undocumented null return values are the JavaBean getters described above, but sometimes a genuine processing method forgets to mention that the return value might be null. A good example is theZipInputStream.getNextEntry() method, that returns "the ZipEntry just read" but fails to mention that the "ZipEntry just read" is null if no more Zip entries are available. A clever developer will of course assume that this is the case, since the method doesn't throw aNoSuchElementException like the Iterator.next() method does, but the only way to know for sure is to read the ZipInputStream sources and even then you are left with the bad feeling that the implementation might well be changed in a future release.

The return value of the getNextEntry() method should therefore be documented as "the ZipEntry just read, or null if no more entries are available".

5 comments:

  1. > The return value of the getNextEntry() method should therefore be documented as
    > “the ZipEntry just read, or null if no more entries are available“.

    Now that I checked it, this has already been fixed in Java 5: "the next ZIP file entry, or null if there are no more entries". Good work, Sun!

    ReplyDelete
  2. Java could do much better at preventing NPEs than it currently does. See:

    "Add Nice Option types to Java to prevent NullPointerExceptions"
    http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5030232

    ReplyDelete
  3. You should check out the Nice programming language (nice.sourceforge.net), which is based on Java but adds a much stronger typing mechanism. Among other things, it gets rid of the NullPointerException by forcing you to declare when a parameter or variable is allowed to be null, and then checking before dereferencing it. The compiler can then ensure at any given point that no NPE will be thrown.

    ReplyDelete
  4. Nice looks nice, thanks for the pointers. It however doesn't solve the issue for existing Java code. The Nully project (http://nully.dev.java.net/) mentioned on the referenced Java feature request seems like another good approach (using Java 5 annotations) to help remedy this issue. I'm also wondering whether Checkstyle has some static inference modules for flagging common null reference issues.

    ReplyDelete
  5. Related to this is http://www.c2.com/cgi/wiki?NoNullBeyondMethodScope

    ReplyDelete