Mitigating the Billion-Dollar Mistake
Dereferencing null pointers is one of the most common software errors. It is so infamous that Tony Hoare called the invention of null pointers his billion-dollar mistake. Fortunately, we can significantly reduce the chance of a null pointer dereference with the right techniques which we’re going to explore later in this blog post. But first, let’s talk about why null pointers exist at all. Null pointers represent a default value which is used when something unexpected happens. As this is a common need, they have been included in many programming languages over the last sixty years. For example, imagine that we’re working on a banking application and need to look up the account balance for a bank account with a given number. What do we do if there is no account for the given account number? In this case, we need to return some value which clearly indicates that nothing was found, i.e., a null pointer. The problem with null pointers is that they shift the burden to deal with the unexpected situation back to the caller: It is very easy to return a null pointer, but more difficult to handle it. Null pointers also have the nasty habit of spreading through the codebase as it is very tempting to react to receiving a null pointer by just returning one as well. As a result, usage of null pointers is very common in many codebases. Careless usage of null pointers can lead to many problems, so how do we use them properly? The first thing we should keep in the mind is that we don’t have to use them.
Null Pointer Alternatives
There are alternatives to using null pointers. For example, when we return a data collection of some sort, e.g., a list, it is easier and better to return an empty list instead of a null pointer. This enables the caller to use the data without any special care as interacting with empty lists is (mostly) safe. The downside of this technique is that it is a little bit less expressive. Let’s assume we’re looking up the list of transactions made on a bank account in this month. If we return an empty list in case the account cannot be found, then our method will return the same result in case the account is found but there were no transactions in the given month. As a result, the caller cannot distinguish between those two cases. This is fine most of the time but we should keep it in mind.
If our language supports exceptions, we can throw one instead of returning a null pointer if we get bad input. That can be better than returning a null pointer but also means that we have to answer these questions:
- Which exception should we throw? Should it be a custom exception or a generic one?
- How should the caller handle this exception?
- Should it be a checked or an unchecked exception?
- How often will this exception occur? Conventional programming wisdom is that exceptions should occur rarely and shouldn’t be used for regular code flow as this makes real errors harder to spot.
To better understand this, let’s again use the bank account transaction lookup as an example. Here, it might be an option to throw an exception if the bank account cannot be found if we assume that only valid bank account numbers should exist in our application. In that case, we should use an unchecked exception as we expect a valid bank account number and there is nothing meaningful which can be done to recover from the thrown exception as we’re in an invalid system state. However, if we think that non-existing bank accounts are a common problem and there is something meaningful the caller can do to recover in this situation, then we should throw a checked exception.
The null object pattern is another alternative to returning a null pointer. Here, a specially prepared object is returned which has the desired return type but doesn’t do anything. So, if we try to look up a bank account which doesn’t exist, then we get a dummy bank account object in return. It isn’t null, so we don’t crash when interacting with it but it doesn’t do anything meaningful either. This pattern is rarely useful because it can introduce very subtle bugs. We can easily tell that we accidently dereferenced a null pointer because we crash immediately. However, we might never realize that we’re interacting with a null object until something unexpected happens. For example, what should happen if we try to transfer money from our dummy bank account to a different one? The null object pattern sounds good on paper, but only masks the problem instead of solving it.
The last alternative to returning a null pointer is to return a dedicated type which clearly indicates that we might return a null pointer. Here, we wrap the actual return type in a container. For example, in Java this is done via the optional
container object. The big advantage of this pattern is that it allows the compiler to check for proper null pointer handling. The downside is that it can make the code more verbose. Similar to null pointers, optional
has the nasty habit of spreading through the codebase. Also mixing optional
and null pointers in the same codebase can be quite confusing.
Now we’ve seen various alternatives to using null pointers. We should use one of these when it is appropriate to circumvent the inherent danger of null pointers. However, there are cases when a null pointer is actually the best solution. In this case, we should use it despite its dangers. Next, let’s explore best practices on how to use null pointers.
Never Pass Null Pointers
We should never pass any null pointers into other classes or methods as doing this can lead to hard to understand problems. Even if the method we call directly can deal with null pointers, it might call other methods and as soon as one of these isn’t ready for a null pointer we will crash. In that case, it is not easy to figure out where the null pointer originally came from. To make absolutely sure that we don’t pass null pointers, we might want to add assert
statements or throw an exception when someone passes a null pointer to our code. This way incorrect calls can be found quickly and easily.
Use Static Code Checks
When we have decided to return a null pointer, we want to make sure that this is handled by our callers. The best way to do this is by using static code checks. For example, this can be done in Java with Spotbugs and its CheckForNull
annotation. We can configure Spotbugs so that it breaks our continuous integration build when we forget to consider a potential null pointer. This way we can be confident that null pointers are correctly handled. However, this also has the downside that marking a method with CheckForNull
can be very disruptive if not all callers are ready for this change as it will break the build. Hence, this feature needs to be used with care.
Validate on the Database
Not all null pointers are created in the application code: Some are the result of missing data in the database. When we use an object relation mapping framework the database data is automatically transformed into objects. If we interact with an object where a property is missing in the database, then we can easily crash with a null pointer dereference. For example, let’s assume we look up a bank account with its bank account number. We get a bank account object back and want to read the owner of the bank account from this object. Sadly, this information is missing in the database, so we get a null pointer back and crash as soon as we interact with it. We cannot easily fix this in the application layer as we don’t want to add null pointer checks for data which should always exist. Instead, we need to prevent this kind of data corruption. In addition to our data consistency checks in the application layer, we should also validate the data on the database layer as the database is the last line of defense. We should have some kind of schema validation so that required fields are always present and that data types are as expected. This way, we can avoid a lot of trouble caused by corrupt data.
Conclusion
Dereferencing null pointers can be avoided if we minimize our usage of null pointers in return types, never pass null pointers to other methods, use static code checks to find any missing null pointer checks and validate our data on the database level as well.
If you liked this blog post, please share it with somebody. You can also follow me on Twitter/X.