Dealing with Java linking problems

Dependency Hell

Most Java developers have probably run into problems where their code throws a NoSuchMethodError or a NoClassDefFoundError at runtime, despite compiling perfectly well. These issues can be very frustrating and hard to solve. This post tries to explain how they happen and explores some things that can be done to fix them.

Why oh Why?

A typical scenario is this:

  1. Your application is built via Maven, and uses library A and library X.
  2. You’ve specified in your pom file that you depend on version 1 of library X.
  3. You upgrade to a new version of library A. The new version of A calls a method in X that was added in version 2.
  4. Boom, NoSuchMethodError!

Dependency Hell
The thing that has happened is that Maven’s dependency resolution mechanism has picked version 1 of X, because that’s what you told it to do in your pom.xml. But the new version of library A had been compiled with version 2 of X invoking the new method. That method doesn’t actually exist on the classpath of your application when it runs, since you’ve chosen to use X version 1.

This is analogous to the difference between compilation and linking in C/C++, but Java doesn’t (yet) have an explicit linker. In this case, units were compiled with different versions, and when the JVM tries to link them up at runtime, it doesn’t work.

The terribly insidious thing about this kind of error is that they only show up when particular code paths are executed. So if you’re unlucky, the problem could be somewhere in code that doesn’t get executed by your integration tests, and things may blow up on you when you least want it. So for instance, there could be a NoSuchMethodError lurking in the code that your Cassandra client runs only when a Cassandra node crashes in some spectacular fashion, meaning that when there’s a Cassandra incident, suddenly all your service instances also crash, making everything worse.

In the graph above, if the fact that Library A uses Library X would have been an implementation detail of A (Library X was encapsulated inside A), there would have been no problem. But with Java, a single class loader can only load one version of a class, and since libraries/jar files are normally loaded by the same class loader, that means there can only be one version of a class in an application. Whoever gets there first (based on how Maven/your build tool sets up your class path) will win. Note that Java’s class path mechanism predates Maven artifacts and their versioning solution, so there is no special treatment of JAR files on the class path depending on their artifact ID or version.

Trying to solve this is the main focus of Java 9 – see project Jigsaw. But that’s going to take a long time before it takes effect, since it will mean that library maintainers will have to repackage their libraries as modules.

Some Non-Solutions

I’LL JUST USE THE LATEST VERSION!

Sorry, not good enough. Here’s an example that breaks:

  1. Library A is compiled against version 1 of X, and invokes method Foo.bar(int);
  2. In version 2, the maintainers of X add a parameter to Foo.bar, making the signature Foo.bar(int, String).
  3. You use version 2 in your application, and as soon as A tries to call Foo.bar, things blow up.

Even worse, there’s no guarantee that a class with a particular name is only going to be defined in a single artifact. As an example, take org.hamcrest.Matcher, which will typically be defined by org.hamcrest:hamcrest-core as well as org.mockito:mockito-all on most class paths in Spotify projects. That one is generally harmless, but there’s no guarantee that all such conflicts, where different JAR files include the same class names, will be harmless.

SEMANTIC VERSIONING WILL RESCUE ME!

Afraid not – semantic versioning is great and all, but if you have transitive dependencies that need two incompatible versions of a library that uses semantic versioning, then things will break still. Example:

  1. Library A uses version 1.0.1 of X, and uses class Foo.
  2. In version 2.0.0 of X, Foo is gone, and there is a new class, Bar.
  3. Library B is compiled against version 2 of X, and uses Bar.
  4. Your application needs both A and B.

Two conflicting dependencies needed by transitive dependencies
There is no way to solve this without changing A, B or your application (removing the dependency on either A or B).

Some Almost-Solutions

THE ENFORCER PLUGIN

The maven enforcer plugin can be configured to enforce a number of different rules for your build. One of the most commonly used rules here at Spotify is require-upper-bound-deps, which basically forces you to ‘always use the latest version that any dependency requires’. The hypothesis is that APIs will generally evolve in a way that doesn’t break backwards compatibility, so picking the latest will be the best bet. That’s probably true in general, but it comes with some problems:

  • Picking the latest version doesn’t guarantee freedom from runtime conflicts (see above).
  • It frequently leads to false positives that force you to do unnecessary work to determine which is the latest version and ensure that Maven picks that – even if there were no actual conflicts in the code.
  • It tends to pollute your pom.xml files with dependencies that have been added there just to make the enforcer plugin happy, not because they’re needed.

THE SHADE PLUGIN

Using the shade plugin, you can relocate classes from libraries you depend on, and include them in your JAR file. This means that you can make your library dependencies an implementation detail, unless you expose classes from those libraries in your API. This is great, but also comes with some problems:

  • It requires the library developers to correctly relocate dependencies that are implementation details, and many or most don’t in fact do that. (I’m usually one of the library developers who don’t do shading right).
  • Cluttering the class namespace, so that, for instance, you get multiple versions of shaded classes, sometimes leading to incorrect imports.
Shaded Guava Dependencies
Shaded dependencies – click to enlarge

THE MISSINGLINK PLUGIN

Earlier this year Matt Brown, Axel LiljencrantzKristofer Karlsson and I did a hack week project where we tried to mitigate these problems. Kristofer’s idea was to navigate through the byte code, finding all methods that were invoked and verify that they exist in the final binary/classpath. This became the missinglink plugin. The missinglink plugin should lead to fewer false positives than, say, the enforcer plugin, because it (almost) only checks code that is actually executed. However, there are problems with it:

  • It’s not battle-tested – but we’re using it in quite a few internal projects, and it has helped us find and fix a few previously undetected issues.
  • It, too leads to false positives, due to the way that we traverse the code. Basically, if a class is referenced by something, we traverse everything that any method in that class can call. So if you use, say, Futures.getUnchecked() in your code, that method never uses, say, MoreExecutors. But missinglink will then also check MoreExecutors for missing links because there are methods in Futures that refer to MoreExecutors.
  • There are kinds of problems it doesn’t find, in particular relating to things loaded via reflection/annotations. The ones we’re aware of are listed in the project.

INTEGRATION TESTING

Since this kind of errors only show up at runtime, and what’s more, only show up at runtime if you happen to trigger the right code path, they are elusive. It’s a very good idea to invest in automated integration testing to ensure that you exercise as many code paths as is feasible in the final artefact. However, of course, the test pyramid says these kinds of tests are the most expensive to write and maintain, so you don’t want too many of them. So while integration testing is necessary, it isn’t going to be a complete solution.

All right, so what can I do?

Transitive dependency management is an unsolved problem. Maybe Java 9 will make things better, but that’s going to take many years before it takes effect. For now, understanding how things work and being able to troubleshoot and solve this kind of error is a necessary skill. Hopefully this post has taught you some tricks you can use to figure out what’s wrong if you run into a linking problem and how it can be solved. If you have questions, corrections or tips, please comment on this post!