Tuesday, November 8, 2011

Dealing with library version clashes in Impala

One of the problems that occasionally crops up in Java application development is library class version clashes. For some reason, you have to use a particular version of some library, but that library, or some of dependencies, clashes with other libraries that you are using for different parts of the application. I encountered an example of this occurring on our project, where we were forced to use the old Axis 1.4 library to connect with an integration partner. (For those who are interested, the reason is that the our partner is using the rpc/encoded SOAP API configuration, which is generally regarded as obsolete, and is not supported by any of the modern SOAP libraries, from CXF to Metro, as well as the Axis successor, Axis 2. See here and here for more on this problem.)

The ability to deal with these kinds of problems is often cited as one of the main reasons for using OSGi. In my development of Impala, I have steered clear from tackling these kinds of problems, on the basis that while they are theoretically quite valid to consider, in practice they occur quite rarely, and there generally is a workaround. Unfortunately, in my recent project, the workarounds seemed quite unpalatable. One would have been to resort a plain text based client for this SOAP API. For a small API, this would have been satisfactory, but for a large API it would have involved quite a bit of extra work. Ultimately, it seemed like a sensible point to add to Impala some capability for dealing with library version clashes.

The new mechanism simply involves the following:
  • add any module-specific library in the module's lib directory. For our project, this involved adding axis-1.4.jar and a couple of its dependencies
  • add the setting supports.module.libraries=true to impala.properties
This feature is currently in the code base and will be available in the next release, 1.0.2, which should be available soon.

How does it work?

In the normal Impala application, a class is loaded first by using the web application or system class loader (typically to load third party libraries), then using a topologically sorted list of the dependent module class loaders (typically to find application classes).

With module-specific libraries, the scheme is a bit more complex. For any module which has module-specific libraries, the module-specific library locations are searched first. Suppose we are loading module A, and we need to load the class Foo is in both the shared library area (WEB-INF/lib) and in a module-specific location for module A. In this case, the class Foo will be found in and loaded from the module-specific location.

Caveats

The main caveat is that the class Foo should never be 'published' from module A, for example as the return type in a method which might be called from another module, otherwise ClassCastExceptions or LinkageErrors can result. Instead, the interfaces to module A should ensure that appropriate abstractions are in place to ensure that this does not occur. In our Axis example, we need to ensure that instances of classes which hold references to Axis libraries are not passed around between modules.


More details on how the class versioning feature works, and how it can be configured, will be added to the documentation wiki.

The solution is by no means the solution to any kind of class versioning requirement which may crop up. Multiple versions for particular classes is still seen as the exception rather than the rule, and the scope as well as the complexity of the solution reflects this. That being said, there is at least a workable practical solution when you are stuck in a corner on this problem.