Jukka Zitting: Analyzing the Jackrabbit architecture with Lattix LDM

Tim Bray pointed to David Berlind who pointed to the Lattix company. Lattix makes a tool called Lattix LDM that uses a Dependency Structure Matrix to work with software architecture. I watched the nice Lattix demo and decided to try the software out.

After receiving my Community license and struggling for a while to get the software running on Linux (need to include both the jars and the Lattix directory in the Java classpath!) I loaded the latest Jackrabbit jar file for analysis. The dependency matrix of the top-level packages after an initial partitioning is shown below:

The matrix contains all the package dependencies. A number in a cell of the matrix tells how many dependencies the package on the vertical column has on the package on the horizontal row. You can tell how widely a package is used by reading the cells on the package row. The package column identifies the other packages that the selected package uses or depends on. In general a healthy architecture only contains dependencies located below the diagonal.

The packages 2-6 form the general purpose Jackrabbit commons module, while the more essential parts of the Jackrabbit architecture are found within the core module. I grouped the commons packages and expanded the core module to get a more complete view of the Jackrabbit internals:

There was no immediate structure appearing, so I used the automatic DSM partitioning tool on the core module to sort out the package dependencies:

Jackrabbit core after initial partitioning

The value, config, fs, and util packages form a lower utility layer and the jndi package a higher deployment tool layer. The most interesting part lies between those layers, in the large interdependent section in the middle. The key to the architecture seems to be the main core package that both uses and is used by other packages in this section. I opened a separate view for examining the contents of the main core package:

Partitioning classes within the Jmain core package

The partitioning suggests that it might make sense to split the package in two parts. Without concern for semantic grouping, I just grouped the classes in the upper half as core.A and the classes in the lower half as core.B. This seems useful as the core.B package seems to be a bit better in terms of external dependencies:

Jackrabbit core after splitting the main core package

Running the package partitioning again, I got a bit more balanced results although the main section still is heavily interdependent:

Jackrabbit core partitioning after splitting the main core package

Looking at the vertical columns it seems like the main culprits for the interdependencies are the nodetype, state, version, and the virtual core.A packages. Both the nodetype and state package contain subpackages so I wanted to see if the dependencies could be localized to just a part of those packages:

Contents of the Jackrabbit state and nodetype packages

This is interesting, the interdependencies for the state package are for the main state package, while the nodetype interdependencies only affect the nodetype.virtual subpackage. I split both packages along those dependency relations,and partitioned the core module again:

Jackrabbit core partitioning after splitting the state and nodetype packages

The persistence managers in the state subpackages are now outside the main section just like the non-virtual nodetype classes. After a short while of further research on the dependencies I found that the partitioning of the main state package would suggest that the item state managers be split to a separate package:

After creating a new statemanager package for containing the item state managers, the partitioning of the core module starts to look better. The only remaining circular dependencies are for the virtual core.A and core.B packages:

Jackrabbit core partitioning after moving the state managers into a new statemanager packate

Looking at the virtual core.B package we find that only the NodeId, PropertyId, and ItemId classes depend on the state package:

In fact it seems that it might make sense to move the classes there. After doing that the core module partitioning looks even better:

Jackrabbit core partitioning after moving the ItemId classes to the state package

The only remaining source of cyclic dependencies is the virtual core.A package into which I wont be going any deeper at this moment. Even now the analysis seems to have provided a number of suggestions for reducing the amount of cyclic dependencies and thus the improving the design quality of the Jackrabbit core:

Split the main core package into subpackages

Move the nodetype.virtual package to a higher level

Move the state subpackages to a separate package

Make a separate package for the item state managers

Move the NodeId, PropertyId, and ItemId classes to the state package

Note that these suggestions are just initial ideas based on a quick walkthrough of the Jackrabbit architecture using a Dependency Structure Matrix as a tool. As such the approach only gives structural insight to the architecture, and for this short analysis I didn't much include knowledge about the semantic roles and relationships of the Jackrabbit classes and packages.

Jukka Zitting

Saturday, January 7, 2006

Analyzing the Jackrabbit architecture with Lattix LDM

No comments:

Post a Comment

Blog Archive