Dropwizard Cassandra UpdatesCassandra, Dropwizard, and Java
Details on recent changes in
and some plans for upcoming releases.
I recently wrote about a project that allows you to easily integrate Cassandra with Dropwizard, and wanted to give an update on some of the recent changes in the project.
First, I need to acknowledge Nick Telford for his contributions to the library. He's brought some very helpful ideas and improvements that I'm sure will benefit others, and his experience with Dropwizard has been invaluable in maintaining consistency.
Recent ChangesThese are some of the highlights. For full details, please see the .
One of Nick's improvements was to separate the building of a
from the bootstrapping phase. The original
implementation wasn't really necessary, and with Nick's changes it is now both simpler and easier to use. The
class has been marked as deprecated and will be removed in a future release, so I'd recommend switching to the more direct usage (as documented in the
) if you haven't already.
In the first release,
to establish and initialise a session to the Cassandra cluster. This approach was chosen as it appeared to be a lightweight and reliable way to force connection to the Cassandra nodes without needing any knowledge of the data contained within it. This would essentially prove that the cluster is up and we can connect to it (but obviously not confirm whether it's the
Unfortunately, after a period of extended use, it emerged that there was a memory leak in the application. After some investigation, this narrowed to the DataStax Cassandra driver itself, and the way
instances are managed. The
maintains a set of session instances
and holds on to them until the cluster is garbage collected, regardless of whether the session or cluster is closed. To spell it out a bit more clearly:
in a typical application, every session you create will remain in memory until your application terminates
has been raised on the DataStax issue tracker, and I've not yet seen any feedback or progress at the time of writing. I'm hoping this can be picked up soon and rectified, but until then please be careful how you use
instances in your application.
As a result of this bug, an initial change was pushed out to use
(retrieved from the driver) as a means of determining whether any nodes are available. Unfortunately this proved to be an unreliable means of determining application health as it was not responsive enough and didn't invoke any direct communication with Cassandra at the time the health check was executed. This meant that there were a few (sometimes many) false positives, which is not acceptable.
The JDBC-style validation query approach was finally chosen for
, with a default query on a system table that can be overridden in configuration. The advantages of this approach are many:
- The health check requires communication with the Cassandra cluster at the time of execution
- Each application can determine the query that is most appropriate for their use case
- The health check can validate not only that Cassandra is up, but that we're connected to the correct cluster (this relies on a custom validation query, of course)
Another improvement by Nick is an overload for
that does not require a Dropwizard
. An example of when this might be useful is when creating a
that requires interaction with a Cassandra cluster (e.g. schema migration). Of course, there are many other scenarios where this might be useful.
The only behavioural difference between the two overloads is in who manages the
. If you provide an
will be managed for you via Dropwizard's
objects; if not, you will be responsible for closing the
appropriately as there is no environment lifecycle to hook into.
has been versioned without reference to the version of Dropwizard that it is built against. Going forward, this will need to change.
I've seen some suggestions on the Dropwizard mailing list around versioning strategies, with one proposal of
. While this might work, I'm personally more a fan of inverting these so the library version comes first (as used in the Scala community, where libraries are suffixed with the version of Scala they're compiled against).
Either way, the distinction is less important than the desire to standardise this across all Dropwizard contrib libraries. There doesn't appear to be an agreed versioning strategy yet, but I plan to pursue this as a priority. Any suggestions are welcome!
Beyond the versioning changes, please feel free to make suggestions or log bugs on the project .