Saturday, June 20, 2015

Distributed hibernate search with ApacheMQ and Spring

This catching title will be not about distributed hibernate search, but something really close :)

The case I've recently solved was a quite different. I have a frontend application that uses hibernate database and hibernate search, and then I needed to add additional application - let's call it an integration server, which exposes some API webservices for the overall system. Integration server uses the same database as the frontend application, and enables clients to put data to the database using its webservices. Both applications exist on two different physical servers, as well.

Everything looks simple unless you start thinking about hibernate search update from the integration server, while the index is located solely on the frontend application side, because only this part of the system uses it. When you put data to the database from the integration server, the frontend application full-text index is not updated, of course. I've been looking for simple solution to overcome this problem.

Firstly, let's take a look at what the hibernate search proposes. It supports distributed hibernate search index, with master-slave replication, where all nodes are connected using JMS. This solution was something I didn't really need because only one node uses the index for searching. Moreover this solution is based on periodical index replication, what causes the index is up-to-date on each node only after some interval. Finally, I didn't like this solution because it uses JNDI, what is not really Spring way to solve the problems (I don't really like JEE, I only like to work with lighweight Java application stacks).

So I figured out the solution with following prerequisites:
  1. I don't want to use replication because I only need to use search on frontend application side.
  2. I want to keep index physically on the side really using it, ie. on the frontend application side.
  3. When the integration server updates database data, I need to update the index.
  4. I don't want to use JNDI.
  5. We are already using ApacheMQ, I want to use it for this solution as well. AMQ broker is already located on the integration server side.
OK, let's delve into the solution. Here is the spring config of important beans on the frontend side:


What do we have here? Standard hibernate session factory on which I'm showing the hibernate search config, that creates and uses local lucene index. Then AMQ connection factory, that connects to the broker running somewhere else. Nothing special. The only interesting bean is RemoteHibernateSearchController, which is derived from standard AbstractJMSHibernateSearchController, that already is a JMS message listener, and only needs to provide hibernate sesssion from our session factory:


Now let's take a look at integration server config, which is a little more interesting:


Same session factory, but configured in the other way. As the backend we use AMQBackendQueueProcessor - our own implementation, shown for a while, not the standard "lucene" implementation. Our implementation will delegate all hibernate search insert/update requests to the listnening frontend RemoteHibernateSearchController, through the JMS queue named "queue.search".

I need to mention here a thing. The integration server doesn't use hibernate search for searching at all (it is only insert/update oriented). But if we have enabled hibernate session for integration server, we need to have at least some index to work. This index won't be updated ever (AMQBackendQueueProcessor will delegate all updates to frontend index through JMS) and will never be read. So I decided to use "ram" provider, which holds whole this few-bytes index in RAM memory. You can, anyway, use any implementation you want - this is only a fake index.

AMQ configuration comes then, and we define only one queue here (this config is redundant, but you can add some parametrization to the config made this way). This is the part really starting the broker using TCP transport (in test environment both servers are run on localhost). Note, that AMQ connection factory connects to the (local) broker using VM transport, and doesn't start its own broker itself.

Now few words about the initialization order. We will use a little trick to bind AMQBackendQueueProcessor to Spring for a while, so the order is important. When session factory bean creates the session factory, hibernate search worker backend needs to be able to work right away. Our backend will work using AMQ, so AMQ needs to be initialized before the session factory bean is initialized. In the example it is done by depends-on attribute, and sessionFactory bean depends here on amqBroker bean (indirectly, the dependency goes through appContextProvider bean, which is also required to be initialized when session factory bean starts).

ApplicationContextProvider bean just accomplishes the commons hack to access Spring beans from non-spring aware code:


And finally AMQBackendQueueProcessor overrides JNDI-related code of standard JmsBackendQueueProcessor JMS connection factory lookup with spring-based implementation, using our container config and appContextProvider hack: