Wednesday, January 28, 2009

Adding container level locking to your H.A. Servicemix 4 deployment.

As an improvement to the High Availability features being implemented in the latest versions of the Servicemix 4 Kernel the concept of container level locking has been introduced.

The Container Level locking mechanism allows bundles to be loaded into slave kernel instances in order to provide faster failover performance (when a slave instance becomes the master it will have fewer bundles to load before starting operation). The Container Level refers to the starting priority assigned to each bundle in the OSGI container. These start levels are specified in $SERVICEMIX_HOME/etc/startup.properties, in the format jar.name=level. The core system bundles have levels below 50, where as user bundles have levels greater than 50.

Level: 1 Behavior: A 'cold' standby instance. Core bundles are not loaded into container. Slaves will wait until lock acquired to start server.

Level: <50 Behavior: A 'hot' standby instance. Core bundles are loaded into the container. Slaves will wait until lock acquired to start user level bundles. The console will be accessible for each slave instance at this level.

Level: >50 Behavior: This setting is Not recommended as user bundles will be started.

Container Level locking is supported in both currently supported failover deployment schemes;
  • Simple Lock File
  • JDBC Locking
To make use of this capability the following must be set on each system in the master/slave setup:
  • $SERVICEMIX_HOME/etc/system.properties file updated to include the below entries in addition to other configuration entries to enable H.A.
servicemix.lock.level=50
servicemix.lock.delay=10

For more information on how to use the High Availability features supported with Servicemix 4 please visit the users guide.

Saturday, January 10, 2009

Setting up Servicemix 4 with H.A. in mind.


Looking to setup Servicemix 4 with High Availability in mind? Then follow this guide to setting up a JDBC Master Slave deployment (Requires Servicemix 4 Kernel 1.1.0-SNAPSHOT or higher).

What is a JDBC Master Slave deployment?
A JDBC Master Slave deployment is a collection of Servicemix instances configured to act as one logical instance (which I refer to as a cluster). Under this High Availability model one instance node provides services (Master) while all other instance nodes (Slaves) wait by the side lines for the node to release its JDBC lock. When the JDBC lock is not held by any node then the first node to obtain the lock becomes active and will fully start its instance of Servicemix.

In this deployment scenario the JDBC connection to a locking table hosted on a database server becomes a point of failure to the cluster, as such it is highly recommended that the database also be provided in a highly available manner. A lock monitor is implement in Servicemix to ensure that as a master node the loss of its lock will force a graceful shutdown of the node.



The above figure depicts three servers each hosting an instance of servicemix in a JDBC Master Slave setup. The DB server hosts an Apache Derby database instance. The instance labeled "Master" has obtained the lock on the database table, the Servicemix instances on servers B and C are left waiting to obtain the lock.

What happens during the initial start of the cluster?
When Servicemix nodes are started they will read the Servicemix lock configuration and attempt to connect to the specified database. The first cluster node to establish connection to the locking table will become the master, the remaining instances will wait, retrying their connection periodically.

What happens when a master fails?
In the event that the master node experiences a failure it will release its hold on the locking table, this will allow awaiting slave nodes to attempt access to the lock. Once the master node has failed it will require a manual restart, upon which it will join the slave node pool.

What if a database failure occurs?
In the event of database failure the current master node will detect the loss of connection and shutdown. The slave nodes will continue to await connection to the database. The former master node will require manual restart. This feature was developed to ensure that a temporary loss of lock would not allow two master nodes to operate at one time.
What happens when an instance restarts?
When a node is started as part of a running cluster it will attempt to make connection with the database to access the lock, failing to obtain the lock it will retry periodically.

How to configure Servicemix 4 JDBC Master Slave:
To setup the JDBC Master Slave deployment you will need to configure the following on each node to be included in the cluster:

  • Classpath updated to include JDBC driver.
  • $SERVICEMIX_HOME/etc/system.properties file updated to reflect the below entries.
servicemix.lock=true
servicemix.lock.class=org.apache.servicemix.kernel.main.DefaultJDBCLock
servicemix.lock.jdbc.url=jdbc:derby://dbserver:1527/sample
servicemix.lock.jdbc.driver=org.apache.derby.jdbc.ClientDriver
servicemix.lock.jdbc.user=user
servicemix.lock.jdbc.password=password
servicemix.lock.jdbc.table=SERVICEMIX_LOCK
servicemix.lock.jdbc.clustername=smx4
servicemix.lock.jdbc.timeout=30

Note:

  • Will fail if jdbc driver is not on classpath.
  • The database name "sample" will be created if it does not exist on the db.
  • First SMX4 instance to acquire the locking table is the master instance.
  • If connection to the DB is lost then the master instance will attempt to gracefully shutdown, allowing a slave instance to become master when the DB service is restored. The former master will require manual restart.