Implementing a complete high availability Alfresco solution using open source technologies

As a proof of concept, I have done some research and experimenting to determine the best way of clustering Alfresco using completely open source components. I wanted a solution that offered load balancing as well as fault tolerance. There are three components outside of Alfresco that are needed to achieve this.

  1. Load balancer
  2. File system
  3. Database

The load balancer is the simplest component of the three, and the one with the most options available. We just need a load balancer that is able to handle sticky sessions. A dumb load balancer which round robins connections will not work for this scenario.

Alfresco stores all the content as regular files. (Unlike Sharepoint putting content in the database. Yikes!) In order to achieve HA on the content repository we need some sort of clustered or replicating file system. It was not long ago when clustered file systems were out of reach from the open source community. It is great that we now have some viable open source options now.

The last component needed, of course, is the database. Unfortunately, there is no viable multi-master open source option. There are many projects that are working towards this, such as Bucardo. But there is nothing currently that is a drop in replacement and/or production ready. The good news is we still have a master-slave(s) setup that can still achieve HA and some sort of load balancing.

Here is the complete solution I implemented:

 

Alfresco: Alfresco Enterprise 4.0.2

I used the latest version of Alfresco Enterprise at the time of writing this, just since it is what I deal with the most. I believe the Community Edition would work just fine as well in this scenario since the heart of Alfresco clustering is within Ehcache.

 

Load balancer: HAProxy

HAProxy is known to be very stable and currently used on some very high traffic web sites. It also gives us the functionality to keep track of sessions via the JSESSIONID cookie. Another great feature is we can take the fault detection further, and test a web script page in Alfresco to determine if Alfresco is currently running. (http://admin:passwd@server1/alfresco/wcs/s is a great page to check.)

There will be a small portion of people that were looking at this diagram and saying to themselves, “But there is a single point of failure!” HAProxy is a very simple component, and it would be easy to set up an active/passive automatic fail over. Also very stable physical and virtual options exist.

I should also note that we have tested HAProxy using single sign on authentication via Active Directory Kerberos. I assume NTLM would work just fine as well.

 

Clustered file system: GlusterFS

I have read good things about GlusterFS, but this was my first hands on experience with it. I was shocked how simple and quick this was to get up and running. A command to add the second server, and another to get the replication going. No messing with configuration files. You can even have 4 servers and enable replication and striping. Similar to the way RAID 10 (or 0+1) works, but across servers. This is a perfect fit for putting Alfresco’s content. Load balancing and seamless fault tolerance.

 

Replicating database: PostgreSQL + pgpool-II

MySQL is still an option, but I chose to go with Postgres here. I liked some of the HA features Postgres provided that seemed lacking in MySQL.  Unfortunately, either way we have to use a master-slave replication configuration.

In order to achieve load balancing and fault tolerance we need to put pgpool-II in front on the databases. It will take read only queries and load balance them between the master and slave(s). Commands that involve any kind of updates, or writes will be forwarded to the master which in turn get streamed to the slaves. This makes writes slower than a standalone database, but most Alfresco installs should be primarily reads for the average implementation. Pgpool can also be configured to use parallel queries. This means large queries can be split up amongst servers.

Pgpool will also detect any faults, so if any of the slaves go down it will just take them out of the pool. And if the master goes down, it will take one of the slaves and promote it to the new master. For the chance of a problem with Pgpool, a similar configuration with HAProxy, an active/passive configuration can be used to add some redundancy.

 

Enjoy your content management uptime! And feel free to drop me a comment.

Alfresco startup script for Ubuntu/Debian

If you have used the script that comes with Alfresco, you have most likely already made your own. I created one for Ubuntu, but it should work with other variants.

Features

  • NEW! Added support for JPDA debugging and JMX console
  • NEW! Also support for status (/etc/init.d/alfresco status)
  • NEW! Precise doesn’t come with ‘bc’. (Really Canonical?) I removed that dependency.
  • NEW! Added support for Alfresco 4.X!
  • Can configure script to run Alfresco as root or a non-privileged user.
  • Firewall rules will be setup at system boot if configured to run as a non-privileged user.
  • Will attempt to shutdown all instances of Alfresco cleanly, will kill them after a set time if they are hung up with a default of 30 seconds.
  • Cleans up rogue processes such as OpenOffice
  • Able to change the umask that Alfresco creates files as. (Handy for multiuser environments.)

Limitations

  • Only tested on Ubuntu Server 10.04 and 12.04 (Lucid and Precise LTS releases)
  • The script directly calls Java, bypassing alfresco.sh and even catalina.sh. It was the only way I could seem to get everything working smoothly including umasks. So the biggest downside of this, is that future versions of Alfresco may have different Tomcat start up parameters.

Installation

  1. Copy and paste the script into the file: /etc/init.d/alfresco
  2. Edit the “### Configurable variables” section in the script to suit your environment
  3. chmod +x /etc/init.d/alfresco
  4. update-rc.d alfresco start 99 2 3 4 5 . stop 99 0 1 6 .

Feel free to email myself any bugs, feedback ,or requests. My email address is the user name you see here at the top of the blog @abstractive.ca.

The Script

Here is a link with proper indentation and easier copying:

http://dl.dropbox.com/u/7658190/util/alfresco.sh