Wednesday 20 March 2013

Using Ansible for configuration management

In a previous life, I used to work as a research engineer in HP Labs Bristol (UK).

Over the 11 years or so I was there I worked on very cool projects (including a talking pot plant).

By far the project I had most fun with was FrameFactory, a Cloud Computing CGI rendering service that we rolled out as part of the SE3D showcase. FrameFactory was essentially a fairly complex distributed system with various moving parts that had to be configured and coordinated properly. To that end, we designed the system around SmartFrog and Anubis. This was handy as the engineering team for both of those projects was sitting in the nearby cubicles.

Anubis let us do 10 years ago what you now take from granted with Zookeeper, that is directory services and coordinating distributed system. SmartFrog is a very generic configuration management tool with a very nice DSL for expressing configuration data and a deployment engine for taking that and pushing onto remote nodes. I used SmartFrog in a lot of projects, and I even wrote a compiler plug-in for it that would auto-generate Java code from EMF. So as you can see it is very generic.

The downside of SmartFrog is that it is too generic, so if you wanted to use to it to install software and configure a Linux node with a particular role, you would still have to write a lot of low level drivers yourself. Which we did, and I remember writing SmartFrog scripts to deploy Xen vms and move them around using live migration. This was fun, but it felt that you had to write too much (Java) code. 

Forward to 2013, and I am finding myself working on another large scale distributed system that eventually has to be configured and deployed on Linux. I am not a Linux sysadmin, and I knew about Chef and Puppet, but given the amount of workload in that project (I have plenty of my plate), I got slightly worried just by reading the various ways Chef or Puppet (Solo, master, knife, etc..) could be used.

Then I came across Ansible and I was really pleased to find myself up and running within 30 minutes of reading the tutorial. The main web page does a good job of describing what Ansible does so I won't bother repeating it here. What I found is that Ansible makes it possible to turn Linux How-Tos documents (like how to enabled EPEL on CentOS) into workable, reusable scripts that are still very easy to understand.

Loading ....

Ansible has a comprehensive set of modules for installing packages, copying files over SSH, tweaking remote text files via its template mechanism. As a more complex example, I tried to use Ansible to see if I could use to deploy the lower stack required for high availability on Linux. Basically, this means installing Corosync, Pacemaker and ensuring that they are properly configured with the right IP addresses and so on.

What I had done so far was to capture all the steps required in documentation that could be reproduced (i.e. retyped), however I realized I could just come up with Ansible playbooks (recipes) to do the same and they would be just as clear. Here are a set of scripts which are sufficient to setup Corosync and Pacemaker working in UDP mode.

Loading ....

I also recently created playbooks to deploy CouchDB and Solr. I think tweaking those playbooks to setup replicated CouchDB or Solr clusters should not be too hard either.

Although I could (and should) give Puppet and Chef a closer look, I greatly value simplicity and I've been really impressed by Ansible's ease of use (the community is very active and friendly as well).

I've recommended that we use it as the Configuration Management solution for our current project.

6 comments:

  1. I have a gist for a basic Solr master/slave replication setup using ansible here: https://gist.github.com/dstoflet/5204421

    I have found it useful to dump the ansible variables (https://coderwall.com/p/4p_vka?i=6&p=1&q=sort%3Ascore+desc&t[]=ansible) when trying to do moderate to complex jinja templates for ansible.

    ReplyDelete
  2. Thanks @CodeBleep
    I was more thinking along the lines of Solr cloud (http://wiki.apache.org/solr/SolrCloud) which now takes care of leader election automatically.
    Setting up Solr cloud with Ansible should be straightforward.

    ReplyDelete
  3. "Although I could (and should) give Puppet and Chef a closer look, I greatly value simplicity.." take a look at saltstack.. simple, scalable, rapid deployment http://blog.smartbear.com/software-quality/bid/283535/A-Taste-of-Salt-Like-Puppet-Except-It-Doesn-t-Suck

    ReplyDelete
    Replies
    1. I actually took a cursory look at Saltstack, and my 1st impression was that it did not seem as easy to setup as Ansible. With Ansible, you can really get started within few minutes as there are minimal things to install. Saltstack may have a faster/ more scalable protocol, but I am not really in the business of managing 1000s of servers with little latency. SSH is fine for my use case. But I can certainly see how Saltstack would work at scale. Plus it supports Windows, while currently, AFAIK, Ansible does not.

      Delete
    2. where does {{ bindnetaddr }} come from?

      Delete
  4. hello can you post your solrcloud playbook please

    ReplyDelete