I’ve been reading some interesting blog posts about service discovery and some of the tools and techniques available to make it happen. There’s a great post by James Wilder that gives an overview of some of the well known tools out there along with some of the challenges involved with service discovery.

Some environments are very fluid; IP addresses change as VMs are cycled, the number of nodes providing a service may change dependent on load, there are planned maintenances, unplanned downtime and so on. One way to handle this might be to rerun your config management system to update your machines so they all know who provides what service. But sometimes the state of your environment is too transient to justify changing and releasing new config/infrastructure code, especially if you’re doing something dynamic like autoscaling.

One of the reasons I like Consul is that it provides a dynamic DNS server on localhost so you can lookup services by name (read Gareth Rushgrove’s post on using Consul with Dnsmasq). You can use these names in your app config files and you’ve got plug and play service discovery for almost any application.

The problem though is that lots of applications cache DNS queries for a long time so their view of the environment gets stale. Over the last week or so I’ve been looking at Airbnb’s Synapse, which can watch for DNS changes and will reconfigure a local HAProxy instance on the fly. HAProxy is configured to bind to a port on localhost and can load balance traffic between all the hosts that provide a desired service. So now it’s just a case of configuring apps to send traffic to on whatever port, and the traffic will be routed dynamically as the environment changes.

Here’s an overview of the setup. I’ve also put the Vagrantfile and some puppet manifests up on github for people to take a look at if interested.

The set up

Here’s an overview of the configuration I used for the load balancer and web servers. I’ve also put the Vagrantfile on github for people to take a look at if interested.

Load balancer

I created an RPM to install Consul which you can download from here, or you can build it yourself using the spec file I wrote for it. Otherwise the consul.io site has some great manual installation instructions you can follow here.

This will run a DNS server on port 8600. By installing Dnsmasq we have a local service that responds to both normal DNS queries and Consul DNS queries by forwarding them to the right place.

Make sure you add nameserver to the top of /etc/resolv.conf.

To install Synapse you need HAProxy and the Synapse gem. I also had to install a few prerequisite packages on my minimal Fedora box to make the gem installation work.

To configure Synapse you need to create a JSON file that defines the DNS watcher and the HAProxy configuration defaults. I use Synapse to resolve the A record for ‘web.service.consul’ which will include all IPs running the service named ‘web’, and then configure HAProxy to round-robin the traffic across those servers.

Finally, to run Synapse as a service I wrote a systemd unit file:

Which you can run and enable:

Web server

Install Consul as you did with the load balancer. You also need to serve some basic web content so I installed nginx and overwrote the index.html with the hostname of the machine to make tests easier.

Consul runs in bootstrap mode on the load balancer to create a new cluster, but on the web servers we need to join them to this existing cluster which means the command options differ slightly. Here are the command options for Consul on the web servers. Unless you’re running this on the Vagrant boxes I linked above, you will need to set -join and -bind addresses to be the load balancer and local IP addresses respectively.

You can put config files for Consul in /etc/consul.d. I’ve written one that defines the ‘web’ service along with a couple of tags and a basic check using curl to test that it is running.

Then just enable and start the Consul service.

Putting it together

If you had the load balancer up and running and have just started Consul on the web server then we should see HAProxy being reconfigured immediately as the web servers come online. Have a look at the logs on the load balancer and you should see something like this:

A quick test with curl should show we can round-robin load balance between the web servers:

If you were to create a fourth web server and bring it online it should automatically register with the load and start accepting traffic.

I’d like to play with Consul more over the next few weeks as I imagine there are quite a few different use cases. The tags look interesting and I’d like to try using them for blue/green deployments or canary releasing.