Monthly Archives: June 2013

A sign of life in a busy period

I’m quite busy for the time being, which is why I haven’t posted for the past eleven days. Real estate trading, friends visiting and many active projects at work are the topics that have demanded my attention. Not something to complain about, to put it in the typical, cup-half-empty Northern Jutland style 🙂

Regarding real estate trading, I’m in the process of buying a house. That includes getting the paperwork right, taking out a mortgage and vacating my current, rented apartment. It’s a big, big thing. At the moment, everything is on track. I have done my homework, and I will continue to do so. I’m very much looking forward to having my own place. It will probably be easier to grasp, when I’m past this uncertain phase. Three years of savings will be converted to bricks and I will be in debt for the next 15 years. I have never been in debt for more than a few consecutive months, so that’s another new experience. Also, I will be driving to work. For the past three years and a half, I’ve lived 1–1.5 km from work and used my bike or walked. My Aygo will probably be happy to drive more often.

More details will follow in a later post 😛

Let me share a few pictures… First, look at the view I had on Wednesday, June 19, when I was hunting for roe bucks with my dad:

20130619-waiting-for-a-buck

I didn’t see or shoot a buck, but I saw a doe twice (that’s a female).

On Saturday, June 22, I cooked roe deer venison with bacon and new potatoes for two visiting friends:

20130622-roe-deer-venison-with-bacon-1   20130622-roe-deer-venison-with-bacon-2

20130622-roe-deer-venison-with-bacon-3   20130622-new-potatoes-with-peel

It was very delicious.

Last night I assembled my new grill, which I hope to inaugurate soon:

20130626-my-new-grill-and-starter

I should have looked around the store for a smaller starter 🙂

… or for a bigger grill, as some people would probably say 😛

High Availability with CARP and HAProxy

Let me start by setting the scene. You have five mail servers providing a relatively large customer base with SMTP (relay to the world), POP3(S), IMAP(S) and webmail (HTTP(S)). Other servers, which are not part of the scene, receive mails from the world to the customers and do spam filtering. Mail from the customers to the world are also piped through this filtering – you can stop your facepalm – but it is also not a part of the scene. At the moment, the five mail servers are loadbalanced using DNS round robin, which provides load sharing but no redundancy. The protocols are in charge of the redundancy.

Since only SMTP provides redundancy (delivery is retried a number of times), the customers will have a bad experience with POP3, IMAP and webmail in the event of a server crash. If their MUA does not have an outgoing queue from which it retries delivery, they will also notice problems with SMTP and risk losing outgoing mails. Additionally, many customers do not hesitate to call the service desk, if their newly written mail does not disappear from their MUA’s queue immediately after they have clicked send. This increases the service desk load, which we also want to avoid.

It is clear that the situation is not ideal. There is no redundancy. Both uncontrolled and controlled downtime (scheduled maintenance) will disturb the customers’ use of mail services. The servers run FreeBSD, so we will utilize CARP and HAProxy to add redundancy without adding extra mail servers next to the existing ones or adding loadbalancer servers in front of the mail servers. Originally, my mind was set on keepalived, as I have some experience with it on Linux, but it seems that CARP is the weapon of choice in the *BSD world. As a curiosity, the FreeBSD port of keepalived expired in November 2011. The idea behind the improved setup is that CARP lets us have a number of virtual IP addresses that float between the servers. HAProxy provides service monitoring on each server and directs connections to the other servers if the local services are down.

It is important for me to stress that the presented setup in this post is only one out of many setups that can achieve load sharing and redundancy. Its pros and cons must be compared to the pros and cons of other ways that could be chosen in a given situation. Before mentioning a few cons, I want to start with the big, obvious pro. In a typical CARP/keepalived and HAProxy setup, in my experience, you have two dedicated loadbalancer servers in active/passive mode, i.e. at a particular point in time, one server sits idle, while all traffic flows through the other server. Failover to the idle server provides redundancy, and loadbalancing over a number of application servers provides load sharing. The single active server, however, is still a bottleneck. If your infrastructure provides 1 or even 10 Gbps connectivity, and your setup only handles e.g. 200 Mbps of traffic, this might not be a problem. Nonetheless, the dedicated loadbalancer servers are still at least two extra servers that can be avoided, if the application servers themselves, in coorporation with the underlying network infrastructure, are able to perform load sharing and redundancy. The setup presented here enables the application servers to do so.

The cons are that the solution is a bit complex, the procedures for maintenance and scaling are a bit complex as well, and the chosen loadbalancer, HAProxy, lacks some useful features. I have only looked at version 1.4 of HAProxy, so the features might not be lacking in newer versions. In addition, I might have missed something. I operate a Riverbed Stingray Traffic Manager on a daily basis, which means that I am used to many possibilities for session persistence, extensive traffic scripting, built-in checks for many types of services, SSL offloading, etc. It would be nice to offload the SSL to HAProxy and to have built-in, deep checks for POP3 and IMAP. We have to do without these things.

The setup consists of six servers:

Hostname Address Virtual address 1 Virtual address 2
mail01.test.netic.dk 192.168.42.11 192.168.42.21 192.168.42.26
mail02.test.netic.dk 192.168.42.12 192.168.42.22 192.168.42.27
mail03.test.netic.dk 192.168.42.13 192.168.42.23 192.168.42.28
mail04.test.netic.dk 192.168.42.14 192.168.42.24 192.168.42.29
mail05.test.netic.dk 192.168.42.15 192.168.42.25 192.168.42.30
client01.test.netic.dk 192.168.42.100

We have DNS round robin for the ten virtual addresses for the records {relay, mail, smtp, pop3, pop3s, imap, imaps, webmail}.test.netic.dk. Example:

[root@client01 ~]# dig mail.test.netic.dk a | grep ^mail | sort
mail.test.netic.dk.	900	IN	A	192.168.42.21
mail.test.netic.dk.	900	IN	A	192.168.42.22
mail.test.netic.dk.	900	IN	A	192.168.42.23
mail.test.netic.dk.	900	IN	A	192.168.42.24
mail.test.netic.dk.	900	IN	A	192.168.42.25
mail.test.netic.dk.	900	IN	A	192.168.42.26
mail.test.netic.dk.	900	IN	A	192.168.42.27
mail.test.netic.dk.	900	IN	A	192.168.42.28
mail.test.netic.dk.	900	IN	A	192.168.42.29
mail.test.netic.dk.	900	IN	A	192.168.42.30

It might be okay to use a greater TTL. Caching resolvers should cache the entire record sets and answer clients in a round robin fashion. We are not interested in changing the records frequently. The six servers are FreeBSD 9.1 amd64 guests in VirtualBox on my laptop (click on the images to view them unscaled):

20130615-high-availability-with-carp-and-haproxy-01

The terminal multiplexer tmux makes it easy to give an overview:

20130615-high-availability-with-carp-and-haproxy-02

In FreeBSD 9.1, CARP is available as a kernel module. I just added the line ‘if_carp_load=”YES”‘ to /boot/loader.conf and rebooted. The device configuration takes place in /etc/rc.conf. The handbook has an example – I will post configuration details in the end of the post.

I started out with one virtual address per server rather than two. This is shown on the following two images:

20130615-high-availability-with-carp-and-haproxy-03   20130615-high-availability-with-carp-and-haproxy-04

The advskew values are chosen such that the server “to the right” takes over in case of a crash, i.e. mail02 takes over for mail01, mail03 for mail02, …, mail01 for mail05. Think of the five servers being placed in a ring. One virtual address per server, however, has the disadvantage that a lot of traffic forwarding work might be put on one server, which might cause a domino effect. The second image shows the uneven distribution after mail01, mail02 and mail03 have crashed. Their virtual addresses have moved to mail04, while mail05 still only has one address.

Rather than building some sort of service which continuously distributes the virtual addresses evenly between operational servers, I decided to upgrade to two virtual addresses per server. Both solutions are complex, but the one that I have chosen does not require an extra server or any scripting. In return it probably does not scale as well as the other solution.

The following six images show the resulting setup and test the most obvious failover scenarios. The two servers “on each side” of a server takes over in the event of a crash. In this way, the traffic forwarding work is divided somewhat more evenly.

20130615-high-availability-with-carp-and-haproxy-05   20130615-high-availability-with-carp-and-haproxy-06   20130615-high-availability-with-carp-and-haproxy-07

20130615-high-availability-with-carp-and-haproxy-08   20130615-high-availability-with-carp-and-haproxy-09   20130615-high-availability-with-carp-and-haproxy-10

Note that a CARP interface starts out in the backup state, when it is brought up. We want virtual addresses to float back to their original server, when the server becomes operational. My solution, which is not thoroughly tested at this point, is to have a cronjob at boot and every tenth minute that sets the state, e.g. on mail01:

[root@mail01 ~]# tail -n 2 /etc/crontab 
@reboot root /bin/sleep 30; /sbin/ifconfig carp0 state master; /sbin/ifconfig carp5 state master
*/10 * * * * root /sbin/ifconfig carp0 state master; /sbin/ifconfig carp5 state master

If you decide to create the aforementioned virtual address distribution service, these cronjobs becomes unnecessary (and disturbing).

The following image shows that HAProxy has entered the scene and that Postfix answers on the virtual addresses:

20130615-high-availability-with-carp-and-haproxy-11

The haproxy.conf shown on the image has been changed a bit. It turned out that checks every tenth second, coming from five HAProxy instances, generate many useless lines of log. So far, I have chosen only to monitor the backup servers/services once every minute. (Correction: The uploaded haproxy.conf files at the bottom of this post still checks all servers every tenth second. It is left as an exercise to the reader to adjust this for his/her particular setup.)

HAProxy on a given server is configured such that it only forwards to services on the other servers, if its own services are down. If one of its own services is down, it loadbalances in a round robin fashion over the corresponding service on the other servers. The idea is that we do not want to generate extra traffic between the servers, if we do not have to. It should be more the exception than the rule that a local service is down.

The following two images show tests of CARP failover, from a client’s point of view:

20130615-high-availability-with-carp-and-haproxy-12   20130615-high-availability-with-carp-and-haproxy-13

The following two images show a HAProxy failover test and the HATop ncurses client, respectively:

20130615-high-availability-with-carp-and-haproxy-14   20130615-high-availability-with-carp-and-haproxy-15

Two things about the image to the right: a) HATop are useful for gathering statistics and for toggling service maintenance. b) The glimpse of Postfix’ lines in /var/log/maillog reveals that services no longer see the real source address of a client – they only see the source addresses of the different HAProxy instances.

Especially the remark about source addresses is important. Many connections/requests from one source address might trigger an abuse detection/prevention mechanism in a service. Lookups in e.g. RBLs will not make sense. Most protection mechanisms must be migrated from the services to HAProxy. Finally, the HAProxy access log is necessary to link internal and external addresses. Intelligent log collection tools like Splunk can be configured to do this, which means that it might not be a problem.

The services only listen on 127.0.0.1 and 192.168.42.1[1-5], while HAProxy listens on all interfaces, i.e. also on the floating virtual addresses:

[root@mail01 ~]# netstat -an | grep LISTEN | sort
tcp4       0      0 *.110                  *.*                    LISTEN
tcp4       0      0 *.143                  *.*                    LISTEN
tcp4       0      0 *.25                   *.*                    LISTEN
tcp4       0      0 *.587                  *.*                    LISTEN
tcp4       0      0 10.0.3.15.22           *.*                    LISTEN
tcp4       0      0 127.0.0.1.22           *.*                    LISTEN
tcp4       0      0 127.0.0.1.8025         *.*                    LISTEN
tcp4       0      0 127.0.0.1.8110         *.*                    LISTEN
tcp4       0      0 127.0.0.1.8143         *.*                    LISTEN
tcp4       0      0 127.0.0.1.8587         *.*                    LISTEN
tcp4       0      0 192.168.42.11.22       *.*                    LISTEN
tcp4       0      0 192.168.42.11.8025     *.*                    LISTEN
tcp4       0      0 192.168.42.11.8110     *.*                    LISTEN
tcp4       0      0 192.168.42.11.8143     *.*                    LISTEN
tcp4       0      0 192.168.42.11.8587     *.*                    LISTEN

I know I also mentioned the protocols POP3S, IMAPS and HTTP(S) (webmail) in the beginning, but these services are not yet installed, and the corresponding frontends and backends in HAProxy are disabled. In haproxy.conf, “balance source” is added to HTTP(S) backends, as session persistence is needed in the event of round robin loadbalancing over backup servers/services. As is evident from the current, non-redundant setup, persistence is not needed with pure DNS round robin, as a browser gets a single DNS reply from the operating system and remembers that for at least some time (15-30 minutes have been observed). This is yet another reason for keeping addresses alive rather than editing DNS records when performing scheduled maintenance or when working around a crashed server.

Let me end this saga with links to scripts and configuration files:

Easter egg on the occasion of Copenhell:

As I Lay Dying – An Ocean Between Us

One of those weeks

I’m slacking on my couch thinking back on the past week… I have been quite busy, at least work-wise. For some reason, however, my list of assigned tasks/changes/service requests/etc. has grown from 18 to 34. *sigh* 🙂

I’m not really accustomed to saying no or saying “Listen, that sounds mighty exciting, and I would love to set that up, but I’m swamped as it is.” or “What’s the priority? Is it allowed preempt task X and Y, because it sort of has to, if you want it done.”. In my experience, most people want to please others, and I’m no different. Saying no hurts. You also feel a bit inadequate. People are poking at your limits.

Maybe the title of this post is misleading. “One of those weeks” sounds like a recurring event, and this week has actually been somewhat special. My musings above should indicate that. The only recurring thing, which probably everybody can relate to, is that I’m a little worn out and need the weekend 🙂

Being busy isn’t all bad. It indicates that I have something to do, which isn’t a given in any industry these days. Also, during the week I got the chance to play with FreeBSD and CARP for a customer. Lovely software. Easy to set up and it just works. No matter what you think about virtualization in the server room, it’s an awesome invention for testing stuff. In this case I installed VirtualBox and created three guests with FreeBSD 9.1. Using the camera in my phone, I created the following two videos:

Simple test of FreeBSD 9.1 and CARP – part 1 of 2

Simple test of FreeBSD 9.1 and CARP – part 2 of 2

Have a nice weekend wherever you are! I will be cleaning my apartment, washing clothes, getting a visit from my parents (and maybe my brother) and looking at a house. The house is located in the small village Store Brøndum (a funny name for a rather small village…), which is in the countryside. It’s approximately midway between Aalborg and Hadsund. I don’t want to unveil more at this point.

Party on if you’re at Copenhell \m/

Pictures

Here is a glimpse of the end of May and the beginning of June.

My parents’ two dogs, Asterix and Birko:

  

A few pictures from Aalborg Carnival 2013 on May 25:

     

  

Georg bought – and almost ate – a Whopper burger with nine beef patties 🙂
According to this article, that’s 9 x 113.4 = 1,020.6 grams of beef patty!

On June 5, I ran 10 km (actually 9,85 km) in 49 minutes and 22 seconds at the event Grundlovsløbet in the city Hjørring:

Hopefully that will mark the time in 2013 where I got back into a somewhat stable running rhythm.