Skip to content

SSH auto-reconnect

You're connected via SSH to a remote machine, and you need to reboot it. The reboot, as expected, kicks you out. After a while, you think "it should be back up by now", so you try to SSH again. "Ok, it's not up yet". Retry with up arrow, enter. "Uhm, not yet". Up arrow, enter. "You know that old chestnut, it's slow". Up arrow, enter, up arrow, enter...at some point you get "connection refused". "Ah, now network is up, probably it's doing fsck". Up arrow, enter. "Darn, this is taking quite some time". Up arrow, enter. "Come on, let me in already!". And so on, trying and trying, until the SSH attempt is successful and you're finally back in.

Certainly not a big deal, but it can be made a bit simpler. If you've ever found yourself in this situation, here's a simple function to make the above automatic. The idea is: keep reconnecting as long as the ssh exit status is non-zero.

sssh(){
  # try to connect every 0.5 secs (modulo timeouts)
  while true; do command ssh "$@"; [ $? -eq 0 ] && break || sleep 0.5; done
}

Just use sssh (for "sticky ssh") in place of ssh and you're done.

SSH can terminate with non-zero exit status if it fails to connect, or if the remote shell exits with non-zero exit status (which includes being kicked out due to a reboot).

Inside the function, ssh is invoked as command ssh, so if you're brave you can even rename the function to be called ssh.

Obviously, if using sssh for "normal" (non-reboot) scenarios there are caveats: since it cannot tell the reason for a non-zero exit status, if the very last command executed in the remote shell happens to terminate with a failure and the session is then terminated, it will duly reconnect. Same thing if a non-existing host or other wrong parameters are specified: sssh will keep failing. Whether these facts are an issue or not has to be considered on a case-by-case basis.

So a useful variation could be to only try reconnecting if the exit status is 255 (which also works for the reboot-kicks-you-out case):

sssh(){
  # try to connect every 0.5 secs (modulo timeouts)
  while true; do command ssh "$@"; [ $? -ne 255 ] && break || sleep 0.5; done
}

However this still includes the "ssh wrong@parameters" case.

Sample session excerpt (yours may be different of course):

local ~ $ sssh root@scooter
# ... do a few things ...
scooter ~ # reboot

Broadcast message from root@scooter.mps (pts/0) (Fri Apr 19 15:44:40 2013):
The system is going down for reboot NOW!
scooter ~ # Connection to scooter closed by remote host.
Connection to scooter closed.
ssh: connect to host scooter port 22: Connection refused
ssh: connect to host scooter port 22: Connection refused
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: Connection refused
ssh: connect to host scooter port 22: Connection refused
ssh: connect to host scooter port 22: Connection refused
ssh: connect to host scooter port 22: Connection refused
Last login: Fri Apr 19 13:55:30 2013 from local
scooter ~ #

Update 13/09/2013: Since some people are actually replacing their regular ssh command with the hack described here, and of course then something does not work as expected (for example, the ssh-copy-id that comes with the recent versions of ssh does not work), here's a script that can be used to run some command or script using the native ssh binary:

#!/bin/bash
# rssh: runs a command using the "real" ssh
unset -f ssh
exec "$@"

So one can do for example, rssh ssh-copy-id -i ~/.ssh/id_dsa.pub machine and have ssh-copy-id run using the real ssh. If one wants to run other commands with the overridden ssh afterwards in the same shell, then the following one can be used:

#!/bin/bash
 
# save the definition of the overriding function
ssh_func=$(declare -f ssh)
 
# remove it, from here on "ssh" means the real ssh
unset -f ssh
 
# run whatever we have to run
"$@"
 
# override "ssh" again
eval "$ssh_func"
 
# here we run again with the overridden ssh
...

Firewall HA with conntrackd and keepalived

As mentioned earlier, let's see how to add HA to a linux/iptables-based firewall by means of keepalived and conntrackd.

There are a few scenarios for firewall HA. Probably, the most common one is the "classic" active-backup case where, at any time, one firewall is active and manages traffic, and the other is a "hot standby" ready to take over if the active one fails. In principle, since all the tools we're going to use can communicate over multicast, it should be possible to extend the setup described here to more than two firewalls.

We're going to assume this network setup:

firewallha

The two firewalls have a dedicated link (interface eth2 on both machines) to exchange session table synchronization messages, which is the recommended setup. If that is not possible, another interface can be used (for example, the internal LAN interface eth1). In that case, the configuration shown below should be adapted accordingly (essentially, use eth1 and 172.16.0.x instead of eth2 and 10.0.0.x, where x varies depending on the firewall). However beware that the recommendation of using a dedicated link exists for a reason: conntrackd can produce a lot of traffic. On a moderately busy firewall (about 33K connections on average), a quick test showed up to 1.6 Mbit/s of conntrackd synchronization traffic between the firewalls.

keepalived

The basic idea is that keepalived manages the failover of the (virtual) IPs using the standard VRRP protocol: at any time, the firewall that owns the virtual IPs replies to ARP requests (neighbor solicitiations for IPv6) and thus receives the traffic (this is accomplished by sending gratuitous ARPs for IPv4 and "gratuitous" neighbor advertisements for IPv6 when the firewall becomes active. Any HA product that has to move IPs uses this method).

Since the VRRP protocol performs failover of the virtual IPs, one may think that it's all that we need to get HA. For some applications, this may be true; however, in the case of stateful firewalls a crude VRRP-only failover would disrupt existing sessions. The keyword here is stateful, that is, the firewall keeps a table of active sessions with various pieces of metadata about each one. When a previously idle firewall becomes active, it suddenly starts receiving packets belonging to established sessions, which however it knows nothing about. Thus, it would kill them, or try to handle the packets locally; in all cases, sessions would be disrupted. (We will see later that this problem can still occur for short times even when using conntrackd, but can be easily solved). For small setups it may be relatively fine, but if the firewall is a busy one the failover can kill hundreds of sessions. If we're serious about HA, VRRP alone is not enough; the connection tracking table has to be kept in sync among firewalls, and this is where conntrackd comes into play.

conntrackd

Conntrackd is a complex tool. It can be used to collect traffic statistics on the firewalls, but also (and this is what we want here) to keep the stateful session table synchronized between the firewalls, so at any time they have the same information. Session information can be exchanged using a few different ways; here we're going to use the recommended method (called FTFW) which uses a reliable messaging protocol. In turn, FTFW can use multicast or unicast UDP as its transport; here we're using unicast. The sample configuration files that come with conntrackd have comments that explain how to set up multicast UDP if one wants to.

By default, there are two locations where session information is stored: the so-called internal cache is where the firewall stores its local session table (ie, sessions for which it's passing traffic; this is a (partial) copy of the kernel session table, which can be inspected with tools like conntrack - without the trailing d); then, the external cache is where the firewall stores sessions it learns from the other firewall(s). During normal operation, the firewalls continuously exchange messages to inform the peer(s) about each one's session table and its changes, so at any time each firewall knows its own and the other firewall's sessions. When using two firewalls, one firewall's internal cache should match the other's external one, and viceversa.

When a firewall becomes active following a failover, it invokes a script that commits the external cache into the kernel table, and then resyncs the internal cache using the kernel table as the origin; the result is that from that moment on the firewall can start managing sessions for which it had not seen a single packet until then, just as if it had been managing them from their beginning. This is much better than what we would get if using only pure VRRP failover.
The commit script is invoked by keepalived when it detects that the firewall is changing state. The script is called primary-backup.sh and comes with conntrackd; most distributions put it into the documentation directory (eg /usr/share/doc/conntrackd or similar). The same script is invoked upon any state change (when the firewall becomes active, backup, or fails); it knows what happened because it's passed a different argument for any possible state.

Note that it is also possible to disable the external cache (see the DisableExternalCache configuration directive). This way, all the sessions (local and learned) will always be stored directly into the kernel table/internal cache. This means that nothing needs to be done upon failover (or at most, only resyncing the internal cache with the kernel table), as the information the firewall needs to take over is already where it should be (the internal cache). So one may wonder why bother with the external cache at all; the official documentation mentions efficiency and resource usage concerns. Personally, using the external cache seems to work fairly well, so I didn't have the need to mess about and disable it.

Configuration files

Here are the configuration files used for the scenario described here. keepalived.conf:

vrrp_sync_group G1 {
    group {
        E1
        I1
    }
    notify_master "/etc/conntrackd/primary-backup.sh primary"
    notify_backup "/etc/conntrackd/primary-backup.sh backup"
    notify_fault "/etc/conntrackd/primary-backup.sh fault"
}

vrrp_instance E1 {
    interface eth0
    state BACKUP
    virtual_router_id 61
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass zzzz
    }
    virtual_ipaddress {
        10.15.7.100/24 dev eth0
        2001:db8:15:7::100/64 dev eth0 
    }
    nopreempt
    garp_master_delay 1
}

vrrp_instance I1 {
    interface eth1
    state BACKUP
    virtual_router_id 62
    priority 100
    advert_int 1
    authentication {
        auth_type PASS
        auth_pass zzzz
    }
    virtual_ipaddress {
        172.16.10.100/24 dev eth1
        2001:db8:16:10::100/64 dev eth1 
    }
    nopreempt
    garp_master_delay 1
}

The above is from fw1; on fw2 it's the same but the priority of each instance is 50 instead of 100.

conntrackd.conf (comments removed):

Sync {
    Mode FTFW {
        DisableExternalCache Off
        CommitTimeout 1800
        PurgeTimeout 5
    }

    UDP {
        IPv4_address 10.0.0.1
        IPv4_Destination_Address 10.0.0.2
        Port 3780
        Interface eth2
        SndSocketBuffer 1249280
        RcvSocketBuffer 1249280
        Checksum on
    }
}

General {
    Nice -20
    HashSize 32768
    HashLimit 131072
    LogFile on
    Syslog on
    LockFile /var/lock/conntrack.lock
    UNIX {
        Path /var/run/conntrackd.ctl
        Backlog 20
    }
    NetlinkBufferSize 2097152
    NetlinkBufferSizeMaxGrowth 8388608
    Filter From Userspace {
        Protocol Accept {
            TCP
            UDP
            ICMP # This requires a Linux kernel >= 2.6.31
        }
        Address Ignore {
            IPv4_address 127.0.0.1 # loopback
            IPv4_address 10.0.0.1
            IPv4_address 10.0.0.2
            IPv4_address 172.16.10.100
            IPv4_address 172.16.10.101
            IPv4_address 172.16.10.102
            IPv4_address 10.15.7.100
            IPv4_address 10.15.7.101
            IPv4_address 10.15.7.102
            IPv6_address 2001:db8:15:7::100
            IPv6_address 2001:db8:15:7::101
            IPv6_address 2001:db8:15:7::102
            IPv6_address 2001:db8:16:10::100
            IPv6_address 2001:db8:16:10::101
            IPv6_address 2001:db8:16:10::102
        }
    }
}

Again, the above is taken from fw1; on fw2, the UDP section has the source/destination IP addresses inverted.

The "Address Ignore" block should list ALL the IPs the firewall has (or can have) on local interfaces, including the VIPs. It doesn't hurt to include some extra IP (eg those of the other firewall).

The "well-formed ruleset"

(Just in case you're testing with everything set to ACCEPT and it doesn't work)

One thing that is mentioned in the documentation but imho not stressed enough is the fact that the firewall MUST have what they call a "well-formed ruleset", which essentially means that the firewall must DROP (not accept nor reject) any packet it doesn't know about. It's explained better in this email from the netfilter mailing list.

We briefly touched the issue earlier; even with conntrackd, it may still happen that during a failover the firewall that is becoming active receives some packet related to a session it doesn't yet know about (eg. because failover isn't instantaneous and the firewall hasn't finished committing the external cache); under normal conditions, the firewall's local TCP/IP stack may try to process such packets, resulting in potential disruption since it would almost certainly end up sending TCP RST or ICMP errors to one or both connection parties. One case is especially critical, it goes like this: an internal client is receiving data (eg downloading) from an external server, a failover happens, some of the packets the server is sending hit the firewall that is becoming active, which isn't fully synced yet, so it sends RST to the server. Result: the server closes its side, but the client in the LAN still thinks the connection is valid, and hangs waiting for data. If it's the client that gets the RST, what happens depends on the specific application; it may exit, or retry.

The moral of the story thus is that, for the failover to be seamless, it's critical that the firewall ignore (drop, not reject) packets it doesn't know about. In particular, a packet coming from the outside belonging to a NATed connection looks just like a packet addressed to the firewall, if the firewall has no state for the connection; so those packets have to be DROPped in the INPUT chain. In practice, this probably means a default DROP policy for the INPUT chain (ok, being a firewall it probably does it anyway, but better be explicit). Similarly, a DROP policy for the FORWARD chains will also help.

All this works because if the firewall drops unknown traffic, TCP or whatever protocol the communicating parties are using will notice the loss and sort it out (eg by retrasmitting packets).

Testing

So for example we can download some Debian DVD on two or more clients, to keep them busy with a long-running TCP connection:

wget -O/dev/null 'http://cdimage.debian.org/debian-cd/6.0.7/amd64/iso-dvd/debian-6.0.7-amd64-DVD-1.iso'

Open some other less intensive task, like ssh or telnet sessions, and perhaps watch some Internet video. In short, create many connections to the Internet through the active firewall. Once all this is in place, log into the active firewall (the one that has the VIPs), and stop or restart keepalived, to force a failover to the other firewall (if you stop keepalived, remember to start it again later before doing further tests). If everything is set up correctly, the VIPs should move to the other box and the active sessions in the LAN should keep working flawlessly. That's it! For more thorough testing, the failover process can be repeated many times (within reason), and every time it should be transparent to clients.

Here's a script that forces a failover between fw1 and fw2 and viceversa every N seconds, where N is a random number between 61 and 120 (of course, this is just for testing purposes):

#!/bin/bash
 
declare -a fws
fws=( 172.16.10.101 172.16.10.102 )
 
i=0
maxi=$(( ${#fws[@]} - 1 ))
 
while true; do
  [ $i -gt $maxi ] && i=0
  fw=${fws[$i]}
 
  #echo "deactivating $fw"
  ssh root@"${fw}" '/etc/init.d/keepalived restart'
 
  # interval between 61 and 120 seconds
  period=$(($RANDOM % 60 + 61))
  #echo "sleeping $period seconds..."
  sleep $period
 
  ((i++))
done

IPv6 NAT, the day has come

So they finally made it:

router # ip6tables -t nat -A POSTROUTING -s 2001:db8:ffff::/64 -o eth1 -j MASQUERADE
client # ip -6 addr
1: lo:  mtu 16436 
    inet6 ::1/128 scope host 
       valid_lft forever preferred_lft forever
2: eth0:  mtu 1500 qlen 1000
    inet6 2001:db8:ffff::cafe/64 scope global 
       valid_lft forever preferred_lft forever
    inet6 fe80::216:3eff:fec8:871e/64 scope link 
       valid_lft forever preferred_lft forever
client # ping6 -c 4 -n www.google.com
PING www.google.com(2a00:1450:4004:803::1010) 56 data bytes
64 bytes from 2a00:1450:4004:803::1010: icmp_seq=1 ttl=54 time=78.6 ms
64 bytes from 2a00:1450:4004:803::1010: icmp_seq=2 ttl=54 time=70.9 ms
64 bytes from 2a00:1450:4004:803::1010: icmp_seq=3 ttl=54 time=71.2 ms
64 bytes from 2a00:1450:4004:803::1010: icmp_seq=4 ttl=54 time=72.5 ms

--- www.google.com ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3003ms
rtt min/avg/max/mdev = 70.986/73.343/78.619/3.105 ms

Ah right, because NAT gives you, um, more security. Oh yeah.

IPv6 address normalization

So we have this application that sends us IPv6 addresses in some odd (though valid) format, like

2001:db8:0:0:0:0:cafe:1111
2001:db8::a:1:2:3:4
2001:0DB8:AAAA:0000:0000:0000:0000:000C
2001:db8::1:0:0:0:4

and we need to compare them to addresses stored elsewhere (for example, in the output of ip6tables-save). However we find that, though the same addresses do already exist "elsewhere", there they look like this:

2001:db8::cafe:1111
2001:db8:0:a:1:2:3:4
2001:db8:aaaa::c
2001:db8:0:0:1::4

So our code thinks they're not present, and processes them again (or adds them twice, or whatever). Whith IPv4, of course, there was no such problem.

What we need is a way to "normalize" all these addresses so that if two addresses really are the same, their string representation must be the same as well, so they can be compared.

There is a proposed standard (RFC 5952) for IPv6 address text representation, which essentially boils down to these simple rules:

  1. Always suppress leading zeros in each field, eg "0004" becomes just "4", "0000" becomes "0"
  2. Always use lowercase letters ("1AB4" becomes "1ab4")
  3. Regarding replacement of runs of zero-valued 16-bit fields with "::" (as long as the run is composed of more than one field), always replace the longest run; in case of runs of the same length, replace the first (leftmost) run. As said, a single zero-valued 16-bit field is not considered a run and is not touched by this algorithm (but rule 1 above still applies).

The two functions we want are inet_pton() and inet_ntop(), which (at least in glibc) do implement the above rules. Although they are C library functions, the most popular scripting languages expose an interface to them so they can be called from scripts.

inet_pton() takes a string and converts it into an internal format (failing if the string is invalid, so it can also be used to validate the address string), and inet_ntop() takes an address in internal format and converts it into a string representation that follows the rules above.
I'm being purposely vague about the "internal format" because its representation may be platform-dependent, and scripting languages could obfuscate it even more, so it's better not to mess with it and just treat it as a sort of black box. At least with Perl and Python, it's the binary representation of the address.

Perl

#!/usr/bin/perl
use warnings;
use strict;
use Socket qw(AF_INET6 inet_ntop inet_pton);
 
while(<>){
  chomp;
  my $internal = inet_pton(AF_INET6, $_);
  if (defined($internal)) {
    print inet_ntop(AF_INET6, $internal), "\n";
  } else {
    print "Invalid address $_\n";
  }
}

If we're 100% sure that the input address is valid, then the main loop can be shortened to

while(<>){
  chomp;
  print inet_ntop(AF_INET6, inet_pton(AF_INET6, $_)), "\n";
}

Though extra checking surely can't hurt and avoids surprises.
Note that this needs a relatively recent Perl; for example, the Socket module of Perl 5.10.1 does not export AF_INET6 and friends (though ISTR there used to be a Socket6 module which did, but it had to be installed separately).

Python 2

#!/usr/bin/python
 
import os
import sys
import socket
 
for addr in sys.stdin:
  addr = addr.rstrip(os.linesep)
 
  try:
    internal = socket.inet_pton(socket.AF_INET6, addr)
    print socket.inet_ntop(socket.AF_INET6, internal)
 
  except socket.error:
    print "Invalid address " + addr

Again if validation can be skipped, then it can be written as

for addr in sys.stdin:
  print socket.inet_ntop(socket.AF_INET6, socket.inet_pton(socket.AF_INET6, addr.rstrip(os.linesep)))

which can in turn be turned into a oneliner by inlining the loop into a list and doing

print "\n".join([socket.inet_ntop(socket.AF_INET6, socket.inet_pton(socket.AF_INET6, addr.rstrip(os.linesep))) for addr in sys.stdin])

Code in action

So now running the original list of addresses through the normalization code we get:

$ echo '2001:db8:0:0:0:0:cafe:1111
2001:db8::a:1:2:3:4
2001:0DB8:AAAA:0000:0000:0000:0000:000C
2001:db8::1:0:0:0:4' | normalize.py
2001:db8::cafe:1111
2001:db8:0:a:1:2:3:4
2001:db8:aaaa::c
2001:db8:0:1::4

Check signal handling of a process

Here's a quick and dirty program to check what a given process (specified by its PID) is doing with regard to signal handling. It gets its information from the virtual file /proc/<PID>/status, reading the SigBlk:, SigIgn:, and SigCgt: values. Each one is a 64 bit mask indicating which signals the process is blocking, ignoring and handling (catching) respectively.

#!/usr/bin/perl
 
# use as: $0 <PID>
 
use warnings;
use strict;
use bignum;
use Config;
 
defined $Config{sig_name} or die "Cannot find signal names in Config";
my @sigs = map { "SIG$_" } split(/ /, $Config{sig_name});  
 
my $statfile = "/proc/$ARGV[0]/status";
 
open(S, "<", $statfile) or die "Cannot open status file $statfile";
 
while(<S>) {
  chomp;
  if (/^Sig(Blk|Ign|Cgt):\s+(\S+)/) {
    if (my @list = grep { oct("0x$2") & (1 << ($_ - 1)) } (1..64) ) {
      print "$1: " . join(", ", map { "$sigs[$_] ($_)" } @list) . "\n";
    }
  }
}
 
close(S);

Sample usage:

$ checksignals.pl 1747
Ign: SIGPIPE (13)
Cgt: SIGHUP (1), SIGINT (2), SIGQUIT (3), SIGTERM (15), SIGCHLD (17), SIGWINCH (28), SIGNUM32 (32), SIGNUM33 (33)
$ checksignals.pl $$
Blk: SIGCHLD (17)
Ign: SIGQUIT (3), SIGTERM (15), SIGTSTP (20), SIGTTIN (21), SIGTTOU (22)
Cgt: SIGHUP (1), SIGINT (2), SIGILL (4), SIGTRAP (5), SIGABRT (6), SIGBUS (7), SIGFPE (8), SIGUSR1 (10), SIGSEGV (11), SIGUSR2 (12), SIGPIPE (13), SIGALRM (14), SIGCHLD (17), SIGXCPU (24), SIGXFSZ (25), SIGVTALRM (26), SIGWINCH (28), SIGSYS (31)

Of course it's racy, like almost anything that messes with files in /proc, but it may be good enough for most situations.