Skip to content
 

SSH auto-reconnect

You're connected via SSH to a remote machine, and you need to reboot it. The reboot, as expected, kicks you out. After a while, you think "it should be back up by now", so you try to SSH again. "Ok, it's not up yet". Retry with up arrow, enter. "Uhm, not yet". Up arrow, enter. "You know that old chestnut, it's slow". Up arrow, enter, up arrow, enter...at some point you get "connection refused". "Ah, now network is up, probably it's doing fsck". Up arrow, enter. "Darn, this is taking quite some time". Up arrow, enter. "Come on, let me in already!". And so on, trying and trying, until the SSH attempt is successful and you're finally back in.

Certainly not a big deal, but it can be made a bit simpler. If you've ever found yourself in this situation, here's a simple function to make the above automatic. The idea is: keep reconnecting as long as the ssh exit status is non-zero.

sssh(){
  # try to connect every 0.5 secs (modulo timeouts)
  while true; do command ssh "$@"; [ $? -eq 0 ] && break || sleep 0.5; done
}

Just use sssh (for "sticky ssh") in place of ssh and you're done.

SSH can terminate with non-zero exit status if it fails to connect, or if the remote shell exits with non-zero exit status (which includes being kicked out due to a reboot).

Inside the function, ssh is invoked as command ssh, so if you're brave you can even rename the function to be called ssh.

Obviously, if using sssh for "normal" (non-reboot) scenarios there are caveats: since it cannot tell the reason for a non-zero exit status, if the very last command executed in the remote shell happens to terminate with a failure and the session is then terminated, it will duly reconnect. Same thing if a non-existing host or other wrong parameters are specified: sssh will keep failing. Whether these facts are an issue or not has to be considered on a case-by-case basis.

So a useful variation could be to only try reconnecting if the exit status is 255 (which also works for the reboot-kicks-you-out case):

sssh(){
  # try to connect every 0.5 secs (modulo timeouts)
  while true; do command ssh "$@"; [ $? -ne 255 ] && break || sleep 0.5; done
}

However this still includes the "ssh wrong@parameters" case.

Sample session excerpt (yours may be different of course):

local ~ $ sssh root@scooter
# ... do a few things ...
scooter ~ # reboot

Broadcast message from root@scooter.mps (pts/0) (Fri Apr 19 15:44:40 2013):
The system is going down for reboot NOW!
scooter ~ # Connection to scooter closed by remote host.
Connection to scooter closed.
ssh: connect to host scooter port 22: Connection refused
ssh: connect to host scooter port 22: Connection refused
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: No route to host
ssh: connect to host scooter port 22: Connection refused
ssh: connect to host scooter port 22: Connection refused
ssh: connect to host scooter port 22: Connection refused
ssh: connect to host scooter port 22: Connection refused
Last login: Fri Apr 19 13:55:30 2013 from local
scooter ~ #

Update 13/09/2013: Since some people are actually replacing their regular ssh command with the hack described here, and of course then something does not work as expected (for example, the ssh-copy-id that comes with the recent versions of ssh does not work), here's a script that can be used to run some command or script using the native ssh binary:

#!/bin/bash
# rssh: runs a command using the "real" ssh
unset -f ssh
exec "$@"

So one can do for example, rssh ssh-copy-id -i ~/.ssh/id_dsa.pub machine and have ssh-copy-id run using the real ssh. If one wants to run other commands with the overridden ssh afterwards in the same shell, then the following one can be used:

#!/bin/bash
 
# save the definition of the overriding function
ssh_func=$(declare -f ssh)
 
# remove it, from here on "ssh" means the real ssh
unset -f ssh
 
# run whatever we have to run
"$@"
 
# override "ssh" again
eval "$ssh_func"
 
# here we run again with the overridden ssh
...

One Comment

  1. Toni says:

    Stumbled upon this and it is really useful. To avoid this messages:

    ssh: connect to host scooter port 22: Connection refused
    ssh: connect to host scooter port 22: No route to host

    just add "-q"

    sssh(){
    while true; do command ssh -q "$@"; [ $? -eq 0 ] && break || sleep 0.5; done
    }