Skip to content
 

On-demand tar copy across the network

Everyone who has worked a while with the command line knows the classic trick to move files between machines using tar and ssh:

source$ tar -cf - -C /src/dir . | ssh user@target 'tar -xf - -C /dest/dir'

(I'll use -C here; if using a tar that doesn't support it, of course one can always do cd /source/dir && tar ...)

While this works fine, it has some encryption overhead that is not necessary if the machines we're moving files between are both in the local LAN. So one could use netcat (or equivalent) instead of ssh, something like the following:

target$ nc -l -p 1234 | tar -xf - -C /dest/dir
source$ tar -cf - -C /src/dir . | nc target 1234

(adapt the syntax of netcat to your specific variant, of which there are many; in particular, you may need to add a timeout or some option to handle half-closes so that it terminates correctly when it has received all the input, if it doesn't do it by default).

However, this requires setting up things in advance on the target machine, before the transfer can be started on the source. It would be nice if the target machine part could be somehow automated, so everything could be controlled entirely from the source machine. Here are some ideas.

xinetd

Most distributions come with xinetd preinstalled and running, so this seems like a good fit. Let's create a new service managed by xinetd (adapt as needed, of course):

service nettar
{
  port = 1234
  type = UNLISTED
  socket_type = stream
  instances = 1
  wait = no
  user = someuser
  server = /usr/local/bin/nettar.sh
  log_on_success += USERID PID HOST EXIT DURATION
  log_on_failure += USERID HOST ATTEMPT
  disable = no
}

It is important to specify wait = no if we want xinetd to pass data to our program on standard input. This is not clearly documented, see this page and this page for more information.
The script /usr/local/bin/nettar.sh looks something like this:

#!/bin/bash
 
# this reads just one line of input
IFS= read -r destdir
 
if ! cd "$destdir"; then
  echo "Error entering destination directory $destdir" >&2
  exit 1
fi
 
# just untar what we get on stdin from xinetd
tar -xf -

So on the source machine we can then do

source$ { echo "/dest/dir"; tar -cf - -C /src/dir . ; } | nc target 1234

and wait for the transfer to complete. Obviously this can all be put inside a wrapper script that does the necessary sanity checks to ensure that the first line of what we send to the target machine is indeed the name of the remote directory.

To transfer to multiple machines at once, process substitution can be used, as in

{ echo "/dest/dir"; tar -cf - -C /src/dir . ; } | tee >(nc target1 1234) >(nc target2 1234) | nc target3 1234  # etc

Assuming of course that our xinetd service is running on all target machines.

Start netcat by ssh

This method is more kludgy than the previous one, but has the advantage that it does not require xinetd on the target machine, only ssh (and netcat, of course). The idea is to create a wrapper script that connects to the target machine using ssh, starts netcat in listen mode, and finally starts the transfer on the source machine. No need to say that having public key authentication for ssh helps significantly here.
A sample script could be something like:

#!/bin/bash
 
#nettar2.sh
 
sourcedir=$1
target=$2
 
targetmachine=${target%%:*}
targetdir=${target#*:}
 
if ssh user@"$targetmachine" "cd '$targetdir' || exit 1; { nc -l -p 1234 | tar -xvf - ; } </dev/null >/dev/null 2>&1 &"; then
  tar -cf - -C "$sourcedir" . | nc "$targetmachine" 1234
else
  echo "Error changing to $targetdir on $targetmachine" >&2
  exit 1 
fi

Here we do an explicit cd on target, to make sure the directory exists and fail immediately if not. Of course, everything can be changed and/or adapted.

We should take care that after starting netcat on target ssh returns instead of hanging, hence all the /dev/null redirections (local redirections inside braces override the external ones).

So with this we can do

source$ nettar2.sh /src/dir target:/dest/dir

and transfer our files at full speed without having to connect manually to the target machine to prepare the transfer.

Conclusion

The tricks described here are a bit kludgy and possibly not always worth the effort, but there are scenarios where they can be employed successfully and make life easier than other more usual methods.

Compression can be added to tar if wanted (you may want to do some test and see if it makes things better or worse, or make it a commandline option to the script).