CUCC Expedition Handbook - rsync Quick

Quick Reminder - rsync

Manual Version Control using rsync

This is NOT a tutorial. This is a set of reminders for people who already know all this stuff.

expofiles (all the big files and documents)

Photos, scans (logbooks, drawn-up cave segments) (This was about 40GB of stuff in 2019 which you probably don't actually need locally).

If you don't need an entire copy of all of it, then it is probably best to use Filezilla/ftp to copy just a small part of the filesystem to your own machine and to upload the bits you add to or edit. Instructions for installing and using Filezilla are found in the expo user instructions for uploading photographs: uploading.html.

To sync all the files from the server to your local expofiles directory on your laptop:

rsync -nazv --delete-after --exclude="thumbs/" --exclude="*.???.xml" --exclude="*.jpeg.xml" expo@expo.survex.com:expofiles/ /home/expo/expofiles

To sync the local expofiles directory back to the server after you have edited updates (e.g. scanned some hand-drawn surveys into expofiles/surveyscans/ (but only if your machine runs Linux):

rsync -nazv --delete-after /home/expo/expofiles/surveyscans/2019/ expo@expo.survex.com:expofiles/surveyscans/2019
then CHECK that the list of files it produces matches the ones you absolutely intend to delete forever! ONLY THEN do it without the "-n" option. "-n" is the same as "--dry-run" which shows you the overwriting changes but doesn't actually do them.

Note that the target folder has no trailing slash but that the source folder does. Important.

Always:

(do be incredibly careful not to delete piles of stuff then rsync back, or to get the directory level of the command wrong - as it'll all get deleted on the server too, and we will not have backups if these are recent files!). It's absolutely vital to use rsync --dry-run --delete-after first to check what would be deleted.

If your version of rsync produces output for every folder it sees, even if it is not update, then pipe the output through

| grep -v "/$"
to hide the folders which have a termial slash.

If you are using rsync from an NTFS folder on a Windows machine (even if you are using WSL to do it) you will not necessarily get all the files cleanly as some legal Linux filenames are incompatible with Windows. What will happen is that

  1. rsync will invisibly change the names as it downloads them from the Linux expo server to your Windows machine, but
  2. then it forgets what it has done and
  3. when you next try to rsynchronise using rsync, it will
  4. re-upload all the renamed files and maybe delete the originals even if you have touched none of them.
  5. This pollutes the server and would break links between survex files and drawings file.
Now there won't be any problems with simple filenames using all lowercase letters and no funny characters (except for "con.jpg" and similar of course*), but we have nothing in place to stop anyone creating (using Linux) an incompatible filename of that sort somewhere in that 40GB or of detecting the problem at the time. So be extra, extra careful and religiously use the -n (DRY RUN) setting and manually check all changes before running rsync without -n.

(We may also have an issue with rsync not using the appropriate user:group attributes for files pushed back to the server. This may not cause any problems, but watch out for it.)

* CON is an MS-DOS identifier for the CONSOLE and it is still an illegal Windows filename. It's not the only thing like that.