BACKMEUP.PL - Do backups using a configuration file

Download this program

SYNOPSIS

NAME

backmeup.pl - Do backups using a configuration file


SYNOPSIS


  backmeup.pl homemachine
  backupup.pl -cfg work.cfg workserver workclient


DESCRIPTION

The goal of this program is to make it easy to centralize backups in one (or a few) configuration files. Backups can be done in a wide variety of schemes and commands. Some of the more common commands to backup data include:

  cp
  scp
  rsync

While each of these commands has the basic format of command from destination and each command has a wide variety of options that might be used for some data, but not for others. Just a few of the backup commands I personally use include:

  cp -p  /home/mail    /backups/mail
  cp -rp /home/mozilla /backups/mozilla
  rsync -az --delete --force /etc /backups/etc
  rsync -az --delete --force --bwlimit 500 /home2 /backups/home2
  rsync -az --delete --force -e 'ssh p 77777' \
      /home/[a-c]* BAK@back:/backups/home
  rsync -az --delete --force -e --exclude Simulations \
      /home/tpg /backups/home

Complicating this is that the commands and options change all the time based on the space available at the destination or on the whim of my users or myself about what SHOULD be backed up.

Sometimes I need a command to prepare the data to be backed up, like this:

  tar czf /tmp/b.tgz
  cp -rp /tmp/b.tgz /backups/mozilla
  rm -f /tmp/b.tgz

Like many I started with crontab entries to issue the specific command for a given set of data. Multiply this for a number of machines and it starts getting confusing and many times the commands to be used were very similar.

Next I moved to writing a shell script to generate the correct command. The same shell script running on many machines could implement the rule of the day for a given set of data. Eventually this too got confusing. Perhaps I just get too confused.

What I want in backup is:

  * Single command passed a few keywords to indicate what to do
  * Centralized configuration files. Preferrably one file.
  * Enough flexability so I don't have to write a wrapper script
  * Simple to understand and remember

backmeup.pl is my current attempt to satisfy these requirements. This program reads a configuration file. The parameters to the command are easy and are intended to be invoked with crontab entries like:

  11 20 3 * * /home/tpg/backmeup.pl -cfg /home/tpg/bin/my.cfg etc home2 s2v

The script reads the configuration file specified and calls the commands defined for the keys 'etc', 'home2' and 's2v'.

The crontab time provides the scheduling of the backups. This keeps backmeup.pl simple since it does not need to decide when to run. The parameters to backmeup.pl specify what to backup and how.

This keeps my life simple. If backups for 'etc' need to move from 8PM to 1AM, I just change the crontab entry. If the destination for 'etc' changes, I edit the configuration file and not some program. If I mess something up, most likely just one configuration clause will fail - and the rest of the backups can continue.


OPTIONS

-add2cfg string

Specifies a string that will be added to the beginning of the configuration file. This allows one to, for instance, define a variable which is referenced in the rules. Semicolons in string can be used to define multiple lines. For instance -add2cfg 'a==3;cdf==6' would result in two variables being defined within the configuration file allow one to reference %a% and %cdf%.

-add2opts string

Specifies a string that will be added to the value of the opts config value. Use this to subtlely modify the parameters to rsync (for instance). A value like -add2opts --verbose would enable the verbose mode when method command is 'rsync'.

-cfg file

Specifies the configuration file to read. The default is backmeup.cfg in the current working dirctory. See the section CONFIG FILE FORMAT for details.

-help

Displays this help text.

-quiet

Specifies no informational messages should be generated under any circumstances. The default is to generate messages indicating what is going on.

-verbose

Specifying this will generate additional debugging information. If this is specified, additional debugging information is generated.


PARAMETERS

key1 ... keyN

Specifies keywords for sections of the configuration file. Only commands specified by each keyword are run.


CONFIG FILE FORMAT

The config file for this program provides a simple means to identify named subsets of data to be backed up and the commands to be used for each. Simple variables can be defined for global use. Examples will be the most useful way to see what's going on.

You may split your configuration data into multiple files if it is convenient and combine them with a simple statement like:

  include /home/backups/defines.cfg

An include file cannot have another include.

Subsets of data are defined by sequences of lines that start with a unique name followed by {. All keyword/values (see CONFIG KEYS below) apply only to this subset. The subset is terminated with a lone }. For example:

  mymail {
    from=/var/spool/mail/tpg
    to=/backups/tpg/mail
    method=cp
  }

This identifies 'mymail' which results in this command being executed:

  cp /var/spool/mail/tpg /backups/tpg/mail

Many times subsets will have common values. To keep this simple and easily modified, the program allows variables to be defined like this:

  backdir == /backups/tpg
  today == `date +%a`

These must appear before the subset in which they are used and must appear outside a subset definition. Note the use of the operator '=='. These variables are referenced in subsets with '%' surrounding the name of the variable. For example:

  backdir == /backups/tpg
  mymail {
    from=/var/spool/mail/tpg
    to=%backdir%/mail-%today%
    method=cp
  }
  etc {
    from=/etc
    to=%backdir%/etc
    opts = -a
    method=cp
  }

This results in two commands being executed:

  cp /var/spool/mail/tpg /backups/tpg/mail-Tue
  cp -a /etc backups/tpg/etc

Note that this program does not attempt to create any target directories. See precmd for a way to create target directories.


CONFIG KEYS

Subsets in a config file support a small number of keywords which identify information important in constructing the commands:

desthost=name

Specifies the SSH-style hostname to be used for the destination of the data. This might be used to construct a command like: scp -rp /etc backhost:/backups

Note that no password is provided here. For use with a crontab entry you'd want to set up the scp command so it used public/private keys so you are not prompted for a password. If on the other hand you are running this script by hand, then a password prompt might be acceptable.

destuser=userid

Specifies the SSH-style userid to be used for the destination of the data. This requires desthost to be specified to be useful. This might be used to construct a command like: scp -rp /etc joe@backhost:/backups

from=path

Specifies the file, directory or set of directories for the from part of the command. The path should usually be a fully qualified path (e.g. begins with '/'). from=xyz could be a single file or single directory.

Very simple regular expressions can be used to match more than one directory or file. from=/home/[A-Fa-f] can be used to match anything in /home which begins with a upper or lower case letter from 'a' through 'f'. This expression is restricted to match just those files/directories whose first letter matches the letters within the square backets. This is nothing like a full Perl regular expression. Using a regular expression will result in a separate command being executed for each match. If this gets you too many files/directories, you may be interested in omit.

method=command

Specifies the base command to be used. This is typically cp, scp, or rsync, but it could be any command that has a similar syntax to these commands. The default command is rsync -avz.

mount=string

Specifies a command to be executed before we look for data to be backed up. As the name suggests, this is intended to allow a mount to be issued. This count be something like mkdir -p /mnt/home; mount prod:/home /mnt/home.

omit=name1 [name2 .. nameN]

Specifies a white space delimited list of files or directories that might be matched by from=expression. Anything which matches a name in this list is not backed up.

opts=options

Specifies a string of options to be added to the method command.

postcmd=string

Specifies a command to be executed after method is called. string may contain semicolons so that multiple commands are executed (e.g. postcmd=rm -f /tmp/%TMP%; rm -rf %TMPDIR%). The return code for string is the return code from the last command. This command is always executed, regardless of errors returned from method.

precmd=string

Specifies a command to be executed before method is called. If this fails, method and postcmd commands are not called. string may contain semicolons so that multiple commands are executed (e.g. precmd=rm -f /tmp/%TMP%; tar czf %TMP; true). The return code for string is the return code from the last command.

to=path

Specifies the destination path to a file or directory to which the from data will be copied. The destination path must already exist. The path should usually be a fully qualified path (e.g. begins with '/').

umount=string

Specifies a command to be executed after all data is backed up. As the name suggests, this is intended to allow a mount to be unmounted. This count be something like umount -f /mnt/home.


COMPLEX CONFIG EXAMPLE

  #=====================================================================
  #   Define constants   Note: use '==', not '='
  #=====================================================================
  #   These definitions could be split off into another included file
  sshopt == -e 'ssh -p 77777'
  bwlimit == --bwlimit 500 
  excludes == --exclude Simulations
  #   Add -v or --progress if you want to see what's going on
  rsyncopt == -az --delete --force
  backuphostg == backghost.georgia.edu
  backuphostp == backphost.penn.edu
  backupuser == copyuser
  blue1 == /home/blue1
  blue2 == /home/blue2
  g2 == /BAK2/backups
  g3 == /BAK3/backups
  h == `hostname`
  today == `date +%a`
  #   These users backed up separately
  fomits == gman larry sally
  #=====================================================================
  #   Backup sets of data from my machine
  #=====================================================================
  etc {
    from = /etc
    to = %g2%/%h%
    method = rsync
    opts = %rsyncopt% %sshopt%
    desthost=%backuphostg%
    destuser=%backupuser%
  }
  mail {
    from = /tmp/%today%.%h%.mail.tgz
    to = %g2%/%h%/mail
    method = cp
    opts = -p
    precmd = tar -czf /tmp/%today%.%h%.mail.tgz /var/spool/mail
    postcmd = rm -f /tmp/%today%.%h%.mail.tgz
  }
  home2 {
    from = /home/.home2
    to = %g2%/%h%
    method = rsync
    opts = %rsyncopt% %sshopt% %excludes% %bwlimit%
    desthost=%backuphostp%
    destuser=%backupuser%
  }
  a2k {
    from = /home/[a-k]
    to = %blue1%/%h%/home
    method = Srsync
    opts = %rsyncopt% %sshopt% %excludes% %bwlimit%
    desthost=%backuphostp%
    destuser=%backupuser%
    omit=%fomits%
  }
  #=====================================================================
  #   These users backed up separately
  #=====================================================================
  #   Added for easy testing
  sally {
    from=/home/sally
    to=%blue2%/%h%/home
    method=rsync
    opts = %rsyncopt% %sshopt%
    desthost=%backuphostg%
    destuser=%backupuser%
    omit=%fomits%
  }
  gman {
    from=/home/gman
    to=%blue2%/%h%/home
    method=rsync
    opts = %rsyncopt% %sshopt% %bwlimit%
    desthost=%backuphostp%
    destuser=%backupuser%
  }

The command backmeup.pl -cfg example.cfg mail sally etc home2 a2k gman would result in the following commands being executed. Errors will be reported, but will not stop the others from running. The commands are executed in the order of the keywords you specify.

  #   mail
  tar -czf /tmp/Fri.myhost.mail.tgz /var/spool/mail
  cp -p /tmp/Fri.myhost.mail.tgz /BAK2/backups/myhost/mail
  rm -f /tmp/Fri.myhost.mail.tgz
  #  sally
  rsync -e 'ssh -p 77777' /home/sally \
     copyuser@backuphostg:/home/blue2/myhost/home
  #  etc
  rsync /etc copyuser@backuphostg:/BAK2/backups/myhost
  #  home2
  rsync -e 'ssh -p 77777' --exclude Simulations --bwlimit 500 \
     /home/.home2 copyuser@backuphostp:/BAK2/backups/myhost
  # a2k
  rsync -e 'ssh -p 77777' --exclude Simulations --bwlimit 500 \
     /home/amy  copyuser@backuphostp:/home/blue1/myhost/home
  rsync -e 'ssh -p 77777' --exclude Simulations --bwlimit 500 \
     /home/carl copyuser@backuphostp:/home/blue1/myhost/home
  rsync -e 'ssh -p 77777' --exclude Simulations --bwlimit 500 \
     /home/george copyuser@backuphostp:/home/blue1/myhost/home
  rsync -e 'ssh -p 77777' --exclude Simulations --bwlimit 500 \
     /home/karl copyuser@backuphostp:/home/blue1/myhost/home
  #  gman
  rsync -e 'ssh -p 77777' --bwlimit 500 \
     /home/gman copyuser@backuphostp:/home/blue2/myhost/home


EXIT

If no fatal errors are detected, the program exits with a return code of 0. All commands are executed, but any error results in a non-zero return code from this program.


AUTHOR

Written by Terry Gliedt <tpg@hps.com> in 2007 and is made available under terms of the GNU General Public License.