|
JUMP2NR.PLjump2nr.pl - Automate moving to a new release of your distribution
SYNOPSIS
jump2nr.pl -cfg nextrel.cfg testsys
DESCRIPTIONI manage a cluster of machines. Some of the machines are 'gateways' which have many users and the systems are large and complex. Most of the machines are 'client' nodes which are user-less (only have root) and data-less (contain no user data). jump2nr.pl was written to give me a simple means to replace the operating system on these clients. Conventional approaches to doing this usually require some sort of TFTP server which provides special software to do the updating. This requires a boot from the network (PXE boot) and some of my systems cannot do a PXE boot and those that can require console access to press a key (PF12) to initiate a PXE boot. What I really want is some command I can just issue from a central site which goes through a set of machinations and the result is a new system on some client node. Updating the operating system is not just a case of installing the latest version of some packages. It might well mean a change of distributions or moving the root partition from sda2 to sda3. As I was describing this to someone, I realized that my systems all had just enough to resources to work: * they all have plenty of memory so they will not swap * /boot was a separate partition
THE TRICKThe keys to doing this update came when I realized * I could steal the swap partition * The swap partition was big enough to run a small Linux system * A distribution can be installed as a tar file plus a small set of changes I've named this process B<jumping> because it reminds me of a leap off a cliff - think of the ending of Butch Cassidy and the Sundance Kid. Hopefully your result will not be as traumatic. If everything is done correctly, you hit the ground, roll and come up. If not, well, you were going to have to put a CD in and install manually anyway, right? :-) Here is a poor man's diagram of what we'll do:
target jumpsystem newtarget completed
+-----------+ +-----------+ +-----------+ +-----------+
| /boot | | /boot jump| | /boot | | /boot |
+-----------+ +-----------+ +-----------+ +-----------+
| / | | unused | | / new | | / new |
| | boot | | boot | | --> | |
| | | | | | | |
+-----------+ +-----------+ +-----------+ +-----------+
| swap | | / jump| | unused | | swap |
| | | | | | | |
+-----------+ +-----------+ +-----------+ +-----------+
The jump process is to install a separate, carefully tailored operating system on the target system's swap partition and configure the /boot parition to boot the new root (old swap) partition. After the first reboot the host is running our jump system - a carefully tailored system which uses only the original modified /boot partition and the original swap partition. This makes all the other partitions available to be changed. The second step is to 'install' the new distribution using the other target partitions. The /boot partition can be re-used now for the new distribution since our jump system will not use /boot again. We take advantage of the fact that Linux systems can mostly be installed by formatting a new root partition and extracting a tar file at the new root partition (e.g. on the correct unused partition from the original system). After installing most of the files, there are a small set of file changes that need to be made - mostly dealing with a hostname change. The boot configuratiuon file in /boot/grub/menu.lst is now corrected to point to the new root partition. A second reboot boots the new target system with the new distribution. One last detail must be handled by the new system - to get the swap partition set up again.
VARIATIONSIn the section above I've presented a very simple scenario, but it does not need to be quite so simple. I've shown the root system of the target being reused, but of course, the new target partitions might be quite different from the original target's. Once the jump system is running, your scripts are free to do anything like re-partition the drives, reformat, copy data from one parition to another, fetch data from another system, etc. As long as it all works, you are fine. If your scripts fail, you probably have a mostly unusable system on your hands, so this process is not reasonable for a single upgrade, but is aimed at doing system upgrades to many hosts.
DETAILSThe actual steps to do a jump are pretty simple: * Copy some files to the target system
* On the target system do:
* Save some details from the target
* Take the swap partition offline
* Format the swap partition for the jump system root
* Set up jump system to complete the upgrade
* Set up grub to boot the jump partition
* Reboot
* On the jump system, rc.local begins the restore process:
* Re-partition if necessary
* Reformat a new root partition
* Mount it and restore a new distribution
* Copy saved files from the original system to the correct places
* Change important files on the new distribution
* Set up grub to boot the newly installed target system
* Set up target system to complete the upgrade
* Reboot
* On the newly installed jump system do:
* Format the original swap partition (past jump system root partition)
* Mount the swap parition for the newly installed system
Conceptually this is pretty simple, but of course, the devil is in the details. Don't underestimate what's going on. It's not hard - as long as you understand the details. This is not for the faint of heart. jump2nr.pl is no magic bullet, not beats understanding what's going on.
FILESjump2nr.pl is designed to use several files: * a configuration file * a shell script * a tar file of a jump system * a tar file of a new target system Configuration File The configuration file is read by B<jump2nr.pl> and is, essentially, a simple shell script which copies files issues commands to be executed on the jump system. Comment lines begin with '#' in column one and blank lines are ignored. Lines must use one of these formats: * key=value Defines variable 'key' * key=`cmd` Sets variable 'key' to STDOUT value of cmd * print string Print something to the STDOUT * local cmd Command to be run on the local host * remote cmd Command to run on target host * jump(function, parms) Invoke a builtin to do something complex * jump(usepartition, swap) Set jump root partition on swap partition * jump(usepartition, /dev/sda2) Set jump root partition to a particular partition 'cmd' may be a conventional command line with shell special characters, including parenthesies e.g. (cd /etc; ls). Note that quote characters ARE REMOVED from cmd. Any occurence of a varable surrounded with % (e.g. %xx%) will be replaced by the value of that variable which has been previously defined. This means variables need to be defined at the top of the file. The following variables are used internally by jump2nr.pl: _JUMPPARTITION, _JUMPDEV, _JUMPDEVICEID, _HOST, _LOGFILE, _MOUNTDIR. An example configuration file is provided below. Shell Script There are quite a few commands to be run on the target machine when either setting up the jump system or after the reboot as the upgrading of the new installation is being done. The examples provided use the script name jump2.sh for both of these parts - thereby consolidating all the information in one place. You are, of course, free to do this anyway you can imagine. An example shell script is provided below. Tar of Jump System This might seem complex, but it is really quite easy. Install any version of Linux you want on some system and then simply create a tar file of it, e.g. tar --one-file-system --exclude proc -f jump.system.tgz / This creates the file 'jump.system.tgz' which can be extracted on the target system to create a boot jump system. Keep this as small as you can manage (mine is 290MB). Once you have a working system image, you may well never need to re-create this. Sorry, no example is provided. Tar of New Target System This tar image is created just like that one above - except this should be an image of the final upgraded system you want. This might be very different from the Jump System, or it might be almost exactly like the Jump System. Once restored (see the Shell Script), there will be various files you need to update - including /etc/fstab, /etc/passwd, /etc/rc.local and others. While all the changes might be confusing, once you get a runnable system and can SSH to it, you can easily derive what more needs fixing and add that to restore in your shell script. Sorry, no example is provided.
OPTIONS
PARAMETERS
SAMPLE FILESSample Configuration File # Target on target system where files get copied TARGET_DIR=/jumpfiles # Define a script to be run BEFORE we get going. Remote location will be TARGET_DIR TARGET_SETUP = jump2.sh # Define what partition is used for the target system TARGET_PARTITION = /dev/sda1 # Tar file of new target system image to be copied extracted TARGET_TAR=target.pe1950.system.tgz NEW_TARGET_TAR=`basename %TARGET_TAR%` # New target system rc.local. Used to boot strap and complete new target NEW_TARGET_RC1=rc.local.client NEW_TARGET_RC2=rc.local.reconfigure.client # Define what partition to use on the jump system JUMP_PARTITION = swap # Where partition on jump system is mounted JUMP_MNT=/media/jump # Tar file of jump system image. Copy this to target and extract JUMP_TAR=jump.system.tgz # Target on jump system where files will be found JUMP_DIR=%TARGET_DIR% # GRUB menu. LILO is similar, but has not been tried GRUBMENU=/boot/grub/menu.lst # Show values of all variables #Print Dumping config variables in %_LOGFILE% #Showvars() #remote echo At %_HOST% it should be %D% too # Copy local files to remote system so commands run there have what they need print Copying files to %_HOST% remote mkdir -p %TARGET_DIR% local sudo scp -q %JUMP_TAR% %TARGET_TAR% %TARGET_SETUP% %NEW_TARGET_RC1% %NEW_TARGET_RC2% root@%_HOST%:%TARGET_DIR% print Run target setup on %_HOST% remote %TARGET_DIR%/%TARGET_SETUP% prep # Mount the swap partition on the target system print Prepare to use %JUMP_PARTITION% on %_HOST% jump(usepartition, %JUMP_PARTITION%, %JUMP_MNT%) # # Build the jump system on the target. We will soon be booting this # # Extract tar file in %JUMP_MNT% print Extracting jump image tar file in %JUMP_MNT% at %_HOST% remote (cd %JUMP_MNT%; tar xzf %TARGET_DIR%/%JUMP_TAR%) # Tailor some details for the jump system print Run target setup processing on %_HOST% remote %TARGET_DIR%/%TARGET_SETUP% save %TARGET_PARTITION% %JUMP_MNT% %TARGET_DIR% %JUMP_DIR% %NEW_TARGET_TAR% # Fix JUMP boot menu to specify %_JUMPPARTITION% and %_JUMPDEVICEID% remote cp -p %JUMP_MNT%%GRUBMENU% %JUMP_MNT%%GRUBMENU%.saved remote (cat %JUMP_MNT%%GRUBMENU%.saved | sed -e s:%TARGET_PARTITION%:%_JUMPPARTITION%: -e s:hd0,0:%_JUMPDEVICEID%: > %JUMP_MNT%%GRUBMENU%) # Fix fstab to use %_JUMPPARTITION% for / and not as swap remote cp -p %JUMP_MNT%/etc/fstab %JUMP_MNT%/etc/fstab.saved remote (cat %JUMP_MNT%/etc/fstab.saved | grep -v %_JUMPPARTITION% | sed -e s:%TARGET_PARTITION%:%_JUMPPARTITION%: > %JUMP_MNT%/etc/fstab) remote grub-install --root-directory=%JUMP_MNT% %_JUMPDEV% print Rebooting target sytstem %_HOST% ... Courage grasshopper remote reboot Sample Shell Script #!/bin/bash # # jump2nr.pl script to help setup jump and new target systems # # Syntax: jump2.sh prep | save | restore # # If something is wrong, the script should exit with a non-zero return code # All exits should be 'exit 0' or 'exit 1' or jump2nr.pl might incorrectly # conclude the remote command failed # me=`basename $0` FILES2SAVE="/etc/fstab /etc/hosts /etc/hostname /etc/passwd /etc/shadow /etc/ssh /etc/network/interfaces /root /etc/postfix/main.cf"; NEW_TARGET_RC1=rc.local.client NEW_TARGET_RC2=rc.local.reconfigure.client LOGFILE="/var/log/restore.log"
############################################################
# prep
# Runs on the target system before we try to set up a jump system.
# E.g. Stop applications, get users off, etc
############################################################
if [ "$1" = "prep" ]; then
exit 0
fi
############################################################
# save TARGET_PARTITION JUMP_MNT TARGET_DIR JUMP_DIR TARGET_TAR
# Runs on the target system after the jump root dir is set up
# This should copy important files from the target system (/) to the jump system.
# These files will be used in a 'restore' step
############################################################
if [ "$1" = "save" ]; then
shift
if [ "$#" != 5 ]; then
echo "Invalid '$me save' syntax: $0 $*"
exit 1
fi
TARGET_PARTITION="$1" # /dev/sda1 partition for new target
JUMP_MNT="$2" # /media/jump where jump system image is mounted
TARGET_DIR="$3" # /jumpfiles where jump files are on target system
JUMP_DIR="$4" # /jumpfiles where saved files are on jump system
TARGET_TAR="$5" # target.system.tgz tar image of new target system to restore
if [ ! -r $TARGET_DIR/$TARGET_TAR ]; then
echo "$me - Target system tar image ($TARGET_DIR/$TARGET_TAR) does not exist"
exit 1
fi
echo "Copying files to $JUMP_MNT/$JUMP_DIR"
mkdir -p $JUMP_MNT/$JUMP_DIR
cd $JUMP_MNT/$JUMP_DIR || exit 1
for f in $0 $FILES2SAVE $TARGET_DIR/$TARGET_TAR $TARGET_DIR/$NEW_TARGET_RC1 $TARGET_DIR/$NEW_TARGET_RC2; do
echo " $f"
cp -rp $f . || exit 1
done
# Create rc.local on jump system to invoke me to restore files
echo "Set up rc.local on jump system to complete process"
f="$JUMP_MNT/etc/rc.local"
if [ ! -r $f ]; then
echo "#!/bin/sh" > $f
fi
echo "$JUMP_DIR/$me restore $TARGET_PARTITION $JUMP_DIR $TARGET_TAR 2>&1 | tee $LOGFILE" >> $f
echo "reboot" >> $f
chmod 755 $f
echo "Jump system ready for reboot"
exit 0
fi
############################################################
# restore TARGET_PARTITION JUMP_DIR TARGET_TAR
# Runs on the jump system
# This should restore a new target system image
# This should copy saved important files from the old target
# system (/) to the new target system
############################################################
if [ "$1" = "restore" ]; then
shift
echo "#================================================="
echo "# jump2.sh restore"
echo "#================================================="
if [ "$#" != 3 ]; then
echo "Invalid '$me restore' syntax: $0 $*"
exit 1
fi
TARGET_PARTITION="$1" # /dev/sda1 partition for new target
JUMP_DIR="$2" # /jumpfiles where saved files are on jump system
TARGET_TAR="$3" # target.system.tgz tar image of new target system to restore
# Clean out the target partition, restore the new TAR image
# This is where we seriously make a commitment
mkfs -t ext3 -F $TARGET_PARTITION || exit 1
mount -t ext3 $TARGET_PARTITION /mnt || exit 1
cd /mnt || exit 1
tar xzf $JUMP_DIR/$TARGET_TAR
echo "If you are lucky, you have a runnable system now"
df
# Now restore those files saved by 'save'
echo "Restoring files to the new target"
for f in $FILES2SAVE; do
echo " $f"
if [ "$f" = "/etc/passwd" ]; then
echo "$f requires special handling, cannot just blast a new one in. Replace this code."
elif [ "$f" = "/etc/shadow" ]; then
echo "$f requires special handling, cannot just blast a new one in. Replace this code."
else
fb=`basename $f`
fd=`dirname $f`
cp -rp $JUMP_DIR/$fb ./$fd
fi
done
# Files restored. Prep new target so it can be booted
# The new target system better have /boot/grub/menu.lst set up properly
echo "Prepare new target system for reboot"
# Derive device from partition
d=`perl -e 'chop($ARGV[0]); print $ARGV[0];' $TARGET_PARTITION`
grub-install --root-directory=/mnt $d
# Last hook - replace existing rc.local support file with this one
echo "Enable my local hooks so the new target system finishes the last bit"
cp -p $JUMP_DIR/$NEW_TARGET_RC1 $JUMP_DIR/$NEW_TARGET_RC2 etc
touch /mnt/etc/jump.initialize.myself
exit
fi
echo "Syntax: ${me} prep | save | restore"
exit 1
EXITIf no fatal errors are detected, the program exits with a return code of 0. Any error will set a non-zero return code.
AUTHORWritten by Terry Gliedt <tpg@hps.com> in 2009 and is copyrighted (C) by Terry Gliedt and the University of Michigan. This is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; See http://www.gnu.org/copyleft/gpl.html |