Automating Subversion Backups

From PeformIQ Upgrade
Revision as of 13:29, 12 December 2007 by PeterHarding (talk | contribs) (New page: = Automate subversion backups = Vincent Danen, TechRepublic Subversion is increasing in popularity and usage, and while it makes a fantastic version control system -- as with all things -...)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Automate subversion backups

Vincent Danen, TechRepublic

Subversion is increasing in popularity and usage, and while it makes a fantastic version control system -- as with all things -- it needs to be properly backed up. If subversion is used as a version control or backup system for other data (i.e., configuration files, as we've seen in a previous tip), then that data becomes exceptionally important, and "backing up the backup" would be a wise thing to do.

Fortunately, subversion gives us the tools and flexibility to do this easily. With a monthly full backup and on-the-fly backups of each commit, backing up subversion can be painless and protects against the rare case of data corruption.

The first thing is to add a post-commit hook to do a dump of each commit as it is made. Assuming the subversion repository is stored in /subversion/repos/mystuff, modify /subversion/repos/mystuff/hooks/post-commit and add:

svnadmin dump "$REPOS" --revision "$REV" --incremental >> /subversion/backup/incremental/mystuff/commit-$REV

The backup directory will be /subversion/backup/incremental/mystuff and every commit will be dumped to that directory as "commit-52" for commit number 52.

If you do not already have the post-commit file, simply copy the skeleton post-commit.tmpl file to create it.

To create monthly backups, two things are desired: first is a full hotcopy of the subversion repository and then setting new dump baselines for subsequent commits. On a busy subversion server, the incremental dumps can quickly add up, so every month a new full dump is done to create a baseline and the previous month's commits are removed (as they are part of the new baseline). The script below accomplishes this. To automate it, create it as /etc/cron.monthly/svn-fullbackup:

#!/bin/sh

date="`date +%Y-%m`"
svnbasedir="/subversion/repos"
svnfullbkdir="/subversion/backup/full"
svnincbkdir="/subversion/backup/incremental"
echo "+++ Backing up subversion repositories..."
for repo in mystuff ; do
    echo ""
    mkdir -p ${svnfullbkdir}/${date}
    /usr/bin/hot-backup.py ${svnbasedir}/${repo} ${svnfullbkdir}/${date}
    if [ "$?" != "0" ]; then
        echo "!! Hot backup failed on repository: ${repo}"
    fi
    lastrev="`svnlook youngest ${svnbasedir}/${repo}`"
    echo "** Creating new baseline for ${repo} (at revision ${lastrev})..."
    mkdir -p ${svnincbkdir}/${repo}.tmp
    svnadmin dump -q ${svnbasedir}/${repo} >> ${svnincbkdir}/${repo}.tmp/baseline-${lastrev}
    if [ "$?" == "0" ]; then
        rm -f ${svnincbkdir}/${repo}/{baseline,commit}*
        mv -f ${svnincbkdir}/${repo}.tmp/baseline-${lastrev} ${svnincbkdir}/${repo}/
    else
       echo "!! Creating new baseline failed!  Left the old baseline and commits intact!"
    fi
    rm -rf ${svnincbkdir}/${repo}.tmp
done

Depending on how subversion is installed, the path to hot-backup.py may need to be adjusted; that script comes with subversion and its resulting installed location may depend on how the vendor packaged subversion for your particular Linux distribution.

Now, every month, a new baseline dump of the entire repository is created both as a text dump, which can easily be reloaded in the case of problems with the subversion repository, and also a hot backup of the full repository. If worse comes to worse, the repository can be restored from the hot backup and then subsequent changes can be reloaded with the incremental dumps the repository performs on every commit.

As well, because each hot dump goes to its own dated directory, you can roll back the repository to a specific point in time if required. The script could probably be made more robust for high-activity repositories so that Apache would be stopped prior to the backup and restored after the backup (if feasible, and assuming the use of Apache to front-end the repository in the first place) to prevent any commits to the repository during the backup. You could also perform verification on the repository prior to doing the backup to ensure the backup is okay; however, I recommend doing a daily verify on the database to ensure you spot problems as soon as they arise. This can be done by creating /etc/cron.daily/svn-verify with:

#!/bin/sh

svnbasedir="/subversion/repos"
echo "Verifying subversion repositories..."
for repo in mystuff ; do
    printf "Verifying: ${repo}\t\t"
    svnadmin verify ${svnbasedir}/${repo} >/dev/null 2>&1
    if [ "$?" != "0" ]; then
        printf "  FAILED!\n"
    else
        printf "  ok\n"
    fi
done

Of course, in either script you can verify/back up more than one subversion repository by changing for repo in mystuff to for repo in mystuff stuff private to perform the actions on the repositories mystuff, stuff, and private.