Data backup with crontab and iput

This week we had a system error which led to catastrophic data loss in my lab. Fortunately it was restricted to a single virtual machine, but unfortunately that virtual machine happened to be my personal workbench. Inconvenient losses include data, scripts, and notes from recent projects and from just about all of my graduate coursework. The absolutely devastating loss, however, was my electronic lab notebook, which was hosted as a wiki on the machine. Luckily I had done some backups of my lab notebook, but the most recent one I could find was from May 2013. As happy as I am to have avoided losing my entire graduate lab notebook, losing 8-9 months’ worth is still heartbreaking.

So I just finished doing what I should have done a long time ago: automate my backup procedure. Let me introduce the crontab command. The crontab command lets you edit a system file which specifies commands to be run at regular intervals by your system. Using crontab, you can set up cron jobs to run hourly, daily, or weekly, with lots of flexibility. Here are a few examples.

# Execute 'somecommand' command at the beginning of every hour
0 * * * * somecommand

# Execute 'somecommand' at 1:00am and 1:00pm every day.
0 1,13 * * * somecommand

# Execute '/home/standage/check-submissions' every minute between the hours 
# of 12am-2am on Mondays and Wednesdays.
* 0-2 * * 1,3 /home/standage/check-submissions

You can run man crontab on your system for a complete description of available options.

So as far as the specifics of my backup procedure, I decided a weekly backup would be sufficient: specifically, at 2am on Saturday morning when the chances of me doing any research are pretty slim. So I ran crontab -e to open my crontab file, and added the following entry.

0 2 * * 7 /home/standage/bin/wiki-backup

The file /home/standage/bin/wiki-backup is an executable bash script that includes that commands needed to perform each backup. This particular script creates a gzip-compressed tar archive of my lab notebook, and then copies it over the network to my directory in the iPlant data store using the iput command. If I had Box or Dropbox installed on that machine, I could just have easily replaced the iput command with a cp command that copies the data backup to the directory on my local system that syncs with the cloud.

#!/usr/bin/env bash

cdate=`date '+%F'`

cd /var/www/html
tar czf $backdir/$backfile labnotebook

cd $backdir
iput -V $backfile Backups

Hopefully this example gives a clear idea of what is possible with cron jobs and how easy it is to set up automatic backups for your critical research data.


Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s