This week we had a system error which led to catastrophic data loss in my lab. Fortunately it was restricted to a single virtual machine, but unfortunately that virtual machine happened to be my personal workbench. Inconvenient losses include data, scripts, and notes from recent projects and from just about all of my graduate coursework. The absolutely devastating loss, however, was my electronic lab notebook, which was hosted as a wiki on the machine. Luckily I had done some backups of my lab notebook, but the most recent one I could find was from May 2013. As happy as I am to have avoided losing my entire graduate lab notebook, losing 8-9 months’ worth is still heartbreaking.
So I just finished doing what I should have done a long time ago: automate my backup procedure. Let me introduce the
crontab command. The
crontab command lets you edit a system file which specifies commands to be run at regular intervals by your system. Using crontab, you can set up cron jobs to run hourly, daily, or weekly, with lots of flexibility. Here are a few examples.
# Execute 'somecommand' command at the beginning of every hour 0 * * * * somecommand # Execute 'somecommand' at 1:00am and 1:00pm every day. 0 1,13 * * * somecommand # Execute '/home/standage/check-submissions' every minute between the hours # of 12am-2am on Mondays and Wednesdays. * 0-2 * * 1,3 /home/standage/check-submissions
You can run
man crontab on your system for a complete description of available options.
So as far as the specifics of my backup procedure, I decided a weekly backup would be sufficient: specifically, at 2am on Saturday morning when the chances of me doing any research are pretty slim. So I ran
crontab -e to open my crontab file, and added the following entry.
0 2 * * 7 /home/standage/bin/wiki-backup
/home/standage/bin/wiki-backup is an executable bash script that includes that commands needed to perform each backup. This particular script creates a gzip-compressed tar archive of my lab notebook, and then copies it over the network to my directory in the iPlant data store using the
iput command. If I had Box or Dropbox installed on that machine, I could just have easily replaced the
iput command with a
cp command that copies the data backup to the directory on my local system that syncs with the cloud.
#!/usr/bin/env bash cdate=`date '+%F'` backdir=/home/standage/Backups/labnotebook backfile="labnotebook.$cdate.tar.gz" cd /var/www/html tar czf $backdir/$backfile labnotebook cd $backdir iput -V $backfile Backups
Hopefully this example gives a clear idea of what is possible with cron jobs and how easy it is to set up automatic backups for your critical research data.