Ops Guy Corner: MySQL cloud backup solution

I’m Kevin, and I wanted to take a couple minutes to introduce myself and talk a little about the state of operations when I first came to UserVoice. I was hired on as the sole Operations Engineer. It sounds like a fancy title, but it really means I wear many different hats. On top of handling the day-to-day operations of our systems, I’m the DBA (oh no!), the network guy, the office IT guy, and, of course, the resident geek. Whether it be the WiFi interference in our loft office, our MongoDB installation, or the water cooler that was oddly plugged into a battery-backed UPS, I’ve been fixing problems since I got here.

My first real task here at UserVoice was to come up with a backup solution for our MySQL database. To implement a quick hack, I wrote a bash script that would do the necessary work which would run from ‘cron’ once a day. As we already had a master/slave setup with our database, it was simple to take a ‘mysqldump’ of the slave database.

The basics of the script are:

1
mysqldump thisisourdbname | gzip -9 > /backups/mysql_dump-`date +%m%d%y-%H%M`.sql.gz

This ran successfully from ‘cron’ for a few days, until I noticed the disk was filling up too quickly. I needed to come up with something better and more robust. I also needed a more permanent storage solution, as well as a rotation plan, so I wouldn’t have to manually rotate the backups.

We already use Amazon S3 for different things here at UserVoice, making it easy to create a new bucket in our existing account to dump our backups in. But, before I put our entire database in the cloud, I wanted to make sure it was encrypted with a password, split into multiple files for easy upload and retrieval, and compressed as much as possible. I chose the 7z archive file format, which uses the LZMA compression algorithm, because it met all these requirements and it’s free, open source, and cross-platform. I added the following to my existing bash script to include the needed compression/splitting/encrypting:

1
mysqldump thisisourdbname | 7z a -si -v250m -psillypass /backups/mysql_dump-`date +%m%d%y-%H%M`.sql.7z

This further compressed our dumps, and split them into nice 250 meg volumes (except that it took about 4.5 hours to complete the dump, during which the slave wouldn’t replicate any data from the master). This was obviously not good, so I tweaked it a bit. I fixed that by running the dump and compressing it a little bit first with:

1
DATETIME=`date +%m%d%y-%H%M` mysqldump thisisourdbname | gzip -1 > /backups/mysql_dump-$DATETIME.sql.gz

…then uncompressing it while compressing it again into 7z format:

1
gzcat /backups/mysql_dump-$DATETIME.sql.gz | 7z -si -v250m -psillypass /backups/mysql_dump-$DATETIME.sql.7z && rm -f /backups/mysql_dump-$DATETIME.sql.gz

This shortened the ‘mysqldump’ portion of the process to only 30-40 minutes. I was comfortable with these figures and ready to wrap it up, all the while including weekly uploads to Amazon S3.

I polished my Ruby skills and wrote a rudimentary script that uses the ‘aws-s3’ library to read a list of files as arguments from the command line that uploads to a bucket on our Amazon account. To easily run this script once a week from ‘cron’:

1
find /backups -name \*.7z.\* -type f -mtime -1 | xargs s3_backup.rb

This will find all the files modified in the last day, matching the 7z pattern used when creating volumes, then send it to my S3 script as command line arguments care of ‘xargs’. The S3 script handles the upload and notifies the rest of the team via a Campfire notification.

This isn’t a perfect solution yet, but works just fine if you monitor the disk usage. I have yet to implement the automatic rotation, but plan on using ‘logrotate’ to rotate and manage backup files on the database server. I will also create another host with more space and daily ‘rsync’ the /backups folder for archiving and weekly backups to S3.

This is all part of the excitement that goes on here at UserVoice. I’ll continue to update you on my progress and all fun problems I get to work out. I’d love to hear your feedback about this subject, or if you have any suggestions for further discussion topics.