<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>randys.org &#187; Amazon S3</title>
	<atom:link href="http://www.randys.org/tag/amazon-s3/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.randys.org</link>
	<description>wasting your precious bandwidth since 1998</description>
	<lastBuildDate>Wed, 16 Nov 2011 23:40:23 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>How-To: Automated Backups to Amazon&#039;s S3 with Duplicity</title>
		<link>http://www.randys.org/2007/11/16/how-to-automated-backups-to-amazon-s-s3-with-duplicity/?utm_source=rss&#038;utm_medium=rss&#038;utm_campaign=how-to-automated-backups-to-amazon-s-s3-with-duplicity</link>
		<comments>http://www.randys.org/2007/11/16/how-to-automated-backups-to-amazon-s-s3-with-duplicity/#comments</comments>
		<pubDate>Fri, 16 Nov 2007 00:28:00 +0000</pubDate>
		<dc:creator>randy</dc:creator>
				<category><![CDATA[Code Chunks]]></category>
		<category><![CDATA[General Nerdery]]></category>
		<category><![CDATA[Home]]></category>
		<category><![CDATA[How-To]]></category>
		<category><![CDATA[Amazon S3]]></category>
		<category><![CDATA[Duplicity]]></category>
		<category><![CDATA[GPG]]></category>
		<category><![CDATA[Slicehost]]></category>

		<guid isPermaLink="false">http://www.randys.org//2007/11/16/how-to-automated-backups-to-amazon-s-s3-with-duplicity</guid>
		<description><![CDATA[I&#8217;ve been using Amazon&#8217;s S3 service for a couple months now. It was working OK using s3sync and a cron job, but it seemed like it wasn&#8217;t actually making incremental backups and I wasn&#8217;t 100% sure that it was backing &#8230; <a href="http://www.randys.org/2007/11/16/how-to-automated-backups-to-amazon-s-s3-with-duplicity/">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I&#8217;ve been using Amazon&#8217;s <a href="http://www.amazon.com/S3-AWS-home-page-Money/b/ref=sc_fe_l_2?ie=UTF8&amp;node=16427261&amp;no=3435361&amp;me=A36L942TSJ2AJA">S3</a> service for a couple months now. It was working OK using <a href="http://s3sync.net/wiki">s3sync</a> and a cron job, but it seemed like it wasn&#8217;t actually making incremental backups and I wasn&#8217;t 100% sure that it was backing up everything (i.e. it appeared to be crapping out once in a while). I searched around for various S3 backup solutions and found a handy utility called <a href="http://duplicity.nongnu.org/">duplicity</a>. Even more handy that it is available for most distributions (Archlinux, <a href="http://www.ubuntu.com/">the</a> <a href="http://www.debian.org/">debs</a>, and Fedora anyway).</p>
<p>From the duplicity home page:</p>
<blockquote>
<p>Duplicity backs directories by producing encrypted tar-format volumes and uploading them to a remote or local file server. Because duplicity uses <a href="http://sourceforge.net/projects/librsync">librsync</a>, the incremental archives are space efficient and only record the parts of files that have changed since the last backup. Because duplicity uses <a href="http://www.gnupg.org/">GnuPG</a> to encrypt and/or sign these archives, they will be safe from spying and/or modification by the server.</p>
</blockquote>
<h3>What you&#8217;ll need</h3>
<p>You&#8217;ll need to make sure you have a few things installed before you install duplicity. Namely librsync and GnuPG. Luckily, if the duplicity package is available for your distribution, you probably needn&#8217;t worry.</p>
<p>Here&#8217;s a rundown of the steps involved:</p>
<ol>
<li>Generate a new GnuPG key</li>
<li>Create a simple shell script wrapper</li>
<li>Create a cron job</li>
</ol>
<h3>Generating a new Key</h3>
<p>Start by generating a new gpg key for duplicity. Or if you have an existing one, you can use that. </p>
<p><strong>N.B.</strong> <em>I set this up on a <a href="https://manage.slicehost.com/customers/new?referrer=396224371">Slice</a> running Arch64 and had problems generating a new key (<code>gpg --gen-key</code>). Apparently, it could not generate enough entropy. Not a problem though: Just generate the keys else where and import them later if this happens to you.</em></p>
<pre><code>#~ gpg --gen-key
gpg (GnuPG) 1.4.7; Copyright (C) 2006 Free Software Foundation, Inc.
This program comes with ABSOLUTELY NO WARRANTY.
This is free software, and you are welcome to redistribute it
under certain conditions. See the file COPYING for details.

Please select what kind of key you want:
(1) DSA and Elgamal (default)
(2) DSA (sign only)
(5) RSA (sign only)
Your selection?
</code></pre>
<p>Default (DSA and Elgamal) is fine here.</p>
<pre><code>DSA keypair will have 1024 bits.
ELG-E keys may be between 1024 and 4096 bits long.
What keysize do you want? (2048)
</code></pre>
<p>The default (2048) is more than enough for this. Change it to whatever you want.</p>
<pre><code>Requested keysize is 2048 bits
Please specify how long the key should be valid.
         0 = key does not expire
      &lt;n&gt;  = key expires in n days
      &lt;n&gt;w = key expires in n weeks
      &lt;n&gt;m = key expires in n months
      &lt;n&gt;y = key expires in n years
Key is valid for? (0)
</code></pre>
<p>Unless you want the key to expire (I don&#8217;t see why one would want that), the default is what we want.</p>
<pre><code>Key does not expire at all
Is this correct? (y/N)
</code></pre>
<p>Um, yes, this is correct.</p>
<pre><code>You need a user ID to identify your key; the software constructs the user ID
from the Real Name, Comment and Email Address in this form:
    "Heinrich Heine (Der Dichter) &lt;heinrichh@duesseldorf.de&gt;"

Real name: DuplicityBackup
Email address: duplicity@mydomain.com
Comment: Key for Duplicity
You selected this USER-ID:
    "DuplicityBackup (Key for Duplicity) &lt;duplicity@mydomain.com&gt;"

Change (N)ame, (C)omment, (E)mail or (O)kay/(Q)uit?
</code></pre>
<p>Enter whatever information you want here and type &#8216;O&#8217; for &#8216;Okay&#8217;</p>
<pre><code>You need a Passphrase to protect your secret key.

Enter Passphrase:
</code></pre>
<p>Enter something. Anything. The more complex the better. This is your private data. Remember that it&#8217;s being transfered over http to a server you don&#8217;t own. I don&#8217;t care if it is Amazon. Remember what you type because you&#8217;ll need it later while creating the wrapper script.</p>
<pre><code>gpg: key **9929DAB1** marked as ultimately trusted
public and secret key created and signed.

gpg: checking the trustdb
gpg: 3 marginal(s) needed, 1 complete(s) needed, PGP trust model
gpg: depth: 0  valid:   2  signed:   0  trust: 0-, 0q, 0n, 0m, 0f, 2u
pub   1024D/9929DAB1 2007-11-15
      Key fingerprint = 3378 8E93 4349 0E7F 44F3  7C81 2460 5A11 9929 DAB1
uid                  DuplicityBackup (Key for Duplicity) &lt;duplicity@mydomain.com&gt;
sub   2048g/5385A6BB 2007-11-15
</code></pre>
<p>And you&#8217;re done. Make note of the key (in this case, 9929DAB1) as we&#8217;ll need that later too.</p>
<h4>But I already have a key I want to use</h4>
<p>OK, fine, but chances are, if you have a key already, you know how to get it. However, if you don&#8217;t know how to get your key, <code>gpg --list-keys</code>. You want the key in the &#8216;pub&#8217; line&#8230; after the forward slash &#8216;/&#8217;</p>
<h3>The Wrapper</h3>
<p>This can be written in any language really. I chose shell because it&#8217;s easy and basic. You could run the <code>duplicity</code> now on the command line, but writing a wrapper is much more convenient and makes adding a cron job later a lot easier. Here&#8217;s what you&#8217;ll need:</p>
<ul>
<li>Your Amazon S3 Access Key ID and Secret Access Key. If you don&#8217;t have one, you&#8217;ll have to <a href="http://www.amazonaws.com/">sign up for one</a>.</li>
<li>Your GPG key</li>
<li>Your GPG key&#8217;s passphrase</li>
<li>A list of directories you want to back up</li>
</ul>
<p>Here&#8217;s a basic script that works for me:</p>
<p>	#!/bin/bash<br />
	# Export some ENV variables so you don&#8217;t have to type anything<br />
	export AWS_ACCESS_KEY_ID=&lt;your-access-key-id&gt;<br />
	export AWS_SECRET_ACCESS_KEY=&lt;your-secret-access-key&gt;<br />
	export PASSPHRASE=&lt;your-gpg-passphrase&gt;</p>
<p>	GPG_KEY=&lt;your-gpg-key&gt;</p>
<p>	# The source of your backup<br />
	SOURCE=/</p>
<p>	# The destination<br />
	# Note that the bucket need not exist<br />
	# but does need to be unique amongst all<br />
	# Amazon S3 users. So, choose wisely.<br />
	DEST=s3+http://&lt;your-bucket-name&gt;</p>
<p>	duplicity<br />
	    &#8211;encrypt-key=${GPG_KEY} \<br />
	    &#8211;sign-key=${GPG_KEY} \<br />
	    &#8211;include=/boot \<br />
	    &#8211;include=/etc \<br />
	    &#8211;include=/home \<br />
	    &#8211;include=/root \<br />
	    &#8211;include=/var/lib/mysql \<br />
	    &#8211;exclude=/** \<br />
	    ${SOURCE} ${DEST}</p>
<p>	# Reset the ENV variables. Don&#8217;t need them sitting around<br />
	export AWS_ACCESS_KEY_ID=<br />
	export AWS_SECRET_ACCESS_KEY=<br />
	export PASSPHRASE=</p>
<p>And, that&#8217;s pretty much it. Save the file as something creative, like, <code>backup</code> and make it executable (<code>chmod 700 backup</code>). If you want to test it first (and you have the disk space), change the destination to some <code>/tmp</code> directory or external HDD. Once you&#8217;ve got it working the way you want, set it up as a cron job. Daily, weekly, monthly&#8230; doesn&#8217;t matter.</p>
<p>Duplicity is a nice backup solution for any situation, not just Amazon&#8217;s S3. It can handle HTTP, SCP and local backups as well. I highly recommend reading the <a href="http://duplicity.nongnu.org/duplicity.1.html">duplicity man page</a> and checking out the various command line arguments and availble options.</p>
<p>A couple of <em>Thanks</em> goes out to <a href="http://www.brainonfire.net/2007/08/11/remote-encrypted-backup-duplicity-amazon-s3/">Tim McCormack&#8217;s</a> and <a href="http://www.sysadminschronicles.com/archives/2007/10/21/backing_up_with_amazon_s3/">Ben and Ron&#8217;s</a> articles which got me started.</p>
<hr/>
<p>Tim points out that, adding your GPG PASSPHRASE to the shell script might not be the most secure method, especially in a shared environment. I agree, however, it kind of defeats the purpose of automated backups if you have to actually enter your passphrase (twice) on the command line when calling the wrapper script. One way I managed to go around this is to create a simple C++ application that prints the passphrase.</p>
<p>Here&#8217;s the C++ code:</p>
<pre><code>#include &lt;stdio.h&gt;
int main()
{
    printf("your-gpg-passphrase");
    return 0;
}
</code></pre>
<p>Compile</p>
<pre><code>#~ gcc gpg-passphrase.c -o gpg-passphrase
</code></pre>
<p>Make it executable by your user and set the sticky bit so no one else can execute it</p>
<pre><code>#~ chmod 700 gpg-passphrase
#~ chmod +s gpg-passphrase
</code></pre>
<p>Modify the wrapper script to use the binary for the passphrase</p>
<pre><code>export PASSPHRASE=$(gpg-passphrase)
</code></pre>
<p>You might go as far as to do the same thing for your AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY as well. There are probably other ways around this, but this was a quick a dirty way to not have readable strings in shell scripts. I figure, if someone has rooted my server, I&#8217;ve got bigger problems to worry about than my data sitting on Amazon&#8217;s S3.</p>
<div style=" text-align: center;  margin: 8px; ">
				<script type="text/javascript">
				google_ad_client = "pub-6476605957445525";
				google_ad_width = 468;
				google_ad_height = 60;
				google_ad_format = "468x60_as";
				google_ad_type = "text_image";
				google_ad_channel = "8409641020";
				google_color_border = "#C7C7C7";
				google_color_bg = "#FFFFFF";
				google_color_link = "#777777";
				google_color_text = "#777777";
				google_color_url = "#3333CC";
				google_ui_features = "rc:0";
				</script>
				<script type="text/javascript" src="http://pagead2.googlesyndication.com/pagead/show_ads.js"></script>
			</div><!-- PHP 5.x -->]]></content:encoded>
			<wfw:commentRss>http://www.randys.org/2007/11/16/how-to-automated-backups-to-amazon-s-s3-with-duplicity/feed/</wfw:commentRss>
		<slash:comments>34</slash:comments>
		</item>
	</channel>
</rss>

