Saving Images with the Google App Engine using S3

Note: After several updates, some of this code is obsolete.

In this tutorial I will attempt to show you how to use the Google-App-Engine (GAE) with Amazon's S3. It's not for the faint at heart, so be prepared to get a little dirty.

First of all, why would you want to use Amazon's S3 with the GAE? My reasoning is to prevent myself from storing images to BigTable (as you're prevented from writing to the file system) and circumvent Google's 1mg file size limit. If you haven't tried to provide a way to upload images in your GAE app, I suggest you follow this tutorial. I can't say that I'm a fan.

In order to complete this tutorial, you'll need:

Getting Started with S3
You'll obviously need to get a S3 account in order to upload files to this service. It's fairly simple and all you need is an Amazon account. I'll assume that you can figure this out on your own.

Before attempting to integrate S3 with GAE, play around with the service first. Be sure to download the FireFox extension S3 FireFox Organizer. You can find your credentials for accessing S3 account within your Amazon Web Services Developer account (account information > access identifiers).

Keep track of those first two identifiers: Your Access Key ID, and Your Secret Key. You'll need these to connect to S3 later on using Boto.

Create a bucket, using S3 FireFox Organizer, and give it a universally unique name. A bucket is a lot like a domain-name. There's a high likely hood that you can not create a bucket named "test" but you may be able to name one "donnie123." Through out the rest of this tutorial, we'll use the bucket name of "donnie123" but you will have to create your own bucket with your own special and unique name.

The App-Engine-Patch
In order to use Boto, you may be able to patch it to use Google's urlfetch rather than python's httplib. However, since I'm already using the App-Engine-Patch, I already have access to a patched version of httplib that will work with the GAE.

Download the App-Engine-Patch and try running the sample. We'll build off the sample to connect to S3 using Boto.

I won't do anything tricky here and I'll simply drop the "app-engine-patch-sample" in my "google_appengine" folder. You can run it with your regular GAE command, mine is "python25 app-engine-patch-sample/"

Make sure that everything works for you by hitting the url "http://localhost:8080". You should be able to perform crud actions on people objects.

Hacking the App-Engine and Patch
At the time of my writing of this tutorial, you will have to make a few small adjustments.

In %GAE_DIR%/google/appengine/api/
Comment out the logging.debug statement
logging.debug('Making HTTP request: host = %s, '
'url = %s, payload = %s, headers = %s',
host, url, payload, adjusted_headers)

Then in %GAE_DIR%/app-engine-patch-sample/common/appenginepatcher/lib/
Comment out the logging.debug statement again and give it a pass (within def getresponse)
# 'Calling urlfetch.fetch(url=%r, body=%r, method=%r, '
# 'headers=%r, allow_truncated=%r)'
# % (url, self._body, self._method, headers, self.allow_truncated))

And modify the putheader method to
def putheader(self, header, *lines):
# FIXME: there's no good way to send multiple lines
try: line = ', '.join(lines)
except: line = "" # D.DEMUTH catching a current error
self.headers.append((header, line))

I suspect the putheader error will be corrected in the future, but for now it's something that MAY or MAY NOT effect your coding.

By the way, we're commenting out the Debug messages because when you run your GAE instance, your console will be bombarded with the string output of the uploaded file.

Adding Boto
Adding Boto to your appengine project is as easy as pie. Just drop the main folder of "boto" in %GAE_DIR%/app-engine-patch-sample.

Reading from your S3 Bucket
Let's try reading the pictures that you saved in your S3 Bucket. First modify the file in %GAE_DIR%/app-engine-patch-sample/myapp/ and add
# tests
(r'^boto_read/$', 'boto_read'),
Now modify %GAE_DIR%/app-engine-patch-sample/myapp/ and add
from boto.s3 import Connection
from boto.s3.key import Key
AWS_ACCESS_KEY_ID = "youraccesskey"
AWS_SECRET_ACCESS_KEY = "yoursecretkey"

def boto_read(request):
# BOTO Imports

bucket = connection.get_bucket('donnie123')
keys = bucket.get_all_keys()
key_urls = []
for fk in keys:
# get a url for the keys, I'm manually creating it
fk.make_public() # just incase if it's not public
key_urls.append( "http://%s.%s/%s" % (, connection.server, )

values = {'key_urls':key_urls}
return render_to_response(request,"boto_read.html", values)

Within your templates directory, add a simple template named boto_read.html
{% for k_url in key_urls %}
<p><img src="{{k_url}}" /></p>
{% endfor %}

With the above, the link http://localhost:8080/person/boto_read will display all of the images that you have in your S3 Bucket.

Writing to your S3 Bucket
This is a bit more difficult. Add one more line to your urls file.
# tests
(r'^boto_read/$', 'boto_read'),
(r'^simple_upload/$', 'simple_upload'),

Create a new template named simple_upload.html
<form action="/simple_upload/" enctype="multipart/form-data" method="post">
<div><input name="img" type="file"></div>
<div><input value="upload image" type="submit"></div>

And in your file add:
def simple_upload(request):
if request.method == "GET":
return build_template(request, "simple_upload.html")

if request.method == "POST":
if 'img' in request.FILES:
img = request.FILES['img']

bucket = connection.get_bucket('donnie123')
new_key = Key(bucket)
new_key.key =
return HttpResponseRedirect("/boto_read")

Here, the url http://localhost:8080/person/simple_upload will present you with an form to upload a file. When you submit the form, the code connects to S3 using the supplied credentials and attemps to upload the file.

If you left the debug statements in, you may see a lot of debugging info on your console. In order to make the code run faster during testing, I also removed all of the boto.log statements and replaced them with pass.

  1. gravatar

    # by Todd Hoff - February 9, 2009 at 8:00 AM

    I'm curious how big a file you can really upload giving GAE's limits? Because you have to loop reading and writing to S3 which I assume will take quite a while.

  2. gravatar

    # by Donnie Demuth - February 15, 2009 at 10:21 PM

    Hi Todd, I haven't ran into any issues so far. I've been able to upload 3mb photos so far and push them to my S3 bucket.

    GAE has a limit of storing 1mb files in Big Table. And you can't write to the file-system either. Apparently though, they don't stop you from routing a larger file to some other destination.

  3. gravatar

    # by Donnie Demuth - February 15, 2009 at 10:25 PM

    Oh and the wait isn't too noticeable. I think I'll just have the handler return immediately and not actually wait for the response after pushing the file off to S3.