Reducing Web Server load using Amazon S3

Anyone who runs a website, will know that eventually a website will (hopefully) become so large, and popular that one server is simply not enough to host all the content or load that is thrown at it. A common method to reduce this is just to add more servers in and load balance them. But what if you can’t afford more servers. Well there is a very cheap alternative. This is Amazon’s S3 hosting. It is a cloud storage technology provided by Amazon Web Services, which provide extra features, likely access control, enabling public access and setting custom headers. The ultimate goal would be to use a fully fledged Content Distribution Network, but for starters Amazon S3 easily does the trick. All you have to pay for is the storage space you use, and data you actually transfer.

So how does this help, well by placing your content (images, video, even say CSS) on Amazon S3 and using an Amazon S3 address to link to the content, then the end user will pull the content from Amazon S3, reducing the number of connections needed on your server, and the amount of data your server needs to send, enabling it to answer other requests faster. Not only that you can provide Cache tags on the files meaning that the client will cache the file, to stop you incurring extra costs of the end user requesting the file all the time. Not only this, it makes it faster for the user.

I use Amazon S3 on my blog, and by assigning the S3 bucket name as a CNAME on my domain, I can use a nice URL to access my content, making it look highly personalized. Not only that if you are using Wordpress their are a number of addons that allow for Amazon S3 integration, my favourite being WP Total Cache, which will upload the files that it thinks should be served statically, and automatically rewrite the URLs to them. Not only that, if you change to Amazon CloudFront it will easily allow you to change to that.

So if you are having issues with your website being overloaded with traffic especially when it is images etc, try moving it to Amazon S3. And once it is in their, if you decide you need to added power of the Amazon CDN, it’s extremely simple to setup and use your pre-existing content in S3 as the source.

Updating METADATA on Amazon S3 objects

So I host the static content from my blog on the Amazon S3 Simple Storage Service. This allows me to remove some of the load of my server for static content. However this means that over time I need to pay money for the S3 hosting, and if I have a lot of requests this could end up costly. So how do I get around this. Well by setting the Content-Control META tag onto the objects in S3, I can ensure that the static content is cached by the remote user for however long I want. In this case I have set it for 7 days. However updating all the files in S3 would take a long time to do manually, so I use this Python code to update the objects in my S3 bucket.

I had to modify it to support encoding as I use gzip encoding on some of the static content to reduce the amount of data needing to be transferred. :::python from boto.s3.connection import S3Connection

connection = S3Connection('API_KEY', 'API_SECRET')

buckets = connection.get_all_buckets()

for bucket in buckets:
    for key in bucket.list():
        print('%s' % key)
        encoding = None
        if key.name.endswith('.jpg'):
            contentType = 'image/jpeg'
        elif key.name.endswith('.gif'):
            contentType = 'image/gif'
        elif key.name.endswith('.png'):
            contentType = 'image/png'
        elif key.name.endswith('.css.gzip'):
            encoding = 'gzip'
            contentType = 'text/css'
        elif key.name.endswith('.js.gzip'):
            contentType = 'application/x-javascript'
            encoding = 'gzip'
        elif key.name.endswith('.css'):
            contentType = 'text/css'
        elif key.name.endswith('.js'):
            contentType = 'application/x-javascript'
        else:
            continue
        if encoding is not None:
            key.metadata.update({
                'Content-Type': contentType,
                'Cache-Control': 'max-age=604800',
                'Content-Encoding': encoding
            })
        else:
            key.metadata.update({
                'Content-Type': contentType,
                'Cache-Control': 'max-age=604800'
            })
            key.copy(
                key.bucket.name,
                key.name,
                key.metadata,
            )
            key.set_acl('public-read')