Synching Your Amazon S3 Asset Host using Capistrano
Note: This article is out of date. The latest version of this article is on the new permanent page in the projects section.
So you’ve got multiple asset hosts running in your Rails application, and you’re using Amazon’s S3 to host your assets. Now you want to make sure that your assets are kept up to date. This plugin is a Capistrano recipe that keeps the asset hosts synchronized with the public directory in your subversion repository.
Usage
After you get everything setup and do your first deploy, just run cap deploy as normal and all changed files in RAILS_ROOT/public will be uploaded to all of your asset host buckets before the final deploy:symlink task.
The following tasks are also available:
- cap s3_asset_host:get_s3_revision
- cap s3_asset_host:find_changed
- cap s3_asset_host:list_changed
- cap s3_asset_host:find_all
- cap s3_asset_host:upload_changed
- cap s3_asset_host:upload_all
- cap s3_asset_host:upload
- cap s3_asset_host:reset_and_upload
- cap s3_asset_host:setup
- cap s3_asset_host:create_buckets
- cap s3_asset_host:delete_all
- cap s3_asset_host:connect
You can get documentation on these tasks by running cap -T
Requirements
This plug-in is a Capistrano extension. It requires Capistrano 2.0.0 or greater.
You will also require the aws-s3 gem
So far, this plug-in:- assumes that you are using the ‘checkout’ method of deployment.
- only works with svn.
If you are using another version control system, I think all you’ll have to change is the two methods in lib/scm.rb. If you do get something other than svn working, please let me know.
If you want to use more than one asset host, then you have to either install the multiple asset hosts plugin or upgrade to Rails 2.0 (see setting up multiple asset hosts in Rails)
Setup
To set-up, you need to do the following
- Install the plug-in
- Install the AWS-S3 gem.
- Set up your Rails application to use asset hosts.
- Set up your asset hosts.
- Configure Capistrano.
Installing the plug-in
From RAILS_ROOT, run:
script/plugin install svn://svn.spattendesign.com/svn/plugins/synch_s3_asset_host
Installing the AWS-S3 gem
You need to do this on both your local computer and the computer that is defined as the asset_host_syncher (see Capistrano Configuration, below).
$> sudo gem install aws-s3
Setting up your Rails app to use asset hosts
Single asset host
For a single asset host, simply add the following line to RAILS_ROOT/config/environments/production.rb:
config.action_controller.asset_host = "http://assets.example.com"
Multiple asset hosts
Follow the instructions in setting up multiple asset hosts in Rails
Setting up your asset hosts
Set up a CNAME entry for each asset host pointing to s3.amazonaws.com. How you do this depends on your domain host. Here’s what it looks like on easydns

You may need to wait up to 24 hours for the DNS entries for these new hosts to propagate.
Configuring Capistrano
Capistrano installation
This plugin requires Capistrano 2.0.0 or greater.
To upgrade to the latest version (currently 2.1.0):
$> gem install capistrano
Once the plug-in is installed, make sure that the recipes are seen by Capistrano
$> cap -T | grep s3_asset_host
should return a bunch of tasks. If you don’t see anything listed, then you need to update your Capfile by doing the following (this is from Jamis Buck):
$> cd RAILS_ROOT $> rm Capfile $> capify .
If you do not want to delete your Capify file, or if you are using Capistrano 2.0.0, add the following line to your Capify file:
Dir['vendor/plugins/*/recipes/*.rb'].each { |plugin| load(plugin) }
Capistrano configuration
Create a new file in RAILS_ROOT/config called synch_s3_asset_host.rb. Add the following lines to it, and edit to suit:
# ============================================================================= # S3 ASSET HOST OPTIONS # ============================================================================= set :asset_host_name, "assets%d.example.com" set :aws_access_key, "your Amazon AWS access key" # You can also set this in your environment as AMAZON_ACCESS_KEY_ID set :amazon_secret_access_key, "your Amazon AWS secret" # You can also set this in your environment as AMAZON_SECRET_ACCESS_KEY # set :dry_run, false# Set to true if you want to test the asset_host uploading without doing anything on Amazon S3 before "deploy:symlink", "s3_asset_host:upload_changed"
You have to do one more thing: in RAILS_ROOT/config/deploy.rb. Specify one of your web hosts as an “asset_host_syncher”, like this:
role :web, webserver1, :asset_host_syncher => true
The first deploy
Commit all changes to your rails application and do the initial bucket setup:
$> cap s3_asset_host:setup $> svn commit -m "Adding synch_s3_asset_host plugin" $> cap deployThis will do the following:
- Create your Amazon S3 AWS buckets
- upload everything in RAILS_ROOT/public (in your svn repository) to each bucket
- Set the revision in each bucket to the latest revision in your repository.
This could take a while if you have lots of images or other big files.
You’re done!
That should do it. Now, every time you run cap deploy, your asset hosts should be updated with any changes to files in RAILS_ROOT/public.
Let me know if you have any problems, suggestions or comments.
Comments
-
Scott, The plugin looks awesome. I plan to try it out on my project. But before I use this, I need to upgrade the project to Capistrano 2. The project is still using capistrano 1.4.1 Thanks for the plugin. Will post my experiences here.
-
Manik, That's great! Please do let me know how it goes. Hopefully the Capistrano upgrade goes smoothly. If you run into snags, feel free to send me an e-mail; I just did it a month ago, so it's still pretty fresh. Scott
-
Hi, I'm having some problems with the plugin. When it generates the FILES_TO_UPLOAD file, it lists some directories that don't exist in my public directory, such as: /var/www/myapp/releases/20071121005146/public/trunk/public /var/www/myapp/releases/20071121005146/public/trunk/vendor/rails/actionpack/test/fixtures/public /var/www/myapp/releases/20071121005146/public/trunk/myapp/trunk/public and so on.... So I get errors from the upload_to_s3 script that these paths don't exist. Any idea why I'm getting this or how to fix it? I'm using svn 1.4.4 and cap 2.1.0. Thanks, Chris
-
Well, I edited the code that generates the FILES_TO_UPLOAD file and had it ignore any file with the word trunk in it. That fixed some of the errors. However, now I'm still getting file not found errors for other files in the list. I notice that it's listing files that were in my repository years ago but were deleted a long time ago. This also causes file not found errors when it tries to upload to s3. My guess is that it has something to do with lib/scm.rb file in the plugin? -Chris
-
Hi Chris, Thanks for the bug reports. I made a change to the upload_to_multiple_buckets script that should fix both problems. Can you try it out and let me know how it goes? You can update the plugin by doing @script/plugin install --force svn://svn.spattendesign.com/svn/plugins/synch_s3_asset_host@
-
Hi Scott, We're getting there ;-). All the file not found errors are gone now, except now I've noticed a few of my files aren't being uploaded to s3. Upon more investigation, I noticed that some of the files in the FILES_TO_UPLOAD file have a trailing space at the end of the name. With your new fix that checks for file existence, the trailing space in the name causes it to return false. i.e >> File.exist?('/Users/chris/test.log') => true >> File.exist?('/Users/chris/test.log ') => false Anyways, everything should be good if you trim the trailing spaces in the file name before checking if it exists. Best, Chris
-
Chris, thanks again for your help here. I've made a change that I think should squash that bug, but can't really test it. Can you try it again and let me know? Scott
-
Hi Scott, Perfect, works like a charm now!! One other suggestion I had: have you considered gziping text files like the css, js, etc. before uploading it to s3? You can serve them from S3 compressed (cheaper and faster), and then the client's browser will decompress it. Here's an example of someone doing it on s3: http://devblog.famundo.com/articles/2007/03/02/serving-compressed-content-from-amazons-s3 I tried it out and it seems to work well. Only downside I guess is that it doesn't work in really old browsers. -Chris
-
Chris, Great! Glad to hear it works for you. I took a look at the compression idea. It looks pretty simple to do, but I'm going to have to think about it a bit. I was thinking of migrating to using s3synch.rb to do the uploading, and I'd need to figure out an elegant way to do the compression with s3synch.rb. Thanks, Scott
-
Hi Scott, Actually I spoke too soon. I was going through my site with s3 and noticed that there were some images that it didn't upload. I'm not sure why unfortunately. The files don't appear in the FILES_TO_UPLOAD file, so it must have something to do with the information its getting from svn. What I can tell you is that these images once resided in 'public/images/folder'. Over the course of the project, I issued an svn move command so they now reside in public/images/folder/another_folder. The directory public/images/folder/another_folder/ does appear in the FILES_TO_UPLOAD but none of the files inside the folder appear in the list...
-
Hi Scott, Still trying to figure out why those files aren't being included. My best guess is that it has something to do with issuing svn move commands on directories. It looks like you have code to support moves of files but not directories... I had it print out the name of the files on line 34 on scm.rb and it never printed out those missing files. What does appear though is this: A /trunk/myapp/trunk/public/images/folder/another_folder (from /trunk/myapp/trunk/public/images/another_folder:591) Hopefully this helps a bit. -Chris
-
Hi Chris, Hmm. I was never happy with the way I was finding the files to upload, and it looks like you're paying the price. Sorry about that. I'm going to spend some time over the next couple of days updating the code to use s3synch.rb. This should take care of your problems, and make it work with content management systems other than svn as well. I'll post something when I get it worked out. Thanks for all of your help. Scott
-
Hi Scott, Ya, you're right, it seems like svn isn't the best route for finding files to upload afterall. I'm not sure there's any clean solution to the svn move directory problem. There were some other problems I ran into as well if you're curious. Since I use the asset packager plugin for rails, the plugin generates all the compressed css/js in another capistrano recipe before deploying, so the assets it creates are never committed to svn (thus never caught by your plugin). I ended up modifying the packager plugin to upload its assets to S3 as well. If you haven't checked it out: http://synthesis.sbecker.net/pages/asset_packager the packager plugin is pretty handy. Anyways, the s3sync sounds like a better route. As I understand it, it's like rsync for s3, right? Look forward to the update! Best, -Chris
-
Hey Scott, I was playing around with s3sync. Pretty nifty and easy. I whipped up a quick capistrano task using it. Thought I'd paste it below, might save you some time if you haven't delved into it yet. I'm having it ignore files/directories that start with ., or contain .svn, and .DS_Store (ya i'm a mac guy ;-). Anyways hope it helps. Seems like line breaks don't work in your blog, so hopefully the code below is readable after copy&pasting. Unfortunately s3sync doesn't do gziping of assets, so I'm gonna look into how to add it to s3sync. NUM_ASSET_HOSTS = 4 ASSET_HOSTS = "assets%d.mysite.com" namespace :s3 do desc "Sync S3" task :sync, :roles => :web, :only => {:asset_host_syncher => true} do (0...NUM_ASSET_HOSTS).each do | n | run "cd #{release_path}/vendor/gems/s3sync/ && ./s3sync.rb -sprv --exclude='(\\.svn)|(\\.DS_Store)|(^\\.)' --cache-control='max-age=604800' #{release_path}/public/ #{ASSET_HOSTS % n}:" end end end
-
Chris, thanks for the code. That looks like a nice way to do it. I may just steal it verbatim if that's okay with you. Let me know if you get it zipping assets, too. -- Scott
-
Hey Scott, Sure, feel free to use it! I'll let you know once I look into the s3sync.rb code some more about the gziping. I don't see anyway to really do it without hacking s3sync.rb up unfortunately. But I will post my findings once I look into it more. Best, Chris
-
Just in case you didn't notice at the top, I've updated this article and put it in a new, permanent location at http://spattendesign.com/projects/synching-your-amazon-s3-asset-host-using-capistrano. The latest version uses s3sync.rb and should work much better. -- Scott"
-
This did everything perfect until I tried to load my site. The ACLs on all the files did not allow them to be world-readable. I don't know why. I just went in with irb and AWS::S3 to fix the ACLs after the fact, but I get the feeling that I must have done something wrong.
-
Fixed my problem about the ACL not set to public access by setting the --public-read flag to the options to s3sync.rb on line 183 of recipies/synch_s3_asset_host.rb. The plugin still saved me lots of time. Thanks.
-
Oops. Matt, thanks for pointing that out. I made the same change to the plugin in the svn repository. Stuff like this makes me wish I had found a better way to unit test this plugin. -- Scott
