Revolution On Rails: deployment

Showing posts with label deployment. Show all posts

Monday, May 14, 2007

Capistrano Off the Beaten Path

Introduction

If you use Capistrano, most likely you use it to deploy rails applications by running it from the project directory. The plugem management tool piggy-backs on Capistrano to execute some recipes without using the current path because recipes operate on gems. Recipes do not, however, completely ignore the current directory, but instead use it for optional customization. Customization is not limited to a deployment recipe from the current directory but can be environment-specific across multiple sites if they each have a special extension gem installed.

Background

We use a bunch of ruby scripts at RHG for deployment and delivery of our numerous applications and shared component. When we started preparing some of that functionality for a public release, we needed to provide a way to customize the scripts. Warren Konkel suggested to migrate them to Capistrano since it was the most familiar tool for rails developers and it has a recipes hierarchy that can be used for customization. So, we wrote the plugem command tool that feeds its parameters to Capistrano via Capistrano::CLI.new(plugem_converted_arguments).execute!
after loading its own recipes.

The Code

    1 module Capistrano
    2   class Configuration
    3
    4     alias :standard_cap_load :load
    5
    6     def load(*args, &block)
    7
    8       standard_cap_load(*args, &block)
    9      
   10       if args == ["standard"]
   11
   12         load_plugem_deploy_recipes(File.dirname(__FILE__) + '/../..')
   13
   14         begin
   15           require 'plugems_deploy_ext'
   16           load_plugem_deploy_recipes(PLUGEMS_DEPLOY_EXT_DIR) # Overriding from extensions
   17         rescue Exception
   18           # No extension is loaded
   19         end
   20
   21       end
   22
   23     end
   24    
   25     def load_plugem_deploy_recipes(dir)
   26       Dir[File.join(dir, 'recipes', '*')].each { |f| standard_cap_load(f) }
   27     end
   28      
   29   end
   30 end

Ignoring lines 14-19 for now, all it does is inject the plugem deployment gem recipes to Capistrano. They are now available for execution by Capistrano. So when I call plugem update my_app it is being translated to cap plugem_update -s plugem_name=my_app(we use the plugem_ namespace for tasks and variables to avoid clashing with standard recipes). Since loading of plugem recipes is purposely not handled via the -f flag of capistrano, the standard deployment recipes like config/deploy.rb are still being loaded.

Customization

The package is extending Capistrano but we wanted it to be customized too. For example, you might want to define your own list of gem servers to download gems from. Capistrano allows to define recipes per project, user, or host. The plugems_deploy takes it a step further and provides a way to customize it per deployment environment, which might contain many hosts. It does this by expecting an optional plugems_deploy_ext gem to be installed on a system (lines 14-19). If it finds the extension gem, it loads recipes from there overriding the default ones.

Conclusion

Capistrano is not only a great deployment tool, it can be used as a base of highly-customizable general purpose tools.

Friday, March 09, 2007

DRYing Up Configuration Files

Introduction

Our post about deployment process explains that some configuration files are overridden at the time of deployment. Maintaining those files could quickly become a nightmare unless the development team constantly evaluates them to gather only pieces that differ into the deployment specific configs, i.e. keeps them DRY. It has been a long journey for us and we are still adjusting the config file usage for our project.

ConfigurationLoader

We access all configuration files from our code via ConfigurationLoader (and, yes, it is a plugem itself). It provides some convenience methods for major configs. For example, to establish a non-default connection from a model, this piece of code could be used:

establish_connection ConfigLoader.load_db_config['secondary_db']

The load_db_config method knows that database.yml is sectioned per environment so it loads the file and returns the corresponding to current RAILS_ENV section. ConfigLoader is an instance of ConfigurationLoader with caching enabled. Since we use a lot of configuration-driven parameters in our models and controllers, caching this info saves unnecessary file system calls and ERB parsing.

DRYing Up By Keeping Configs Close to The Source

Another property of ConfigurationLoader is that it looks not only in the application config directory but in configs of plugins and gems. All found configs are merged (for usual config ruby structures like Hash and Array) in the order gems<-plugins<-application. It allows us to keep the default configuration within a gem to be shared between multiple applications, and, at the same time, lets applications overwrite it if needed. Together with the power of deployment time overriding, which is applicable for shared components as well, it helps us to keep only application-specific entries in its config files. The following is a concrete example from today: We have been using the Browser Logger as a plugin in our applications. Aaron decided to convert it into a plugem. We obviously don't want it to be enabled in our production deployment environment but we want to be able to see logs through a browser on developers' workstations and in QA. The gem initializer loads browser_loggger.yml via ConfigurationLoader to determine whether it is enabled or not. It is not DRY to put browser_loggger.yml in every application that uses Browser Logger. Instead, the gem itself contains config/browser_loggger.yml, with the property enabled, config/deployment/prod-browser_loggger.yml, with the property disabled, and config/deployment.yml with a single entry 'files: browser_loggger.yml' which defines that config/browser_loggger.yml will be overridden on production boxes with the content of the second file (see When Capistrano Is Not Enough for details how deployment time configuration works).

DRYing Up By Extracting Changing Parts in A Separate File

Our rails applications call a lot of backend services. Services configuration is stored in the service.yml file. Service endpoints usually follow same naming conventions in respect to a port and a path between different deployment environments. So at some point, we extracted host names into a separate file (service-hosts.yml) and changed our deployment.yml to override it instead of service.yml at the deployment time. Since service.yml is ERB-processed, we leverage it for loading the host file:


<% hosts = ConfigLoader.load_file('service-hosts.yml') %>

service_cfg: &service_cfg

foo:
url: http://<%= hosts['foo'] %>:8080/services/foo
timeout: 5

bar:
url: http://<%= hosts['bar'] %>:8080/services/bar
timeout: 5

production: *service_cfg
development: *service_cfg
test: *service_cfg

Conclusion

Our experience has shown that combination of deployment and run time configurations is powerful enough to handle most use cases we have had so far. There is always room for improvement, however. For example, we have lately run into a problem when one of the production boxes needed one of parameters in service.yml set to a different value then the rest of production boxes. We can address it within the current framework by copying a content of prod-sevice-hosts.yml to prod-box-1-service-hosts.yml and changing the parameter there. Since the host name (prod-box-1) takes precedences over the host class name (prod) it would fix the issue but the maintenance cost of two almost identical files is high. An alternative solution would be to enhance the framework to allow additional runtime overriding of values of service.yml from service-override.yml. Then we can create prod-box-1-service-override.yml which would contain a single entry with the parameter it needs to override. Work in progress...

Wednesday, March 07, 2007

Gem-based Deployment and Delivery: Part 2 - Distribution via Gem-servers

Part of the Gem-based Development Process series:

The Idea

Once we packaged all of our applications and shared components as gems, it did not take us long to realize that we could leverage the native method of gem distribution - gem servers - to build an infrastructure for pushing products through the dev/qa/production deployment stack. The idea of gem promotion was born.

There are three layers of gem servers - one per deployment environment (dev/qa/production). Gems can be promoted from the upstream gem server to the downstream one. The direction of promotion is dev->qa->prod. The individual boxes within the deployment environment install gems only from the server that serves their environment. For example, when a QA person installs gems on qa7-rails & qa8-rails boxes, he uses the qa-gems server as the source (as in gem install xyz --source qa-gems --remote). The dev gem server is the base one where freshly built gems are pushed to.

The Implementation

Since our QA and Operations teams wanted to have full control over what they have on their gem servers, we adopted the push-pull model instead of a more simple push-only one (when a gem promotion works by pushing from the upstream to the downstream). In our current model, developers push gems to the base (dev) gem server first. After it is tested and is QA ready, the manifest file, containing a specific gem name and version (together with names/version of all dependent gems), is generated and sent to QA. A QA installer uses the manifest to pull gems from the upstream dev gem server to the downstream QA one. He then goes to the individual QA boxes and upgrades the gems there. When QA clears the build, the manifest is sent to Operations to repeat the procedure in production with the qa gem server being the upstream one.

We greatly simplified the usage by hiding decisions about upstream/downstream, the source gem, etc. behind a single tool that was deployment environment aware. For example, to pull down gems based on the manifest to the qa gem server, the installer issues a command rhg pull manifest_file on the qa gem server box. The rhg tool queries the hostname of the machine it is being run on, and based on its name (qaX-YYY), it picks the upstream server (dev-gem). Running rhg up gem_name gets the gem (and all of its dependencies) from the environment specific gem server, with the base (dev) gem server being the default option so developers can use the same command for updating their local gem repositories.

Under this scheme, each environment has full control over which gems are installed there and the only piece of information needed for promotion is the manifest file declaring specific versions to promote.

The Reality

We built the tools and dev/qa gem servers and started pushing gems to QA using the described approach. When the time came to implement it in production, our operations team rejected it. They had a reason: they wanted a unified approach for delivering all packages they need to install. Since we have Java and many other non-ruby applications while the ops requirement was to use RPMs for package distribution, we had to adapt. We retained the dev gem sever and continued to allow developers to push gems to it. We continue to generate the manifest file, but instead of delivering it as is, we use it to pick up the specific versions of gems and repackage them as RPMs (using a modified version of gem2rpm). They are then delivered to QA and later to production.

The Future

We still believe that using the 'gem servers' hierarchy is the right way for plugem-based applications distribution. Most likely the original scripts will be a part of the near future plugem public release. We hope that one day it might become the preferred method of rails applications distribution.

Tuesday, March 06, 2007

Gem-based Deployment and Delivery: Part 1 - When Capistrano Is Not Enough

Part of the Gem-based Development Process series

Introduction

In the beginning, we, as many other rails projects, were using Capistrano for deployment. This changed when it came time to start preparing builds for QA and production. Having formal QA and Operations teams, we had to adjust our approach to meet their requirements for deployment. Another force pushing us off Capistrano--even for deployment in the development environment--was the existence of multiple servers where different versions of the application could be deployed by anyone from the large development team. As a result, we built our own deployment tools to be used by QA/Operations (and occasionally the development team.) We also use push-button builds (via luntbuild) to deploy applications to dev.

The Tools

A New RHG Developer's Illustrated Primer provides insight into how deployment tools are used on our projects. Since it stops at the point when the build is ready for QA, it does not show all use cases; however, it gives up enough to see what might be going on later in QA and production. The teams there use rhgcontrol for any application or shared component deployment task. Wrapping up all deployment activities in a self-contained tool gives the development team better control over how applications are installed and deployed in those environments. In addition, it allows the addition of new features to the existing command set without changing the installation instructions.

A couple of examples:

When our DBAs asked us to provide an automated way to update the application history table with the current application upon deployment, we added auto-generation of a db migration that does just that to the rhgcontrol migrate command.
We want to know what versions of shared components are used by the running version of an application. Instead of building a runtime configuration querying system a-la JMX, we simply fixed rhgcontrol start to dump its runtime configuration right after the application was started.

The Deployment

Any application or shared component is potentially deployable. All that is required is a single file--deployment.yml--in the config directory of the gem (since all applications at RHG are plugems). The content of the file is read and executed at deployment time when rhg deploy is run. The deployment configuration is simpler than a Capistrano recipe and contains just a few sections.

Sample deployment.yml:

# For servers
server: &server_cfg

files:
- database.yml
- log4r.xml

execs:
- ln -nfs <%= @app_home %> <%= @app_install_dir %>/<%= @app_name %>
- <%= @cmd_host_gems %>
- rake some:action

qa: *server_cfg
dev: *server_cfg

# For developer's workstations
default:

files:
- database.yml

execs:
- <%= @cmd_link_to_app %>

First, there is a separation by a deployment environment - dev/qa/etc. Those names are based on the hostname of a machine where rhg deploy is executed. We adopted unified DNS naming conventions for our rails boxes (e.g. all of our QA machine names start with qa.) This allows us to define host classes and put, say, QA-specific tasks. If the hostname does not match any class, the default section is used.

There are two types of the deployment tasks: copying over configuration files (the files subsection) and execution of commands (the execs section).

Some configuration files, such as database.yml, are often deployment environment specific. The deployment-time configuration directory (config/deployment) may contain such files, prefixed with either the host class (like dev) or the full hostname (like dev8-rails.) They are used to overwrite the copies under the config/ directory at deployment time.

Configuration-file overrides are a small piece of the multiple-configuration puzzle we've had to solve at RHG. We plan to have a separate post on this subject, which will cover runtime configuration as well.

There are two types of the execution commands--regular UNIX commands and macros. Macros are commands that are bundled with the rhg tool (when it makes sense to share them between different applications.) For example, the macros for hosting gems (@cmd_hosts_gems) looks like this:

- rm -rf gems
- tar -xf <%= @deployment_gem_dir %>/templates/gems-bare.tar

Instead of putting those commands in the deployment.yml file of every application that uses the locking mechanism, we share them via macros. They also provide us a single place to change the implementation of these common tasks.

The execs section leverages deployment-specific variables, resulting in a system flexible enough to handle every deployment task we've encountered.

Conclusion

While Capistrano is suitable for most of the rails projects out there, we had to build our own deployment and delivery tools to wrangle our multi-environment, multi-application, multi-component portal. We plan to extract and release the useful parts of the tool for Rails development teams that are isolated from their QA/Production environments.

Wednesday, February 14, 2007

Locking down a deployed application

A single host might have multiple applications installed each on a different release track. Since all our applications are distributed as a bunch of gems, there is a potential issue with shared components. Even though the whole portal is being tested when some application or component is being pushed through the stack, there is always a possibility of incompatibility of on of the upgraded shared component with another dependent application. In some other environments there would only one solution - to roll back the whole new application. In our environment we have another option - to lock down the affected application. When an application is deployed, one of the deployment steps is to create a gem repository inside the application structure. The application gem repository has the sources gem installed and the rest of gems sym-linked as in this example:

actionmailer-1.2.5 -> /usr/lib/ruby/gems/1.8/gems/actionmailer-1.2.5
actionpack-1.12.5 -> /usr/lib/ruby/gems/1.8/gems/actionpack-1.12.5
...
sources-0.0.1

Gems put into an internal gem structure are those it was tested with, so they are guarantee to work. All is needed to lock down an application now is to set the GEM_HOME environment variable on a web server instance, serving the application, and to point it to the application internal gem repository. After that the application won't use the latest versions of the gems but those specific it is locked to.

Thursday, January 25, 2007

Plugems

Part of the Gem-based Development Process series

Implementations:

Runtime

Deployment

*****

We started with a small application. A few controllers, a few models. Soon the company grew, the development team grew, and with that the application grew. Luckily, we namespaced applications early on, so it was mostly clear where one application lived from another, but the rails project became too big. As one coworker (Eddie) put it, "My TextMate file find is getting too slow!". It was time to split applications. But for a portal with a single look and feel, how do you share things across applications? Here is how we do it (this week).

We wanted to use gems, as described in Part I, but without gems being true 1st class citizens in rails, there was no way to make it happen. So what was important to share and more importantly version?

I. Code
Thats obvious, rubygems already does (most of) this for us.

II. Views
With a common header, or commonly shared assets, we'd like to leverage those.

III. Rake Tasks
We have lots of rake tasks to do all sorts of things, we need to share those too.

IV. Assets
We'll cover this later. This gets complicated, and needs deployment input.

V. Plugems as Plugins
Using gems, as plugins, in your plugin dir, and the A/B/C problem.

I. Code
We didnt do (much of) anything to get this. Thanks guys who wrote RubyGems!

II. Views
We needed to load views from a few places, and it took some diving into rails internals to figure out how. There were a couple things we needed to figure out.

Where to look for views.
The order in which we look.
Caching. Globbing for files is slow... especially across many directories. Cant do this a lot.

1. So we first overwrote the ActionView::Base's implementation of finding files. That was straightforward. We also fixed ActionController::Layout to allow layouts to live in gems as well. Once we provided ActionView and ActionController with a bigger list of partials, it could then load the partial from anywhere. Now you'll see the path to a file in a gem in certain stack traces, just to prove that where to find it.

2. So why is the order important and powerful? Lets say you decide to use our base_ui gem and its default layout.

<%= render :partial => "header"%>
<%= @content_for_layout -%>
<%= render :partial => "footer"%>

In our gem,
/gems/base_ui/
/views/layouts/
default.rhtml
_header.rhtml
_footer.rhtml

You've been told to build the FooBar application, whos pages dont have the same header. So in your application, add a
/app/views/layouts/_header.rhtml, and put whatever you want there. It just works.

So the dependency order is: app ==> plugins ==> gems. There is almost always true. I'll explain the exception later when "developing" things that are gems.

Note: The compiled template method names get kinda ugly, but who cares.. You only see those in some stack traces.

3. Caching was quickly important. Disclaimer: We only look for views in gems and plugins at startup. After that, bounce your app to pick up new files. Thanks to Zed, mongrel_rails restart is wicked fast.

III. Gem Rake Tasks
So the easiest way to include rake tasks from gems was to add a bootstrap.rake into /lib/tasks/bookstrap. We grab the manifest file for the application, and for each of the gems defined, 'load' the rake files. This was a quick fix and could probably be more elegant.

IV. Assets
This conversation gets complicated quick. We'll cover this later.

V. Plugems ( Plugin + Gem = Plugem!)
So now you've taken all your favorite plugins and made them gems. In the next article, I'll cover developing gems as plugins, and the A --> B --> C dependency loading problem, where A and C are gems, and B is a plugin.

Tuesday, January 23, 2007

A New RHG Developer's Illustrated Primer

Part of the Gem-based Development Process series

*****

Dan had just completed his last project and was looking for another job where he can apply his skills in Rails and AJAX, when he run into a blog discussing challenges of an enterprise RoR-based application. He never worked on a large-scale Rails project so he was eager to try. He applied to a position at Revolution Health Group and after going through series of interviews, received an offer. People he talked to on campus, and his own scrutinizing of the web-site on which he was supposed to work, made him decide to accept the offer. He started on Monday.

Now it's Wednesday. Dan has spent his first two days at RHG reading the development's wiki, talking to coworkers, and configuring his new development workstation, a MacBook Pro. He was just assigned a bug, so he is eager to prove that he was the right choice for a job.

First, he decides to install a copy of one of the applications (rop) locally. Armed with instructions from wiki, he installs the deployment support tools:

$ gem install rhg_deployment --remote --source http://gems.revolutionhealth.com:8808

He now has two new commands - rhg and rhgcontrol. He sets up a runtime environment for the application:

$ rhgcontrol setup
Setting up the runtime environment
mkdir -p /opt/rhg/applications/etc
mkdir -p /opt/rhg/applications/tmp
...
mkdir -p /opt/rhg/applications/log

He downloads the latest version of the application with all its dependencies to his workstation:

$ rhg update rop
Bulk updating Gem source index for: http://gems.revolutionhealth.com:8808
Installing [ actionwebservice, 1.1.6 ]
Installing [ activesupport, 1.3.1 ]
Installing [ rails, 1.1.6 ]
...
Installing [ rhg_ui, 1.5.40766 ]
Installing [ rhg_migrations, 1.0.37541 ]
Installing [ rop, 1.4.40837 ]

Dan has all he needs now so he deploys the application, converting it from a static gem to a live site:

$ rhg deploy rop
Deploying rop-1.4.40837
Loading /usr/lib/ruby/gems/1.8/gems/rop-1.4.40837/config/deployment/database.yml as database.yml
Executing: ln -nfs /usr/lib/ruby/gems/1.8/gems/rop-1.4.40837 /opt/rhg/applications/rop
...
Executing: ln -nfs /opt/rhg/applications/log/rop log

It is a rails app, so it needs a DB which he creates using mysqladmin. He then issues a command to add a structure and populate with data his development DB:

$ rhgcontrol migrate rop
Running migrations for rop
cd /opt/rhg/applications/rop
Executing: rake db:migrate
...

Dan plans to use both lighttpd and mongrel to host the application, but first he goes with the default one, lighttpd, using a config supplied with the deployment tools:

$ rhgcontrol add rop
Adding lighttpd configuration for rop_8001

$ rhgcontrol start rop
Executing on rop_8001
Executing: /usr/sbin/lighttpd -f /opt/rhg/applications/etc/rop_8001.conf

Dumping the runtime manifest

He points his browser to http://localhost:8001 and plays with the application. He feels like doing some coding. He checks out the latest application code from subversion and runs mongrel from the top of the source tree:

$ mongrel_rails start -d

He fixes some code and navigates his browser to http://localhost:3000 to see the changes. It works but he needs to do some more fixes, this time in a shared component rhg_ui. He checks out the latest component code, sym-links it to the vendor/plugin directory of the application making it temporarily a plugin, and changes some code to see the immediate result. The bug is resolved, and, after running unit tests, he checks in the modified code for both the application and the component.

The changes he made are in a latent state. They are in the source tree but no gems were built off them yet. Dan knows that QA usually builds application gems off the latest code but not component ones. He decides to build a gem for rhg_ui himself. He navigates to the top of the component source tree and runs a command to tag, build, package, and publish the component as a gem to our local gem server:

$ rhg publish
...
Committed revision 40967.
...
Checking out tag to /tmp/rhg_ui-1.5.40966
Changing directory to /tmp/rhg_ui-1.5.40966
Building gem from tag
...
Successfully installed rhg_ui, version 1.5.40966
Pushing gem to development gem server via rhg tool: rhg push rhg_ui
Bulk updating Gem source index for: http://gems.revolutionhealth.com:8808
Bulk updating Gem source index for: http://gems.revolutionhealth.com:8808/archive
SSH User: dsmith
SSH Password: XXXXXXX
Publishing gems...
...
Refreshing the gem server indexes at /opt/rhg/gems

His changes are now packaged and ready to be picked up by QA for testing and pushing to production. How that part is done is for another time...

Revolution On Rails