Friday, March 09, 2007

DRYing Up Configuration Files


Our post about deployment process explains that some configuration files are overridden at the time of deployment. Maintaining those files could quickly become a nightmare unless the development team constantly evaluates them to gather only pieces that differ into the deployment specific configs, i.e. keeps them DRY. It has been a long journey for us and we are still adjusting the config file usage for our project.


We access all configuration files from our code via ConfigurationLoader (and, yes, it is a plugem itself). It provides some convenience methods for major configs. For example, to establish a non-default connection from a model, this piece of code could be used:

establish_connection ConfigLoader.load_db_config['secondary_db']

The load_db_config method knows that database.yml is sectioned per environment so it loads the file and returns the corresponding to current RAILS_ENV section. ConfigLoader is an instance of ConfigurationLoader with caching enabled. Since we use a lot of configuration-driven parameters in our models and controllers, caching this info saves unnecessary file system calls and ERB parsing.

DRYing Up By Keeping Configs Close to The Source

Another property of ConfigurationLoader is that it looks not only in the application config directory but in configs of plugins and gems. All found configs are merged (for usual config ruby structures like Hash and Array) in the order gems<-plugins<-application. It allows us to keep the default configuration within a gem to be shared between multiple applications, and, at the same time, lets applications overwrite it if needed. Together with the power of deployment time overriding, which is applicable for shared components as well, it helps us to keep only application-specific entries in its config files. The following is a concrete example from today: We have been using the Browser Logger as a plugin in our applications. Aaron decided to convert it into a plugem. We obviously don't want it to be enabled in our production deployment environment but we want to be able to see logs through a browser on developers' workstations and in QA. The gem initializer loads browser_loggger.yml via ConfigurationLoader to determine whether it is enabled or not. It is not DRY to put browser_loggger.yml in every application that uses Browser Logger. Instead, the gem itself contains config/browser_loggger.yml, with the property enabled, config/deployment/prod-browser_loggger.yml, with the property disabled, and config/deployment.yml with a single entry 'files: browser_loggger.yml' which defines that config/browser_loggger.yml will be overridden on production boxes with the content of the second file (see When Capistrano Is Not Enough for details how deployment time configuration works).

DRYing Up By Extracting Changing Parts in A Separate File

Our rails applications call a lot of backend services. Services configuration is stored in the service.yml file. Service endpoints usually follow same naming conventions in respect to a port and a path between different deployment environments. So at some point, we extracted host names into a separate file (service-hosts.yml) and changed our deployment.yml to override it instead of service.yml at the deployment time. Since service.yml is ERB-processed, we leverage it for loading the host file:

<% hosts = ConfigLoader.load_file('service-hosts.yml') %>

service_cfg: &service_cfg

url: http://<%= hosts['foo'] %>:8080/services/foo
timeout: 5

url: http://<%= hosts['bar'] %>:8080/services/bar
timeout: 5

production: *service_cfg
development: *service_cfg
test: *service_cfg


Our experience has shown that combination of deployment and run time configurations is powerful enough to handle most use cases we have had so far. There is always room for improvement, however. For example, we have lately run into a problem when one of the production boxes needed one of parameters in service.yml set to a different value then the rest of production boxes. We can address it within the current framework by copying a content of prod-sevice-hosts.yml to prod-box-1-service-hosts.yml and changing the parameter there. Since the host name (prod-box-1) takes precedences over the host class name (prod) it would fix the issue but the maintenance cost of two almost identical files is high. An alternative solution would be to enhance the framework to allow additional runtime overriding of values of service.yml from service-override.yml. Then we can create prod-box-1-service-override.yml which would contain a single entry with the parameter it needs to override. Work in progress...

No comments: