Data Driven Application Deployment with Chef

We are pleased to announce the release of two new cookbooks to aid in deploying applications that are designed to be completely data driven.

The new cookbooks are:

They are designed to work with a Chef Server such as the Opscode Platform, and be driven by Data Bags and Search Indexes. Describe the application to deploy with information in a data bag, create and assign roles to the appropriate nodes, and they will be configured automatically with search discovery.

The application cookbook itself will support recipes for handling various types of application stack deployments. Currently supported is Ruby on Rails with Unicorn as the application server. The database cookbook can be used to configure master and slave database nodes, with Amazon Elastic Block Store volumes and snapshot-based backups. It currently supports MySQL, and uses the Opscode mysql::server recipe to install MySQL.

The cookbooks are both documented via the README files, are posted to the Opscode Cookbooks site, and development happens in the Opscode cookbooks github repository. For issues, improvements or feature requests, please file a ticket in the COOK project on the Opscode Open Source ticket system.

The remainder of this blog post will describe how to set up an environment to start deploying database and application nodes with these cookbooks.

This example environment will use the Opscode Platform as the Chef Server. The clients will be Amazon EC2 instances launched from the Ubuntu 9.10 AMIs that we rebundle with Chef, utilizing the instance metadata feature to automatically provision the instance as a Chef Client. We follow the Chef Repository workflow, using the Opscode cookbooks git repository. The application we’re deploying for example purposes is a simple Rails app called “myapp”. It lives in GitHub.

Knife has already been set up – we have a platform account with validation and user certificates. We have already run “rake install” in the chef-repo to upload the cookbooks and roles to the Platform. Let’s examine what we have.

These are the default cookbooks straight from the Opscode repository with no local modifications.

% knife cookbook list

We’re going to work with two specific roles, one for the application server itself, and one for the database server. We also have a role that will be used on both nodes for defining the application environment.

% knife role list

The application role contains the git and application default recipes in the run list. Git is used by the application recipe to deploy the application from a Git repository.

Note that we’re not going to specify a particular application recipe, more on that in a few minutes.

% knife role show myapp
  "name": "myapp",
  "cheftype": "role",
  "jsonclass": "Chef::Role",
  "defaultattributes": {
  "description": "",
  "runlist": [
  "overrideattributes": {

Next, we have created a database master role. The role name is important, because we’re going to search on it based on the application data bag information about what the role is called. More on that when we get to the data bag.

% knife role show myappdatabasemaster
  "name": "myappdatabasemaster",
  "cheftype": "role",
  "jsonclass": "Chef::Role",
  "defaultattributes": {
  "description": "",
  "runlist": [
  "overrideattributes": {

We’re going to deploy as a “production” application in a production environment. The way we handle that is with a role named production, and it simply sets a role defaultattribute “appenvironment” set to production. We can set up other environments such as staging, development, qa and so forth, naming the role the same as the appenvironment value. We use a default attribute at the role level, because that gives us the opportunity to “move” a node to another environment by changing the attribute.

% knife role show production
  "name": "production",
  "cheftype": "role",
  "jsonclass": "Chef::Role",
  "defaultattributes": {
    "appenvironment": "production"
  "description": "production environment role",
  "run_list": [

], "override_attributes": { } }

These recipes are data driven primitives that allow us to dynamically configure the application and the database on the fly. We do this by using a feature of Chef 0.8 called “Data Bags”. For more on Data Bags, see the Chef Wiki.

We have a bag called “apps”, and the example app is an “item” in the bag.

The item is a substantial data structure that defines a large amount of information about our application, including where its source code lives, the roles that are associated with it, and the recipes required to get it running. Everything in the recipes is driven with this data; we do not hard-set any values in roles or on the nodes.

Let’s take a look at the entire data bag in its unmodified JSON, and then we will break it down by section in detail. For more information on the data bag, see the README in the application cookbook.

% knife data bag list

% knife data bag show apps [ "my_app" ]

% knife data bag show apps myapp { "id": "myapp", "serverroles": [ "myapp" ], "type": { "myapp": [ "rails", "unicorn" ] }, "databasemasterrole": [ "myappdatabasemaster" ], "repository": "git://", "deployto": "/srv/myapp", "revision": { "production": "master" }, "force": { "production": false }, "migrate": { "production": false }, "owner": "nobody", "group": "nogroup", "gems": { "rails": "2.3.5", "rspec": "" }, "packages": { }, "mysqlrootpassword": { "production": "p@ssw0rd1root" }, "mysqldebianpassword": { "production": "p@ssw0rd1debian" }, "mysqlreplpassword": { "production": "p@ssw0rd1repl" }, "databases": { "production": { "reconnect": "true", "encoding": "utf8", "username": "myapp", "adapter": "mysql", "password": "p@ssw0rd1app", "database": "myappproduction" } } }

First, we need a name, this is the “item” in the “bag”. We name it after our application.

“id”: “myapp”,

Next we define the possible server roles as application types that map to a list of recipes for that role. In this example, we have one application type “myapp”, which has two recipes, “rails” and “unicorn”. We could have other serverroles that support this application, with their own corresponding recipes.

The rails recipe does the main amount of work, including setting up the deployment and actually deploying the application. The unicorn recipe sets up Unicorn for the application and starts it as a Runit service, utilizing the Opscode Unicorn and Runit cookbooks.

“serverroles”: [
“type”: {
“myapp”: [

The databasemasterrole is used to determine who the database server is. This can be a role on the application server or a role on another system. The recipe is smart enough to figure out if the role is in the node’s run list, or search for a node that has it. There should only be one “database master” at a time. We don’t have support for master/master configurations (yet).

“databasemasterrole”: [

The next set of values corresponds to the application source code repository and how to control the deployment behavior. This is handled in the “deployrevision” resource of the Rails recipe. The repository is the actual Git repository we’re going to use. At a later date we may add support for other SCM’s. The deployto is the filesystem location where the application will live. The revision is passed to the deploy resource and can be specified on a per-environment basis. For example we’re deploying master, but we could also specify a version tag. Force is a special value that tells whether we’re going to make sure the revision is deployed, whether it exists or not. Be careful with this setting, as it can make the recipe no longer idempotent. The migrate value specifies whether we’re going to run the migration command, in the case of a Rails application, “rake db:migrate”. Finally, the owner and group set the file ownership.

We can always edit the data bag to force a redeployment, run database migrations or switch to a new revision quickly and easily. This gives us tremendous flexibility, because we know that the Chef client will finish the run and everything works as we specifiedin this data.

See the Deploy resource documentation for more information on how that resource behaves.

“repository”: “git://”,
“deployto”: “/srv/myapp”,
“revision”: {
“production”: “master”
“force”: {
“production”: false
“migrate”: {
“production”: false
“owner”: “nobody”,
“group”: “nogroup”,

We also have the ability to specify gems and packages specific to the application in the data bag, these will get installed at run time based on the version specified. If a particular version isn’t required, it will install the latest available version. Note that we developed our example application with Rails 2.3.5, so we “lock” that version here, so we don’t break backwards compatibility by accidentally upgrading Rails.

The packages can be specified the same way as gems. If we wanted, we could have specified the “git-core” package here instead of including that whole recipe.

“gems”: {
“rails”: “2.3.5”,
“rspec”: ""
“packages”: {

The next few settings are for the database. The mysql cookbook has the ability to set random passwords if the node doesn’t already have a password set, for the root, replication and debian maintenance users. If these values aren’t set, when the database master is launched it will print a message about storing the generated passwords in this data bag. These are clearly trite example passwords and shouldn’t be used in the real world.

“mysqlrootpassword”: {
“production”: “p@ssw0rd1root”
“mysqldebianpassword”: {
“production”: “p@ssw0rd1debian”
“mysqlreplpassword”: {
“production”: “p@ssw0rd1repl”

The following structure is a hash of databases. This looks strangely like database.yml for those familiar with Rails projects, because it is. This will be rendered out in the actual database.yml on the node. We also use this data in the database recipe to create the actual databases and set up the permissions for the application’s database user.

“databases”: {
“production”: {
“reconnect”: “true”,
“encoding”: “utf8”,
“username”: “myapp”,
“adapter”: “mysql”,
“password”: “p@ssw0rd1app”,
“database”: “myappproduction”

Now that we have uploaded the cookbooks, roles and created the data bag, it is time to set up our EC2 instances to run these recipes to deploy the application.

First, create the JSON to pass to the instances, using knife. We’ll create a JSON file for each of our instances we’re going to launch.

% knife ec2 instance data role[production] role[myappdatabasemaster] > myappdb.json
% knife ec2 instance data role[production] role[myapp] > myapp.json

Run the EC2 instances, using the Ubuntu 9.10 AMI with Chef preinstalled by Opscode.

% ec2-run-instances ami-69987600 -k $EC2KEYPAIRUSEAST1 -t m1.small -f myappdb.json
% ec2-run-instances ami-69987600 -k $EC2KEYPAIRUSEAST1 -t m1.small -f my_app.json

The instances will boot, start Chef using the instance data to set the validation certificate and server URL, register with the Chef Server, be automatically configured by Chef, and the app will be deployed. In other words, after some prep work, we have and end-to-end generic application deployment mechanism driven entirely by data structures stored on the Chef Server.

We will continue development of these cookbooks to add support for other application stacks, other version control systems, and other databases.

If you have feedback, or need help using these cookbooks or workflow, please post to the Chef mailing list, or #chef IRC channel. File a ticket (COOK project) if you encounter problems or want to contribute improvements.

Posted in Uncategorized
  • Andreas Schacke

    Thanks for creating chef. A really great tool. Coming from capistran based deployment, I still don’t get one point. With capistrano I cd into my app folder and deploy from there whenever I like. Is it correct that with chef deployment I have to login into every app server and run ‘chef-client’?

    Thanks for the reply Andi

    • Tom Thomas

      Hi Andi,

      Though one approach would be to log into every app server and run ‘chef-client’, that particular approach isn’t required.

      Many construct an automated method of running chef periodically to pick any changes in cookbooks, roles assigned to the server, or results from dynamic searches performed by the recipes. The chef-client could be run from cron, some other scheduler, or can be started by itself as a persistent daemon on its own interval and splay. There is further information on chef-client within our Chef Wiki.

      Opscode now also has a Chef-Client cookbook to automate the configuration of a system as a chef client located within our github repository. This cookbook includes setting the chef-client to run as a daemon.

      For more directive deploys, where you want to push changes out to the chef clients rather than waiting for a scheduled run to pick up any changes, you can also use knife to ssh to a set of hosts and run chef-client.

      knife ssh “role:webserver” “sudo chef-client”

      This will search the Chef Server for all nodes with role “webserver” applied using the Search index feature, then open an SSH connection to all the results and run the command sudo chef-client. For more information see the Knife SSH examples within the Chef Wiki.

      One of the advantages of Chef is its flexibility, to work the way that you want it to!

      Thanks Tom

  • Edmund Haselwanter

    I’ve done a rework on that. and

  • Pingback: How to Deploy Ruby on Rails With The Opscode Chef Application Cookbook — Agile Web Development & Operations()