Automating IAM Credentials with Ruby and Chef

This post was originally published on SysAdvent.

Chef, nee Opscode, has long used Amazon Web Services. In fact, the original iteration of “Hosted Enterprise Chef,” “The Opscode Platform,” was deployed entirely in EC2. In the time since, AWS has introduced many excellent features and libraries to work with them, including Identity and Access Management (IAM), and the AWS SDK. Especially relevant to our interests is the Ruby SDK, which is available as the aws-sdk RubyGem. Additionally, the operations team at Nordstrom has released a gem for managing encrypted data bags called chef-vault. In this post, I will describe how we use the AWS IAM feature, how we automate it with the aws-sdk gem, and store secrets securely using chef-vault.


First, here are a few definitions and references for readers.

  • Hosted Enterprise Chef – Enterprise Chef as a hosted service.
  • AWS IAM – management system for authentication/authorization to Amazon Web Services resources such as EC2, S3, and others.
  • AWS SDK for Ruby – RubyGem providing Ruby classes for AWS services.
  • Encrypted Data Bags – Feature of Chef Server and Enterprise Chef that allows users to encrypt data content with a shared secret.
  • Chef Vault – RubyGem to encrypt data bags using public keys of nodes on a chef server.


We have used AWS for a long time, before the IAM feature existed. Originally with The Opscode Platform, we used EC2 to run all the instances. While we have moved our production systems to a dedicated hosting environment, we do have non-production services in EC2. We also have some external monitoring systems in EC2. Hosted Enterprise Chef uses S3 to store cookbook content. Those with an account can see this with knife cookbook show COOKBOOK VERSION, and note the URL for the files. We also use S3 for storing the packages from our omnibus build tool. The omnitruck metadata API service exposes this.

All these AWS resources – EC2 instances, S3 buckets – are distributed across a few different AWS accounts. Before IAM, there was no way to have data segregation because the account credentials were shared across the entire account. For (hopefully obvious) security reasons, we need to have the customer content separate from our non-production EC2 instances. Similarly, we need to have the metadata about the omnibus packages separate from the packages themselves. In order to manage all these different accounts and their credentials which need to be automatically distributed to systems that need them, we use IAM users, encrypted data bags, and chef.

Unfortunately, using various accounts adds complexity in managing all this, but through the tooling I’m about to describe, it is a lot easier to manage now than it was in the past. We use a fairly simple data file format of JSON data, and a Ruby script that uses the AWS SDK RubyGem. I’ll describe the parts of the JSON file, and then the script.


IAM allows customers to create separate groups which are containers of users to have permissions to different AWS resources. Customers can manage these through the AWS console, or through the API. The API uses JSON documents to manage the policy statement of permissions the user has to AWS resources. Here’s an example:

  "Statement": [
      "Action": "s3:*",
      "Effect": "Allow",
      "Resource": [

Granted to an IAM user, this will allow that user to perform all S3 actions to the bucketan-s3-bucket and all the files it contains. Without the /*, only operations against the bucket itself would be allowed. To set read-only permissions, use only the List and Get actions:

"Action": [

Since this is JSON data, we can easily parse and manipulate this through the API. I’ll cover that shortly.

See the IAM policy documentation for more information.


We use data bags to store secret credentials we want to configure through Chef recipes. In order to protect these secrets further, we encrypt the data bags, using chef-vault. As I have previously written about chef-vault in general, this section will describe what we’re interested in from our automation perspective.

Chef vault itself is concerned with three things:

  1. The content to encrypt.
  2. The nodes that should have access (via a search query).
  3. The administrators (human users) who should have access.

“Access” means that those entities are allowed to decrypt the encrypted content. In the case of our IAM users, this is the AWS access key ID and the AWS secret access key, which will be the content to encrypt. The nodes will come from a search query to the Chef Server, which will be added as a field in the JSON document that will be used in a later section. Finally, the administrators will simply be the list of users from the Chef Server.


The script reads a JSON file, described here:

  "accounts": [
  "user": "secret-files",
  "group": "secret-files",
  "policy": {
    "Statement": [
        "Action": "s3:*",
        "Effect": "Allow",
        "Resource": [
  "search_query": "role:secret-files-server"

This is an example of the JSON we use. The fields:

  • accounts: an array of AWS account names that have authentication credentials configured in ~/.aws/config – see my post about managing multiple AWS accounts
  • user: the IAM user to create.
  • group: the IAM group for the created user. We use a 1:1 user:group mapping.
  • policy: the IAM policy of permissions, with the action, the effect, and the AWS resources. See the IAM documentation for more information about this.
  • search_query: the Chef search query to perform to get the nodes that should have access to the resources. For example, this one will allow all nodes that have the Chef role secret-files-server in their expanded run list.

These JSON files can go anywhere, the script will take the file path as an argument.


Note This script is cleaned up to save space and get to the meat of it. I’m planning to make it into a knife plugin but haven’t gotten a round tuit yet.

require 'inifile'
require 'aws-sdk'
require 'json'
filename = ARGV[0]
dirname  = File.dirname(filename)
aws_data = JSON.parse(
aws_data['accounts'].each do |account|
  aws_creds = {}
  aws_access_keys = {}
  # load the aws config for the specified account
  IniFile.load("#{ENV['HOME']}/.aws/config")[account].map{|k,v| aws_creds[k.gsub(/aws_/,'')]=v}
  iam =
  # Create the group
  group = iam.groups.create(aws_data['group'])
  # Load policy from the JSON file
  policy = AWS::IAM::Policy.from_json(aws_data['policy'].to_json)
  group.policies[aws_data['group']] = policy
  # Create the user
  user = iam.users.create(aws_data['user'])
  # Add the user to the group
  # Create the access keys
  access_keys = user.access_keys.create
  aws_access_keys['aws_access_key_id'] = access_keys.credentials.fetch(:access_key_id)
  aws_access_keys['aws_secret_access_key'] = access_keys.credentials.fetch(:secret_access_key)
  # Create the JSON content to encrypt w/ Chef Vault
  vault_file ="#{File.dirname(__FILE__)}/../data_bags/vault/#{account}_#{aws_data['user']}_unencrypted.json", 'w')
  vault_file.puts JSON.pretty_generate(
      'id' => "#{account}_#{aws_data['user']}",
      'data' => aws_access_keys,
      'search_query' => aws_data['search_query']
  # This would be loaded directly with Chef Vault if this were a knife plugin...
    puts <<-EOH
knife encrypt create vault #{account}_#{aws_data['user']} \
  --search '#{aws_data['search_query']}' \
  --mode client \
  --json data_bags/vault/#{account}_#{aws_data['user']}_unencrypted.json \
  --admins "`knife user list | paste -sd ',' -`" # list of humans who should be admins

Joshua Timberman

Joshua Timberman is a Code Cleric at CHEF, where he Cures Technical Debt Wounds for 1d8+5 lines of code, casts Protection from Yaks, and otherwise helps continuously improve internal technical process.