On the level: Testing your infrastructure

At Opscode we have an internal Jenkins cluster that runs our test suites against each project as commits are pushed; pretty standard stuff. We also built a tool around vagrant that we plan to release soon that we use for standing up test VMs to run integration tests, cookbook tests and tests against the APIs common among our codebases. Like many, we are constantly working on improving our testing and recently have been moving our build systems into Jenkins as well.

We recently released Ruby Omnibus, which is the build system we use internally for producing Private Chef packages for our customers. Along with the software descriptions, it is now building public Chef client packages as well and will soon be the source of Open Source Chef Server packages.

Everyone knows that testing is essential, but figuring out how to do it can be a confusing journey. Today we’re excited to share a guest post by Andrew Crump (@acrmp). Andrew is the author of Foodcritic, a tool for checking your cookbooks for best practices and other errors which may be syntactically correct, but would fail when the cookbook is converged by Chef.

We spend a lot of time thinking about the cookbook contribution process and how to test patches to ensure they don’t break functionality. Andrew also recently wrote full suites of minitest and cucumber tests for both the apache2 and mysql cookbooks. Take a look and let us know what you think.

I can’t say enough good things about Andrews work. He has not only provided great tools for the community, but that work is well documented and very professional. He is another contributor that we’re proud to have as part of our community.

— Bryan McLellan

Don’t miss the screencast Andrew produced for this blog post: Chef and BDD

On the level: Testing your infrastructure

By Andrew Crump

Testing is a hot topic in the Chef and Ruby communities. Opscode have kindly agreed to share a screencast I’ve prepared on TDD and BDD and how they relate to writing examples for your infrastructure which you can watch below. I hope you find it interesting.

Behaviour Driven Development is essentially an extension of Test Driven Development, but expressing examples in language that everyone involved can understand, not just developers.

To quote the RSpec Book:

Behaviour-Driven Development is about implementing an application by describing its behavior from the perspective of its stakeholders.

The RSpec Book describes an iterative approach where you first outline an example in domain language, and then ‘drop down’ to write examples at the level of unit tests to further define the behaviour of the code.

So why link infrastructure back to examples?

Obviously the infrastructure services are required to support the user’s scenario. We already know it’s necessary, so why try to link our infrastructure back to examples?

  • By focusing on examples of the users use of the application rather than focusing on the fun implementation detail we can more easily expose assumptions we might have about the implementation. It encourages thinking creatively to meet the users needs rather than repeating patterns we’ve used in the past out of habit. It also exposes waste – by focusing on meeting the users needs with the simplest implementation possible we just avoid doing unnecessary things.

  • If we have an automated suite of examples we can run against the application and its infrastructure then this opens up the prospect of refactoring our infrastructure. We can experiment and swap out components of our infrastructure, confident that our examples will detect any regression. By running our examples we can validate at any point in time that we are still able to support the example scenarios that our stakeholders care about.

What do examples look like?

When you write examples you start by writing them in domain language – for example these might be English sentences in the Given / When / Then format.

Take the following example:

Given I want hire a car
When I search for late-model vehicles specifying my hire dates
Then I should be shown the available vehicles

We don’t express these examples in terms of nodes, cookbooks, resources and roles. Neither do we express these examples in terms of SSHing onto a box and running a command to check a service is running. These are specific to the implementation of our infrastructure and too low-level, here we should be describing the business scenarios that we think we need to support.

In most cases many different application and infrastructure services will be needed to satisfy a user scenario. There often won’t be a clear relationship between the example scenarios and the underlying infrastructure created to support them. Our examples will typically cut across multiple application and infrastructure services.

It sounds like the infrastructure I build with Chef sits below this

It does. But you can also benefit hugely from testing at the service layer. We can test that following a converge of a node against a Chef role that the node behaves as we would expect the nodes of that role to behave.

To make this concrete – imagine we have a MySQL cookbook that ships with service layer examples. When we deploy a new instance of our database role we can run these examples. They will attempt to connect to the server as a MySQL client and verify they can query and modify data. The examples can be run on another node entirely.

You are now getting into implementation detail – detail that not everyone involved will care about or understand. When writing your cookbook you will still be practicing TDD, but these examples won’t necessarily be expressed in a tool like Cucumber as using normal sentences offers little additional benefit. An exception is if you are writing cookbooks to be widely used by people that know just enough Ruby for Chef but that may not be able to grok your tests – in this case examples written in plain English may be a better idea.

Ok, so I’m not checking which resources have been created on the node?

No, that’s the job of your lower level resource tests that can verify that the box has the correct state. These tests can poke around and look at the state – the packages installed and services started. The service layer tests that sit above these don’t care about how you got the service to work, only that it can for example query MySQL.

These tests will be the most natural to get up and running with as they are at the same level as writing Chef recipes. However I’d really like to see more people writing tests to test that a service behaves as expected, and sharing these with the community as part of their cookbooks.

Cucumber isn’t cool

While Cucumber is a popular tool to use when attempting to practise BDD it has become fashionable to look down on Cucumber as introducing an unnecessary step in the development process which adds complexity and boilerplate for little benefit. There is some truth to this, but some of this is also down to mis-use of the tool:

  • The main benefits are to broaden the set of people that can get involved and decouple the business examples from the implementation. Being forced to write the examples separately can help clarity but if you are writing scenarios that only the development team reads you are probably doing it wrong.

  • Cucumber is often associated with unmaintainable proliferation of steps. The code that sits behind your features is still code and you should strive to write clean code and remove duplication the same as you would elsewhere. Often the easiest way to avoid step definition hell is to extract methods to a separate steps module, so that the step definition is merely a call out to this reusable library of step definitions.

  • Avoid using pre-canned step definitions when writing your features. This can be very tempting at first but as the whole point of writing features is to express the examples in your own domain language, any attempt to rely on pre-canned definitions will invariably lead to writing features that don’t concisely and clearly express the example. For example, use of web steps “When I fill in form field with value” or SSH steps “When I run this command on the node” should be avoided. Your steps may take these actions behind the scenes but to use this language in your features surfaces too much detail and makes your features hard to follow.

What’s next?

I think there’s a massive public benefit in shipping service layer tests with our cookbooks – being able to demonstrably prove that the examples for a service work following code changes allows us to be much more confident when making changes and should help drive more feature-rich and solid cookbooks.


Posted in Uncategorized