I have been thinking about test driven sysadmin recently. We wouldn’t want to write code without tests, so we also shouldn’t want to create untestable pieces of infrastructure.
I think cucumber (or gherkin) could be a nice way of expressing policy:
Now, the idea is that the scenarios are skipped when the clauses in the ‘given’ part are not met. In this case I have defined two scenarios: a mail server and a production server to relay via a mail host. When I run this against a host, the given clauses work out which policies to test. I needed to teach lettuce to have skippable scenarios. You can find a rough and ready implementation of this on my branch of lettuce. Now all that is needed is to implement the steps to check those features (which is at the bottom of this post).
The nice part about this is I can write a fair amount of logic in python to test the state of the system, but these can be calling out to fabric simple shell calls. i.e. I don’t need to install python onto to the system I wish to test. Changes in policy can be tested site wide by testing a feature against all hosts. I can also test the external view of the test machine from my box by say, accessing a port or using an http get.
This is just the sketch of an idea: I need to write a command line tool to handle the setup of of fabric hosts, and to allow multiple hosts to be tested at once. I would also like to add a fix mode, whereby failing steps could call out to the code that corrects a failing policy. Do you think this is a good way to test a policy?
