A function is one of the smallest units of behaviour. Given an input, it returns the output.

This means that given the context (input), the function's behaviour can be verified by calling the function and looking at the outcome (output).

Notice that the most straightforward outcome to verify is the function's return value, but it's not the only outcome we can look for. Functions can have side effects, modify state, interact with collaborators, etc.

Note: I'm going to talk about functions as a unit of behaviour for simplicity, but most of it applies to objects as well. After all, objects can be thought of as functions with some context encapsulated.

Functions can be composed to build more complex behaviours out of the simpler ones.

Composition works all the way up (or down).

It's a simplification, but an application can be thought of as a composition of behaviours. At the top, there's an entry point that the end-user will call. The entry point will in turn call lower levels, that will call even lower levels, that will call [...], all the way to the bottom.

In my tests, I can choose to verify the behaviour on any level.

By choosing one of the higher levels, I'm able to verify the behaviour closer to the end user's experience. The cost is less pressure put on the design.

By choosing one of the lower levels, I'm able to verify the behaviour of components that make up the system. The gain is better feedback on my design.

The higher level I choose to test through, the more freedom I get in arranging lower levels. Since it requires more mental effort to keep myself disciplined and take steps small enough to remain in control, it's not necessarily a good kind of freedom. Furthermore, at a certain altitude, it will be harder to verify behaviours buried at lower levels.

The lower level I choose to test through, the more organised I am and the smaller steps I take. Tests will also give me better feedback on my design while being less business-focused perhaps. I will sometimes need to refactor them or even replace if I decided to change some components or the way they interact.

So which way do I prefer?

If I had to choose one, I'd go for the lower, fine-grained level tests.

I tend to write both kinds though, to make sure both external and internal quality is taken care of.

I like to exercise my acceptance tests through one of the higher levels. I often use them to create a "walking skeleton". This kind of tests won't execute every possible path. I won't write a lot of them either. This kind of tests will exercise big chunks of behaviours to confirm and document the business goals. These tests will also make sure everything is correctly wired together. These tests merely supplement the tests that exercise the lower levels.

I like to write a lot more fine-grained level tests. These are smaller, faster to run and write, and more focused. This kind of tests will exercise very small chunks of behaviour. These tests will help me to take small steps, keep me disciplined, and give me feedback on the quality of my design.

On granularity of tests focused on behaviour