Wednesday, 25 April 2007

Quantifying the Cost of Feature-Creep

A naive person might think that the cost of building a software product with 100 features would be roughly 100c, where c is the average cost of a new feature. Unfortunately the real cost is proportionate to at least the square of the number of features, and may be much worse. Here's why ...

A Holistic View
No feature is an island. It interacts with some of the other features.

An easy way to appreciate the consequence of this is to imagine that the software product is being constructed incrementally, feature by feature. [In fact, this is not a bad approximation of reality.]

Each feature may have an interaction with a pre-existing feature. Therefore not only do we need to design and implement the new feature, we need to determine whether it interacts with each of the pre-existing features, and possibly modify them to co-exist with the new feature.

The cheapest case
In the cheapest case -- I won't call it the best case for reasons which will be described below -- all features are independent, and we simply pay the cost of being careful, i.e. checking that the new feature is independent of all the other features.

So the interaction cost is 0 + 1 + 2 + ... + n-1, for n features, which anyone who knows the famous story about Gauss can tell you is proportionate to n2.

So the overall-cost is n * average_isolated_cost + n2 * average_consderation_cost

The priciest case
What if the features are not independent? Not only do we incur the cost of modifying code associated with pre-existing features, but this may trigger a cascade:

Adding the nth feature may require a revision of code associated with the n-1th feature, but this code may also have supported all the other previous features! For example, disregarding the design-level interaction between feature n and feature n-2, there may be an additional indirect coupling via feature n-1. And there may be more distant indirect interactions too.

So in the worst case, adding a feature involves adding the new feature, modifying the existing features to account for direct interactions, and modifying existing features to account for indirect-interactions.

Note to the reader: Please let me know if you figure out a good upper-bound is: My hunch is proportionate to the factorial of the number of features.

Why the cheapest case is not the best case
In the cheapest case all features are independent. But in a cohesive software product you expect features to interact; so you would not want to design a product with this characteristic.

On the other hand you probably do not want all features to interact, because the end-result would seem overly dense, and consequently very difficult to learn to use.

Take home message: The degree of likely coupling of features is a consideration in both usability and cost.

What to do: Software Developers
The thought-experiments above do not reflect how developers actually determine how new features interact with existing features. However, determining these interactions is important. The following techniques may be of help:
  1. Reflection: A developer with a good theory of the product will be able to diagnose some interactions by thought, white-board, and poking around.
  2. Good Test Coverage: A good set of tests (e.g. built using TDD) will be of help in showing by failures of existing tests where a new addition interacts deleteriously, giving clues as to problems.
  3. Design by Contract: DBC-style assertions will give even better locality information than a test by showing where in the code the violations of old assumptions occur.
Of course, good abstraction and modularity of the code-base help too.

What to do: Software Designers
Of course the big message is to designers:

Choose your features carefully

The cost increases with the size of the product, so "just throwing things in", is a policy which will lead to great cost later. You can reduce this cost by attempting to do more with less: Aim for smaller feature-sets that do more with less.

Good luck!

No comments: