As with previous issues about modularity and interface design,
Unix programmers react to a set of distinctions they have often
learned from experience without knowing how to articulate. Therefore
we'll need to start by developing some terminology.
We will start by defining what software complexity is. We will
make some horizontal distinctions between different flavors of
complexity, which sometimes have to be traded off against each other.
We will finish by making some even more important vertical
distinctions, between the kinds of complexity we must live with and
the kinds we have the option to eliminate.
Questions about simplicity, complexity, and the right size of
software arouse a lot of passion in the Unix world. Unix programmers
have learned a view of the world in which simplicity is beauty is
elegance is good, and in which complexity is ugliness is
grotesquery is evil.
Underlying the Unix programmer's passion for simplicity is a
pragmatic fact: complexity costs. Complex software is harder to think
about, harder to test, harder to debug, and harder to maintain —
and above all, harder to learn and use. The costs of complexity,
rough as they are during development, bite hardest after
deployment. Complexity creates places for bugs to nest, from which
they will emerge to trouble the world through the entire lifetime of
their software.
All kinds of pressures tend to drag programmers into a swamp of
complexity nevertheless. We've examined a rogue's gallery of these in
earlier chapters; feature creep and premature optimization are the two
most notorious. Traditionally, Unix programmers push back against
these tendencies by proclaiming with religious fervor a rhetoric that
condemns all complexity as bad.
So what exactly do we mean by ‘complexity’? This
point is worth pinning down, because it varies by observer.
Unix programmers (like other programmers) tend to focus on
implementation complexity—basically,
the degree of difficulty a programmer will experience in attempting to
understand a program so he or she can mentally model or debug
it.
Customers and users, on the other hand, tend to see complexity
in terms of the program's interface complexity.
In Chapter11 we discussed
the quality of ease and its inverse, mnemonic load. To a user,
complexity correlates closely with mnemonic load. Poor expressiveness
and concision can matter too, if a weak interface forces the user to
perform lots of error-prone or merely tedious low-level operations
rather than a few high-level ones.
Driven by both of these is a third measure that is much simpler:
the total number of lines of code in the system, its
codebase size. In terms of life-cycle costs,
this is usually the most important measure. The reasons go back to
perhaps the most important empirical result in software engineering,
one we've cited before: the defect density of code, bugs per hundred
lines, tends to be a constant independent of implementation language.
More lines of code means more bugs, and debugging is the most
expensive and time-consuming part of development.
Codebase size, interface complexity and implementation
complexity may all rise together. That is the usual result of feature
creep, and why programmers especially dread it. Premature
optimization doesn't tend to raise interface complexity, but it has
bad effects (often severely bad) on implementation complexity and
codebase size. But those sorts of arguments against complexity are
relatively easy to win; the difficult ones begin when these three
measures have to be traded off against each other.
We've already mentioned one situation in which two measures vary in
opposite directions: a user interface that has been designed primarily
to preserve implementation simplicity, or keep codebase size down, may
simply dump low-level tasks on the user. (A crude example of this,
barely imaginable to a Unix programmer but all too common elsewhere,
might be an editor that lacked a global-replace feature.) Though this
sort of design failure is all too common, it does not traditionally
have a name. We'll call it a manularity
trap.
Pressure to keep the codebase size down by using extremely dense
and complicated implementation techniques can cause a cascade of
implementation complexity in the system, leading to an un-debuggable
mess. This used to happen frequently when fitting programs onto very
small systems demanded assembler programming or tricks like
self-modifying code; nowadays it is uncommon except in embedded systems,
and rapidly becoming rare even there. This kind of design failure doesn't
have a traditional name, but one might call it a blivet
trap, after an old Army term for the results of attempting
to stuff ten pounds of horse manure into a five-pound bag.
The blivet trap won't appear in our case studies, but we've
defined it for contrast with its opposite. It can happen that the
designers of a project are so wary of implementation complexity that
they reject a complex but unified way to solve a whole class of
problems in favor of lots of duplicative, ad-hoc code that solves each
individual one in turn. The result is bloat in the size of the
codebase, and maintainability problems more severe than if the unified
method had been accepted. For example, a Web project that really
needs a centralized relational database behind its pages might instead
spawn several different keyed data files containing information that
has to be reintegrated at page generation time. This sort of failure
is all too common. It doesn't have a traditional name; we'll call it
an adhocity trap.
These are the three faces of complexity, and some of the traps
designers fall into in attempts to avoid them.[114] We'll see more examples
when we get to the case studies later in the chapter.