Programs can also have run-control or dot directories. These
group together several configuration files that are related to the
program, but that are most conveniently treated separately (perhaps
because they relate to different subsystems of the program, or have
differing syntaxes).
Whether file or directory, convention now dictates that the
location of the run-control information has the same basename as the
executable that reads it. An older convention still common among
system programs uses the executable's name with the suffix
‘rc’ for ‘run control’.[101] Thus, if you write a
program called ‘seekstuff’ that has both site-wide and
user-specific configuration, an experienced Unix user would expect to
find the former at /etc/seekstuff and the latter
at .seekstuff in the user's home directory; but
it would be unsurprising if the locations were
/etc/seekstuffrc and
.seekstuffrc, especially if seekstuff were a
system utility of some sort.
While the semantics of run-control files are of course
completely program dependent, there are some design rules about
run-control syntax that are widely observed. We'll describe
those next; but first we'll describe an important exception.
If the program is an interpreter for a language, then it is
expected to be simply a file of commands in the syntax of that
language, to be executed at startup. This is an important rule,
because Unix tradition strongly encourages the design of all kinds of
programs as special-purpose languages and minilanguages. Well-known
examples with dotfiles of this kind include the various Unix command
shells and the Emacs programmable editor.
(One reason for this design rule is the belief that special
cases are bad news — thus, that any switch that changes the
behavior of a language should be settable from within the language.
If as a language designer you find that you
cannot
express all the startup settings of a
language in the the language itself, a Unix programmer would say you
have a design problem — which is what you should be fixing,
rather than devising a special-case run-control syntax.)
This exception aside, here are the normal style rules for
run-control syntaxes. Historically, they are patterned on the syntax of
Unix shells:
-
Support explanatory comments, and lead them with
#.
The syntax should also ignore whitespace before #, so
that comments on the same line as configuration directives are
supported.
-
Don't make insidious whitespace
distinctions.
That is, treat runs of spaces and tabs,
syntactically the same as a single space. If your directive format is
line-oriented, it is good form to ignore trailing spaces and tabs on
lines. The metarule is that the interpretation of the file should
not depend on distinctions a human eye can't
see.
-
Treat multiple blank lines and comment lines
as a single blank line
. If the input format uses blank
lines as separators between records, you probably want to ensure that
a comment line does not end a record.
-
Lexically treat the file as a simple
sequence of whitespace-separated tokens, or lines of tokens.
Complicated lexical rules are hard to learn, hard to remember, and
hard for humans to parse. Avoid them.
-
But, support a string syntax for tokens with
embedded whitespace.
Use single quote or double quote as
balanced delimiters. If you support both, beware of giving them different
semantics as they have in shell; this is a well-known source
of confusion.
-
Support a backslash syntax for embedding
unprintable and special characters in strings
. The standard
pattern for this is the backslash-escape syntax supported by C
compilers. Thus, for example, it would be quite surprising if the
string "a\tb" were not interpreted as a character
‘a’, followed by a tab, followed by the character
‘b’.
Some aspects of shell syntax, on the other hand, should
not
be emulated in run-control syntaxes —
at least not without a good and specific reason. The shell's baroque
quoting and bracketing rules, and its special metacharacters for
wildcards and variable substitution, both fall into this
category.
It bears repeating that the point of these conventions is to
reduce the amount of novelty that users have to cope with when they
read and edit the run-control file for a program they have never seen
before. Therefore, if you have to break the conventions, try to do so
in a way that makes it visually obvious that you have done so,
document your syntax with particular care, and (most importantly)
design it so it's easy to pick up by example.
These standard style rules only describe conventions about
tokenizing and comments. The names of run-control files, their
higher-level syntax, and the semantic interpretation of the syntax are
usually application-specific. There are a very few exceptions to
this rule, however; one is dotfiles which have become
‘well-known’ in the sense that they routinely carry
information used by a whole class of applications. Sharing
run-control file formats in this way reduces the amount of novelty
users have to cope with.
Of these, probably the best established is the
.netrc file. Internet client programs that must
track host/password pairs for a user can usually get them from
the .netrc file, if it exists.