In B News, expiration needs to be performed by a program called
expire, which took a list of newsgroups as arguments,
along with a time specification after which articles had to be expired.
To have different hierarchies expire at different times, you had to write a
script that invoked expire for each of them separately.
C News offers a more convenient solution. In a file called
explist, you may specify newsgroups and expiration
intervals. A command called doexpire is usually run once
a day from cron and processes all groups according to this
list.
Occasionally, you may want to retain articles from certain groups even
after they have been expired; for example, you might want to keep
programs posted to comp.sources.unix.
This is called archiving. explist
permits you to mark groups for archiving.
An entry in explist looks like this:
grouplist perm times archive |
grouplist is a comma-separated list of
newsgroups to which the entry applies. Hierarchies may be specified by
giving the group name prefix, optionally appended with all. For example, for an entry applying to
all groups below comp.os,
enter either comp.os or
comp.os.all.
When expiring news from a group, the name is
checked against all entries in explist in the
order given. The first matching entry applies. For example, to throw
away the majority of comp
after four days, except for comp.os.linux.announce, which you want
to keep for a week, you simply have an entry for the latter, which
specifies a seven-day expiration period, followed by an expiration
period for comp, which
specifies four days.
The perm field details if the entry applies to
moderated, unmoderated, or any groups. It may take the values
m,
u, or
x, which denote moderated, unmoderated,
or any type.
The third field, times, usually contains
only a single number. This is the number of days after which articles
expire if they haven't been assigned an artificial expiration
date in an Expires: field in the article
header. Note that this is the number of days counting from its
arrival at your site, not the date of posting.
The times field may, however, be more
complex than that. It may be a combination of up to three numbers
separated from one another by dashes. The first denotes the number of
days that have to pass before the article is considered a candidate
for expiration, even if the Expires: field would
have it expire already. It is rarely useful to use a value other than
zero. The second field is the previously mentioned default number of days
after which it will be expired. The third is the number of days after
which an article will be expired unconditionally, regardless of
whether it has an Expires: field or not. If only
the middle number is given, the other two take default values. These
may be specified using the special entry /bounds/, which is described a little
later.
The fourth field, archive, denotes whether the
newsgroup is to be archived and where. If no archiving is intended, a dash
should be used. Otherwise, you either use a full pathname (pointing to a
directory) or an at sign (@). The at sign denotes the default archive
directory, which must then be given to doexpire by using
the –a flag on the command line. An archive directory
should be owned by news. When
doexpire archives an article from say,
comp.sources.unix, it stores it in
the directory comp/sources/unix below
the archive directory, creating it if necessary. The archive directory itself,
however, will not be created.
There are two special entries in your explist file that
doexpire relies on. Instead of a list of newsgroups, they
have the keywords /bounds/ and
/expired/. The
/bounds/ entry contains the default
values for the three values of the times field
described previously.
The /expired/ field determines how
long C News will hold onto lines in the history file.
C News will not remove a line from the history file
once the corresponding article(s) have been expired, but will hold onto it
in case a duplicate should arrive after this date. If you are fed by only
one site, you can keep this value small. Otherwise, a couple of weeks is
advisable on UUCP networks, depending on the delays you experience with
articles from these sites.
Here is a sample explist file with rather tight
expiry intervals:
# keep history lines for two weeks. No article gets more than three months
/expired/ x 14 -
/bounds/ x 0-1-90 -
# groups we want to keep longer than the rest
comp.os.linux.announce m 10 -
comp.os.linux x 5 -
alt.folklore.computers u 10 -
rec.humor.oracle m 10 -
soc.feminism m 10 -
# Archive *.sources groups
comp.sources,alt.sources x 5 @
# defaults for tech groups
comp,sci x 7 -
# enough for a long weekend
misc,talk x 4 -
# throw away junk quickly
junk x 1 -
# control messages are of scant interest, too
control x 1 -
# catch-all entry for the rest of it
all x 2 - |
Expiring presents several potential problems. One is that your newsreader
might rely on the third field of the active file
described earlier, which contains the number
of the lowest article online. When expiring articles, C News does not update
this field. If you need (or want) to have this field represent the real
situation, you need to run a program called updatemin after
each run of doexpire. (In older versions of C News, a
script called upact did this.)
C News does not expire by scanning the newsgroup's directory, but simply
checks the history file if the article is due for
expiration.[1]
If your history file somehow gets out of sync, articles may be around on
your disk forever because C News has literally forgotten
them.[2]
You can repair this by using the addmissing script in
/usr/lib/news/maint, which will add missing articles
to the history file or mkhistory,
which rebuilds the entire file from scratch. Don't forget to become user
news before invoking it, or else you
will wind up with a history file unreadable by
C News.