9. Components of a running system

This chapter reviews the components of a running CNews+NNTPd server. Analogous components will be found in an INN-based system too. We invite additions from readers familiar with INN to add their pieces to this chapter.

9.1. /var/lib/news: the CNews control area

This directory is more popularly known as $NEWSCTL. It contains configuration, log and status files. There are no articles or binaries kept here. Let's see what some of the files are meant for. Control files are dealt in slightly greater detail in "Section 4.3>"

9.2. /var/spool/news: the article repository

This is also known as the $NEWSARTS or $NEWSSPOOL directory. This is where the articles reside on your disk. No binaries or control files should belong here. Enough space should be allocated to this directory as the number of articles keep increasing with each batch that is digested. An explanation of the following sub-directories will give you an overview of this directory:

9.3. /usr/lib/newsbin: the executables


9.4. crontab and cron jobs

The heart of the Usenet news server is the various scripts that run at regular intervals processing articles, digesting/rejecting them and transmitting them to NDNs. I shall try to enumerate the ones that are important enough to be cronned. :)

9.5. newsrun and relaynews: digesting received articles

The heart and soul of the Usenet News system, newsrun just picks up the batches/ articles in the in.coming directory of $NEWSARTS and uncompresses them (if required) and calls relaynews. It should run from cron.

relaynews picks up each article one by one through stdin, determines if it belongs to a subscribed group by looking up sys file, looks in the history file to determine that it does not already exist locally, digests it updating the active and history file and batches it for neighbouring sites. Logs errors on encountering problems while processing the article and takes appropriate action if it happens to be a control message. More info in manpage of relaynews.

9.6. doexpire and expire: removing old articles

A good way to get rid of unwanted/old articles from the $NEWSARTS area is to run doexpire once a day. It reads the explist file from the $NEWSCTL directory to determine what articles expire today. It can archive the said article if so configured. It then updates the active and the history file accordingly. If you wish to retain the article entry in the history file to avoid re-digesting it as a new article after having expired it, add a special /expired/; line in the control file. More on the options and functioning in the expire manpage.

9.7. nntpd and msgidd: managing the NNTP interface

As has already been discussed in the chapter on setting up the software, nntpd is a TCP-based server daemon which runs under inetd. It is fired by inetd whenever there's an incoming connection on the NNTP port, and it takes over the dialogue from there. It reads the C-News configuration and data files in $NEWSCTL, article files from $NEWSARTS>, and receives incoming posts and transfers. These it dutifully queues in $NEWSARTS/in.coming, either as batch files or single article files.

It is important that inetd be configured to fire nntpd as user news, not as root like it does for other daemons like telnetd or ftpd. If this is not done correctly, a lot of problems can be caused in the functioning of the C-News system later.

nntpd is fired each time a new NNTP connection is received, and dies once the NNTP client closes its connection. Thus, if one nntpd receives a few articles by an incoming batch feed (not a POST but an XFER), then another nntpd will not know about the receipt of these articles till the batches are digested. This will hamper duplicate newsfeed detection if there are multiple upstream NDNs feeding our server with the same set of articles over NNTP. To fix this, nntpd uses an ally: msgidd, the message ID daemon. This daemon is fired once at server bootup time through newsboot, and keeps running quietly in the background, listening on a named Unix socket in the $NEWSCTL area. It keeps in its memory a list of all message IDs which various incarnations of nntpd have asked it to remember.

Thus, when one copy of nntpd receives an incoming feed of news articles, it updates msgidd with the message IDs of these messages through the Unix socket. When another copy of nntpd is fired later and the NNTP client tries to feed it some more articles, the nntpd checks each message ID against msgidd. Since msgidd stores all these IDs in memory, the lookup is very fast, and duplicate articles are blocked at the NNTP interface itself.

On a running system, expect to see one instance of nntpd for each active NNTP connection, and just one instance of msgidd running quietly in the background, hardly consuming any CPU resources. Our nntpd is configured to die if the NNTP connection is more than a few minutes idle, thus conserving server resources. This does not inconvenience the user because modern NNTP clients simply re-connect. If an nntpd instance is found to be running for days, it is either hung due to a network error, or is receiving a very long incoming NNTP feed from your upstream server. We used to receive our primary incoming feed from our service provider through NNTP sessions lasting 18 to 20 hours without a break, every day.

9.8. nov, the News Overview system

NOV, the News Overview System is a recent augmentation to the C-News and NNTP systems and to the NNTP protocol. This subsystem maintains a file for each active newsgroup, in which it maintains one line per current article. This line of text contains some key meta-data about the article, e.g. the contents of the From, Subject, Date and the article size and message ID. This speeds up NNTP response enormously. The nov library has been integrated into the nntpd code, and into key binaries of C-News, thus providing seamless maintenance of the News Overview database when articles are added or deleted from the repository.

When newsrun adds an article into starcom.test, it also updates $NEWSARTS/starcom/test/.overview and adds a line with the relevant data, tab-separated, into it. When nntpd comes to life with an NNTP client, and it sees the XOVER NNTP command, it reads this .overview file, and returns the relevant lines to the NNTP client. When expire deletes an article, it also removes the corresponding line from the .overview file. Thus, the maintenance of the NOV database is seamless.

9.9. Batching feeds with UUCP and NNTP

Some information about batching feeds has been provided in earlier sections. More will be added later here in this document.