David G. Wiseman

Weighty Mail Brings System to Knees

	Several computers in the Earth Science department at Stanford were
brought to their knees Dec 13(89) by an interesting combination of bugs.  I can
only assume similar numbers of machines around campus in other departments also
succumbed. I don't really know, because our network is dead as a result of it!

	For some reason, the Stanford Chinese Student association mailing list
started bouncing mail infinitely between two Stanford machines, "Macbeth" and
"Portia". At each iteration the 30K of mail was rebroadcast anew to everyone on
the list, including at least one Chinese student on each machine in our
department. This was the first bug. I don't know why it happened. (Anyone know
that story?) This bug alone would have been bad, but not catastrophic...

	The problem was that after each successive bounce the return address
got longer and longer, until monstrosities like the following became
commonplace:

<@Macbeth.Stanford.EDU,@Portia.stanford.edu,@Macbeth.Stanford.EDU,@Portia.stanfo
rd.edu,@Macbeth.Stanford.EDU:xu@spanky.Stanford.EDU>

	Once the reply addresses got up to about the length shown in this
example, they started to overflow a fixed-length buffer in all Berkeley-derived
mails (NO checking for overflow, OF COURSE). This caused the affected mail
processes to go crazy. First of all they sent an error message back to the
sending machine (causing it to send the viral mail _again_ 30 minutes later).
Worse, the mangled mail processes continue to run forever until somebody kills
them. (And it takes kill -9 as superuser to do it.) One such mail process on a
lightly-used Earth Science machine accumulated _9500_ minutes of CPU before
anyone figured out why the machine had been so slow for the preceding week!

	Our first reaction was to scream at the people who run the list, and
they said they fixed the problem, but their fix only resulted in a greater
variety of mail loops being generated; various interesting orbits about portia,
polya, hamlet, macbeth, ...

	Needless to say as the traffic built up the CPUs and then finally the
network itself in the Earth Science department ground to a halt. It became
impossible to even log in to many machines to kill the offending mail
processes. Killing the processes wasn't very effective, either, since that
would not stop Macbeth from immediately starting the whole mess up again. We
finally pulled the plug on sendmail.

	Fortunately this evening Stanford had a power glitch that seems to have
crashed most of the offending machines, so the network is OK again now. :-)

Ha, ha, ha. Take me back to [ the alphabetic list ] [ the date-ordered list ].