MOO-cows Mailing List Archive

[Prev][Next][Index][Thread]

Re: processor load problem



> Andreas Dieberger writes:
> > Our sysop told me that my MOO frequently takes up to 90% of the processor
> > time of our server (which is a quite fast machine).
> > 
> > And here is the info I got from out sysop:
> > 
> > >  PID USERNAME PRI NICE   SIZE    RES STATE   TIME   WCPU    CPU COMMAND
> > >  802 root     -25    0  6716K   596K cpu  1159:12 94.71% 93.88% mpl-moo
> > >
> > >As you can tell mpl-moo is taking 90% and up all the time and I'm not sure
> > >why. If you could take another look at it I'd appreciate it. Thanks!
> > 
> > I just noticed that the MOO obviously runs under root - I dont think that
> > should be like that. Could that somehow cause the problem ...? Still how
> > can it cause that much load if it doesnt *do* anything?

> It looks to me like you're using a hacked MOO server, possibly with the MPL
> patch (multi-port listening)?  In any case, something is weird about how you're
> doing things, since the normal MOO server never runs as root 

More accurately, the MOO server doesn't muck with the UID it runs as; 
root is treated the same as any other user.  

This may not be optimal for MPL-patched servers; you'd sorta like the 
server to start as root, lock down port 80 (or whatever) and then 
throw away root privs.

> and never mucks
> with its scheduling priority.  

The kernel is perfectly happy to muck with its scheduling priority 
for you.

>From man 1 ps under SunOS 4.1.4:

     PRI         Process  priority  (non-positive  when  in  non-
                 interruptible wait).
     NI          Process scheduling increment (see getpriority(2)
                 and nice(3V)).

and man 1 nice:

     nice executes command with the nice value number.  The  nice
     value  is one of the factors used by the kernel to determine
     a  process's  scheduling  priority.   Scheduling  priorities
     range  from  0  to 127.  The higher the value, the lower the
     command's scheduling priority, and the lower the value,  the
     higher  the  command's  scheduling priority.  In addition to
     the nice value, the kernel also recent CPU usage by the pro-
     cess,  the  time  the  process  has been waiting to run, and
     other factors to arrive at scheduling priority.

...so what's probably happening here is that an unrenice'd MOO server 
is chewing all the CPU, and the kernel is giving it a lower priority 
in order to give interactive tasks an edge.

> What's the rest of the story?

The rest of the story is that something weird *is* happening.  It's 
quite possible that MPL has a bug; I'm sure that Ivan or someone else 
would appreciate a chance to debug this.  JHM, another MPL-patched 
server has never exhibited this behavior, though.

My best guess is that you have a connection that's locked in 
I-don't-understand-that mode.  We had this problem at one of MITRE's 
servers, and it was eventually tracked down to a hosed client that 
sent a bad line (like "telnet> "), to which the MOO server said "I 
don't understand that.", which then triggered another line from the 
client....

Whenever you have non-human network connections in 1.7.8 and earlier, 
you have to be very careful not to let the input stream get out of 
control.  The simplest thing to do is make sure you have a read() 
loop locked on the connection at all times.  Orbitnet gets a bit 
fancier, and puts both connection objects into rooms that have :huh 
verbs that can't escalate the I-don't-understand-that condition.

> 	Pavel


Jay Carlson
nop@io.com    nop@ccs.neu.edu    nop@kagoona.mitre.org

Flat text is just *never* what you want.   ---stephen p spackman




References:

Home | Subject Index | Thread Index