MOO-cows Mailing List Archive

[Prev][Next][Index][Thread]

processor load problem - update



Hi all.

Thanks for the replies I got so far. Here is an update on the situation and
some previously missing info:

The server we are running is 1.7.8p4  with the MPLpatch 1.4.

It certainly was not a good idea to run the MOO under root and with that
high priorities - that was an accident it seems. Still that alone doesnt
explain why an empty moo needs that many cycles on the server.

So today our sysop (Michael Mealling) checked what the MOO does all the
time. At that point the MOO was basically idling with only me logged in and
3 queued tasks that didnt do anything except waiting (using suspend).

Here is an extract of what he saw:


>poll(0x00576710, 10, 1000)                      = 0
>time()                                          = 816284036
>time()                                          = 816284036
>time()                                          = 816284036
>time()                                          = 816284036
>poll(0x00576710, 10, 1000)                      = 0
>time()                                          = 816284037
>time()                                          = 816284037
>alarm(0)                                        = 21117
>sigaction(SIGALRM, 0xDFFFF700, 0xDFFFF800)      = 0
>sigaction(SIGALRM, 0xDFFFF700, 0xDFFFF800)      = 0
>setitimer(ITIMER_VIRTUAL, 0xDFFFF8A0, 0xDFFFF8A0) = 0
>sigaction(SIGVTALRM, 0xDFFFF700, 0xDFFFF800)    = 0
>sigaction(SIGVTALRM, 0xDFFFF700, 0xDFFFF800)    = 0
>time()                                          = 816284037
>sigaction(SIGALRM, 0xDFFFF710, 0xDFFFF810)      = 0
>alarm(21117)                                    = 0
>sigaction(SIGVTALRM, 0xDFFFF710, 0xDFFFF810)    = 0
>setitimer(ITIMER_VIRTUAL, 0xDFFFF8B0, 0x00000000) = 0
>getitimer(ITIMER_VIRTUAL, 0xDFFFF4B0)           = 0
>getitimer(ITIMER_VIRTUAL, 0xDFFFF4B0)           = 0
>getitimer(ITIMER_VIRTUAL, 0xDFFFF4B0)           = 0
>getitimer(ITIMER_VIRTUAL, 0xDFFFF4B0)           = 0
>getitimer(ITIMER_VIRTUAL, 0xDFFFF4B0)           = 0
>time()                                          = 816284037
>getitimer(ITIMER_VIRTUAL, 0xDFFFF4B0)           = 0
>time()                                          = 816284037
>time()                                          = 816284037
>getitimer(ITIMER_VIRTUAL, 0xDFFFF4B0)           = 0
>getitimer(ITIMER_VIRTUAL, 0xDFFFF4B0)           = 0
>getitimer(ITIMER_VIRTUAL, 0xDFFFF4B0)           = 0
>getitimer(ITIMER_VIRTUAL, 0xDFFFF4B0)           = 0
>getitimer(ITIMER_VIRTUAL, 0xDFFFF4B0)           = 0
>getitimer(ITIMER_VIRTUAL, 0xDFFFF4B0)           = 0
>getitimer(ITIMER_VIRTUAL, 0xDFFFF4B0)           = 0
(...)

and from then on there are thausands of these getitimer calls with only a
few other calls in between. The moo took between 1% and 30% of the
processor cycles during that, switching between high and low load for no
apparent reason.

Does that ring a bell with anybody...?

(maybe I should mention that the server is quite a powerful machine. I
always forget what it really is but most of the time it has 4 CPUs and runs
the GaTech WWW server, ftp server, news server and a few other things in
parallel to the MOO (I actually had even two MOOs running on it for some
time) without any problems. So 30% load on _that_ machine really is _a
lot_. If somebody needs more accurate data I can ask ;) )

Ok - well maybe thats a problem of the MPL patch..? so we started a second
MOO we have on that machine that uses the same server but does not have the
MPL patch. That MOO had no queued tasks (and only me logged in). Basically
we observed the same but didnt look into too many details yet.

The strange thing is that these problems appeared for the first time about
1-2 weeks ago and before that we never had any problems of that kind in the
months before.

well - that much for an update of the situation...

----

[Jay Carlsson]:
>More accurately, the MOO server doesn't muck with the UID it runs as;
>root is treated the same as any other user.
>
>This may not be optimal for MPL-patched servers; you'd sorta like the
>server to start as root, lock down port 80 (or whatever) and then
>throw away root privs.

*nod*

>The rest of the story is that something weird *is* happening.  It's
>quite possible that MPL has a bug; I'm sure that Ivan or someone else
>would appreciate a chance to debug this.  JHM, another MPL-patched
>server has never exhibited this behavior, though.

I tried to contact ivan already but wasn't lucky so far. Does somebody have
an up-to-date email address of him?

>My best guess is that you have a connection that's locked in
>I-don't-understand-that mode.  We had this problem at one of MITRE's
>servers, and it was eventually tracked down to a hosed client that
>sent a bad line (like "telnet> "), to which the MOO server said "I
>don't understand that.", which then triggered another line from the
>client....

Interesting. As far as I know everybody accessed the WWW gateway only using
Netscape...

>Whenever you have non-human network connections in 1.7.8 and earlier,
>you have to be very careful not to let the input stream get out of
>control.  The simplest thing to do is make sure you have a read()
>loop locked on the connection at all times.  Orbitnet gets a bit
>fancier, and puts both connection objects into rooms that have :huh
>verbs that can't escalate the I-don't-understand-that condition.

Ok. Thanks for that hint.


[Pieter-Bas]

>I've got a moo running under the same circumstances, though it is just a
>standard moo. What the standard moo-server does is that it checkpoints the
>database every hour, and it dumps it to disk. It uses quite a bit of processor
>time when it does so:
>
>  PID USERNAME PRI NICE   SIZE   RES STATE   TIME   WCPU    CPU COMMAND
> 1395 pbijdens  58    0  2800K 2484K run     0:05 81.83% 24.16% moo

>I never thought this was a problem, since it always lasts just for a few
>seconds.

Well the dump sure _does_ kick the server each time. We set the
checkpointing interval to 6 hours because of that... Maybe not such a good
idea, but most of the time TechMOO is quite empty anyway...

Thanks a lot for any help so far.

Andreas




Follow-Ups:

Home | Subject Index | Thread Index