[BlueOnyx:20258] Load Average High

Richard Sidlin richard at sidlin.co.uk
Thu Nov 17 04:29:36 -05 2016


Hi

 

Bit of advice needed. Every so often, a few times a day, the load average of
one of my BO servers goes up to say 65. Needless to say, nothing much can be
done until it comes back down. 

 

I ran this command (found on a different forum) when it started to go wrong
this morning: top -b -n 1 | awk '{if (NR <=7) print; else if ($8 == "D")
{print; count++} } END {print "Total status D (I/O wait probably): "count}'
> topsave.txt

 

This was the output:

 

top - 08:36:18 up 1 day, 22:39,  3 users,  load average: 21.75, 13.19, 5.98

Tasks: 275 total,   2 running, 272 sleeping,   1 stopped,   0 zombie

Cpu(s):  0.3%us,  0.1%sy,  0.0%ni, 95.0%id,  4.6%wa,  0.0%hi,  0.0%si,
0.0%st

Mem:  16325480k total,  6542392k used,  9783088k free,   375556k buffers

Swap:  4194300k total,        0k used,  4194300k free,  5325764k cached

 

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

56942 enquirie  20   0 19712 2720 1780 D  1.9  0.0   0:04.96 imap

  918 root      20   0     0    0    0 D  0.0  0.0   0:17.69 flush-253:4

  950 root      20   0     0    0    0 D  0.0  0.0   0:24.13 kjournald

  952 root      20   0     0    0    0 D  0.0  0.0   0:06.24 kjournald

56995 root      20   0 91072 6332 2104 D  0.0  0.0   0:00.00 sendmail

57002 root      20   0 91072 6320 2100 D  0.0  0.0   0:00.02 sendmail

57228 root      20   0 91072 6320 2100 D  0.0  0.0   0:00.03 sendmail

57241 root      20   0 90272 5588 2124 D  0.0  0.0   0:00.00 sendmail

57248 root      20   0 90940 5344 1280 D  0.0  0.0   0:00.00 sendmail

57268 root      20   0 88000 3560  456 D  0.0  0.0   0:00.00 sendmail

57271 root      20   0 91072 6248 2044 D  0.0  0.0   0:00.00 sendmail

57286 root      20   0 91072 6244 2044 D  0.0  0.0   0:00.00 sendmail

57298 root      20   0 90480 6376 2816 D  0.0  0.0   0:00.04 sendmail

57300 root      20   0 91072 6248 2044 D  0.0  0.0   0:00.00 sendmail

57418 enquirie  20   0 19000 2092 1728 D  0.0  0.0   0:00.00 imap

57420 enquirie  20   0 19024 2204 1748 D  0.0  0.0   0:00.00 imap

57424 root      20   0 91272 6500 2056 D  0.0  0.0   0:00.00 sendmail

57469 root      20   0 86820 2360  632 D  0.0  0.0   0:00.00 sendmail

57470 root      20   0 86820 2188  452 D  0.0  0.0   0:00.00 sendmail

Total status D (I/O wait probably): 19

 

As I saw loads of sendmail processes, I thought I would stop the service and
see what happened. The load average continued to climb until it just hung
again and was unable to do anything. Strangely it does seem to happen at the
same time each morning and the problem has been happening for about a week.

 

Can anyone advise on how I can get to the bottom of what's causing the
problem? I have just run another top command once access came back and got
this:

 

Here is one more when it escalated even higher:

 

top - 08:55:01 up 1 day, 22:57,  3 users,  load average: 44.97, 30.92, 20.79

Tasks: 318 total,   2 running, 313 sleeping,   2 stopped,   1 zombie

Cpu(s):  0.3%us,  0.1%sy,  0.0%ni, 94.4%id,  5.2%wa,  0.0%hi,  0.0%si,
0.0%st

Mem:  16325480k total,  6617764k used,  9707716k free,   375800k buffers

Swap:  4194300k total,        0k used,  4194300k free,  5322732k cached

 

  PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND

  918 root      20   0     0    0    0 D  0.0  0.0   0:18.30 flush-253:4

  950 root      20   0     0    0    0 D  0.0  0.0   0:26.66 kjournald

  952 root      20   0     0    0    0 D  0.0  0.0   0:06.26 kjournald

7981 root      20   0     0    0    0 D  0.0  0.0   0:00.67 flush-253:3

55156 root      20   0     0    0    0 D  0.0  0.0   0:00.01 flush-253:0

58150 root      20   0  105m 4044 3112 D  0.0  0.0   0:00.02 auth

58152 root      20   0  105m 4040 3108 D  0.0  0.0   0:00.03 auth

58168 root      20   0  105m 4040 3112 D  0.0  0.0   0:00.02 auth

58208 enquirie  20   0 20288 2912 2032 D  0.0  0.0   0:02.31 imap

58225 enquirie  20   0 19180 2412 1852 D  0.0  0.0   0:00.06 imap

58261 root      20   0  162m  83m 1328 D  0.0  0.5   0:00.00 cced

58274 root      20   0 91072 6324 2104 D  0.0  0.0   0:00.00 sendmail

58276 root      20   0  102m 3812 2948 D  0.0  0.0   0:00.02 auth

58278 root      20   0  102m 3816 2948 D  0.0  0.0   0:00.02 auth

58280 root      20   0  102m 3812 2948 D  0.0  0.0   0:00.01 auth

58281 root      20   0 91072 6316 2104 D  0.0  0.0   0:00.00 sendmail

58283 root      20   0  102m 3816 2948 D  0.0  0.0   0:00.01 auth

58284 root      20   0  102m 3816 2948 D  0.0  0.0   0:00.01 auth

58287 root      20   0  102m 3816 2948 D  0.0  0.0   0:00.01 auth

58290 root      20   0  102m 3812 2948 D  0.0  0.0   0:00.02 auth

58294 root      20   0  102m 3812 2948 D  0.0  0.0   0:00.03 auth

58297 root      20   0  102m 3816 2948 D  0.0  0.0   0:00.03 auth

58298 root      20   0  102m 3816 2948 D  0.0  0.0   0:00.03 auth

58300 root      20   0  102m 3816 2948 D  0.0  0.0   0:00.03 auth

58303 root      20   0  102m 3816 2948 D  0.0  0.0   0:00.03 auth

58307 root      20   0  102m 3816 2948 D  0.0  0.0   0:00.03 auth

58311 root      20   0  102m 3812 2948 D  0.0  0.0   0:00.03 auth

58312 root      20   0  102m 3816 2948 D  0.0  0.0   0:00.03 auth

58313 root      20   0  102m 3812 2948 D  0.0  0.0   0:00.03 auth

58314 root      20   0 91072 6316 2096 D  0.0  0.0   0:00.00 sendmail

58315 root      20   0 91072 6324 2104 D  0.0  0.0   0:00.15 sendmail

58318 root      20   0  102m 3816 2948 D  0.0  0.0   0:00.03 auth

58320 root      20   0  102m 3816 2948 D  0.0  0.0   0:00.03 auth

58321 root      20   0  102m 3812 2948 D  0.0  0.0   0:00.03 auth

58323 root      20   0 91272 6544 2084 D  0.0  0.0   0:00.00 sendmail

58324 root      20   0  102m 3816 2948 D  0.0  0.0   0:00.04 auth

58327 root      20   0  102m 3812 2948 D  0.0  0.0   0:00.03 auth

58331 root      20   0  102m 3808 2948 D  0.0  0.0   0:00.03 auth

58332 root      20   0  102m 3808 2948 D  0.0  0.0   0:00.04 auth

58333 root      20   0  102m 3812 2948 D  0.0  0.0   0:00.03 auth

58335 root      20   0  102m 3812 2948 D  0.0  0.0   0:00.03 auth

58337 root      20   0  102m 3816 2948 D  0.0  0.0   0:00.03 auth

58339 root      20   0  102m 3816 2948 D  0.0  0.0   0:00.03 auth

58340 despatch  20   0  8828  924  804 D  0.0  0.0   0:00.00 procmail

58341 root      20   0  102m 3812 2948 D  0.0  0.0   0:00.04 auth

58343 root      20   0 91072 6236 2044 D  0.0  0.0   0:00.00 sendmail

58344 root      20   0  102m 3812 2948 D  0.0  0.0   0:00.03 auth

58347 root      20   0  102m 3808 2948 D  0.0  0.0   0:00.03 auth

58348 root      20   0 91272 6488 2048 D  0.0  0.0   0:00.00 sendmail

58350 les       20   0 11548 3752  812 D  0.0  0.0   0:00.02 procmail

58351 root      20   0 91228 5564 1240 D  0.0  0.0   0:00.00 sendmail

 

Looking at it at the moment. It has 99.8%wa. Just rebooted it as it has been
in this state for almost an hour.

 

BlueOnyx 5208R

HyperV 2012 R2

4 CPU's (Upgraded from 1 to try and resolve issue)

16GB Ram

 

 

Richard

 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.blueonyx.it/pipermail/blueonyx/attachments/20161117/b0c4e506/attachment.html>


More information about the Blueonyx mailing list