crazy server load in short time...please help!
Our server is in theplanet.com, dual xeon 2.8G, 2G ram, SCSI RAID5,with cPanel, PHP 4.3.11, Apache 1.33 and MySQL 3.23.This server mainly runing httpd and we have another mysql server to run databases, and only a huge vbb forum and some small webistes on this server.
Recently, the server load can reach over 100 in some times, and only last less than 1 or 2 minutes, then the load went down slowly to normal. At that time, there are over 600 processes in the top message... so I changed MaxClients from 800 to 256 in httpd.conf, found got a little better, the server load will only reach over 80 and the processes will less than 400...but the server got same slow at the time... I just can't find how this happened...please help!
here is some info of my server:
================================
Linux domain.com 2.4.21-32.0.1.ELsmp #1 SMP Tue May 17 17:52:23 EDT 2005 i686 i686 i386 GNU/Linux
================================
-=httpd.cnf=-
Timeout 120
KeepAlive Off
MaxKeepAliveRequests 100
KeepAliveTimeout 15
MinSpareServers 20
MaxSpareServers 80
StartServers 32
MaxClients 256
MaxRequestsPerChild 10000
==================================
-=top=-
13:10:04 up 9 days, 2:28, 1 user, load average: 56.73, 20.03, 8.68
337 processes: 336 sleeping, 1 running, 0 zombie, 0 stopped
CPU states: cpu user nice system irq softirq iowait idle
total 9.6% 0.0% 2.0% 0.0% 1.1% 3.8% 83.1%
cpu00 11.1% 0.0% 3.1% 0.0% 1.3% 3.1% 81.1%
cpu01 8.5% 0.0% 1.7% 0.0% 0.3% 2.9% 86.3%
cpu02 8.1% 0.0% 1.3% 0.3% 2.3% 4.5% 83.1%
cpu03 10.7% 0.0% 1.9% 0.0% 0.5% 4.5% 82.1%
Mem: 2055252k av, 2010656k used, 44596k free, 0k shrd, 108096k buff
1389864k actv, 269016k in_d, 31284k in_c
Swap: 2040212k av, 54844k used, 1985368k free 563860k cached
PID USER PRI NI SIZE RSS SHARE STAT %CPU %MEM TIME CPU COMMAND
23303 nobody 16 0 16052 14M 7164 S 1.2 0.7 1:57 0 httpd
23707 nobody 15 0 11624 10M 6480 S 1.1 0.5 2:21 2 httpd
30453 nobody 15 0 6528 5808 2788 S 1.0 0.2 0:01 1 httpd
28026 nobody 15 0 11384 10M 6480 S 0.6 0.5 1:16 1 httpd
30341 nobody 16 0 8080 7336 3372 S 0.6 0.3 0:01 3 httpd
30563 root 24 0 3332 3332 2088 S 0.5 0.1 0:00 0 dcpumon
23288 nobody 16 0 12856 11M 7192 S 0.4 0.5 1:53 0 httpd
28894 nobody 16 0 10304 9484 5544 S 0.4 0.4 0:52 3 httpd
30349 nobody 16 0 8244 7472 3268 S 0.4 0.3 0:03 3 httpd
30406 nobody 15 0 7820 7112 3404 S 0.4 0.3 0:00 2 httpd
23331 nobody 16 0 11968 10M 6840 S 0.3 0.5 2:01 1 httpd
26239 nobody 15 0 11044 10M 5960 S 0.3 0.5 1:39 3 httpd
28895 nobody 16 0 11812 10M 5660 S 0.3 0.5 0:50 0 httpd
28897 nobody 16 0 9900 9140 5060 S 0.3 0.4 0:49 2 httpd
30478 nobody 16 0 8108 7408 3312 S 0.3 0.3 0:01 2 httpd
30574 root 24 0 1244 1244 776 S 0.3 0.0 0:00 1 top
23332 nobody 16 0 15240 14M 6460 S 0.1 0.7 2:11 1 httpd
30545 root 16 0 1368 1368 896 R 0.1 0.0 0:00 0 top
1 root 15 0 116 84 56 S 0.0 0.0 0:12 2 init
2 root RT 0 0 0 0 SW 0.0 0.0 0:00 0 migration/0
3 root RT 0 0 0 0 SW 0.0 0.0 0:00 1 migration/1
4 root RT 0 0 0 0 SW 0.0 0.0 0:00 2 migration/2
5 root RT 0 0 0 0 SW 0.0 0.0 0:00 3 migration/3
6 root 15 0 0 0 0 SW 0.0 0.0 0:00 2 keventd
7 root 34 19 0 0 0 SWN 0.0 0.0 0:00 0 ksoftirqd/0
8 root 34 19 0 0 0 SWN 0.0 0.0 0:00 1 ksoftirqd/1
9 root 34 19 0 0 0 SWN 0.0 0.0 0:00 2 ksoftirqd/2
=================================
root@web [~]# netstat -ant|grep 80|wc -l
warning, got duplicate tcp line.
warning, got duplicate tcp line.
warning, got duplicate tcp line.
.
.(so many lines...)
.
warning, got duplicate tcp line.
warning, got duplicate tcp line.
8370
=================================
root@web [~]# netstat -ant|grep ESTAB|wc -l
warning, got duplicate tcp line.
warning, got duplicate tcp line.
warning, got duplicate tcp line.
.
.(so many lines...)
.
warning, got duplicate tcp line.
warning, got duplicate tcp line.
220
================================
root@web [~]# ps -ef|grep httpd|wc
258 2065 19591
================================
root@web [~]# ps -ef|grep mysqld|wc
13 163 2180
================================