Tracking down a bottleneck that is killing my business
Would like to ask for suggestions on tracking down a bottleneck that has crippled my system for about a month now. It's the strangest problem I have ever dealt with and it's cost me a small fortune.THE PROBLEM
- From 6:30pm EST to about 8am EST the site runs very, very, fast just like it should. Starting around 8am EST and lasting till about 6:30 EST it bogs down so bad you can't use it. Page loads take a minute or more on my cable connection. Traffic stays about the same but does increase a little during the fast times.
THE SYSTEM
- Servers are (2) dual xeon machines with the mysql DB on server 2 web traffic on server one. Load average on the servers are between 0.01 - 0.18 most of the time. 25mbps line as well as private network. The system is actually setup in a round robin but the A entry has been removed until this probelm is worked out. So all traffic is going to server 1 and only the DB is on server 2. Servers have 2GB RAM each.
THE SITE
- A phpbb forum, runs at about 300 users average online. Lots of picture and video download. A lot of useless queires removed as well as other little things.
CURRENT GRAPHS
- These graphs are as of right now (slow time)
Apache - http://www.wickedvision.com/post/apache-day.png
Memory - http://www.wickedvision.com/post/mem.png
MYSQL - http://www.wickedvision.com/post/mysql-day.png
Processes - http://www.wickedvision.com/post/processes-day.png
TOP
SERVER 1
Code:
15:06:28 up 8 days, 21:59, 1 user, load average: 0.06, 0.06, 0.01 252 processes: 251 sleeping, 1 running, 0 zombie, 0 stopped CPU states: cpu user nice system irq softirq iowait idle total 1.6% 2.8% 0.8% 0.0% 0.8% 0.0% 393.2% cpu00 0.0% 0.0% 0.0% 0.0% 0.9% 0.0% 99.0% cpu01 1.9% 1.9% 0.9% 0.0% 0.0% 0.0% 95.1% cpu02 0.0% 0.0% 0.0% 0.0% 0.0% 0.0% 100.0% cpu03 0.0% 0.9% 0.0% 0.0% 0.0% 0.0% 99.0% Mem: 2055364k av, 2032760k used, 22604k free, 0k shrd, 124140k buff 1430188k actv, 268556k in_d, 33936k in_c Swap: 2040212k av, 341328k used, 1698884k free 1216112k cached
SERVER 2
Code:
15:04:45 up 8 days, 22:05, 2 users, load average: 0.04, 0.05, 0.06 249 processes: 248 sleeping, 1 running, 0 zombie, 0 stopped CPU states: cpu user nice system irq softirq iowait idle total 0.8% 0.8% 1.6% 0.0% 0.8% 20.0% 374.8% cpu00 0.0% 0.0% 0.0% 0.0% 0.9% 5.7% 93.2% cpu01 0.0% 0.9% 0.0% 0.0% 0.0% 3.8% 95.1% cpu02 0.0% 0.0% 0.0% 0.0% 0.0% 5.8% 94.1% cpu03 0.9% 0.0% 1.9% 0.0% 0.0% 4.8% 92.3% Mem: 1025308k av, 1002940k used, 22368k free, 0k shrd, 68164k buff 647336k actv, 119312k in_d, 14824k in_c Swap: 1052248k av, 230396k used, 821852k free 442612k cached
APACHE STATUS
Code:
Current Time: Wednesday, 02-Feb-2005 15:09:26 EST Restart Time: Tuesday, 01-Feb-2005 23:24:58 EST Parent Server Generation: 0 Server uptime: 15 hours 44 minutes 28 seconds Total accesses: 490366 - Total Traffic: 18.7 GB CPU Usage: u3590.36 s419.53 cu23.09 cs1.5 - 7.12% CPU load 8.65 requests/sec - 345.9 kB/second - 40.0 kB/request 108 requests currently being processed, 63 idle servers
- Looking for any help on what direction to look. I have spent over 10,000.00 USD across server hardware, and fees for techs that have done a good job with other things but have not yet be able to track down this problem. I'm asking for ANY advice, this problem is about to the point I will have to shut down as I can not afford to continue. If you actually find the problem I will work out some kind of payment schedule to pay what you believe is fair. I am at my wits end. =(