IIS problem?? [expletive deleted] Windows server crashing repeatedly ...
After 17 months of ok service (crashing 2-3 times per month), my Windows 2000 server has started crashing 4+ times per day recently.My server logs around a crash event look as follows:
2004-06-01 18:11:20 205.128.0.98 80 GET /images/goto/gospiral.gif - 200 0 Mozilla/4.0+(compatible;+MSIE+5.23;+Mac_PowerPC) ASPSESSIONIDSARRCAAD=BKALCBFCJODCPBFDIKMGMHLM http://www.worldofboxes.com/animals/animal-prints.htm
2004-06-01 18:11:20 205.128.0.98 80 GET /images/goto/gocelt.gif - 200 585 Mozilla/4.0+(compatible;+MSIE+5.23;+Mac_PowerPC) ASPSESSIONIDSARRCAAD=BKALCBFCJODCPBFDIKMGMHLM http://www.worldofboxes.com/animals/animal-prints.htm
2004-06-01 18:11:2#Software: Microsoft Internet Information Services 5.0
#Version: 1.0
#Date: 2004-06-01 18:25:02
#Fields: date time c-ip s-port cs-method cs-uri-stem cs-uri-query sc-status sc-bytes cs(User-Agent) cs(Cookie) cs(Referer)
2004-06-01 18:25:02 204.77.16.19 80 GET /index.htm - 200 0 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1) - -
2004-06-01 18:25:06 204.77.16.19 80 GET /mainstyle.css - 200 3668 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1) ASPSESSIONIDQSTCSDBB=CJBPPEGCHLAILAIADBBJPHGF http://www.worldofboxes.com
2004-06-01 18:25:12 204.77.16.19 80 GET /common/ha.asp - 200 0 Mozilla/4.0+(compatible;+MSIE+6.0;+Windows+NT+5.1) ASPSESSIONIDQSTCSDBB=CJBPPEGCHLAILAIADBBJPHGF http://www.worldofboxes.com
In this case, the system event log recorded a crash at 18:19:51, about eight minutes after the last, truncated entry in the server log. This is typical -- the server logs appear to stop mid-line anywhere from two to 45 minutes before an operating system crash, with eight minutes being the most common interval. (The site is busy by day, so those gaps are not due to lack of activity.)
I can think of two explanations:
1) on restarting after the crash, IIS chews off a big chunk of the current log before it starts writing again, creating the appearance of a gap.
2) IIS has stopped logging several minutes before the system crash.
In case (2), I think IIS is still serving pages even though it is not logging the activity, because the web service is monitored every 20 minutes by Alertra, and I have not received a single alert through dozens of crashes. This is possible considering that the site must be down 2.5 minutes starting from Alertra's first try before an alert is sent, and the reboots take two to five minutes, so perhaps by chance no crash has triggered an alert. But if IIS had stopped when the logs stopped, many alerts would have been inevitable. (Or else Alertra is a scam after all, but I don't think so.)
Assuming case (2), I can speculate on two further possible conclusions:
2a.) IIS is starting to get unstable at the time it stops logging, and it brings the whole system down two to 45 minutes later.
2b.) Whatever is crashing the system affects IIS first, causing it to stop logging.
Has anyone ever seen or heard of anything like this? Does IIS look like the problem, or a victim?
Removing and reinstalling IIS, and restoring the metabase from a backup, did not help even slightly.
As for the crashes themselves, the system stop message is fairly generic and unhelpful:
0x000000d1 (0x00000000, 0x00000002, 0x00000000, 0x00000000)
The exception address is always the same and seems to fall in the range of ntoskrnl.exe.
Also, no new hardware or software was added to the system for a very long time.