Any idea why the server stop responding like this?
This has happened several times this week and only calling to the DC and reboot the server can bring things resolved.Cron is not running, console is not recording logs, and nothing is responding to the world. I would say the server is hanged or crushed for specific reasons.
I've written a shell scripts to run every 10 seconds and hope for the best that it would give me some clues at the next time when it crashes again.
Code:
#!/bin/sh if ping -nc 1 -w 1 localhost 2>%1 > /dev/null then # localhost is up if ping -nc 1 -w 1 gateway_ip 2>%1 > /dev/null then # gateway is up echo `date` "ok" exit fi fi # handle the trouble # show siystem stat echo `date` "failure(s) detected" ping -c 5 -w 5 localhost ping -c 5 -w 5 gateway_ip ifconfig df -k vmstat 1 5 netstat
Sat Apr 19 07:17:01 CDT 2003 ok
Sat Apr 19 07:18:01 CDT 2003 ok
Sat Apr 19 07:19:00 CDT 2003 ok
Sat Apr 19 09:14:55 CDT 2003 ok
Sat Apr 19 09:16:00 CDT 2003 ok
Sat Apr 19 09:17:00 CDT 2003 ok
Sat Apr 19 07:18:01 CDT 2003 ok
Sat Apr 19 07:19:00 CDT 2003 ok
Sat Apr 19 09:14:55 CDT 2003 ok
Sat Apr 19 09:16:00 CDT 2003 ok
Sat Apr 19 09:17:00 CDT 2003 ok
Any ideas except for broken RAM that crashes the virtual shared memory.