possible kernel issue?
Hi, i have an amd athlon xp 1900 server from rackshack. I realize this is not a rackshack forum, i am not starting this thread because id like to complain about rackshack, but simply becuase id like to get my issue resolved.Today i was trying to copy a dir (That was roughly about 300mb with many different files, and subdirs) to /home/test. Usually id o it like this
cp -Rf /path/300mbdir ./&
(because i was already in /home/test)
becuase i dont really like to sit and wait around. anyway, usually it'll end and everything will be fine. but i noticed that instead it would SEG Fault.
i noticed that not all the files were copied
so i rm -rf'ed the dir
and re-did it
cp -Rf /path/300mbdir ./ &
then it worked
at first i thought bad ram! but then i thought no wait, they swapped ram sticks.
so then, i went to go untar this 117mb .tar.gz file
tar xzf filename.tar.gz
SEGFAULT
i had to do it a few times for it to work.
then i pico'ed /var/log/messages, went ot the very bottom, saw things like this:
May 29 20:18:31 sun kernel: <1>Unable to handle kernel NULL pointer dereference at virtual address 00000134
May 29 20:18:31 sun kernel: printing eip:
May 29 20:18:31 sun kernel: c012768d
May 29 20:18:31 sun kernel: *pde = 00000000
May 29 20:18:31 sun kernel: Oops: 0002
May 29 20:18:31 sun kernel: Kernel 2.4.9-31
May 29 20:18:31 sun kernel: CPU: 0
May 29 20:18:31 sun kernel: EIP: 0010:[__remove_inode_page+93/128] Not tainted
May 29 20:18:31 sun kernel: EIP: 0010:[<c012768d>] Not tainted
May 29 20:18:31 sun kernel: EFLAGS: 00010206
May 29 20:18:31 sun kernel: EIP is at __remove_inode_page [kernel] 0x5d
May 29 20:18:31 sun kernel: eax: 00000100 ebx: cc6350f8 ecx: 000001fd edx: f7e708ac
May 29 20:18:31 sun kernel: esi: c136c404 edi: 00003979 ebp: c02b2ec0 esp: d68a9e88
May 29 20:18:32 sun kernel: ds: 0018 es: 0018 ss: 0018
May 29 20:18:32 sun kernel: Process cp (pid: 9591, stackpage=d68a9000)
May 29 20:18:32 sun kernel: Stack: c136c420 c136c404 c012e72f c136c404 c02b2ec0 c02b30a8 00000002 00000001
May 29 20:18:32 sun kernel: c01300d3 c02b2ec0 00000001 c02b30b0 00000000 000000d2 c01301cf c02b30a0
May 29 20:18:32 sun kernel: 00000000 00000002 00000001 00000001 000000d2 f7f752c4 00000000 00000000
May 29 20:18:32 sun kernel: Call Trace: [reclaim_page+671/944] reclaim_page [kernel] 0x29f
May 29 20:18:32 sun kernel: Call Trace: [<c012e72f>] reclaim_page [kernel] 0x29f
May 29 20:18:32 sun kernel: [__alloc_pages_limit+99/144] __alloc_pages_limit [kernel] 0x63
May 29 20:18:32 sun kernel: [<c01300d3>] __alloc_pages_limit [kernel] 0x63
May 29 20:18:32 sun kernel: [_wrapped_alloc_pages+175/608] _wrapped_alloc_pages [kernel] 0xaf
May 29 20:18:32 sun kernel: [<c01301cf>] _wrapped_alloc_pages [kernel] 0xaf
May 29 20:18:32 sun kernel: [__alloc_pages+15/160] __alloc_pages [kernel] 0xf
May 29 20:18:32 sun kernel: [<c013038f>] __alloc_pages [kernel] 0xf
May 29 20:18:32 sun kernel: [generic_file_write+860/1552] generic_file_write [kernel] 0x35c
May 29 20:18:32 sun kernel: [<c012ab1c>] generic_file_write [kernel] 0x35c
May 29 20:18:32 sun kernel: [do_generic_file_read+1300/1312] do_generic_file_read [kernel] 0x514
May 29 20:18:32 sun kernel: [<c0128c64>] do_generic_file_read [kernel] 0x514
May 29 20:18:32 sun kernel: [sis900:__insmod_sis900_O/lib/modules/2.4.9-31/kernel/drivers/net/s+-1263102/96] __insmod_ext3_S.text_L$
May 29 20:18:32 sun kernel: [<f880da02>] __insmod_ext3_S.text_L43056 [ext3] 0x19a2
May 29 20:18:32 sun kernel: [sys_write+150/256] sys_write [kernel] 0x96
May 29 20:18:32 sun kernel: [<c01368c6>] sys_write [kernel] 0x96
May 29 20:18:32 sun kernel: [system_call+51/56] system_call [kernel] 0x33
May 29 20:18:32 sun kernel: [<c0106f3b>] system_call [kernel] 0x33
May 29 20:18:32 sun kernel:
May 29 20:18:32 sun kernel:
May 29 20:18:32 sun kernel: Code: 89 50 34 89 02 c7 46 34 00 00 00 00 ff 0d 00 2b 2b c0 5b 5e
May 29 20:19:39 sun kernel: <1>Unable to handle kernel NULL pointer dereference at virtual address 00000834
You see, when i first got this server, i was told it was 1gig of ram. i ran free -m and it only showed 896
i ran dmesg | more
and it said that it will only go up to 896mb of memory and only that much will be used.
so rackshack re-installed the kernel (thats what they told me) -- and i guess they used the rpm version.
anyway
after they did that, i ran free -m and it was working!
1000mb of ram
so here i am, today with this new problem
COuld this be the kernel....?
What would i have to do to get rid of this problem?
Kernel is: 2.4.9-31
and the server is rh 7.2
would upgrading the kernel fix this issue?
to a newer version (e.g. 2.4.18)