Bug #97

Kernel crash on HP DL380G4, broadcom NIC driver related

Added by Pasi Karkkainen over 2 years ago. Updated over 2 years ago.

Status:New Start:July 26, 2010
Priority:Normal Due date:
Assigned to:- % Done:

0%

Category:-
Target version:-

Description

I installed NexentaStor Community Edition 3.0.3 to HP DL380 G4 server.. basicly 5-30 seconds after login prompt shows up on the console the server will reboot due to kernel crash.

the kernel error seems to be about broadcom nic driver.

See the screenshots for the kernel error message: http://pasik.reaktio.net/nexenta/nexenta303-crash02.jpg http://pasik.reaktio.net/nexenta/nexenta303-crash01.jpg

History

Updated by Roman Strashkin over 2 years ago

  1. Please try to install 3.0.4 alpha1.
  2. Crash dump device available on your machine as 'syspool/dump' (details: http://www.nexenta.org/issues/191)

Updated by Pasi Karkkainen over 2 years ago

Roman Strashkin wrote:

  1. Please try to install 3.0.4 alpha1.

Ok. I installed 3.0.4 alpha1, and it shows the same behaviour.. it crashes either on its own (usually within 30 seconds after it has booted up), or when I try to access the web interface.. basicly the kernel crash happens from any network traffic.

  1. Crash dump device available on your machine as 'syspool/dump' (details: http://www.nexenta.org/issues/191)

Is there a way to access that when I cannot use the installed system? I can't really even log in when it already crashes..

I guess I could try disconnecting the broadcom NICs, and adding some Intel NICs..

Updated by Roman Strashkin over 2 years ago

Pasi Karkkainen wrote:

Roman Strashkin wrote:

  1. Please try to install 3.0.4 alpha1.

Ok. I installed 3.0.4 alpha1, and it shows the same behaviour.. it crashes either on its own (usually within 30 seconds after it has booted up), or when I try to access the web interface.. basicly the kernel crash happens from any network traffic.

  1. Crash dump device available on your machine as 'syspool/dump' (details: http://www.nexenta.org/issues/191)

Is there a way to access that when I cannot use the installed system? I can't really even log in when it already crashes..

After crash system all dump info availabel on syspool: '/var/crash/dump-dir' To access to dump-info: start install, wait for boot, press F2, import 'syspool', mount dataset latest 'rootfs-nmu-xxx', copy dump-info. Can you upload dump-info to our FTP ? I'll talk with Anil that he had temporary access to ftp. Please send size of info.

I guess I could try disconnecting the broadcom NICs, and adding some Intel NICs..

Try to load to system and try to see FMA info. (fmdump and fmdump -e)

Thanks.

Updated by Pasi Karkkainen over 2 years ago

Roman Strashkin wrote:

Pasi Karkkainen wrote:

Roman Strashkin wrote:

  1. Please try to install 3.0.4 alpha1.

Ok. I installed 3.0.4 alpha1, and it shows the same behaviour.. it crashes either on its own (usually within 30 seconds after it has booted up), or when I try to access the web interface.. basicly the kernel crash happens from any network traffic.

  1. Crash dump device available on your machine as 'syspool/dump' (details: http://www.nexenta.org/issues/191)

Is there a way to access that when I cannot use the installed system? I can't really even log in when it already crashes..

After crash system all dump info availabel on syspool: '/var/crash/dump-dir' To access to dump-info: start install, wait for boot, press F2, import 'syspool', mount dataset latest 'rootfs-nmu-xxx', copy dump-info. Can you upload dump-info to our FTP ? I'll talk with Anil that he had temporary access to ftp. Please send size of info.

Crashdumps are available from: http://pasik.reaktio.net/nexenta/crashdumps/

I changed to using Intel NICs, and the system doesn't crash anymore.. so then I could grab the crashdumps..

Updated by Mark Holden over 2 years ago

I am seeing this same issue with an HP DL380 G4. With 3.0.3 I would see the same reboot after 5-30 seconds. With 3.0.4b1, I'm seeing reboots much less, but it's still happening. The server can run for an hour or more if it's sitting idle, but with use it's likely to reboot after approximately 5 minutes. I'm going to try the same route that Pasi did and see if it addresses the issue for me as well.

Also available in: Atom PDF