Tuesday, October 8, 2024

Kernel Panic During Oracle Cluster Testing on RHEL 8

 

Hey everyone,

I wanted to share some "exciting" issues I ran into while testing an Oracle cluster on RHEL 8—you know, just your everyday kernel panic to spice things up!

The Issue

While conducting tests, I decided to bring down a couple of network interfaces (enp43s1f8 and enp43s1f9) using nmcli. Little did I know that right after deactivating them, I would be treated to a lovely kernel panic, logged as follows:


Oct 8 16:52:24 serverA kernel: sysrq: SysRq : Trigger a crash Oct 8 16:52:24 serverA kernel: Kernel panic - not syncing: sysrq triggered crash

This surprise happened even with the iSCSI service (iscsi.service) disabled. Apparently, the system thought it was a great time to throw a party!

What I Found

  1. Dispatcher Scripts:

    • I found out that a script (04-iscsi) in /usr/lib/NetworkManager/dispatcher.d/ was doing its own thing and triggering actions whenever the network state changed. It was like that overly enthusiastic colleague who jumps in during a meeting and derails the conversation!
  2. Fixing the Issue:

    • To bring back some sanity, I temporarily removed or renamed the 04-iscsi script. After that, I was able to bring down the interfaces without causing the system to have a meltdown. Who knew a little housekeeping could go such a long way?
  3. For Future Tests:

    • Always use nmcli to gracefully deactivate connections; it’s less dramatic than a kernel panic!
    • Review any service dependencies to make sure nothing throws a tantrum when you change network states.

This little adventure reminded me of the importance of understanding how network management scripts and services like iSCSI interact—especially when they seem to have a mind of their own. By being proactive and keeping a sense of humor, we can avoid these surprises in the future.