Monday, March 16, 2015

Clearing Single Bit Error Logs in CSTM

Here i'll post how to clear memory error log and PDT error log in hpux zx6000 and rx26000 itanium servers.This procedure need to be done after DIMM replacing, for example the server had bad DIMM/'s and you brought a new and replaced the bad one. After that PDT(Page Deallocation Table) must be clear.To be sure if you have any memory errors run: echo "selclass qualifier memory;info;wait;infolog" | cstm you will get something like that (differ from memory map):Memory Board Inventory
  
DIMM Location          Size(MB)     DIMM Location          Size(MB)
   ——————--   ——--     ——————--   ——--
   DIMM 0A                2048         DIMM 0B                2048
   DIMM 1A                2048         DIMM 1B                2048
   DIMM 2A                —-         DIMM 2B                —-
   DIMM 3A                —-         DIMM 3B                —-
   DIMM 4A                —-         DIMM 4B                —-
   DIMM 5A                —-         DIMM 5B                —-

   Total: 8192 (MB)

   ===========================================================================

Memory Error Log Summary

   DIMM Location           Error Address     Error Type  Page           Count
   ———————-  —————-  ———-  ————-  —--
   DIMM 0B                 0x12263d00        Single-Bit  0x12263        1
   RANK 0                  0x40e6042500      Multi-Bit   0x40e6042      N/A
   RANK 0                  0x40e03ebc00      Multi-Bit   0x40e03eb      N/A
   RANK 0                  0x40c60d3580      Multi-Bit   0x40c60d3      N/A
   RANK 0                  0x40b0176480      Multi-Bit   0x40b0176      N/A
   RANK 0                  0x1428e9d80       Multi-Bit   0x1428e9       N/A

above we see a problem with DIMM 0B, he need to be replaced.

                                                                Clearing procedure:
reboot the server, choose start EFI shell, in the shell type: pdt clear
if you asked a question type "yes".
Boot the OS, rerun the cstm command (see above) to be sure if your log is clear. If you'll see the next output all is ok:

Memory Error Log Summary
    The memory error log is empty.
 Page Deallocation Table (PDT)
    The Page Deallocation Table is empty.


If you still see errors you can simply recreate memory log file:

mv /var/stm/logs/os/memlog /var/stm/logs/os/memlog.old
touch /var/stm/logs/os/memlog
chmod 644 /var/stm/logs/os/memlog
chown root:root /var/stm/logs/os/memlog

I am also tried to clear log through Logtool Utility with no luck:

cstm
cstm>runutil logtool
Logtool Utility>CL
The Memory->Clear Log operation is not available on IPF systems.

Recreating log memory file always works.