Showing posts with label HP-UX. Show all posts
Showing posts with label HP-UX. Show all posts

Thursday, June 25, 2020

Replacing a Boot Mirrored Disk in HP-UX 11.31 (11i v3)

Initialize boot information on the replacement disk.

Save the hardware paths to the disk.
MyHPUX01:(/root/home/root)(root)#ioscan -m lun /dev/disk/disk7
Class     I  Lun H/W Path  Driver  S/W State   H/W Type     Health  Description
======================================================================
disk      7  64000/0xfa00/0x1   esdisk  CLAIMED     DEVICE       online  HP      DG146BB976 
             0/4/1/0.0x5000c5000c7bc53d.0x0
                      /dev/disk/disk7      /dev/disk/disk7_p3   /dev/rdisk/disk7_p2
                      /dev/disk/disk7_p1   /dev/rdisk/disk7     /dev/rdisk/disk7_p3
                      /dev/disk/disk7_p2   /dev/rdisk/disk7_p1


In My case, the disk to be replaced is at lunpath hardware path
LUN hardware path is 64000/0xfa00/0x1
lunpath hardware path is 0/4/1/0.0x5000c5000c7bc53d.0x0

disk is hot-swappable

Halt LVM access to the disk.

#pvchange -a N /dev/disk/disk7_p2


Determine the new LUN instance number for the replacement disk.
# ioscan -m lun

- Create a partition description file:
# vi /tmp/partitionfile
3
EFI 500MB
HPUX 100%
HPSP 400MB

idisk -wf /tmp/partitionfile /dev/rdisk/disk-newdisk-


           -w   Enable write mode.  By default, idisk operates in read-only
                mode.  To create and write partition information to the disk
                you must specify the -w option.



- Create the new device files for the new partitions (disk28_p1,_p2_p3)
# insf -e Cdisk

#you should see the numbre of partition
# ioscan -m lun


Now assign the old instance number to the replacement disk.
# io_redirect_dsf -d /dev/disk/disk-old- -n /dev/disk/disk-new-

# ioscan -m lun /dev/disk/disk-new-

The LUN representation of the old disk with LUN hardware path 64000/0xfa00/0x0 was
removed. The LUN representation of the new disk with LUN hardware path
64000/0xfa00/0x1c was reassigned from LUN instance disk-new- to LUN instance 14 and its device
special files were renamed as /dev/disk/disk14 and /dev/rdisk/disk14.


#Use efi_fsinit(1M) to initialize the FAT filesystem on the EFI partition:

efi_fsinit -d /dev/rdisk/disk7_p1**

efi_fsinit -d /dev/rdisk/disk7_p3

mkboot -e -l /dev/rdisk/disk7
efi_ls -d /dev/rdisk/disk7_p1

(to check EFI)
lifls -l /dev/rdisk/disk7_p2


(to check LIF)
- Check the content of AUTO file on EFI partition:

# efi_cp -d /dev/rdisk/disk7_p1 -u /EFI/HPUX/AUTO /tmp/x
# cat /tmp/x
boot vmunix
NOTE: Specify the -lq option if prefer that your system boots up without
interruption in case of a disk failure:
on the original boot disk:
# mkboot -a "boot vmunix -lq" /dev/rdisk/disk7


Restore LVM configuration information to the new disk.

For example:

# vgcfgrestore -n /dev/vg00 /dev/rdisk/disk7_p2

10. Restore LVM access to the disk.
If you did not reboot the system in Step 2, reattach the disk as follows:

# vgchange -a y /dev/vg00
# vgdisplay -v vg00
# vgdisplay -v vg00

Syncronize volume group data (only if sync does not start automatically):

# cd /tmp
# nohup vgsync /dev/vg00 &
(output see /tmp/nohup.out)

11. Initialize/check boot information on the disk.
- Check if content of LABEL file (i.e. root, boot, swap and dump device definition) has been
initialized (done by lvextend) on the mirror disk:

# lvlnboot -v

Monday, March 16, 2015

Clearing Single Bit Error Logs in CSTM

Here i'll post how to clear memory error log and PDT error log in hpux zx6000 and rx26000 itanium servers.This procedure need to be done after DIMM replacing, for example the server had bad DIMM/'s and you brought a new and replaced the bad one. After that PDT(Page Deallocation Table) must be clear.To be sure if you have any memory errors run: echo "selclass qualifier memory;info;wait;infolog" | cstm you will get something like that (differ from memory map):Memory Board Inventory
  
DIMM Location          Size(MB)     DIMM Location          Size(MB)
   ——————--   ——--     ——————--   ——--
   DIMM 0A                2048         DIMM 0B                2048
   DIMM 1A                2048         DIMM 1B                2048
   DIMM 2A                —-         DIMM 2B                —-
   DIMM 3A                —-         DIMM 3B                —-
   DIMM 4A                —-         DIMM 4B                —-
   DIMM 5A                —-         DIMM 5B                —-

   Total: 8192 (MB)

   ===========================================================================

Memory Error Log Summary

   DIMM Location           Error Address     Error Type  Page           Count
   ———————-  —————-  ———-  ————-  —--
   DIMM 0B                 0x12263d00        Single-Bit  0x12263        1
   RANK 0                  0x40e6042500      Multi-Bit   0x40e6042      N/A
   RANK 0                  0x40e03ebc00      Multi-Bit   0x40e03eb      N/A
   RANK 0                  0x40c60d3580      Multi-Bit   0x40c60d3      N/A
   RANK 0                  0x40b0176480      Multi-Bit   0x40b0176      N/A
   RANK 0                  0x1428e9d80       Multi-Bit   0x1428e9       N/A

above we see a problem with DIMM 0B, he need to be replaced.

                                                                Clearing procedure:
reboot the server, choose start EFI shell, in the shell type: pdt clear
if you asked a question type "yes".
Boot the OS, rerun the cstm command (see above) to be sure if your log is clear. If you'll see the next output all is ok:

Memory Error Log Summary
    The memory error log is empty.
 Page Deallocation Table (PDT)
    The Page Deallocation Table is empty.


If you still see errors you can simply recreate memory log file:

mv /var/stm/logs/os/memlog /var/stm/logs/os/memlog.old
touch /var/stm/logs/os/memlog
chmod 644 /var/stm/logs/os/memlog
chown root:root /var/stm/logs/os/memlog

I am also tried to clear log through Logtool Utility with no luck:

cstm
cstm>runutil logtool
Logtool Utility>CL
The Memory->Clear Log operation is not available on IPF systems.

Recreating log memory file always works.



Tuesday, March 10, 2015

Change the kernel parameters in HP_UX...

kctune: It is the administrative command for HP-UX kernel to view or change kernel parameters. The following information provides how to view or modify the kernel parameters.

 Viewing Kernel Parameters:
1
$usr/sbin/kctune

Modifying Kernel Parameters:
/usr/sbin/kctune <parameter name and it’s value>
Sample Output: 
1
2
3
4
5
6
7
8
9
10
mydb:/ #/usr/sbin/kctune hires_timeout_enable=1
     ==> Update the automatic 'backup' configuration first? yes
       * The automatic 'backup' configuration has been updated.
       * Future operations will update the backup without prompting.
        * The requested changes have been applied to the currently
         running configuration.
Tunable                         Value  Expression  Changes
hires_timeout_enable  (before)     0   Default     Immed
                       (now)       1   1
mydb:/ #

Viewing Specific Kernel Parameter:
/usr/sbin/kctune <parameter name >
Use the bellow command if you have HP_UX B.11.31 
1
2
3
4
mydb:/ #/usr/sbin/kctune hires_timeout_enable
Tunable               Value  Expression  Changes
hires_timeout_enable      1  1           Immed
mydb:/ #
Use the bellow command if you have HP_UX B.11.23
1
2
3
sun2:/home/oracle #sysdef | grep kctune hires_timeout_enable
maxuprc                    3686          -          3-                   -
sun2:/home/oracle #


Number of Open LV and Current LV is different for VG

On one of our systems I see that the current number of logical volumes in vg00 does not equal the number of logical volumes currently open. The host does not seem to have any unusual problems because of this.

There are 16 pairs of files in the /dev/vg00 directory, which matches the 16 "open" LVs. However, there are 18 reported as "Cur LV". 

Unfortunately, due to additions/deletions the minor numbers are all over the map ranging from 0x000001 to 0x00001c.

I had hoped to avoid the brute-force approach-- but it worked,  to give me a nice list of all the LVs, sorted in minor number order, I used

ls -l /dev/vg00 | sort +5

Then I used a manual list of missing minor numbers to build a script to create all the devices:

foreach minor in 09 0c 10 11 12 14 15 16 17 18 19 1b; do echo mknod /dev/vg00/mia_${minor} b 64 0x0000${minor}; echo mknod /dev/vg00/rmia_${minor} c 64 0x0000${minor}; done > /tmp/make-minors.sh 

Then I just ran "sh /tmp/make-minors.sh" (after verifying that it looked good)

Now "pvdisplay -v" shows me which LVs were missing but still allocated. Now for a little judicious use of "lvremove"...

Thanks for the help.

PS: Here's what I saw that was odd: [Note the Cur LV vs. Open LV]

# vgdisplay vg00
--- Volume groups ---
VG Name /dev/vg00
VG Write Access read/write
VG Status available
Max LV 255
Cur LV 18
Open LV 16
Max PV 16
Cur PV 2
Act PV 2
Max PE per PV 2500
VGDA 4
PE Size (Mbytes) 4
Total PE 4338
Alloc PE 3964
Free PE 374
Total PVG 0
Total Spare PVs 0
Total Spare PVs in use 0

PPS: It was "19" and "1b" that had been removed without removing their allocated logical extents.