Wednesday, May 17, 2017

NFSv4 ownership as nobody:nobody

RHEL6.3 and newer NFS clients and servers disable idmapping when utilizing the AUTH_SYS/UNIX authentication flavor by enabling the following booleans:

        NFS client
 # echo 'Y' > /sys/module/nfs/parameters/nfs4_disable_idmapping

NFS server

# echo 'Y' > /sys/module/nfsd/parameters/nfs4_disable_idmapping 

Wednesday, March 22, 2017

AIX unique FCS ID & disk ID (UDID)

Recently I discovered a new switch to lspv command -u. It seems to be available since AIX7 TL1 and AIX6 TL7. Thanks for this switch IBM; I like it very much.
A physical volume (hdisk) to an LPAR can be provided in several ways:
  • without VIOS
    • local physical disk connected to a physical disk SCSI/SAS controller owned by the LPAR
    • SAN disk connected to a physical FC controller
  • with VIOS
    • SAN disk connected to a virtual FC controller
    • a physical disk mapped through a virtual scsi
    • SAN disk mapped through a virtual scsi
    • logical volume on VIOS mapped through a virtual scsi
If you need to know where a specific disk comes from without looking into your documentation (which, indeed, is up to date and precise) or loging in to VIOSes etc, lspv -u helps significantly.
The command switch displays unique id of disks (UDID) and does the same job as the code I was using before, but in a more handy way.
lsdev -Cc disk -F name | while read hdisk; do
  echo ${hdisk},$(odmget -q "name=${hdisk} and attribute=unique_id" CuAt|grep value|cut -d '"' -f2)
UDID is build up using a certain logic. I haven't seen the code, but some of the interesting parts are easy to spot:
  • unique_id attribute of an SAS attached hdisk is 2A1135000C500337924AB0BST9146852SS03IBMsas. This ID contains WWNN of the SAS controller (00C500337924AB), disk vendor (IBM), connection type (sas), and also disk model ST9146852SS, which is 146,8GB SAS.
  • unique ID of a logical volume is VGID.sequence_number. For example ID of a third logical volume in a specific VG is 00cc572600004c00000001325d0ab9bd.3.
A few examples of lspv -u outputs with comments on UDID:
  • Locally attached SAS disk
    hdisk0  00cc57264df9dc6b  rootvg  active  2A1135000C500238633030BST9146852SS03IBMsas
  • virtual disk connected via virtual scsi, backed by SAS disk.
    hdisk28  00cc572646f271f9  none  3F2A2A1135000C500337924AB0BST9146852SS03IBMsas05VDASD03AIXvscsi
  • virtual disk connected via virtual scsi, backed by logical volume. Logical volume ID can by identified by a period.
    hdisk0  00cc57265d4534bd  rootvg  active  372200cc572600004c00000001325d0ab9bd.205VDASD03AIXvscsi
  • SAN disk attached via physical or virtual FC. The disk comes from a DS8100 disk array identified as 75BBXM1. Volume ID of the disk in DS8100 is 180D.
    hdisk39  00cc5726aaa9e19e  testvg  active  200B75BBXM1180D07210790003IBMfcp
  • virtual disk connected via virtual scsi, backed by SAN disk. The disk comes from a DS8300 disk array identified as 75YY981. Volume ID of the disk in DS8300 is 0501.
    hdisk2  00cc57260b6d4ca6  rootvg  active  3520200B75YY981050107210790003IBMfcp05VDASD03AIXvscsi
I trimmed the output slightly to fit on screen better. In fact lspv -u show one more column on right with a long string of unknown purpose.
Since disk unique id is in ODM, it can help you solve an issue, when you unmap a disk from a system on either VIOS or a storage by mistake and you want to put it back asap; lspv -u helps you to identify the disk, unless you perform rmdev -dl on that disk.
Didn't find your lspv -u output in the examples above? Leave a comment!

Two ways to create mksysb images in AIX

1) create on NIM server command: 

nim -o define -t mksysb -a server=master -a source=<server name> -a mk_image=yes -a location=<location of the store image> <mksysb image name> 

This will create the mksysb image of the client server and define it on the NIM server. 

nim -o define -t mksysb -a server=master -a source=edppbuslvd01 -a mk_image=yes -a location=/nim/mksysb/edppbuslvd01_6100-04-03-05112010 edppbuslvd01_6100-04-03-05112010 

server=master: server to store image, in this case is master 
source=edppbuslvd01: the source of the image, which is client 
location: the location of the stored mksysb image 

2) create on client machine and then copy to NIM server and define on NIM server, or NFS mount the filesystem from NIM server on the client server. 

let say you successfully NFS mount nim server filesystem on the client machine as /mnt. 

mksysb -ieX /mnt/edppbuslvd01_6100-04-03-05112010 

-e: exclude the filesystem/dir that defined on /etc/exclude.rootvg 
-i: call the mkszfile command to generate the / file 
The / file contains information on volume groups, logical volumes, file systems, paging space, and physical volumes.
 This information is included in the backup for future use by the installation process. 
-X: set to automatically expand the /tmp if necessary 

After the mksysb image created, you need to define it on NIM server. 

nim -o define -t mksysb -a server=master -a location=<image location> <image name>

Monday, December 5, 2016

Replacing a failed disk (rootvg)

This post will describe the replacement of a failed rootvg disk.
In short the procedure is the following:
1. unmirrorvg rootvg hdisk0
(savebase -v)
2. reducevg rootvg hdisk0
3. rmdev -Rdl hdisk0
4. diag (safely remove hot swap device/drive), physically remove the old disk
5. insert the new disk (diag – hotplug task)
6. cfgmgr -vl scsi0
7. extendvg -f rootvg hdisk0
8. mirrorvg -m rootvg hdisk0
9. bosboot -ad hdisk0
10. bootlist -m normal hdisk0 hdisk1
11. bootlist -m normal -o
In more detail the procedure is described below starting with the errpt logs indicating a drive problem.
In the errpt output we can see that there is some problem with a disk:

# errpt
E86653C3   0216144412 P H LVDD           I/O ERROR DETECTED BY LVM
8647C4E2   0216144412 P H hdisk0         DISK OPERATION ERROR
41BF2110   0216144412 U H LVDD           MIRROR WRITE CACHE WRITE FAILED
8647C4E2   0216144412 P H hdisk0         DISK OPERATION ERROR
E86653C3   0216144412 P H LVDD           I/O ERROR DETECTED BY LVM
8647C4E2   0216144412 P H hdisk0         DISK OPERATION ERROR
Lets have a detailed look at the first error:

# errpt -aj 8647C4E2
LABEL:          DISK_ERR3
IDENTIFIER:     8647C4E2
Date/Time:       Thu Feb 16 14:44:06 GMT 2012
Sequence Number: 3706
Machine Id:      00CF405E4C00
Node Id:         power1
Class:           H
Type:            PERM
WPAR:            Global
Resource Name:   hdisk0         
Resource Class:  disk
Resource Type:   scsd
Location:        U787F.001.DPM28WG-P1-T10-L5-L0
        Manufacturer................IBM   H0
        Machine Type and Model......HUS103073FL3800
        FRU Number..................03N5262    
        ROS Level and ID............52505152
        Serial Number...............
        EC Level....................H17923D  
        Part Number.................26K5573    
        Device Specific.(Z0)........000004129F00013E
        Device Specific.(Z1)........RPQR       
        Device Specific.(Z2)........0068
        Device Specific.(Z3)........06131
        Device Specific.(Z4)........0001
        Device Specific.(Z5)........22
        Device Specific.(Z6)........H17923D  
Probable Causes
Failure Causes
Recommended Actions
0A05 0000 2E00 0000 0080 0000 0800 0000 0200 0800 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000
0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0000 0117 0002
Diagnostic Analysis
Diagnostic Log sequence number: 928
Resource tested:        hdisk0
Resource Description:   16 Bit LVD SCSI Disk Drive
Location:               U787F.001.DPM28WG-P1-T10-L5-L0
SRN:                    2643-129
Description:            Error log analysis indicates a SCSI bus problem.
Possible FRUs:
    n/a              FRU: n/a                
                     SCSI bus problem: cables, terminators or other SCSI
    hdisk0           FRU: 03N5262            
                     16 Bit LVD SCSI Disk Drive
    sisscsia0        FRU: 42R7379              U787F.001.DPM28WG-P1
                     PCI-X Dual Channel Ultra320 SCSI Adapter
 n/a              FRU: n/a                
Having a look at the disk using lsvg and lspv commands indicates that the disk is in a missing state
# lsvg -p rootvg
hdisk0            missing           546         41          05..01..00..00..35
hdisk1            active            546         41          05..00..00..00..36
# lspv hdisk0
PHYSICAL VOLUME:    hdisk0                   VOLUME GROUP:     rootvg
PV IDENTIFIER:      00c0e90dce6c290a VG IDENTIFIER     00c0e90d00004c000000012ff4e24eaa
PV STATE:           missing                                   
STALE PARTITIONS:   21                       ALLOCATABLE:      yes
PP SIZE:            128 megabyte(s)          LOGICAL VOLUMES:  24
TOTAL PPs:          546 (69888 megabytes)    VG DESCRIPTORS:   1
FREE PPs:           41 (5248 megabytes)      HOT SPARE:        no
USED PPs:           505 (64640 megabytes)    MAX REQUEST:      256 kilobytes
FREE DISTRIBUTION:  05..01..00..00..35                        
USED DISTRIBUTION:  105..108..109..109..74                    
MIRROR POOL:        None                      
An attempt to read from the disk using dd failed. So it really looks like the disk died. (actually it could also be that there is a problem with the controller/cable, but in this scenario it is the disk..)
dd if=/dev/hdisk0 of=/tmp/disk0 bs=100 count=1024
dd: 0511-051 The read failed.
: There is an input or output error.
0+0 records in.
0+0 records out.
In order to replace the disk we have to do the following:
# unmirrorvg rootvg hdisk0
0516-1734 rmlvcopy: Warning, savebase failed.  Please manually run 'savebase' before rebooting.
0516-1734 rmlvcopy: Warning, savebase failed.  Please manually run 'savebase' before rebooting.
0516-1734 rmlvcopy: Warning, savebase failed.  Please manually run 'savebase' before rebooting.
0516-1734 rmlvcopy: Warning, savebase failed.  Please manually run 'savebase' before rebooting.
0516-1734 rmlvcopy: Warning, savebase failed.  Please manually run 'savebase' before rebooting.
0516-1734 rmlvcopy: Warning, savebase failed.  Please manually run 'savebase' before rebooting.
0516-1246 rmlvcopy: If hd5 is the boot logical volume, please run 'chpv -c '
        as root user to clear the boot record and avoid a potential boot
        off an old boot image that may reside on the disk from which this
        logical volume is moved/removed.
0301-108 mkboot: Unable to read file blocks. Return code: -1
0516-1798 lchangevg: Cannot change quorum without losing quorum.
0516-732 chvg: Unable to change volume group rootvg.
0516-1144 unmirrorvg: rootvg successfully unmirrored, user should perform
        bosboot of system to reinitialize boot records.  Then, user must modify
        bootlist to just include:  hdisk1.
# chpv -c hdisk0
# savebase -v
saving to '/dev/hd5'
75 CuDv objects to be saved
174 CuAt objects to be saved
25 CuDep objects to be saved
39 CuVPD objects to be saved
387 CuDvDr objects to be saved
2 CuPath objects to be saved
0 CuPathAt objects to be saved
0 CuData objects to be saved
0 CuAtDef objects to be saved
Number of bytes of data to save = 38834
Compressing data
Compressed data size is = 9840
        bi_start     = 0x3600
        bi_size      = 0x1820000
        bd_size      = 0x1800000
        ram FS start = 0x8d6ca0
        ram FS size  = 0xea2902
        sba_start    = 0x1803600
        sba_size     = 0x20000
        sbd_size     = 0x2674
Checking boot image size:
        new save base byte cnt = 0x2674
Wrote 9844 bytes
Successful completion
Now, remove the disk from the VG and also remove the disk from the system using rmdev, afterwards use diag to safely remove the physical disk from the system
# reducevg rootvg hdisk0
# rmdev -Rdl hdisk0
hdisk0 deleted
Task selection > Hot plug task > SCSI and SCSI RAID Hot Plug Manager > Replace remove device.
For the safe physical insertion of the new disk also use the diag command.
Now we need the system to detect and identify the drive, put it into the VG,mirror the drive and install BOS on the BLV and set the boot order.
# cfgmgr -vl scsi0
# extendvg -f rootvg hdisk0
# mirrorvg -m rootvg hdisk0
0516-1126 mirrorvg: rootvg successfully mirrored, user should perform
        bosboot of system to initialize boot records.  Then, user must modify
        bootlist to include:  hdisk0 hdisk1.
# bosboot -ad hdisk0
bosboot: Boot image is 49180 512 byte blocks.
# bootlist -m normal hdisk0 hdisk1
[root@power2](/root) # bootlist -m normal -o
hdisk0 blv=hd5 pathid=0
hdisk1 blv=hd5 pathid=0
And finally check we are done.

Ratelimit callbacks suppressed

Recently I hardened some RHEL6 based machines.
During this hardening process - among the others - I did the following:
- disdabled ipv6 by "options ipv6 disable=1" in /etc/modprobe.d/hardening.conf
- added some more audit rules according to NSA guide
- stopped auditd therefore audit log events are redirected to kernel log
- filtered audit logs by following filter to separate audilt.log:
filter f_audit { match(' audit\(' value("MESSAGE")); };

After this hardening there were staerted appearing "__ratelimit: XX callbacks suppressed" messages regularly in kernel log, like following:

Oct 18 01:00:01 test1 kernel: __ratelimit: 4 callbacks suppressed
Oct 18 01:01:01 test1 kernel: __ratelimit: 192 callbacks suppressed
Oct 18 01:05:07 test1 kernel: __ratelimit: 188 callbacks suppressed

It didn't cause any problem but after a while I've started investigating (thx Cipo) what can cause this strange behaviour.

The root cause of this problem is a bit complex. Let's see them:
- disabling ipv6 module caused that some programs would like to insert it
- I've set an audit rule which logs any module instertion attempts: "-w /sbin/modprobe -p x -k modules"
- there are kernel.printk_ratelimit* kernel parameters
# cat /proc/sys/kernel/printk_ratelimit
# cat /proc/sys/kernel/printk_ratelimit_burst
This means that there is a 10 msgs/5 seconds limit in kernel logging. Exceeding this limit messages will be dropped AND "__ratelimit NumberOfDroppedMessages: callbacks suppressed" messages will be written to the kernel log.

To put the pieces of the picture together:
- my script runs curl 20 times
- curl tried to insert ipv6 module by modprobe
- modprobe tries were logged by audit (5 lines/ modprobe)
- too much audit log in short time exceeded the ratelimit

My solution:
- re-enable ipv6 by commenting  "options ipv6 disable=1" line in /etc/modprobe.d/hardening.conf
- prevent use of ipv6 calls by appending following line to /etc/sysctl.conf:
net.ipv6.conf.all.disable_ipv6 = 1 
(Thanks to Daniel Walsh)

These resulted that:
- ipv6 module is already inserted, so programs do not want to insert it
- ipv6 remained pseudo-disabled

After finishing this I've continued getting ratelimit messaged but not regularly.
I've  find out that Midnight Commander deletes a few temporary files and changes permissions of its config files after exiting. This process resulted more than 150 lines within one second in audit log...
My solution is that I've appended following line to the /etc/sysctl.conf file:
kernel.printk_ratelimit = 30
kernel.printk_ratelimit_burst = 200
This means that kernel accepts 200 messages/30 sec

RedHat ResoluciĆ³n
  • The messages are suppressed because some warning messages are rate limited. The kernel parameter printk_ratelimit specifies the minimum length of time between these messages (in seconds), by default we allow one every 5 seconds.
  • A value of 0 will disable rate limiting. However, this way is not a solution for the problem. If you resolve the problem, you may need to see limited messages by this way.
  1. Add the following configuration to an /etc/sysctl.conf file.
    kernel.printk_ratelimit = 0
  2. Reboot the system or execute the follwoing command.
    sysctl -p

Friday, October 21, 2016

Modify limits configuration without reboot

Changes made by ulimit command:

$ ulimit -n 4096
$ ulimit -Hn 16384

will apply only for current user and session. In order to make it permanent, you have to modify /etc/security/limits.conf by adding your limits:

* soft nofile 4096
* hard nofile 16384

However these changes won't apply for root user. In order to do so, you have to state it explicitly:

* soft nofile 4096
* hard nofile 16384
root soft nofile 4096
root hard nofile 16384

These limits will be applied after reboot.
If you want to apply changes without reboot, modify /etc/pam.d/common-session by adding this line at the end of file:
session required
Upon next login you should see updated limits, you can check them (soft and hard limits):
$ ulimit -a
$ ulimit -Ha


[root@XXX~]# cat /etc/security/limits.conf |grep -v "#"
*               soft    nproc           1024
*               hard    nproc           25000

[root@XXX ~]# ulimit -Ha
core file size          (blocks, -c) unlimited
data seg size           (kbytes, -d) unlimited
scheduling priority             (-e) 0
file size               (blocks, -f) unlimited
pending signals                 (-i) 7409
max locked memory       (kbytes, -l) 64
max memory size         (kbytes, -m) unlimited
open files                      (-n) 4096
pipe size            (512 bytes, -p) 8
POSIX message queues     (bytes, -q) 819200
real-time priority              (-r) 0
stack size              (kbytes, -s) unlimited
cpu time               (seconds, -t) unlimited
max user processes              (-u) 25000
virtual memory          (kbytes, -v) unlimited
file locks                      (-x) unlimited

Monday, March 16, 2015

Clearing Single Bit Error Logs in CSTM

Here i'll post how to clear memory error log and PDT error log in hpux zx6000 and rx26000 itanium servers.This procedure need to be done after DIMM replacing, for example the server had bad DIMM/'s and you brought a new and replaced the bad one. After that PDT(Page Deallocation Table) must be clear.To be sure if you have any memory errors run: echo "selclass qualifier memory;info;wait;infolog" | cstm you will get something like that (differ from memory map):Memory Board Inventory
DIMM Location          Size(MB)     DIMM Location          Size(MB)
   ——————--   ——--     ——————--   ——--
   DIMM 0A                2048         DIMM 0B                2048
   DIMM 1A                2048         DIMM 1B                2048
   DIMM 2A                —-         DIMM 2B                —-
   DIMM 3A                —-         DIMM 3B                —-
   DIMM 4A                —-         DIMM 4B                —-
   DIMM 5A                —-         DIMM 5B                —-

   Total: 8192 (MB)


Memory Error Log Summary

   DIMM Location           Error Address     Error Type  Page           Count
   ———————-  —————-  ———-  ————-  —--
   DIMM 0B                 0x12263d00        Single-Bit  0x12263        1
   RANK 0                  0x40e6042500      Multi-Bit   0x40e6042      N/A
   RANK 0                  0x40e03ebc00      Multi-Bit   0x40e03eb      N/A
   RANK 0                  0x40c60d3580      Multi-Bit   0x40c60d3      N/A
   RANK 0                  0x40b0176480      Multi-Bit   0x40b0176      N/A
   RANK 0                  0x1428e9d80       Multi-Bit   0x1428e9       N/A

above we see a problem with DIMM 0B, he need to be replaced.

                                                                Clearing procedure:
reboot the server, choose start EFI shell, in the shell type: pdt clear
if you asked a question type "yes".
Boot the OS, rerun the cstm command (see above) to be sure if your log is clear. If you'll see the next output all is ok:

Memory Error Log Summary
    The memory error log is empty.
 Page Deallocation Table (PDT)
    The Page Deallocation Table is empty.

If you still see errors you can simply recreate memory log file:

mv /var/stm/logs/os/memlog /var/stm/logs/os/memlog.old
touch /var/stm/logs/os/memlog
chmod 644 /var/stm/logs/os/memlog
chown root:root /var/stm/logs/os/memlog

I am also tried to clear log through Logtool Utility with no luck:

cstm>runutil logtool
Logtool Utility>CL
The Memory->Clear Log operation is not available on IPF systems.

Recreating log memory file always works.