Topics: HMC, System Administration

Command line upgrade of HMC

This is how you update your HMC form version 7.9.0 to service pack 3 and all necessary fixes. At the time of writing, service pack 3 is the latest available service pack, and there are 2 fixes available for V7 R7.9.0 SP3, called MH01587 and MH01605. So the following procedure assumes that your HMC is currently at the base level of version 7.9.0, without any additional fixes or service packs installed.

This procedure is completely command line based. For this to work, you need to be able to ssh into the HMC using the hscroot user. For example, if your HMC is called yourhmc, you should be able to do this:

# ssh -l hscroot yourhmc
We also need to make sure we have some backups. Start with saving some output:
# lshmc -v
# lshmc -V
# lshmc -n
# lshmc -r  
The information outputted by the lshmc command is useful to determine what is currently installed on the HMC.

Next, take a console data backup of the HMC:
# bkconsdata -r nfs -h 10.11.12.13 -l /mksysb/HMC -d backupfile
The bkconsdata command above will backup the console data of the HMC via NFS to host 10.11.12.13 (replace with your own server name of IP address), and will store it in /mksysb/HMC/backupfile (replace /mksysb/HMC and backupfile in the bkconsdata command above to represent the correct location to back up to on your NFS server).

Mext, make a backup of the profiles for each managed server:
# bkprofdata -m  -f  --force 
The bkprofdata command above requires the name of each managed system. A good way to know the names of the managed systems configured on the HMC, is by running the following command:
# lssysconn -r all
Now that we have all the necessary backups, it's time to perform the actual upgrade.

Let's start with the upgrade to Service Pack 3:
# updhmc -t s -h ftp.software.ibm.com -u anonymous -p ftp -f /software/server/hmc/updates/HMC_Update_V7R790_SP3.iso -r
This will download the service pack from the IBM site to the HMC via FTP and upgrade the HMC, and reboot it. This may take a while. The updhmc command may return a prompt after the download is completed, but that does not mean the update has occurred already. Please allow it to install and reboot. A message will be shown on the screen *The system is shutting down for reboot now". After the reboot, run the "lshmc -V" command again. It may take some time for the lshmc command will respond with proper output. Again, give it some time. As soon as the lshmc command shows that the service pack is installed, then you can move forward to the next step.

The next step is installing the fixes:
# updhmc -t s -h ftp.software.ibm.com -u anonymous -p ftp -f /software/server/hmc/fixes/MH01587.iso -r
And...
# updhmc -t s -h ftp.software.ibm.com -u anonymous -p ftp -f /software/server/hmc/fixes/MH01605.iso -r
After each fix is installed, the HMC will reboot, and you'll have to check with "lshmc -V" if the fix is installed.

And that concludes the upgrade. If any new service packs and or fixes are released by IBM you can install them in a similar fashion.

Topics: AIX, System Administration

Running bootp in debug mode to troubleshoot NIM booting

If you have a LPAR that is not booting from your NIM server, and you're certain the IP configuration on the client is correct, for example by completing a successful ping test, then you should have a look at the bootp process on the NIM server as a possible cause of the issue.

To accomplish this, you can put bootp into debug mode. Edit file /etc/inetd.conf, and comment out the bootps entry with a hash mark (#). This will help to avoid bootp being started by the inetd in response to a bootp request. Then refresh the inetd daemon, to pick up the changes to file /etc/inetd.conf:

# refresh -s inetd
Now check if any bootpd processes are running. If necessary, use kill -9 to kill them. Again check if no more bootpd processes are active. Now that bootp has stopped go ahead and bring up another PuTTY window on your NIM master. You'll need another window opened, because putting bootp into debug is going to lock the window, while it is active. Run the following command in that window:
# bootpd -d -d -d -d -s
Now you can retry to boot the LPAR from your NIM master, and you should see information scrolling by of what is going on.

Afterwards, once you've identified the issue, make sure to stop the bootpd process (just hit ctrl-c to make it stop), and change file /etc/inetd.conf back the way it was, and run refresh -s inetd to refresh it again.

Topics: Red Hat, System Administration

Increase the size of a tmpfs file system

On Linux systems, a tmpfs filesystem keeps the entire filesystem (with all its files) in virtual memory. All data is stored in memory, which means the data is temporary and will be lost after a reboot. If you unmount the filesystem, all data in the file system is gone. You can also a lot of installations using a tmpfs for /tmp and hence anything written to /tmp is wiped after a reboot.

To increase the size, do the following:

Modify /etc/fstab line to look something like this:

none /raw tmpfs defaults,size=2G 0 0
Then, re-mount the file system:
# mount -o remount /raw # df -h
Note: Be careful not to increase it too much as the system will use up real memory.

Topics: AIX, Storage, System Administration

Allocating shared storage to VIOS clients

The following is a procedure to add shared storage to a clustered, virtualized environment. This assumes the following: You have a PowerHA cluster on two nodes, nodeA and nodeB. Each node is on a separate physical system, and each node is a client of a VIOS. The storage from the VIOS is mapped as vSCSI to the client. Client nodeA is on viosA, and client nodeB is on viosB. Futhermore, this procedure assumes you're using SDDPCM for multi-pathing on the VIOS.

First of all, have your storage admin allocate and zone shared LUN(s) to the two VIOS. This needs to be one or more LUNs that is zoned to both of the VIOS. This procedure assumes you will be zoning 4 LUNs of 128 GB.

Once that is completed, then move to work on the VIOS:

SERVER: viosA

First, gather some system information as user root on the VIOS, and save this information to a file for safe-keeping.

# lspv
# lsdev -Cc disk
# /usr/ios/cli/ioscli lsdev -virtual
# lsvpcfg
# datapath query adapter
# datapath query device
# lsmap -all
Discover new SAN LUNs (4 * 128 GB) as user padmin on the VIOS. This can be accomplished by running cfgdev, the alternative to cfgmgr on the VIOS. Once that has run, identify the 4 new hdisk devices on the system, and run the "bootinfo -s" command to determine the size of each of the 4 new disks:
# cfgdev
# lspv
# datapath query device
# bootinfo -s hdiskX
Change PVID for the disks (repeat for all the LUNs):
# chdev -l hdiskX -a pv=yes
Next, map the new LUN from viosA to the nodeA lpar. You'll need to know 2 things here: [a] What vhost adapter (or "vadapter) to use, and [b] what name to give the new device (or "virtual target device"). Have a look at the output of the "lsmap -all" command that you ran previously. That will provide you information on the current naming scheme for the virtual target devices. Also, it will show you what vhost adapters already exist, and are in use for the client. In this case, we'll assume the vhost adapter is vhost0, and there are already some virtual target devices, called: nodeA_vtd0001 through nodeA_vtd0019. The new four LUNs therefore will be named: nodeA_vtd0020 through nodeA_vtd0023. We'll also assume the new disks are numbered hdisk44 through hdisk47.
# mkvdev -vdev hdisk44 -vadapter vhost0 -dev nodeA_vtd0020
# mkvdev -vdev hdisk45 -vadapter vhost0 -dev nodeA_vtd0021
# mkvdev -vdev hdisk46 -vadapter vhost0 -dev nodeA_vtd0022
# mkvdev -vdev hdisk47 -vadapter vhost0 -dev nodeA_vtd0023
Now the mapping of the LUNs is complete on viosA. You'll have to repeat the same process on viosB:

SERVER: viosB

First, gather some system information as user root on the VIOS, and save this information to a file for safe-keeping.
# lspv
# lsdev -Cc disk
# /usr/ios/cli/ioscli lsdev -virtual
# lsvpcfg
# datapath query adapter
# datapath query device
# lsmap -all
Discover new SAN LUNs (4 * 128 GB) as user padmin on the VIOS. This can be accomplished by running cfgdev, the alternative to cfgmgr on the VIOS. Once that has run, identify the 4 new hdisk devices on the system, and run the "bootinfo -s" command to determine the size of each of the 4 new disks:
# cfgdev
# lspv
# datapath query device
# bootinfo -s hdiskX
No need to set the PVID this time. It was already configured on viosA, and after running the cfgdev command, the PVID should be visible on viosB, and it should match the PIVIDs on viosA. Make sure this is correct:
# lspv
Map the new LUN from viosB to the nodeB lpar. Again, you'll need to know the vadapter and the virtual target device names to use, and you can derive that information by looking at the output of the "lsmap -all" command. If you've done your work correctly in the past, the naming of the vadapter and the virtual target devices will probably be the same on viosB as on viosA:
# mkvdev -vdev hdisk44 -vadapter vhost0 -dev nodeB_vtd0020
# mkvdev -vdev hdisk45 -vadapter vhost0 -dev nodeB_vtd0020
# mkvdev -vdev hdisk46 -vadapter vhost0 -dev nodeB_vtd0020
# mkvdev -vdev hdisk47 -vadapter vhost0 -dev nodeB_vtd0020
Now that the mapping on both the VIOS has been completed, it is time to move to the client side. First, gather some information about the PowerHA cluster on the clients, by running as root on the nodeA client:
# clstat -o
# clRGinfo
# lsvg |lsvg -pi
Run cfgmgr on nodeA to discover the mapped LUNs, and then on nodeB:
# cfgmgr
# lspv
Ensure that the disk attributes are correctly set on both servers. Repeat the following command for all 4 new disks:
# chdev -l hdiskX -a algorithm=fail_over -a hcheck_interval=60 -a queue_depth=20 -a reserve_policy=no_reserve
Now you can add the 4 new added physical volumes to a shared volume group. In our example, the shared volume group is called sharedvg, and the newly discovered disks are called hdisk55 through hdisk58. Finally, the concurrent resource group is called concurrent_rg.
# /usr/es/sbin/cluster/sbin/cl_extendvg -cspoc -g'concurrent_rg' -R'nodeA' sharedvg hdisk55 hdisk56 hdisk57 hdisk58
Next, you can move forward to creating logical volumes (and file systems if necessary), for example, when creating raw logical volumes for an Oracle database:
# /usr/es/sbin/cluster/sbin/cl_mklv -TO -t raw -R'nodeA' -U oracle -G dba -P 600 -y asm_raw5 sharedvg 1023 hdisk55
# /usr/es/sbin/cluster/sbin/cl_mklv -TO -t raw -R'nodeA' -U oracle -G dba -P 600 -y asm_raw6 sharedvg 1023 hdisk56
# /usr/es/sbin/cluster/sbin/cl_mklv -TO -t raw -R'nodeA' -U oracle -G dba -P 600 -y asm_raw7 sharedvg 1023 hdisk57
# /usr/es/sbin/cluster/sbin/cl_mklv -TO -t raw -R'nodeA' -U oracle -G dba -P 600 -y asm_raw8 sharedvg 1023 hdisk58
Finally, verify the volume group:
# lsvg -p sharedvg
# lsvg sharedvg
# ls -l /dev/asm_raw*
If necessary, these are the steps to complete, if the addition of LUNs has to be backed out:
  1. Remove the raw logical volumes (using the cl_rmlv command)
  2. Remove the added LUNs from the volume group (using the cl_reducevg command)
  3. Remove the disk devices on both client nodes: rmdev -dl hdiskX
  4. Remove LUN mappings from each VIOS (using the rmvdev command)
  5. Remove the LUNs frome each VIOS (using the rmdev command)

Topics: AIX, System Administration

Export and import PuTTY sessions

PuTTY itself does not provide a means to export the list of sessions, nor a way to import the sessions from another computer. However, it is not so difficult, once you know that PuTTY stores the session information in the Windows Registry.

To export the Putty sessions, run:

regedit /e "%userprofile%\desktop\putty-sessions.reg" HKEY_CURRENT_USER\Software\SimonTatham\PuTTY\Sessions
Or, to export just all settings (and not only the sessions, run:
regedit /e "%userprofile%\desktop\putty.reg" HKEY_CURRENT_USER\Software\SimonTatham
This will create either a putty-sessions.reg or putty.reg file on your Windows dekstop. You can transfer these files over to another computer, and after installing PuTTY on the other computer, simply double-click on the reg file, to have the Windows Registry entries added. Then, if you start up PuTTY, all the sessions information should be there.

Topics: AIX, Storage, System Administration

Identifying a Disk Bottleneck Using filemon

This blog will display the steps required to identify an IO problem in the storage area network and/or disk arrays on AIX.

Note: Do not execute filemon with AIX 6.1 Technology Level 6 Service Pack 1 if WebSphere MQ is running. WebSphere MQ will abnormally terminate with this AIX release.

Running filemon: As a rule of thumb, a write to a cached fiber attached disk array should average less than 2.5 ms and a read from a cached fiber attached disk array should average less than 15 ms. To confirm the responsiveness of the storage area network and disk array, filemon can be utilized. The following example will collect statistics for a 90 second interval.

# filemon -PT 268435184 -O pv,detailed -o /tmp/filemon.rpt;sleep 90;trcstop

Run trcstop command to signal end of trace.
Tue Sep 15 13:42:12 2015
System: AIX 6.1 Node: hostname Machine: 0000868CF300
[filemon: Reporting started]
# [filemon: Reporting completed]

[filemon: 90.027 secs in measured interval]
Then, review the generated report (/tmp/filemon.rpt).
# more /tmp/filemon.rpt
.
.
.
------------------------------------------------------------------------
Detailed Physical Volume Stats   (512 byte blocks)
------------------------------------------------------------------------

VOLUME: /dev/hdisk11  description: XP MPIO Disk P9500   (Fibre)
reads:                  437296  (0 errs)
  read sizes (blks):    avg     8.0 min       8 max       8 sdev     0.0
  read times (msec):    avg   11.111 min   0.122 max  75.429 sdev   0.347
  read sequences:       1
  read seq. lengths:    avg 3498368.0 min 3498368 max 3498368 sdev     0.0
seeks:                  1       (0.0%)
  seek dist (blks):     init 3067240
  seek dist (%tot blks):init 4.87525
time to next req(msec): avg   0.206 min   0.018 max 461.074 sdev   1.736
throughput:             19429.5 KB/sec
utilization:            0.77

VOLUME: /dev/hdisk12  description: XP MPIO Disk P9500   (Fibre)
writes:                 434036  (0 errs)
  write sizes (blks):   avg     8.1 min       8 max      56 sdev     1.4
  write times (msec):   avg   2.222 min   0.159 max  79.639 sdev   0.915
  write sequences:      1
  write seq. lengths:   avg 3498344.0 min 3498344 max 3498344 sdev     0.0
seeks:                  1       (0.0%)
  seek dist (blks):     init 3067216
  seek dist (%tot blks):init 4.87521
time to next req(msec): avg   0.206 min   0.005 max 536.330 sdev   1.875
throughput:             19429.3 KB/sec
utilization:            0.72
.
.
.
In the above report, hdisk11 was the busiest disk on the system during the 90 second sample. The reads from hdisk11 averaged 11.111 ms. Since this is less than 15 ms, the storage area network and disk array were performing within scope for reads.

Also, hdisk12 was the second busiest disk on the system during the 90 second sample. The writes to hdisk12 averaged 2.222 ms. Since this is less than 2.5 ms, the storage area network and disk array were performing within scope for writes.

Other methods to measure similar information:

You can use the topas command using the -D option to get an overview of the most busiest disks on the system:
# topas -D
In the output, columns ART and AWT provide similar information. ART stands for the average time to receive a response from the hosting server for the read request sent. And AWT stands for the average time to receive a response from the hosting server for the write request sent.

You can also use the iostat command, using the -D (for drive utilization) and -l (for long listing mode) options:
# iostat -Dl 60
This will provide an overview over a 60 second period of your disks. The "avg serv" column under the read and write sections will provide you average service times for reads and writes for each disk.

An occasional peak value recorded on a system, doesn't immediately mean there is a disk bottleneck on the system. It requires longer periods of monitoring to determine if a certain disk is indeed a bottleneck for your system.

Topics: AIX, System Administration

Commands to create printer queues

Here are some commands to add a printer to an AIX system. Let's assume that the hostname of the printer is "printer", and that you've added an entry for this "printer" in /etc/hosts, or that you've added it to DNS, so it can be resolved to an IP address. Let's also assume that the queue you wish to make will be called "printerq", and that your printer can communicate on port 9100.

In that case, to create a generic printer queue, the command will be:

# /usr/lib/lpd/pio/etc/piomkjetd mkpq_jetdirect -p 'generic' -D asc \
-q 'printerq' -h 'printer' -x '9100'

In case you wish to set it up as a postscript printer, called "printerqps", then the command will be:
# /usr/lib/lpd/pio/etc/piomkjetd mkpq_jetdirect -p 'generic' -D ps \
-q 'printerqps' -h 'printer' -x '9100'

Topics: AIX, Monitoring, Networking, Red Hat, Security, System Administration

Determining type of system remotely

If you run into a system that you can't access, but is available on the network, and have no idea what type of system that is, then there are few tricks you can use to determine the type of system remotely.

The first one, is by looking at the TTL (Time To Live), when doing a ping to the system's IP address. For example, a ping to an AIX system may look like this:

# ping 10.11.12.82
PING 10.11.12.82 (10.11.12.82) 56(84) bytes of data.
64 bytes from 10.11.12.82 (10.11.12.82): icmp_seq=1 ttl=253 time=0.394 ms
...
TTL (Time To Live) is a timer value included in packets sent over networks that tells the recipient how long to hold or use the packet before discarding and expiring the data (packet). TTL values are different for different Operating Systems. So, you can determine the OS based on the TTL value. A detailed list of operating systems and their TTL values can be found here. Basically, a UNIX/Linux system has a TTL of 64. Windows uses 128, and AIX/Solaris uses 254.

Now, in the example above, you can see "ttl=253". It's still an AIX system, but there's most likely a router in between, decreasing the TTL with one.

Another good method is by using nmap. The nmap utility has a -O option that allows for OS detection:
# nmap -O -v 10.11.12.82 | grep OS
Initiating OS detection (try #1) against 10.11.12.82 (10.11.12.82)
OS details: IBM AIX 5.3
OS detection performed.
Okay, so it isn't a perfect method either. We ran the nmap command above against an AIX 7.1 system, and it came back as AIX 5.3 instead. And sometimes, you'll have to run nmap a couple of times, before it successfully discovers the OS type. But still, we now know it's an AIX system behind that IP.

Another option you may use, is to query SNMP information. If the device is SNMP enabled (it is running a SNMP daemon and it allows you to query SNMP information), then you may be able to run a command like this:
# snmpinfo -h 10.11.12.82 -m get -v sysDescr.0
sysDescr.0 = "IBM PowerPC CHRP Computer
Machine Type: 0x0800004c Processor id: 0000962CG400
Base Operating System Runtime AIX version: 06.01.0008.0015
TCP/IP Client Support  version: 06.01.0008.0015"
By the way, the example for SNMP above is exactly why AIX Health Check generally recommends to disable SNMP, or at least to dis-allow providing such system information trough SNMP by updating the /etc/snmpdv3.conf file appropriately, because this information can be really useful to hackers. On the other hand, your organization may use monitoring that relies of SNMP, in which case it needs to be enabled. But then you stil have the opportunity of changing the SNMP community name to something else (the default is "public"), which also limits the remote information gathering possibilities.

Topics: AIX, System Administration

Resolving IBM.DRM software errors

If you see several SRC_RSTRT errors in the error report regarding IBM.DRM or IBM.AuditRM, using identifiers CB4A951F or BA431EB7, and detecting module "srchevn.c", then you are probably having a system that has been cloned in the past from another system, and the RSCT software is using the keys of the original system.

The solution is this:

# /usr/sbin/rsct/bin/rmcctrl -z 
# /usr/sbin/rsct/bin/rmcctrl -d 
# /usr/sbin/rsct/install/bin/recfgct -s 
# /usr/sbin/rsct/bin/rmcctrl -A 
# /usr/sbin/rsct/bin/rmcctrl -p 
This will generate new keys, and will solve the errors in the error report. Just to make sure, reboot your system, and they should no longer show up in the error report after the reboot.

Topics: Red Hat, System Administration

RHSM: Too many content sets for certificate

How to fix subscription-manager error "Too many content sets for certificate Red Hat Enterprise Linux Server" using RHN and be able to revert back to Red Hat Subscription Management after updating.

Step 1: Clean up the subscription-manager if needed:

# subscription-manager unsubscribe --all
# subscription-manager unregister
# subscription-manager clean
Step 2: Register to Red Hat Network (RHN) using rhn_register:
# rhn_register
Note: You will need your RH login and password to complete the wizard.

Step 3: Validate RHN registration of the system:
# yum repolist
Note: Look at Loaded plugins in the output and "rhnplugin" should be listed.

Step 4: Update subscription-manager* and python-rhsm* packages: # yum list updates subscription-manager* python-rhsm* Note: The output may vary depending on your system and installed packages.

Example output below:
Updated Packages
python-rhsm.x86_64 1.12.5-2.el6 rhel-x86_64-server-6
subscription-manager.x86_64 1.12.14-9.el6_6 rhel-x86_64-server-6
subscription-manager-firstboot.x86_64 1.12.14-9.el6_6 rhel-x86_64-server-6
subscription-manager-gnome.x86_64 0.99.19.4-1.el6_3 rhel-x86_64-server-6
# yum update subscription-manager* python-rhsm*
Note: Answer the questions when prompted. Validate the updates were applied successfully by examining the output.

Step 5: Unregister from RHN in preparation to register with subscription-manager:
  1. In the online Red Hat Portal, login.
  2. Access Subscription Management.
  3. Access RHN Classic Management -> All Registered Systems.
  4. Click on System Entitlements (you need to see check boxes next to systems).
  5. Select the check box next to the system you are working on.
  6. Click the "Unentitle" button at bottom middle of page.
  7. Validate the entitlement has been removed for the system.
  8. Perform the below command on the system's CLI:
    # rm /etc/sysconfig/rhn/systemid
Step 6: Register system with subscription-manager:

Note: Validate that no subscriptions are showing active.
# subscription-manager list --available
Note: A message similar to below should be displayed.
This system is not yet registered. Try 'subscription-manager register --help' for more information.
Register the system using your credentials to RHSM:
# subscription-manager register --username=xxxxxx --password='xxxxxx'
Note: You will need your Red Hat Portal Username and Password for the account the system will be registered under. Make note of the ID that the system will be registered when this command returns.

Validate that the subscription-manager plugin is loaded
# yum repolist
Look at Loaded plugins in the output where "subscription-manager" should be listed.

Validate that subscriptions are showing available now:
# subscription-manager list --available
Validate the Subscription Name, SKU, Contract, Account and Pool ID are showing up correctly. Make note of the "Pool ID" that will be required to subscribe in the next task. Register the system using one of the pools above:
# subscription-manager subscribe --pool='[POOL_ID_Number]'
Note: Where "[POOL_ID_Number]" should be obtained from the preceding task.

Make sure a message stating "Successfully attached a subscription for" the system is shown.

Step 7: Validate that the system is now consuming a subscription:
# subscription-manager list --consumed
Validate the Subscription Name, SKU, Contract, Account and Pool ID are correct.
# subscription-manager list
Note: The Status should show "Subscribed".

Step 8: Validate in Red Hat Portal that the new system shows up as well.

In Red Hat Portal:
  1. In the online Red Hat Portal, login.
  2. Access Subscription Management.
  3. Access Red Hat Subscription Management -> Subscriber Inventory -> Click on Systems.
  4. Examine the Systems inventory to validate the new system is now visible and shows a subscription attached.

Number of results found: 388.
Displaying results: 11 - 20.