How it works

Installing AIX Health Check

Transfer the AIX Health Check tar image onto the AIX system, and untar the file. Any directory will do; you can transfer the tar image into the directory of your preference and untar it there:
root@(testaix1) /ahc # ls *tar
ahc_latest.tar
root@(testaix1) /ahc # tar -xvf *tar
x checkactivatedrpcservices.ksh, 1625 bytes, 4 media blocks.
x checkadaptersdefined.ksh, 297 bytes, 1 media blocks.
x checkadapters.ksh, 319 bytes, 1 media blocks.
x checkaiooa.ksh, 334 bytes, 1 media blocks.
x checkaiostatus.ksh, 1765 bytes, 4 media blocks.
x checkall.ksh, 26286 bytes, 52 media blocks.
x checkaudit.ksh, 235 bytes, 1 media blocks.
...
...
...
x checkxntpd.ksh, 2557 bytes, 5 media blocks.
x checkzombies.ksh, 407 bytes, 1 media blocks.
x COPYRIGHT, 930 bytes, 2 media blocks.
x DESCRIPTIONS, 124444 bytes, 244 media blocks.

Running checks

Once the tar image has been unpacked, you will notice a lot of files within the chosen directory. Most of the files are individual check scripts that start with the name "check". Each individual script will check a certain function or configuration. You can run each script individually if you like. For example, to check if wget is installed, run checkwget.ksh. To determine the model name of the AIX server, run checkmodelname.ksh.
root@(testaix1) /ahc # checkwget.ksh
wget-1.9.1-1
root@(testaix1) /ahc # checkmodelname.ksh
9117-MMB
Please note that AIX Health Check is designed to run as user root only. Many scripts will still run using a different user account, but AIX Health Check is only supported by running via the root account. Root access is required, because AIX Health Check runs several root-level commands. Also please note, that AIX Health Check does never change anything on the AIX system; it only reports. AIX Health Check is not designed to automatically resolve any issues found, because the configuration can depend on your environment or infrastructure. From the output of the check script(s), you can determine what issue was found (if an issue is found), and what possible action should be taken, to remediate the issue.

Return codes

Each script will return a returncode, that is either zero, which means the script completed successfully, or one, which means an error is encountered, or two, a warning situation occurred.

For example, script checktmpsize.ksh will check if file system /tmp is at least 1 GB in size:
root@(testaix1) /ahc # grep -i purpose checktmpsize.ksh
# Purpose:     Check if the size of /tmp is at least 1 GB.
root@(testaix1) /ahc # checktmpsize.ksh
root@(testaix1) /ahc # echo $?
0
root@(testaix1) /ahc # df -g /tmp
Filesystem    GB blocks      Free %Used    Iused %Iused Mounted on
/dev/hd3           2.00      1.99    1%       44     1% /tmp
As you can see, file system /tmp is indeed at least 1 GB, in fact it is 2 GB, which meets the best practice for sizing /tmp. By having a /tmp file system that is large enough, it is unlikely to fill it up any time soon. The script, checktmpsize.ksh, therefore returns a zero, meaning, it completed successfully.

Here's an example of when a check script returns an error code:

Script checkpgspminsize.ksh is used to make sure that the paging space(s) defined is/are at least the same size as the available memory in the system:
root@(testaix1) /ahc # grep -i purpose checkpgspminsize.ksh
# Purpose: Check if paging space is at least the same size as memory.
root@(testaix1) /ahc #
root@(testaix1) /ahc # checkpgspminsize.ksh
Paging space smaller than memory, requires additional 81664MB
root@(testaix1) /ahc # echo $?
1
root@(testaix1) /ahc # lsattr -El mem0 -a goodsize
goodsize 124672 Amount of usable physical memory in Mbytes False
root@(testaix1) /ahc # lsps -s
Total Paging Space   Percent Used
      43008MB               1%
As you can see, the script returns an error, returncode 1, because the system has 124,672 MB of memory, and only 43,008 MB is assigned to the paging space(s), meaning is lacks 81,664 MB of paging space.

Running several of all checks

Running all the check scripts individually can be really cumbersome. AIX Health Check includes around 400 check scripts. Therefore, there's also a checkall.ksh script available, which can be used to run several or all scripts. The checkall.ksh script will not run all scripts at once, but one at a time.

For example to run all scripts, and to produce a log file within the same directory, run:
root@(testaix1) /ahc # checkall.ksh
You will not see any output on the screen. A log file however is produced:
root@(testaix1) /ahc # ls *log
checkall_testaix1.log
You can review the log file for each script run, for example, for script checkexcluderootvg.ksh, which makes sure that at least the /tmp file is excluded from a mksysb backup:
Running check 78 of 378: checkexcluderootvg.ksh

^./tmp

Check checkexcluderootvg.ksh completed successfully: returncode 0
20% complete - 300 checks to go.
At the end of the log file, a summary is included:
Finished checking host testaix1.

Run time for all checks              : 319 seconds
Total number of checks               : 378
# Checks with result OK              : 337
# Checks with result WARNING         : 29
# Checks with result ERROR           : 12
Score [Percentage OK]                : 89.15 %

For details see logfile              : checkall_testaix1.log

Scoring the system

As you can see, the AIX system receives a score based on the output of all scripts. In general, an AIX system that has well been taken care of, should receive a score of 95% or higher. Any score lower than 95% is an indication that there are issues to be remediated on the system.

Other useful functions

AIX Health Check is a great tool for discovering possible performance bottle necks, because the log file produced by AIX Health Check includes all kinds of performance metrics. For example, it lists the Top 20 memory using processes:
Running check 332 of 378: checktop20memoryusers.ksh

     Pid Command          Inuse      Pin     Pgsp  Virtual 64-bit 
 3997924 java             67395     8771        0    41672      N
 1966558 cimserver        29036     8687        0    29023      N
 2883854 rmcd             24620     8667        0    24337      N
 2556206 clstrmgr         24462     8660        0    22899      N
 5767338 topas_nmon       24042     8660        0    23338      N
  655452 cimlistener      23651     8663        0    23637      N
 7340286 sshd             23472     8660        0    22934      N
 3735800 sshd             23378     8660        0    22842      N
 2359760 sendmail         23333     8660        0    22945      N
 5177352 IBM.CSMAgentR    23321     8675        0    22655      N
 1835476 tier1slp         23178     8660        0    23160      N
 3145776 clcomd           23156     8663        0    22925      N
 2228672 snmpdv3ne        23043     8660        0    22952      N
 3080464 clcomd           22974     8662        0    22783      N
 2687464 cron             22932     8660        0    22898      N
 3342612 xntpd            22872     8660        0    22795      N
 4194544 diagd            22846     8660        0    22834      N
 4063460 nonstop_aix      22728     8662        0    22700      N
 2163140 slp_srvreg       22727     8660        0    22720      N
 4915212 IBM.ServiceRM    22473     8676        0    22386      N

Check checktop20memoryusers.ksh completed successfully: returncode 0
87% complete - 46 checks to go.
Also, AIX Health Check is a great tool for gathering inventory information. Very useful during Disaster Recovery (exercises) of doing a Quick Scan of a new AIX environment. For example, AIX Health Check will generate a list of commands needed to re-create all the logical volumes, file systems and the correct permissions to be set, in case a server needs to be recovered:
root@(testaix1) /ahc # checklvfscreate.ksh
mklv -e x -y exportlv -t jfs2 nimvg 40960M
chlv -U root -G system -P 660 exportlv
mklv -e x -y sysadmlv -t jfs2 nimvg 204800M
chlv -U root -G system -P 660 sysadmlv
crfs -v jfs2 -m /export -d exportlv -a logname=INLINE -A yes 
crfs -v jfs2 -m /sysadm -d sysadmlv -a logname=INLINE -A yes
mkdir -p /export 2>/dev/null
mkdir -p /sysadm 2>/dev/null
mount /export 2>/dev/null
mount /sysadm 2>/dev/null
chmod 755 /export
chown root:system /export
chmod 755 /sysadm
chown root:system /sysadm
If you'd rather run some scripts, and not all of them, you can use the -s option. For example, to run two check scripts, checkmodelname.ksh and checkcpumodel.ksh, run:
root@(testaix1) /ahc # checkall.ksh -v -s checkcpumodel.ksh,checkmodelname.ksh
Note that the -v option will produce verbose output on the screen. The output of the command above will look similar to this:
AIX HEALTH CHECK

Version:         21.04.11
Hostname:        testaix1
Start at:        04/21/2011 22:14:33
Options:         -v -s checkcpumodel.ksh,checkmodelname.ksh
Output file:     checkall_testaix1.log
Width:           130
Display:         All checks
Descriptions:    No
Output type:     Log file
# Checks:        2
Scripts:         checkcpumodel.ksh checkmodelname.ksh


Running check 1 of 2: checkcpumodel.ksh

PowerPC_POWER7

Check checkcpumodel.ksh completed successfully: returncode 0
50% complete - 1 checks to go.


Running check 2 of 2: checkmodelname.ksh

9117-MMB

Check checkmodelname.ksh completed successfully: returncode 0
100% complete - 0 checks to go.


Finished checking host testaix1.

Run time for all checks              : 2 seconds
Total number of checks               : 2
# Checks with result OK              : 2
# Checks with result WARNING         : 0
# Checks with result ERROR           : 0
Score [Percentage OK]                : 100.00 %

For details see logfile              : checkall_testaix1.log

Adding descriptions to checks

If you'd like to include descriptions to the output, add the -d option. Descriptions of each check can be found in file DESCRIPTIONS, which you can view with an editor, however, the -d option will add the description to each specific script:
root@(testaix1) /ahc # checkall.ksh -v -s checkwget.ksh -d
The output of the command above will look similar to this:
Running check 1 of 1: checkwget.ksh

Description:
------------

Check if wget is installed, and if so, if the correct version is
installed. The latest available version in the AIX Toolbox for
Linux Applications is version 1.9.1.

Output:
------------

wget-1.9.1-1

Check checkwget.ksh completed successfully: returncode 0
100% complete - 0 checks to go.

Sending output through email

Using the -m option, you can select to email the output to an email address:
root@(testaix1) /ahc # checkall.ksh -v -s checkwget.ksh -d -m my@email.com

Comma separated output

And using the -c option, you can choose to produce CSV style output instead of the default log file output:
root@(testaix1) /ahc # checkall.ksh -cs checkwget.ksh,checkuptime.ksh
root@(testaix1) /ahc # cat *csv
Hostname,Date-Time,Check,Returncode,Output
testaix1,2011-04-21 22:25:48,checkuptime.ksh,0,
testaix1,2011-04-21 22:25:48,checkwget.ksh,0,wget-1.9.1-1
Selecting CSV style (or Comma Separated) output, can be very useful for loading the output of scripts into a database.

HTML output

The -h option can be used to create an HTML style output:

root@(testaix1) /ahc # checkall.ksh -hs checkwget.ksh,checkuptime.ksh
root@(testaix1) /ahc # ls *html
checkall_testaix1.html
The HTML file created will look similar to this:


As you can see, colors are used to indicate the returncode of the scripts that have been run. For a full HTML sample report, click the following link: Other options

Other options exist as well. The -l option can be used to specify the location of the output file. The -w option can be used to determine the width of the output, especially useful when creating log file output. And the -g option can be used to suppress the output of all the successful checks, resulting in a report of only those scripts that generated a Warning or an Error.

The different options can all be combined. You can create a CSV style output (-c option) and have it emailed to you (-m option). You can generate an HTML style report (-h option), have it emailed to you (-m option), only showing the non-successful checks (-g option), with added descriptions (-d option).

Conclusion

As you can see, AIX Health Check is a very versatile and at the same time, very easy to use, tool. It's very intuitive, and doesn't require a long time to get used to. Use it in any way you see fit, with those options you prefer.