Transfer the AIX Health Check tar image onto the AIX system, and untar the file. Any directory will do; you can transfer the tar image into the directory of your preference and untar it there:
root@(testaix1) /ahc # ls *tar ahc_latest.tar root@(testaix1) /ahc # tar -xvf *tar x checkactivatedrpcservices.ksh, 1625 bytes, 4 media blocks. x checkadaptersdefined.ksh, 297 bytes, 1 media blocks. x checkadapters.ksh, 319 bytes, 1 media blocks. x checkaiooa.ksh, 334 bytes, 1 media blocks. x checkaiostatus.ksh, 1765 bytes, 4 media blocks. x checkall.ksh, 26286 bytes, 52 media blocks. x checkaudit.ksh, 235 bytes, 1 media blocks. ... ... ... x checkxntpd.ksh, 2557 bytes, 5 media blocks. x checkzombies.ksh, 407 bytes, 1 media blocks. x COPYRIGHT, 930 bytes, 2 media blocks. x DESCRIPTIONS, 124444 bytes, 244 media blocks.
Once the tar image has been unpacked, you will notice a lot of files within the chosen directory. Most of the files are individual check scripts that start with the name "check". Each individual script will check a certain function or configuration. You can run each script individually if you like. For example, to check if wget is installed, run checkwget.ksh. To determine the model name of the AIX server, run checkmodelname.ksh.
Please note that AIX Health Check is designed to run as user root only. Many scripts will still run using a different user account, but AIX Health Check is only supported by running via the root account. Root access is required, because AIX Health Check runs several root-level commands. Also please note, that AIX Health Check does never change anything on the AIX system; it only reports. AIX Health Check is not designed to automatically resolve any issues found, because the configuration can depend on your environment or infrastructure. From the output of the check script(s), you can determine what issue was found (if an issue is found), and what possible action should be taken, to remediate the issue.root@(testaix1) /ahc # checkwget.ksh wget-1.9.1-1 root@(testaix1) /ahc # checkmodelname.ksh 9117-MMB
Each script will return a returncode, that is either zero, which means the script completed successfully, or one, which means an error is encountered, or two, a warning situation occurred.
For example, script checktmpsize.ksh will check if file system /tmp is at least 1 GB in size:
As you can see, file system /tmp is indeed at least 1 GB, in fact it is 2 GB, which meets the best practice for sizing /tmp. By having a /tmp file system that is large enough, it is unlikely to fill it up any time soon. The script, checktmpsize.ksh, therefore returns a zero, meaning, it completed successfully.root@(testaix1) /ahc # grep -i purpose checktmpsize.ksh # Purpose: Check if the size of /tmp is at least 1 GB. root@(testaix1) /ahc # checktmpsize.ksh root@(testaix1) /ahc # echo $? 0 root@(testaix1) /ahc # df -g /tmp Filesystem GB blocks Free %Used Iused %Iused Mounted on /dev/hd3 2.00 1.99 1% 44 1% /tmp
Here's an example of when a check script returns an error code:
Script checkpgspminsize.ksh is used to make sure that the paging space(s) defined is/are at least the same size as the available memory in the system:
As you can see, the script returns an error, returncode 1, because the system has 124,672 MB of memory, and only 43,008 MB is assigned to the paging space(s), meaning is lacks 81,664 MB of paging space.root@(testaix1) /ahc # grep -i purpose checkpgspminsize.ksh # Purpose: Check if paging space is at least the same size as memory. root@(testaix1) /ahc # root@(testaix1) /ahc # checkpgspminsize.ksh Paging space smaller than memory, requires additional 81664MB root@(testaix1) /ahc # echo $? 1 root@(testaix1) /ahc # lsattr -El mem0 -a goodsize goodsize 124672 Amount of usable physical memory in Mbytes False root@(testaix1) /ahc # lsps -s Total Paging Space Percent Used 43008MB 1%
Running several of all checks
Running all the check scripts individually can be really cumbersome. AIX Health Check includes around 400 check scripts. Therefore, there's also a checkall.ksh script available, which can be used to run several or all scripts. The checkall.ksh script will not run all scripts at once, but one at a time.
For example to run all scripts, and to produce a log file within the same directory, run:
You will not see any output on the screen. A log file however is produced:root@(testaix1) /ahc # checkall.ksh
You can review the log file for each script run, for example, for script checkexcluderootvg.ksh, which makes sure that at least the /tmp file is excluded from a mksysb backup:root@(testaix1) /ahc # ls *log checkall_testaix1.log
At the end of the log file, a summary is included:Running check 78 of 378: checkexcluderootvg.ksh ^./tmp Check checkexcluderootvg.ksh completed successfully: returncode 0 20% complete - 300 checks to go.
Finished checking host testaix1. Run time for all checks : 319 seconds Total number of checks : 378 # Checks with result OK : 337 # Checks with result WARNING : 29 # Checks with result ERROR : 12 Score [Percentage OK] : 89.15 % For details see logfile : checkall_testaix1.log
Scoring the system
As you can see, the AIX system receives a score based on the output of all scripts. In general, an AIX system that has well been taken care of, should receive a score of 95% or higher. Any score lower than 95% is an indication that there are issues to be remediated on the system.
Other useful functions
AIX Health Check is a great tool for discovering possible performance bottle necks, because the log file produced by AIX Health Check includes all kinds of performance metrics. For example, it lists the Top 20 memory using processes:
Also, AIX Health Check is a great tool for gathering inventory information. Very useful during Disaster Recovery (exercises) of doing a Quick Scan of a new AIX environment. For example, AIX Health Check will generate a list of commands needed to re-create all the logical volumes, file systems and the correct permissions to be set, in case a server needs to be recovered:Running check 332 of 378: checktop20memoryusers.ksh Pid Command Inuse Pin Pgsp Virtual 64-bit 3997924 java 67395 8771 0 41672 N 1966558 cimserver 29036 8687 0 29023 N 2883854 rmcd 24620 8667 0 24337 N 2556206 clstrmgr 24462 8660 0 22899 N 5767338 topas_nmon 24042 8660 0 23338 N 655452 cimlistener 23651 8663 0 23637 N 7340286 sshd 23472 8660 0 22934 N 3735800 sshd 23378 8660 0 22842 N 2359760 sendmail 23333 8660 0 22945 N 5177352 IBM.CSMAgentR 23321 8675 0 22655 N 1835476 tier1slp 23178 8660 0 23160 N 3145776 clcomd 23156 8663 0 22925 N 2228672 snmpdv3ne 23043 8660 0 22952 N 3080464 clcomd 22974 8662 0 22783 N 2687464 cron 22932 8660 0 22898 N 3342612 xntpd 22872 8660 0 22795 N 4194544 diagd 22846 8660 0 22834 N 4063460 nonstop_aix 22728 8662 0 22700 N 2163140 slp_srvreg 22727 8660 0 22720 N 4915212 IBM.ServiceRM 22473 8676 0 22386 N Check checktop20memoryusers.ksh completed successfully: returncode 0 87% complete - 46 checks to go.
If you'd rather run some scripts, and not all of them, you can use the -s option. For example, to run two check scripts, checkmodelname.ksh and checkcpumodel.ksh, run:root@(testaix1) /ahc # checklvfscreate.ksh mklv -e x -y exportlv -t jfs2 nimvg 40960M chlv -U root -G system -P 660 exportlv mklv -e x -y sysadmlv -t jfs2 nimvg 204800M chlv -U root -G system -P 660 sysadmlv crfs -v jfs2 -m /export -d exportlv -a logname=INLINE -A yes crfs -v jfs2 -m /sysadm -d sysadmlv -a logname=INLINE -A yes mkdir -p /export 2>/dev/null mkdir -p /sysadm 2>/dev/null mount /export 2>/dev/null mount /sysadm 2>/dev/null chmod 755 /export chown root:system /export chmod 755 /sysadm chown root:system /sysadm
root@(testaix1) /ahc # checkall.ksh -v -s checkcpumodel.ksh,checkmodelname.kshNote that the -v option will produce verbose output on the screen. The output of the command above will look similar to this:
AIX HEALTH CHECK Version: 21.04.11 Hostname: testaix1 Start at: 04/21/2011 22:14:33 Options: -v -s checkcpumodel.ksh,checkmodelname.ksh Output file: checkall_testaix1.log Width: 130 Display: All checks Descriptions: No Output type: Log file # Checks: 2 Scripts: checkcpumodel.ksh checkmodelname.ksh Running check 1 of 2: checkcpumodel.ksh PowerPC_POWER7 Check checkcpumodel.ksh completed successfully: returncode 0 50% complete - 1 checks to go. Running check 2 of 2: checkmodelname.ksh 9117-MMB Check checkmodelname.ksh completed successfully: returncode 0 100% complete - 0 checks to go. Finished checking host testaix1. Run time for all checks : 2 seconds Total number of checks : 2 # Checks with result OK : 2 # Checks with result WARNING : 0 # Checks with result ERROR : 0 Score [Percentage OK] : 100.00 % For details see logfile : checkall_testaix1.log
Adding descriptions to checks
If you'd like to include descriptions to the output, add the -d option. Descriptions of each check can be found in file DESCRIPTIONS, which you can view with an editor, however, the -d option will add the description to each specific script:
The output of the command above will look similar to this:root@(testaix1) /ahc # checkall.ksh -v -s checkwget.ksh -d
Running check 1 of 1: checkwget.ksh Description: ------------ Check if wget is installed, and if so, if the correct version is installed. The latest available version in the AIX Toolbox for Linux Applications is version 1.9.1. Output: ------------ wget-1.9.1-1 Check checkwget.ksh completed successfully: returncode 0 100% complete - 0 checks to go.
Sending output through email
Using the -m option, you can select to email the output to an email address:
root@(testaix1) /ahc # checkall.ksh -v -s checkwget.ksh -d -m firstname.lastname@example.org
Comma separated output
And using the -c option, you can choose to produce CSV style output instead of the default log file output:
Selecting CSV style (or Comma Separated) output, can be very useful for loading the output of scripts into a database.root@(testaix1) /ahc # checkall.ksh -cs checkwget.ksh,checkuptime.ksh root@(testaix1) /ahc # cat *csv Hostname,Date-Time,Check,Returncode,Output testaix1,2011-04-21 22:25:48,checkuptime.ksh,0, testaix1,2011-04-21 22:25:48,checkwget.ksh,0,wget-1.9.1-1
The -h option can be used to create an HTML style output:
The HTML file created will look similar to this:root@(testaix1) /ahc # checkall.ksh -hs checkwget.ksh,checkuptime.ksh root@(testaix1) /ahc # ls *html checkall_testaix1.html
As you can see, colors are used to indicate the returncode of the scripts that have been run. For a full HTML sample report, click the following link:
Other options exist as well. The -l option can be used to specify the location of the output file. The -w option can be used to determine the width of the output, especially useful when creating log file output. And the -g option can be used to suppress the output of all the successful checks, resulting in a report of only those scripts that generated a Warning or an Error.
The different options can all be combined. You can create a CSV style output (-c option) and have it emailed to you (-m option). You can generate an HTML style report (-h option), have it emailed to you (-m option), only showing the non-successful checks (-g option), with added descriptions (-d option).
As you can see, AIX Health Check is a very versatile and at the same time, very easy to use, tool. It's very intuitive, and doesn't require a long time to get used to. Use it in any way you see fit, with those options you prefer.