Topics: AIX, Backup & restore, Storage, System Admin

Using mkvgdata and restvg in DR situations

It is useful to run the following commands before you create your (at least) weekly mksysb image:

# lsvg -o | xargs -i mkvgdata {}
# tar -cvf /sysadm/vgdata.tar /tmp/vgdata
Add these commands to your mksysb script, just before running the mksysb command. What this does is to run the mkvgdata command for each online volume group. This will generate output for a volume group in /tmp/vgdata. The resulting output is then tar'd and stored in the /sysadm folder or file system. This allows information regarding your volume groups, logical volumes, and file systems to be included in your mksysb image.

To recreate the volume groups, logical volumes and file systems:
  • Run:
    # tar -xvf /sysadm/vgdata.tar
  • Now edit /tmp/vgdata/{volume group name}/{volume group name}.data file and look for the line with "VG_SOURCE_DISK_LIST=". Change the line to have the hdisks, vpaths or hdiskpowers as needed.
  • Run:
    # restvg -r -d /tmp/vgdata/{volume group name}/{volume group name}.data
Make sure to remove file systems with the rmfs command before running restvg, or it will not run correctly. Or, you can just run it once, run the exportvg command for the same volume group, and run the restvg command again. There is also a "-s" flag for restvg that lets you shrink the file system to its minimum size needed, but depending on when the vgdata was created, you could run out of space, when restoring the contents of the file system. Just something to keep in mind.

Topics: AIX, Storage, System Admin, Virtualization

Change default value of hcheck_interval

The default value of hcheck_interval for VSCSI hdisks is set to 0, meaning that health checking is disabled. The hcheck_interval attribute of an hdisk can only be changed online if the volume group to which the hdisk belongs, is not active. If the volume group is active, the ODM value of the hcheck_interval can be altered in the CuAt class, as shown in the following example for hdisk0:

# chdev -l hdisk0 -a hcheck_interval=60 -P
The change will then be applied once the system is rebooted. However, it is possible to change the default value of the hcheck_interval attribute in the PdAt ODM class. As a result, you won't have to worry about its value anymore and newly discovered hdisks will automatically get the new default value, as illustrated in the example below:
# odmget -q 'attribute = hcheck_interval AND uniquetype = \
PCM/friend/vscsi' PdAt | sed 's/deflt = \"0\"/deflt = \"60\"/' \
| odmchange -o PdAt -q 'attribute = hcheck_interval AND \
uniquetype = PCM/friend/vscsi'

Topics: AIX, Storage, System Admin

Mounting USB drive on AIX

To familiarize yourself with using USB drives on AIX, take a look at the following article at IBM developerWorks:

http://www.ibm.com/developerworks/aix/library/au-flashdrive/

Before you start using it, make sure you DLPAR the USB controller to your LPAR, if not done so already. You should see the USB devices on your system:

# lsconf | grep usb
+ usbhc0 U78C0.001.DBJX589-P2          USB Host Controller
+ usbhc1 U78C0.001.DBJX589-P2          USB Host Controller
+ usbhc2 U78C0.001.DBJX589-P2          USB Enhanced Host Controller
+ usbms0 U78C0.001.DBJX589-P2-C8-T5-L1 USB Mass Storage
After you plug in the USB drive, run cfgmgr to discover the drive, or if you don't want the run the whole cfgmgr, run:
# /etc/methods/cfgusb -l usb0
Some devices may not be recognized by AIX, and may require you to run the lquerypv command:
# lquerypv -h /dev/usbms0
To create a 2 TB file system on the drive, run:
# mkfs -olog=INLINE,ea=v2 -s2000G -Vjfs2 /dev/usbms0
To mount the file system, run:
# mount -o log=INLINE /dev/usbms0 /usbmnt
Then enjoy using a 2 TB file system:
# df -g /usbmnt
Filesystem    GB blocks      Free %Used    Iused %Iused Mounted on
/dev/usbms0     2000.00   1986.27    1%     3182     1% /usbmnt

Topics: AIX, Hardware, Storage, System Admin

Creating a dummy disk device

At some times it may be necessary to create a dummy disk device, for example when you need a disk to be discovered while running cfgmgr with a certain name on multiple hosts.

For example, if you need the disk to be called hdisk2, and only hdisk0 exists on the system, then running cfgmgr will discover the disk as hdisk1, not as hdisk2. In order to make sure cfgmgr indeed discovers the new disk as hdisk2, you can fool the system by temporarily creating a dummy disk device.

Here are the steps involved:

First: remove the newly discovered disk (in the example below known as hdisk1 - we will configure this disk as hdisk2):

# rmdev -dl hdisk1
Next, we create a dummy disk device with the name hdisk1:
# mkdev -l hdisk1 -p dummy -c disk -t hdisk -w 0000
Note that running the command above may result in an error. However, if you run the following command afterwards, you will notice that the dummy disk device indeed has been created:
# lsdev -Cc disk | grep hdisk1
hdisk1 Defined    SSA Logical Disk Drive
Also note that the dummy disk device will not show up if you run the lspv command. That is no concern.

Now run the cfgmgr command to discover the new disk. You'll notice that the new disk will now be discovered as hdisk2, because hdisk0 and hdisk1 are already in use.
# cfgmgr
# lsdev -Cc disk | grep hdisk2
Finally, remove the dummy disk device:
# rmdev -dl hdisk1

Topics: AIX, Storage, System Admin

Erasing disks

During a system decommission process, it is advisable to format or at least erase all drives. There are 2 ways of accomplishing that:

If you have time:

AIX allows disks to be erased via the Format media service aid in the AIX diagnostic package. To erase a hard disk, run the following command:

# diag -T format
This will start the Format media service aid in a menu driven interface. If prompted, choose your terminal. You will then be presented with a resource selection list. Choose the hdisk devices you want to erase from this list and commit your changes according to the instructions on the screen.

Once you have committed your selection, choose Erase Disk from the menu. You will then be asked to confirm your selection. Choose Yes. You will be asked if you want to Read data from drive or Write patterns to drive. Choose Write patterns to drive. You will then have the opportunity to modify the disk erasure options. After you specify the options you prefer, choose Commit Your Changes. The disk is now erased. Please note, that it can take a long time for this process to complete.

If you want to do it quick-and-dirty:

For each disk, use the dd command to overwrite the data on the disk. For example:
for disk in $(lspv | awk '{print $1}') ; do
   dd if=/dev/zero of=/dev/r${disk} bs=1024 count=10
   echo $disk wiped
done
This does the trick, as it reads zeroes from /dev/zero and outputs 10 times 1024 zeroes to each disk. That overwrites anything on the start of the disk, rendering the disk useless.

Topics: AIX, LVM, Storage, System Admin

VGs (normal, big, and scalable)

The VG type, commonly known as standard or normal, allows a maximum of 32 physical volumes (PVs). A standard or normal VG is no more than 1016 physical partitions (PPs) per PV and has an upper limit of 256 logical volumes (LVs) per VG. Subsequently, a new VG type was introduced which was referred to as big VG. A big VG allows up to 128 PVs and a maximum of 512 LVs.

AIX 5L Version 5.3 has introduced a new VG type called scalable volume group (scalable VG). A scalable VG allows a maximum of 1024 PVs and 4096 LVs. The maximum number of PPs applies to the entire VG and is no longer defined on a per disk basis. This opens up the prospect of configuring VGs with a relatively small number of disks and fine-grained storage allocation options through a large number of PPs, which are small in size. The scalable VG can hold up to 2,097,152 (2048 K) PPs. As with the older VG types, the size is specified in units of megabytes and the size variable must be equal to a power of 2. The range of PP sizes starts at 1 (1 MB) and goes up to 131,072 (128 GB). This is more than two orders of magnitude above the 1024 (1 GB), which is the maximum for both normal and big VG types in AIX 5L Version 5.2. The new maximum PP size provides an architectural support for 256 petabyte disks.

The table below shows the variation of configuration limits with different VG types. Note that the maximum number of user definable LVs is given by the maximum number of LVs per VG minus 1 because one LV is reserved for system use. Consequently, system administrators can configure 255 LVs in normal VGs, 511 in big VGs, and 4095 in scalable VGs.

VG typeMax PVsMax LVsMax PPs per VGMax PP size
Normal VG3225632,512 (1016 * 32)1 GB
Big VG128512130,048 (1016 * 128)1 GB
Scalable VG102440962,097,152128 GB

The scalable VG implementation in AIX 5L Version 5.3 provides configuration flexibility with respect to the number of PVs and LVs that can be accommodated by a given instance of the new VG type. The configuration options allow any scalable VG to contain 32, 64, 128, 256, 512, 768, or 1024 disks and 256, 512, 1024, 2048, or 4096 LVs. You do not need to configure the maximum values of 1024 PVs and 4096 LVs at the time of VG creation to account for potential future growth. You can always increase the initial settings at a later date as required.

The System Management Interface Tool (SMIT) and the Web-based System Manager graphical user interface fully support the scalable VG. Existing SMIT panels, which are related to VG management tasks, have been changed and many new panels added to account for the scalable VG type. For example, you can use the new SMIT fast path _mksvg to directly access the Add a Scalable VG SMIT menu.

The user commands mkvg, chvg, and lsvg have been enhanced in support of the scalable VG type.

For more information:
http://www.ibm.com/developerworks/aix/library/au-aix5l-lvm.html.

Topics: AIX, Oracle, SDD, Storage, System Admin

RAC OCR and VOTE LUNs

Consisting naming is nog required for Oracle ASM devices, but LUNs used for the OCR and VOTE functions of Oracle RAC environments must have the same device names on all RAC systems. If the names for the OCR and VOTE devices are different, create a new device for each of these functions, on each of the RAC nodes, as follows:

First, check the PVIDs of each disk that is to be used as an OCR or VOTE device on all the RAC nodes. For example, if you're setting up a RAC cluster consisting of 2 nodes, called node1 and node2, check the disks as follows:

root@node1 # lspv | grep vpath | grep -i none
vpath6          00f69a11a2f620c5                    None
vpath7          00f69a11a2f622c8                    None
vpath8          00f69a11a2f624a7                    None
vpath13         00f69a11a2f62f1f                    None
vpath14         00f69a11a2f63212                    None

root@node2 /root # lspv | grep vpath | grep -i none
vpath4          00f69a11a2f620c5                    None
vpath5          00f69a11a2f622c8                    None
vpath6          00f69a11a2f624a7                    None
vpath9          00f69a11a2f62f1f                    None
vpath10         00f69a11a2f63212                    None
As you can see, vpath6 on node 1 is the same disk as vpath4 on node 2. You can determine this by looking at the PVID.

Check the major and minor numbers of each device:
root@node1 # cd /dev
root@node1 # lspv|grep vpath|grep None|awk '{print $1}'|xargs ls -als
0 brw-------    1 root     system       47,  6 Apr 28 18:56 vpath6
0 brw-------    1 root     system       47,  7 Apr 28 18:56 vpath7
0 brw-------    1 root     system       47,  8 Apr 28 18:56 vpath8
0 brw-------    1 root     system       47, 13 Apr 28 18:56 vpath13
0 brw-------    1 root     system       47, 14 Apr 28 18:56 vpath14

root#node2 # cd /dev
root@node2 # lspv|grep vpath|grep None|awk '{print $1}'|xargs ls -als
0 brw-------    1 root     system       47,  4 Apr 29 13:33 vpath4
0 brw-------    1 root     system       47,  5 Apr 29 13:33 vpath5
0 brw-------    1 root     system       47,  6 Apr 29 13:33 vpath6
0 brw-------    1 root     system       47,  9 Apr 29 13:33 vpath9
0 brw-------    1 root     system       47, 10 Apr 29 13:33 vpath10
Now, on each node set up a consisting naming convention for the OCR and VOTE devices. For example, if you wish to set up 2 ORC and 3 VOTE devices:

On server node1:
# mknod /dev/ocr_disk01 c 47 6
# mknod /dev/ocr_disk02 c 47 7
# mknod /dev/voting_disk01 c 47 8
# mknod /dev/voting_disk02 c 47 13
# mknod /dev/voting_disk03 c 47 14
On server node2:
mknod /dev/ocr_disk01 c 47 4
mknod /dev/ocr_disk02 c 47 5
mknod /dev/voting_disk01 c 47 6
mknod /dev/voting_disk02 c 47 9
mknod /dev/voting_disk03 c 47 10
This will result in a consisting naming convention for the OCR and VOTE devices on bothe nodes:
root@node1 # ls -als /dev/*_disk*
0 crw-r--r-- 1 root system  47,  6 May 13 07:18 /dev/ocr_disk01
0 crw-r--r-- 1 root system  47,  7 May 13 07:19 /dev/ocr_disk02
0 crw-r--r-- 1 root system  47,  8 May 13 07:19 /dev/voting_disk01
0 crw-r--r-- 1 root system  47, 13 May 13 07:19 /dev/voting_disk02
0 crw-r--r-- 1 root system  47, 14 May 13 07:20 /dev/voting_disk03

root@node2 # ls -als /dev/*_disk*
0 crw-r--r-- 1 root system  47,  4 May 13 07:20 /dev/ocr_disk01
0 crw-r--r-- 1 root system  47,  5 May 13 07:20 /dev/ocr_disk02
0 crw-r--r-- 1 root system  47,  6 May 13 07:21 /dev/voting_disk01
0 crw-r--r-- 1 root system  47,  9 May 13 07:21 /dev/voting_disk02
0 crw-r--r-- 1 root system  47, 10 May 13 07:21 /dev/voting_disk03

Topics: AIX, Backup & restore, LVM, Performance, Storage, System Admin

Using lvmstat

One of the best tools to look at LVM usage is with lvmstat. It can report the bytes read and written to logical volumes. Using that information, you can determine which logical volumes are used the most.

Gathering LVM statistics is not enabled by default:

# lvmstat -v data2vg
0516-1309 lvmstat: Statistics collection is not enabled for
        this logical device. Use -e option to enable.
As you can see by the output here, it is not enabled, so you need to actually enable it for each volume group prior to running the tool using:
# lvmstat -v data2vg -e
The following command takes a snapshot of LVM information every second for 10 intervals:
# lvmstat -v data2vg 1 10
This view shows the most utilized logical volumes on your system since you started the data collection. This is very helpful when drilling down to the logical volume layer when tuning your systems.
# lvmstat -v data2vg

Logical Volume    iocnt   Kb_read  Kb_wrtn   Kbps
  appdatalv      306653  47493022   383822  103.2
  loglv00            34         0     3340    2.8
  data2lv           453    234543   234343   89.3         
What are you looking at here?
  • iocnt: Reports back the number of read and write requests.
  • Kb_read: Reports back the total data (kilobytes) from your measured interval that is read.
  • Kb_wrtn: Reports back the amount of data (kilobytes) from your measured interval that is written.
  • Kbps: Reports back the amount of data transferred in kilobytes per second.
You can use the -d option for lvmstat to disable the collection of LVM statistics.

Topics: AIX, Backup & restore, LVM, Performance, Storage, System Admin

Spreading logical volumes over multiple disks

A common issue on AIX servers is, that logical volumes are configured on only one single disk, sometimes causing high disk utilization on a small number of disks in the system, and impacting the performance of the application running on the server.

If you suspect that this might be the case, first try to determine which disks are saturated on the server. Any disk that is in use more than 60% all the time, should be considered. You can use commands such as iostat, sar -d, nmon and topas to determine which disks show high utilization. If the do, check which logical volumes are defined on that disk, for example on an IBM SAN disk:

# lspv -l vpath23
A good idea always is to spread the logical volumes on a disk over multiple disk. That way, the logical volume manager will spread the disk I/O over all the disks that are part of the logical volume, utilizing the queue_depth of all disks, greatly improving performance where disk I/O is concerned.

Let's say you have a logical volume called prodlv of 128 LPs, which is sitting on one disk, vpath408. To see the allocation of the LPs of logical volume prodlv, run:
# lslv -m prodlv
Let's also assume that you have a large number of disks in the volume group, in which prodlv is configured. Disk I/O usually works best if you have a large number of disks in a volume group. For example, if you need to have 500 GB in a volume group, it is usually a far better idea to assign 10 disks of 50 GB to the volume group, instead of only one disk of 512 GB. That gives you the possibility of spreading the I/O over 10 disks instead of only one.

To spread the disk I/O prodlv over 8 disks instead of just one disk, you can create an extra logical volume copy on these 8 disks, and then later on, when the logical volume is synchronized, remove the original logical volume copy (the one on a single disk vpath408). So, divide 128 LPs by 8, which gives you 16LPs. You can assign 16 LPs for logical volume prodlv on 8 disks, giving it a total of 128 LPs.

First, check if the upper bound of the logical volume is set ot at least 9. Check this by running:
# lslv prodlv
The upper bound limit determines on how much disks a logical volume can be created. You'll need the 1 disk, vpath408, on which the logical volume already is located, plus the 8 other disks, that you're creating a new copy on. Never ever create a copy on the same disk. If that single disk fails, both copies of your logical volume will fail as well. It is usually a good idea to set the upper bound of the logical volume a lot higher, for example to 32:
# chlv -u 32 prodlv
The next thing you need to determine is, that you actually have 8 disks with at least 16 free LPs in the volume group. You can do this by running:
# lsvg -p prodvg | sort -nk4 | grep -v vpath408 | tail -8
vpath188  active  959   40  00..00..00..00..40
vpath163  active  959   42  00..00..00..00..42
vpath208  active  959   96  00..00..96..00..00
vpath205  active  959  192  102..00..00..90..00
vpath194  active  959  240  00..00..00..48..192
vpath24   active  959  243  00..00..00..51..192
vpath304  active  959  340  00..89..152..99..00
vpath161  active  959  413  14..00..82..125..192
Note how in the command above the original disk, vpath408, was excluded from the list.

Any of the disks listed, using the command above, should have at least 1/8th of the size of the logical volume free, before you can make a logical volume copy on it for prodlv.

Now create the logical volume copy. The magical option you need to use is "-e x" for the logical volume commands. That will spread the logical volume over all available disks. If you want to make sure that the logical volume is spread over only 8 available disks, and not all the available disks in a volume group, make sure you specify the 8 available disks:
# mklvcopy -e x prodlv 2 vpath188 vpath163 vpath208 \
vpath205 vpath194 vpath24 vpath304 vpath161
Now check again with "mklv -m prodlv" if the new copy is correctly created:
# lslv -m prodlv | awk '{print $5}' | grep vpath | sort -dfu | \
while read pv ; do
result=`lspv -l $pv | grep prodlv`
echo "$pv $result"
done
The output should similar like this:
vpath161 prodlv  16  16  00..00..16..00..00  N/A
vpath163 prodlv  16  16  00..00..00..00..16  N/A
vpath188 prodlv  16  16  00..00..00..00..16  N/A
vpath194 prodlv  16  16  00..00..00..16..00  N/A
vpath205 prodlv  16  16  16..00..00..00..00  N/A
vpath208 prodlv  16  16  00..00..16..00..00  N/A
vpath24  prodlv  16  16  00..00..00..16..00  N/A
vpath304 prodlv  16  16  00..16..00..00..00  N/A
Now synchronize the logical volume:
# syncvg -l prodlv
And remove the original logical volume copy:
# rmlvcopy prodlv 1 vpath408
Then check again:
# lslv -m prodlv
Now, what if you have to extend the logical volume prodlv later on with another 128 LPs, and you still want to maintain the spreading of the LPs over the 8 disks? Again, you can use the "-e x" option when running the logical volume commands:
# extendlv -e x prodlv 128 vpath188 vpath163 vpath208 \
vpath205 vpath194 vpath24 vpath304 vpath161
You can also use the "-e x" option with the mklv command to create a new logical volume from the start with the correct spreading over disks.

Topics: Performance, Red Hat / Linux, Storage, System Admin

Creating a RAM disk on Linux

On Linux, you can use the tmpfs to create a RAM disk:

# mkdir -p /mnt/tmp
# mount -t tmpfs -o size=20m tmpfs /mnt/tmp
This will create a 20 Megabyte sized RAM file system, mounted on /mtn/tmp. If you leave out the "-o size" option, by default half of the memory will be allocated. However, the memory will not be used, as long as no data is written to the RAM file system.

Number of results found for topic Storage: 51.
Displaying results: 11 - 20.