Pages

Automated OpenSSH and OpenSSL Upgrade with IFIX Installation on AIX

Upgrade SSH/SSL and apply IFIXes on AIX remotely:

==========================================================================
#!/bin/ksh
# =========================================================================
# upgrade_ssh_ifix.ksh - Upgrade SSH/SSL and apply IFIXes on AIX remotely
# Usage : ./upgrade_ssh_ifix.ksh <hostname>
# Author : Tasleem A Khan
# =========================================================================
# --- Usage Check ---
if [ $# -ne 1 ]; then
echo "Usage: $0 <hostname>"
exit 1
fi

HOST=$1
SSH="ssh -o StrictHostKeyChecking=no -o ConnectTimeout=10 -q"
SCP="scp -o StrictHostKeyChecking=no -o ConnectTimeout=10 -q"
NIM=$(hostname)
SOFTSRC="/software/openssh-ssl"

echo "=============================================================="
echo " Starting OpenSSH/OpenSSL upgrade on host: $HOST"
echo " NIM Server: $NIM"
echo "=============================================================="
echo

# --- Step 1: Check remote connectivity ---
echo "[CHECK] Testing SSH connectivity to $HOST..."
if ! ${SSH} ${HOST} "hostname" >/dev/null 2>&1; then
echo "[ERROR] Unable to connect to $HOST via SSH. Aborting."
exit 2
fi
echo "[OK] SSH connectivity verified."
echo

# --- Step 2: Prepare temporary directory for SSH/SSL install ---
echo "[INFO] Preparing /tmp/openssh-ssl directory..."
${SSH} ${HOST} "rm -rf /tmp/openssh-ssl && mkdir -p /tmp/openssh-ssl"

# --- Step 3: Copy installation files ---
echo "[INFO] Copying installation files from $SOFTSRC to $HOST..."
${SCP} ${SOFTSRC}/* ${HOST}:/tmp/openssh-ssl/ >/dev/null 2>&1
if [ $? -ne 0 ]; then
echo "[ERROR] File copy failed. Check $SOFTSRC path and permissions."
exit 3
fi
echo "[OK] Files copied successfully."
echo

# --- Step 4: Install OpenSSH and OpenSSL base packages ---
echo "[INFO] Installing OpenSSH and OpenSSL base packages..."
${SSH} ${HOST} "installp -aXYd /tmp/openssh-ssl \
openssh.base openssh.license openssh.man.en_US \
openssl.base openssl.license openssl.man.en_US \
openssh.msg.EN_US openssh.msg.en_US" >/dev/null 2>&1
if [ $? -ne 0 ]; then
echo "[WARNING] installp encountered warnings or errors. Verify manually."
fi
echo "[OK] Installation phase complete."
echo

# --- Step 5: Apply IFIX packages (epkg.Z) ---
echo "[INFO] Applying IFIX packages..."
for FIX in 3013ma.240923.epkg.Z 973013sa.250306.epkg.Z; do
echo " -> Applying IFIX: $FIX"
${SSH} ${HOST} "emgr -e /tmp/openssh-ssl/$FIX" >/dev/null 2>&1
done
echo "[OK] IFIXes applied successfully."
echo

# --- Step 6: Restart SSHD service ---
echo "[INFO] Restarting SSHD service..."
${SSH} ${HOST} "stopsrc -s sshd >/dev/null 2>&1; sleep 3; startsrc -s sshd >/dev/null 2>&1"
if [ $? -eq 0 ]; then
echo "[OK] SSHD restarted successfully."
else
echo "[WARNING] SSHD restart failed. Please check manually."
fi
echo

# --- Step 7: Cleanup temporary files ---
echo "[INFO] Cleaning up temporary files..."
${SSH} ${HOST} "rm -f /tmp/openssh-ssl/* /tmp/openssh-ssl/.toc; rmdir /tmp/openssh-ssl" >/dev/null 2>&1
echo "[OK] Cleanup completed."
echo

# --- Step 8: Post-upgrade validation ---
echo "[INFO] Post-upgrade validation:"
echo "--------------------------------------------------------------"
${SSH} ${HOST} "oslevel -s"
${SSH} ${HOST} "lslpp -L | grep -E 'openssh|openssl' | grep -v fileset"
${SSH} ${HOST} "emgr -l | grep -E 'State|Label|Description' | head -n 15"
${SSH} ${HOST} "lppchk -v >/dev/null 2>&1; echo 'lppchk -v: completed successfully'"
echo "--------------------------------------------------------------"
echo
echo "===== OpenSSH/OpenSSL Upgrade Completed Successfully on $HOST ====="
exit 0

==========================================================================
Example Output:
Here’s what you’ll typically see when you run:
# ./upgrade_ssh_ifix.ksh aixlpar01

Example output:
==============================================================
 Starting OpenSSH/OpenSSL upgrade on host: aixlpar01
 NIM Server: nim-master01
==============================================================
[CHECK] Testing SSH connectivity to aixlpar01...
[OK] SSH connectivity verified.
[INFO] Preparing /tmp/openssh-ssl directory...
[OK] Created temporary directory.
[INFO] Copying installation files from /software/openssh-ssl to aixlpar01...
[OK] Files copied successfully.
[INFO] Installing OpenSSH and OpenSSL base packages...
[OK] Installation phase complete.
[INFO] Applying IFIX packages...
   -> Applying IFIX: 3013ma.240923.epkg.Z
   -> Applying IFIX: 973013sa.250306.epkg.Z
[OK] IFIXes applied successfully.
[INFO] Restarting SSHD service...
[OK] SSHD restarted successfully.
[INFO] Cleaning up temporary files...
[OK] Cleanup completed.
[INFO] Post-upgrade validation:
--------------------------------------------------------------
7200-05-09-2346
openssh.base              9.0.100.250923  COMMITTED  OpenSSH Secure Shell Server
openssl.base              3.0.10.250923   COMMITTED  OpenSSL Cryptography Library
State = Applied  Label = 3013ma.240923.epkg.Z
State = Applied  Label = 973013sa.250306.epkg.Z
lppchk -v: completed successfully
--------------------------------------------------------------
===== OpenSSH/OpenSSL Upgrade Completed Successfully on aixlpar01 =====

Adding a New Volume Group (VG) to an Existing Resource Group (RG) in IBM AIX PowerHA (HACMP)

In IBM AIX environments running HACMP (High Availability Cluster Multi-Processing), adding a new Volume Group (VG) to an existing Resource Group (RG) requires careful execution to maintain cluster consistency and ensure filesystem availability.

This guide provides a step-by-step procedure for safely adding a new VG to an active HACMP cluster.

1. Introduction
This procedure explains how to add a new volume group to an existing Resource Group (RG) in an active PowerHA/HACMP cluster.
  • Disk identification
  • VG creation
  • Filesystem setup
  • VG import on other nodes
  • Cluster configuration and synchronization
2. Identify and Prepare New Disks
Before creating the volume group, identify unused disks on each cluster node and configure them as physical volumes (PVs).

On NodeA
# cfgmgr
Find the unuse disk
# lspv | grep -i none 
# chdev -l hdiskX -a reserve_policy=no_reserve

On NodeB
# cfgmgr 
Find the unuse disk
# lspv | grep -i none 
# chdev -l hdiskX -a reserve_policy=no_reserve

Note:
pv=yes marks the disk as a physical volume.
reserve_policy=no_reserve  -- disables disk reservation, which is required for concurrent access in HACMP environments.

3. Create a Concurrent-Capable Volume Group
PowerHA requires volume groups to support concurrent access across cluster nodes.
Check Available Major Numbers

Run this command on NodeB to find available major numbers:
# lvlstmajor

Create the VG on NodeA
# mkvg -C -V 203 -S -s 32 -y <vgname> hdiskX
# varyonvg <vgname>
# chvg -Qn <vgname>

Explanation:
-C → Creates a concurrent-capable VG
-V 203 → Assigns a major number (must be available on both nodes)
-S → Disables strict allocation
-s 32 → Sets physical partition size to 32 MB
-Qn → Disables quota checking

4. Create Logical Volumes and Filesystems
After the VG is created, define logical volumes (LVs), mirrors, and filesystems.
# mklv -e x -u 4 -s s -t jfs2 -y <lvname> <vgname> 4 hdiskX
# lslv -m <lvname>
# mklvcopy -e x <lvname> 2 hdiskY hdiskZ
# varyonvg <vgname>
# crfs -v jfs2 -m /<mountpoint> -d /dev/<lvname> -A no -p rw -a logname=INLINE
# mount /<mountpoint>
# chown user:group /<mountpoint>
# umount /<mountpoint>
# varyoffvg <vgname>

Explanation:
mklv → Creates the logical volume
mklvcopy → Creates mirrored copies of the LV
crfs → Creates a JFS2 filesystem

Filesystem ownership and permissions are set, then the VG is unmounted and varied off for cluster integration

5. Import the VG on the Secondary Node

To make the VG available on the other cluster node:
On NodeA (to get PVID)
# lspv | grep -w <vgname> | head -1

On NodeB
# cfgmgr
# lspv | awk '$3 ~ /^None$/ {print "chdev -l "$1" -a reserve_policy=no_reserve"}' | sh
# importvg -n -V 203 -y <vgname> <PVID>

Explanation:
importvg -n → Imports the VG without automatically mounting filesystems
-V 203 → Ensures the same major number is used as on NodeA

6. Add the VG to the HACMP Resource Group
Once the VG is imported on both nodes, integrate it into the existing Resource Group (RG) configuration.

Verify HACMP Configuration
# smitty hacmp
→ Problem Determination Tools
→ HACMP Verification
→ Verify HACMP Configuration

Discover HACMP Information
# smitty hacmp
→ Extended Configuration
→ Discover HACMP-related Information from Configured Nodes

Add VG to an Existing Resource Group (e.g., RG01)
# smitty hacmp
→ Extended Configuration
→ Extended Resource Configuration
→ HACMP Extended Resource Group Configuration
→ Change/Show Resources and Attributes for a Resource Group
Select the desired Resource Group and add the new VG.

7. Synchronize the Cluster Configuration
After updating the RG, synchronize the cluster configuration across all nodes:
# smitty hacmp
→ Extended Configuration
→ Extended Verification and Synchronization

Resolving PowerHA (HACMP) ODM Version/Node Mismatches After Migration or Upgrade

After performing a migration or version upgrade in IBM AIX PowerHA (formerly HACMP) environments, administrators may encounter cluster version mismatches.

This typically occurs when the Object Data Manager (ODM) retains old version information, leading to inconsistencies that prevent PowerHA services from starting or functioning correctly.

This guide explains how to safely identify and correct PowerHA ODM version mismatches, ensuring cluster integrity and successful startup.

Step 1: Backup ODM (On Both Nodes)
Before making any changes, always back up the ODM repositories on both cluster nodes to prevent data loss in case of an error.
# tar -cvf hacmp.odm.tar /usr/es/sbin/cluster/etc/objrepos
# tar -cvf system.odm.tar /etc/objrepos
Tip: Keep these backups in a safe location. They can be restored if any configuration corruption occurs.

Step 2: Correct the HACMP/PowerHA Cluster Version (On Both Nodes)

Export the Current HACMPcluster Object Class
Dump the current cluster object data into a file:
# odmget HACMPcluster > cluster.file

Review the File
Example output:
# cat cluster.file
PowerHA SystemMirror cluster:
id = 1106245917
name = "HA72_TestCluster"
nodename = "node1"
sec_level = "Standard"
sec_level_msg = ""
sec_encryption = ""
sec_persistent = ""
last_node_ids = ""
highest_node_id = 0
last_network_ids = ""
highest_network_id = 0
last_site_ids = ""
highest_site_id = 0
handle = 1
cluster_version = 17
reserved1 = 0
reserved2 = 0
wlm_subdir = ""
settling_time = 0
rg_distribution_policy = "node"
noautoverification = 0
clvernodename = ""
clverhour = 0

3. Edit the Cluster Version
Open the file in a text editor:
# vi cluster.file
Locate the line:
cluster_version = 17
Update the version number to match your installed PowerHA version.
For example:
PowerHA/HACMP Version cluster_version Value
HACMP                     7.2 17
PowerHA                   7.3 20
Make sure that the cluster name and nodename fields are correct and consistent across nodes.

4. Remove the Existing ODM Object
Delete the old cluster object from ODM:
# odmdelete -o HACMPcluster

5. Re-add the Corrected ODM Object
Add the updated configuration back into the ODM:
# odmadd cluster.file

6. Verify the Changes
Confirm that the cluster version and attributes were updated successfully:
# odmget HACMPcluster
Ensure the cluster_version field now reflects the correct version number.

Verify and Synchronize Cluster Configuration (Primary Node Only)
Once the ODM has been corrected on both nodes, perform a cluster verification and synchronization from the primary node (NodeA):
# smitty hacmp
Navigate through the SMIT menu:
Cluster Applications and Resources → Verify and Synchronize Cluster Configuration
This process ensures that all nodes have identical and valid cluster configurations.

Start PowerHA Cluster Services (Primary Node Only)
After successful verification and synchronization, start the cluster services from NodeA:
# smitty hacmp
Navigate:
System Management (C-SPOC) → Manage HACMP Services → Start Cluster Services
Select Both Nodes (NodeA and NodeB) when prompted.

Step 7: Verify Cluster and Resource Status
Once the cluster is running, check that all resources are online and functioning correctly.
# clstat
# clshowsrv
# lsvg -o
# df -g
# ifconfig -a

Confirm that:
  • Volume groups are varied on,
  • Filesystems are mounted,
  • Service IPs are active, and
  • Applications are running as expected.

Manual Resource Switch in IBM AIX PowerHA (HACMP) Cluster

In certain situations, the PowerHA (High Availability Cluster Multiprocessing) configuration may become unstable or non-functional. In such cases, a UNIX system administrator may need to manually switch resources from one node to another.

This procedure can also be used for planned maintenance, troubleshooting, or failover testing to ensure that all resources (volume groups, filesystems, IPs, and applications) are properly brought online on the target node.

Step 1: Check Cluster Resources
Before initiating a manual switch, verify the current cluster resources and identify the volume groups associated with the resource group.
# clshowres
This command displays the resource groups, service IPs, volume groups, and filesystems managed by PowerHA.

Step 2: Stop Cluster Services
Check if PowerHA cluster services are running. If they are active, stop them on both nodes to avoid conflicts.
# smit cl_admin
Then navigate through the menu:
PowerHA SystemMirror Services → Stop Cluster Services
Press Enter and confirm when prompted.

Step 3: Vary On the Volume Group
Activate the required volume group(s) on the target node (for example, NodeA):
# varyonvg -O <VG_Name> 
# lsvg <VG_Name>
The -O option allows concurrent-capable volume groups to be varied on.
Verify VG status and ensure that logical volumes are available and in an active state.

Step 4: Mount Filesystems
Once the volume group is active, mount the necessary filesystems:
# mount /data01
Note: Ensure that all mount points exist before mounting.

Step 5: Verify Cluster Service IPs
Display cluster topology information and verify the service IP addresses assigned to the target node:
# cltopinfo
Ensure that the service IPs required for the resource group are not active on another node.

Step 6: Configure Service IP Aliases
Manually assign the service IP addresses as aliases on the network interface.
Add IP Alias:
# ifconfig en1 inet 192.168.10.11 netmask 255.255.255.0 alias
Activate IP Alias:
# ifconfig en1 alias 192.168.10.11 netmask 255.255.255.0 up
Verify the configuration:
# ifconfig en1
Confirm that the service IP alias is active on the interface.

Step 7: Start Application Services
After the volume groups, filesystems, and IP addresses are active, start the associated application services manually using their startup scripts or commands.
For example:
# /usr/local/bin/start_app.sh
Verify that the application processes are running and accessible using the configured service IP.

Fixing PV ID Mismatch While Varyon VG in IBM AIX


In IBM AIX environments, administrators sometimes encounter a PV (Physical Volume) ID mismatch issue when trying to varyon a Volume Group (VG).

This usually occurs after disk replacements, SAN migrations, or cloning operations that cause the disk’s metadata to differ from the system’s Object Data Manager (ODM) records.

When this mismatch happens, the Volume Group fails to varyon, preventing access to logical volumes (LVs) and filesystems.

This article explains how to safely fix the issue by cleaning and re-creating the Volume Group while preserving existing data.

Common Error Messages:

You may see one of the following errors when attempting to varyon the VG:
0516-010 : Unable to varyon volume group <vgname>.
PV ID xxxxxxxxxxx mismatch detected for hdiskX.
or
0516-052 varyonvg: Volume group cannot be varied on because PV identifiers do not mat
ch.

Step-by-Step Resolution:
Follow these steps carefully to fix the PV ID mismatch and recover the VG:

1. Export the Volume Group
Remove the VG definition from ODM to prevent configuration conflicts:
# exportvg datavg
This does not delete data — it only removes the VG entry from the system database.

2. Clear the Disk Metadata
Reset the PV information on the affected disk to remove the old PV ID:
# chdev -a pv=clear -l hdiskX
This prepares the disk for reinitialization with a new PV header.

3. Remove and Reconfigure the Disk
Remove the disk from the device list, then rescan to reconfigure it:
# rmdev -dl hdiskX
# cfgmgr -v
Running cfgmgr forces AIX to detect the disk again with a fresh PV ID.

4. Re-Import the Volume Group or Recreate the Volume Group

Option A: Import the VG
If the disk’s metadata is still valid and intact, you can import the VG instead of recreating it:
# importvg -y datavg hdiskX
This rebuilds the VG definition in ODM using the VG metadata stored on the disk.

Option B: Recreate the VG
If importvg fails, recreate the VG using the cleaned disk:
# recreatevg -y datavg hdiskX
recreatevg scans the disk and rebuilds the VG structure in ODM.
It can re-detect existing LVs if they are still intact on the disk.

5. Rename Logical Volumes (Optional)
If LV names differ or need alignment with old names:
# chlv -n <New_LV_Name> <Old_LV_Name>

6. Rename Filesystems
Adjust mount points to match your previous setup:
# chfs -m <New_FS_Name> <Old_FS_Name>

7. Mount Filesystems
Finally, mount your filesystems and verify they are online:
# mount <New_FS_Name>
# df -g