Pages

PowerHA NFS Cluster Installtion & Configuration

Ensuring high availability (HA) for NFS services in an AIX environment requires a robust clustering solution. IBM PowerHA SystemMirror 7.2 provides a reliable platform for clustering nodes, managing resources, and enabling automatic failover. This guide walks you through installing PowerHA, configuring NFS services, creating an Enhanced Concurrent Volume Group (ECVG), and adding it as a cluster resource.

1. Steps to Configure Passwordless SSH Between Nodes:

Step 1: Generate SSH Key Pair
On Node1 (repeat for Node2 if you want both directions):
# su - root
# ssh-keygen -t rsa -f /root/.ssh/id_rsa
Press Enter for default file location.
Leave passphrase empty (important for passwordless access).

Step 2: Copy Public Key to Other Node
Use ssh-copy-id (if available) or manual copy:
On Node1, view public key:
# cat /root/.ssh/id_rsa.pub
On Node2, append it to /root/.ssh/authorized_keys:
# vi /root/.ssh/authorized_keys
# (paste the key from Node1)
# chmod 600 /root/.ssh/authorized_keys

Step 3: Set Correct Permissions
On all nodes:
# chmod 700 /root/.ssh
# chmod 600 /root/.ssh/authorized_keys
# chmod 600 /root/.ssh/id_rsa
# chmod 644 /root/.ssh/id_rsa.pub

Step 4: Test SSH Connection
From Node1:
# ssh root@node2
You should log in without a password.
Repeat from Node2 → Node1.

2. Install PowerHA Software on Both Nodes:

Step 1: Mount the installation media
# mount nimserver01:/software/hacmp /mnt

Step 2: Install PowerHA filesets
# installp -acgXd /mnt bos.cluster.rte cluster.adt.es cluster.doc.en_US.es cluster.es.client cluster.es.cspoc cluster.es.nfs cluster.es.server cluster.license cluster.man.en_US.es

Step 3: Verify installation
# lslpp -l | grep cluster

Step 4: Reboot all nodes
# shutdown -Fr

3. Configure Network and Repository Disk:

Step 1: Configure internal cluster (boot) IPs
# chdev -l en0 -a netaddr=192.168.10.101 -a netmask=255.255.255.0 -a state=up
# chdev -l en0 -a netaddr=192.168.10.102 -a netmask=255.255.255.0 -a state=up

Step 2: Update /etc/hosts
192.168.10.101 node1_boot
192.168.10.102 node2_boot
192.168.10.201 nfs_service_ip

Step 3: Configure /etc/cluster/rhosts
192.168.10.101
192.168.10.102

Step 4: Restart cluster communication daemon
# stopsrc -s clcomd
# startsrc -s clcomd

Step 5: Set up repository disk
# lspv
# chdev -l hdisk2 -a pv=yes
# lspv | grep hdisk2

4. Create Cluster Configuration:

Step 1: Create a new cluster
# /usr/es/sbin/cluster/utilities/clmgr add cluster nfs_cluster

Step 2: Add nodes
# /usr/es/sbin/cluster/utilities/clmgr add node node1 -c nfs_cluster
# /usr/es/sbin/cluster/utilities/clmgr add node node2 -c nfs_cluster

Step 3: Define cluster network
# /usr/es/sbin/cluster/utilities/clmgr add network net_ether_01 ether

Step 4: Add network interfaces
# /usr/es/sbin/cluster/utilities/clmgr add interface net_ether_01 node1 en0
# /usr/es/sbin/cluster/utilities/clmgr add interface net_ether_01 node2 en0

Step 5: Add repository disk
# /usr/es/sbin/cluster/utilities/clmgr add repositorydisk hdisk2

Step 6: Verify and synchronize cluster
# /usr/es/sbin/cluster/utilities/clmgr verify cluster nfs_cluster
# /usr/es/sbin/cluster/utilities/clmgr sync cluster nfs_cluster

Step 7: Start the cluster
# /usr/es/sbin/cluster/utilities/clmgr start cluster nfs_cluster
# clstat -o

5. Configure NFS Filesystem and Service

Step 1: Create shared volume group and logical volume (on one node)
# mkvg -S -y nfs_vg hdisk3
# mklv -t jfs2 -y nfs_lv nfs_vg 1024
# crfs -v jfs2 -d nfs_lv -m /nfs_share -A no -p rw -a logname=INLINE
# mkdir -p /nfs_share
# mount /nfs_share

Step 2: Export NFS share (on both nodes)
Edit /etc/exports:
/nfs_share -rw -secure -root=client1,client2 -access=client1,client2
Apply export:
# exportfs -a

Step 3: Add NFS resources to PowerHA
# /usr/es/sbin/cluster/utilities/clmgr add nfsserver nfs_srv1
# /usr/es/sbin/cluster/utilities/clmgr add nfsexpfs nfs_exp1 -m /nfs_share -e "/nfs_share -rw -secure -root=client1,client2 -access=client1,client2"
# /usr/es/sbin/cluster/utilities/clmgr add serviceip nfs_svc_ip -A 192.168.10.201 -n net_ether_01

Step 4: Create Resource Group for NFS
# /usr/es/sbin/cluster/utilities/clmgr add rg rg_nfs -n node1,node2 -p node1 -f never
# /usr/es/sbin/cluster/utilities/clmgr add rg_resource rg_nfs serviceip nfs_svc_ip
# /usr/es/sbin/cluster/utilities/clmgr add rg_resource rg_nfs nfsserver nfs_srv1
# /usr/es/sbin/cluster/utilities/clmgr add rg_resource rg_nfs nfsexpfs nfs_exp1
# /usr/es/sbin/cluster/utilities/clmgr verify cluster nfs_cluster
# /usr/es/sbin/cluster/utilities/clmgr sync cluster nfs_cluster
# /usr/es/sbin/cluster/utilities/clmgr online rg rg_nfs
# clRGinfo

6. Creating Enhanced Concurrent Volume Group (ECVG)

Step 1: Verify Shared Disks
# lspv
# lspv | grep hdisk4

Step 2: Create ECVG
# mkvg -S -y ecvg_vg hdisk4 -A y
-S → enables concurrent access
-y ecvg_vg → volume group name
-A y → allows access from multiple nodes

Step 3: Create Logical Volumes and Filesystem
# mklv -t jfs2 -y ecvg_lv ecvg_vg 1024
# crfs -v jfs2 -d ecvg_lv -m /ecvg_share -A no -p rw
# mkdir -p /ecvg_share
# mount /ecvg_share

Step 4: Add ECVG as a Cluster Resource
# /usr/es/sbin/cluster/utilities/clmgr add rg rg_ecvg -n node1,node2 -p node1 -f never
# /usr/es/sbin/cluster/utilities/clmgr add rg_resource rg_ecvg concurrentvg ecvg_vg
# /usr/es/sbin/cluster/utilities/clmgr verify cluster nfs_cluster
# /usr/es/sbin/cluster/utilities/clmgr sync cluster nfs_cluster
# /usr/es/sbin/cluster/utilities/clmgr online rg rg_ecvg
# clRGinfo

6. Validate NFS and ECVG
From an NFS client:
# showmount -e 192.168.10.201
# mount 192.168.10.201:/nfs_share /mnt
# touch /mnt/testfile
# ls -l /mnt

Test failover by shutting down node1. The NFS service and ECVG resource should automatically failover to node2.

PowerHA Cluster Installation & Configuration

Setting up a PowerHA SystemMirror (formerly HACMP) cluster requires careful planning and execution to ensure high availability for critical applications. This guide walks through the steps to configure a two-node AIX cluster using shared storage and CAA (Cluster Aware AIX).

Note: In this guide, the primary (active) node is referred to as node1, and the secondary (passive) node is node2.

Step 1: Generate SSH Key Pair
On Node1 (repeat for Node2 if you want both directions):
# su - root
# ssh-keygen -t rsa -f /root/.ssh/id_rsa
Press Enter for default file location.
Leave passphrase empty (important for passwordless access).

Step 2: Copy Public Key to Other node
Use ssh-copy-id (if available) or manual copy:
On node1, view public key:
# cat /root/.ssh/id_rsa.pub
On node2, append it to /root/.ssh/authorized_keys:
# vi /root/.ssh/authorized_keys
# (paste the key from node1)
# chmod 600 /root/.ssh/authorized_keys

Step 3: Set Correct Permissions
On all nodes:
# chmod 700 /root/.ssh
# chmod 600 /root/.ssh/authorized_keys
# chmod 600 /root/.ssh/id_rsa
# chmod 644 /root/.ssh/id_rsa.pub

Step 4: Test Node-to-Node Communication
Test remote shell communication using clrsh:
# /usr/sbin/clrsh node1 hostname   # Output: node1
# /usr/sbin/clrsh node2 hostname   # Output: node2
If tests fail, the cluster cannot be created until the communication issue is resolved.

Step 5: Verify Shared Storage
Before creating a cluster, ensure all shared disks are visible on both nodes:
# lspv
Check all shared datavg hdisks and the hdisk intended for the CAA repository.
Ensure all shared hdisks have reserve_policy = no_reserve:
# lsattr -El hdiskXX
# chdev -l hdiskXX -a reserve_policy=no_reserve
The hdisk used for the CAA repository should be new and never part of a VG.

Step 6: Verify Network Interfaces
Run ifconfig -a on both nodes to identify boot IP addresses for all adapters.
At this stage, there should be no alias IPs configured.

Step 7: Configure rhosts for Cluster Communication
Add all boot IPs to /etc/cluster/rhosts on both nodes (one IP per line, no comments).
Ensure /etc/hosts resolves all IPs locally.
Configure /etc/netsvc.conf to prioritize local hostname resolution:
hosts=local4,bind4

Step 8: Restart Cluster Communication Daemon
# stopsrc -s clcomd; sleep 5; startsrc -s clcomd

Step 9: Install PowerHA
Mount NFS share both the nodes
# mount nimserver1:/export/powerHA  /mnt
Change to the PowerHA installation directory:
# cd /mnt/HA7.x/
Install all packages:
# installp -acgXYd . all

Step 10: Create the Cluster
On node1, add the cluster and nodes:
# /usr/es/sbin/cluster/utilities/clmgr add cluster <cluster_name> NODES="node1 node2"
Add a repository disk (disable validation for a new disk):
# /usr/es/sbin/cluster/utilities/clmgr add repository <Repository_Disk_PVID> DISABLE_VALIDATION=true
Synchronize the cluster configuration:
# /usr/es/sbin/cluster/utilities/clmgr sync cluster
Verify cluster status on both nodes:
# lscluster -m       # Should show each node as UP
# lssrc -a | grep cthags  # Should show cthags active

Step 11: Create Resource Group and Service IP
Create a resource group:
# /usr/es/sbin/cluster/utilities/clmgr add resource_group <RG_Name> NODES=node1,node2
Create a service IP:
# /usr/es/sbin/cluster/utilities/clmgr add service_ip <Service_IP_Label> NETWORK=<Network_Name>

Step 12: Create Shared Volume Group
# /usr/es/sbin/cluster/sbin/cl_mkvg -f -n -cspoc -n 'node1,node2' -r '<RG_Name>' -y '<VG_Name>' -V '<VG_Major_Number>' -E <hdiskx_PVID>

Step 13: Add Resources to Resource Group
# /usr/es/sbin/cluster/utilities/clmgr modify resource_group '<RG_Name>' SERVICE_LABEL='<Service_IP_Label>'

Step 14: Synchronize and Start Cluster
Synchronize the cluster:
# /usr/es/sbin/cluster/utilities/clmgr sync cluster
Start PowerHA and bring the cluster online:
# /usr/es/sbin/cluster/utilities/clmgr online cluster WHEN=now START_CAA=yes

The GPFS Cluster filesystem extension (AIX & Linux)

The GPFS (General Parallel File System) filesystem extension is the core component that integrates IBM Spectrum Scale with the operating system’s Virtual File System (VFS) layer, allowing GPFS to function as a native filesystem similar to ext3 or xfs.

The GPFS kernel extension registers support at the OS vnode and VFS levels, enabling seamless recognition and handling of GPFS operations. It works closely with GPFS daemons that manage cluster-wide I/O operations, including read-ahead and write-behind optimizations for high-performance data access.

GPFS Filesystem Extension Steps for AIX:

1. Scan the LUNs all the GPFS nodes
# cfgmgr -v
2. Set reserve_policy on each disk on each node
# chdev -l <hdisk#> -a reserve_policy=no_reserve
3. Create the file /tmp/nsdhdiskX.txt
# vi /tmp/nsdhdiskX.txt
%nsd:
         device=/dev/<hdiskX>
         servers=server1,server2,server3,server4
         nsd=<nsd_name>
         usage=dataAndMetadata
         failureGroup=1
         pool=system

4. Create NSD from the file /tmp/nsdhdiskX.txt
# mmcrnsd -F /tmp/nsdhdiskX.txt

5. To see that NSD names corresponding to the disks in lspv output.
# lspv

6. To verify using the mmlsnsd command.
# mmlsnsd

7 Add disk to an existing filesystem
# mmadddisk <gpfs_filesystem> "<nsd_disk name1>;<nsd_disk name2>"
Example:
# mmadddisk gpfs01 nsd1;nsd2

8.To validate the NFS disk attached to the filesystem gpfs filesystem.
# mmlsdisk <gpfs_filesystem>
Example:
# mmlsdisk gpfs01

9. To Check new filesystem size.
# df -g | grep <filesystem_name>

GPFS Filesystem Extension Steps for Linux:

1. Scan the LUNs & find the shared LUNs all the GPFS nodes
# for host in /sys/class/scsi_host/host*; do
echo "- - -" > "$host/scan"
done
#lsblk

2. Create the file /tmp/disk1.txt
# vi /tmp/disk1.txt
%nsd:
       device=</dev/sdX>
       servers=server1,server2,server3,server4
       nsd=<nsd_name>
       usage=dataAndMetadata
       failureGroup=1
       pool=system

3. Create NSD from the file /tmp/disk1.txt
# mmcrnsd -F /tmp/disk1.txt

4. To see that NSD names corresponding to the disks in lspv output.
# lsblk

5. To verify using the mmlsnsd command.
# mmlsnsd

6 Add disk to an existing filesystem
# mmadddisk <gpfs_filesystem> "<nsd_disk name1>;<nsd_disk name2>"
Example:
# mmadddisk gpfs01 nsd1;nsd2

7.To validate the NFS disk attached to the filesystem gpfs filesystem.
# mmlsdisk <gpfs_filesystem>
Example:
# mmlsdisk gpfs01

8. To Check new filesystem size.
# df -h  or df -h <filesystem_name>

To update or change the IP address on GPFS cluster (AIX & Linux)

To update or change the IP address of nodes in a GPFS cluster, the general process involves several key steps:

Prerequisites:
  • Root or equivalent privileges all the GPFS nodes.
  • Confirm all nodes are reachable via SSH new IP address.
  • Ensure there’s a maintenance window (since GPFS services will be stopped).
  • Backup critical configuration files.
1. Backup Existing Configuration Files
Run these commands on the primary node:
# cp -rp /var/mmfs/gen/mmfsNodeData /var/mmfs/gen/mmfsNodeData.org
# cp -rp /etc/hosts /etc/hosts.old
# cp -rp /etc/filesystems /etc/filesystems.pciip
This preserves the original configuration in case a rollback is needed.

2. Unmount and Stop GPFS Cluster
Unmount all GPFS filesystems and shut down GPFS services:
# mmumount all -a
# mmshutdown -a
Check that all nodes are properly shut down:
# mmgetstate -aL
Expected output: all nodes should show down.

3. Update Host IP Addresses
Edit the /etc/hosts file to replace the old IPs with the new ones.
# vi /etc/hosts
Before:
192.168.10.101   node1
192.168.10.102   node2
After:
# 192.168.10.101   node1
# 192.168.10.102   node2
192.168.10.151   node1
192.168.10.152   node2
qw! Save and close the file.

4. Update GPFS Node Interfaces
Use the mmchnode command to change the admin and daemon interfaces for each node:
# mmchnode --admin-interface=node1 --daemon-interface=node1 -N node1
# mmchnode --admin-interface=node2 --daemon-interface=node2 -N node2
Verify the change:
# mmlscluster
Confirm the new IPs are correctly associated with the nodes.

5. Restart GPFS Cluster
Start GPFS on all nodes:
# mmstartup -a
Check the state again:
# mmgetstate -aL
Expected output: all nodes should show active.

6. Verify Filesystem Mounts
Remount all GPFS filesystems:
# mmumount all -a
Then verify mount points and disk space for AIX:
# df -g or df -g <filesystem_name>
Then verify mount points and disk space for Linux:
# df -h or df -h <filesystem_name>

AIX Hidden Files in NFS Can Have Permission Issues

NFS client-server permission mapping:
  • NFS relies on UID (user ID) and GID (group ID) matching between client and server.
  • If a file is created on the server by userA (UID 1001) and the client has userB (UID 1002), the permissions may appear wrong.

Root squash:
  • NFS often maps root on the client to nobody on the server for security (root_squash).
  • This prevents root on client from changing server files.
  • Hidden files (. prefix) behave like normal files, so permission issues are the same as regular files, but they might not be visible unless ls -a is used.
Check Existing Permissions
# ls -l /mount/dir # Normal files
# ls -la /mount/dir # Hidden files included
Example:
-rw-r--r-- 1 nobody users 50 Sep 26 .hiddenfile
Here:
Owner is nobody → client cannot modify
Group is users → group may not match

Fixing Hidden File Permissions on NFS

A. On the NFS Server
Check UID/GID of the file:
# ls -ln /export/dir
-n shows numeric UID/GID.
Change ownership:
# chown correctuser:correctgroup /export/dir/.hiddenfile
Set proper permissions:
# chmod 600 /export/dir/.hiddenfile # Owner read/write
# chmod 644 /export/dir/.hiddenfile # Owner read/write, others read

B. On the NFS Client
Remount with correct options if UID/GID mismatch:
# mount -o remount,vers=3,rw server:/export/dir /mount/dir
Ensure user on client has same UID/GID as server:
# id username
If mismatch, either create matching UID/GID or adjust server file ownership.

C. Special Case: Root Cannot Modify
If root_squash is active:
You cannot change ownership or permissions as root from the client.
Fix must be done on NFS server by a user with permissions.

Create multiple Solaris users remotely via jump server

Make sure you can SSH into these hosts passwordlessly (using SSH keys).
Example:
Jump server : [root@inddcpjmp01 solaris]#
./create_users_solaris_full.sh <servername> <users.csv>

1. Input cvs file users.csv example:
[root@inddcpjmp01 solaris]# cat users.csv
username,uid,gid,groups...,fullname,homedir,password
user1,1001,1001,unixadm;staff,John Doe,/export/home/user1,Welcome@123
user2,1002,1002,sysadmin;staff;apps,Jane Smith,/export/home/user2,Welcome@123
[root@inddcpjmp01 solaris]#

2. Create multiple Solaris users remotely via jump server (create_users_solaris_full.sh)

#!/bin/bash
# ---------------------------------------------------------------------
# Script: create_users_solaris_full.sh
# Purpose: Create multiple Solaris users remotely via jump server.
# Author: Tasleem A Khan
# Usage: ./create_users_solaris_full.sh <servername> <users.csv>
# ---------------------------------------------------------------------

REMOTE_HOST="$1"
INPUT_FILE="$2"

if [ -z "$REMOTE_HOST" ] || [ -z "$INPUT_FILE" ]; then
    echo "Usage: $0 <servername> <csv_file>"
    exit 1
fi

if [ ! -f "$INPUT_FILE" ]; then
    echo "Error: File '$INPUT_FILE' not found."
    exit 1
fi

if [ "$(id -u)" -ne 0 ]; then
    echo "Error: Must run as root on jump server."
    exit 1
fi

REMOTE_TMP_DIR="/tmp/user_create_$$"
REMOTE_SCRIPT="${REMOTE_TMP_DIR}/remote_create_users.sh"
REMOTE_CSV="${REMOTE_TMP_DIR}/users.csv"
REMOTE_LOG="${REMOTE_TMP_DIR}/create_users.log"

echo "--------------------------------------------------"
echo " Running user creation on remote host: $REMOTE_HOST"
echo "--------------------------------------------------"

# --- Create temporary directory on remote host ---
ssh -o BatchMode=yes -o ConnectTimeout=10 "$REMOTE_HOST" "mkdir -p $REMOTE_TMP_DIR" || {
    echo "Error: Unable to connect to $REMOTE_HOST or create remote directory."
    exit 1
}

# --- Copy CSV to remote host ---
scp -q "$INPUT_FILE" "$REMOTE_HOST:$REMOTE_CSV" || {
    echo "Error: Failed to copy $INPUT_FILE to $REMOTE_HOST:$REMOTE_CSV"
    exit 1
}

# --- Create remote script dynamically ---
cat > /tmp/remote_create_users.sh <<'EOF'
#!/bin/bash
INPUT_FILE="$1"
LOG_FILE="$2"

if [ "$(id -u)" -ne 0 ]; then
    echo "Error: Must run as root." | tee -a "$LOG_FILE"
    exit 1
fi

echo "Starting user creation..." | tee -a "$LOG_FILE"

create_group_if_missing() {
    local groupname="$1"
    local gid="$2"
    if ! grep -q "^${groupname}:" /etc/group; then
        echo "Creating group $groupname (GID=${gid})" | tee -a "$LOG_FILE"
        if [ -n "$gid" ]; then
            /usr/sbin/groupadd -g "$gid" "$groupname"
        else
            /usr/sbin/groupadd "$groupname"
        fi
    fi
}

# --- Process CSV line by line (skip header) ---
sed '1d' "$INPUT_FILE" | while IFS=',' read -r username uid gid rest
do
    username=$(echo "$username" | xargs)
    uid=$(echo "$uid" | xargs)
    gid=$(echo "$gid" | xargs)
    rest=$(echo "$rest" | xargs)

    # Skip invalid lines
    [ -z "$username" ] && continue
    [ -z "$uid" ] && continue
    [ -z "$gid" ] && continue

    # --- Parse remaining fields: secondary groups, fullname, homedir, password ---
    OLDIFS="$IFS"
    IFS=','
    arr=()
    for f in $rest; do
        arr+=("$f")
    done
    IFS="$OLDIFS"

    # Determine fullname as first field with a space
    sec_groups=""
    fullname=""
    homedir=""
    password=""
    for ((i=0;i<${#arr[@]};i++)); do
        if [[ "${arr[$i]}" =~ \  ]]; then
            fullname="${arr[$i]}"
            homedir="${arr[$((i+1))]}"
            password="${arr[$((i+2))]}"
            break
        else
            sec_groups+="${arr[$i]};"
        fi
    done

    # Cleanup trailing semicolon
    sec_groups=$(echo "$sec_groups" | sed 's/;$//')

    echo "--------------------------------------------------" | tee -a "$LOG_FILE"
    echo "Processing user: $username" | tee -a "$LOG_FILE"
    echo " UID: $uid" | tee -a "$LOG_FILE"
    echo " GID: $gid (primary group: $username)" | tee -a "$LOG_FILE"
    echo " Secondary groups: $sec_groups" | tee -a "$LOG_FILE"
    echo " Full name: $fullname" | tee -a "$LOG_FILE"
    echo " Home dir: $homedir" | tee -a "$LOG_FILE"
    echo "--------------------------------------------------" | tee -a "$LOG_FILE"

    # --- Primary group ---
    create_group_if_missing "$username" "$gid"

    # --- Secondary groups ---
    sec_group_option=""
    if [ -n "$sec_groups" ]; then
        # Convert semicolon to comma for Solaris useradd
        sec_group_csv=$(echo "$sec_groups" | tr ';' ',')
        # Ensure each group exists
        OLDIFS="$IFS"
        IFS=','
        for g in $sec_group_csv; do
            g=$(echo "$g" | xargs)
            create_group_if_missing "$g"
        done
        IFS="$OLDIFS"
        sec_group_option="-G $sec_group_csv"
    fi

    # --- Create user ---
    if id "$username" >/dev/null 2>&1; then
        echo "User $username already exists, skipping..." | tee -a "$LOG_FILE"
        continue
    fi

    /usr/sbin/useradd -u "$uid" -g "$username" $sec_group_option -d "$homedir" -m -c "$fullname" "$username"

    if [ $? -eq 0 ]; then
        # --- Set password using expect ---
        if command -v expect >/dev/null 2>&1; then
            /usr/bin/expect <<EOPASS >/dev/null
spawn passwd "$username"
expect "New Password:"
send "$password\r"
expect "Re-enter New Password:"
send "$password\r"
expect eof
EOPASS
            echo "User $username created successfully with password from CSV." | tee -a "$LOG_FILE"
        else
            # If expect not installed, force password change
            passwd -f "$username" 2>/dev/null
            echo "Password for $username must be changed at first login (expect not installed)." | tee -a "$LOG_FILE"
        fi
    else
        echo "Failed to create user $username" | tee -a "$LOG_FILE"
    fi

done

echo "All users processed." | tee -a "$LOG_FILE"
EOF

# --- Copy and run remotely ---
scp -q /tmp/remote_create_users.sh "$REMOTE_HOST:$REMOTE_SCRIPT" || {
    echo "Error: Failed to copy remote script."
    exit 1
}

ssh -tt "$REMOTE_HOST" "bash $REMOTE_SCRIPT $REMOTE_CSV $REMOTE_LOG"

# --- Fetch log ---
scp -q "$REMOTE_HOST:$REMOTE_LOG" "./create_users_${REMOTE_HOST}.log" && \
echo "Log saved as create_users_${REMOTE_HOST}.log"

# --- Cleanup ---
ssh "$REMOTE_HOST" "rm -rf $REMOTE_TMP_DIR"
rm -f /tmp/remote_create_users.sh

echo "--------------------------------------------------"
echo " Completed user creation on $REMOTE_HOST"
echo " Log: create_users_${REMOTE_HOST}.log"
echo "--------------------------------------------------"
[root@inddcpjmp01 solaris]#

Script Output:
[root@inddcpjmp01 solaris]# ./create_users_solaris_full.sh indsuntst01 users.csv
--------------------------------------------------
 Running user creation on remote host: indsuntst01
--------------------------------------------------
Starting user creation...
--------------------------------------------------
Processing user: user1
 UID: 1001
 GID: 1001 (primary group: user1)
 Secondary groups: unixadm;staff
 Full name: John Doe
 Home dir: /export/home/user1
--------------------------------------------------
Creating group user1 (GID=1001)
80 blocks
User user1 created successfully with password from CSV.
--------------------------------------------------
Processing user: user2
 UID: 1002
 GID: 1002 (primary group: user2)
 Secondary groups: sysadmin;staff;apps
 Full name: Jane Smith
 Home dir: /export/home/user2
--------------------------------------------------
Creating group user2 (GID=1002)
80 blocks
User user2 created successfully with password from CSV.
All users processed.
Connection to indsuntst01 closed.
Log saved as create_users_indsuntst01.log
--------------------------------------------------
 Completed user creation on indsuntst01
 Log: create_users_indsuntst01.log
--------------------------------------------------
[root@inddcpjmp01 solaris]#

Solaris Server output:
login as: user1
Keyboard-interactive authentication prompts from server:
| Password:
End of keyboard-interactive prompts from server
Last login: Mon Oct 27 01:21:44 2025 from 192.168.10.252
Oracle Corporation      SunOS 5.11      11.4    Aug 2018
user1@indsuntst01:~$ sudo su -
Oracle Corporation      SunOS 5.11      11.4    Aug 2018
You have new mail.
root@indsuntst01:~#
root@indsuntst01:~# cat /etc/passwd | egrep "user1|user2"
user1:x:1001:1001:John Doe:/export/home/user1:/usr/bin/bash
user2:x:1002:1002:Jane Smith:/export/home/user2:/usr/bin/bash
root@indsuntst01:~# cat /etc/group | egrep "user1|user2"
staff::10:sunadm,unixadm,user1,user2
sysadmin::14:user2
unixadm::100:user1
apps::102:user2
user1::1001:
user2::1002:
root@indsuntst01:~#


Solaris Server pre & post validation script run from remote server

Make sure you can SSH into these hosts passwordlessly (using SSH keys).
Example:
Jump server : [root@inddcpjmp01 solaris]#
./solaris_pre_check.sh indsuntst01
./solaris_post_check.sh indsuntst01

1. Remote Pre Check Script (solaris_pre_check.sh)
[root@inddcpjmp01 solaris]# cat solaris_pre_check.sh
#!/bin/bash
# =============================================================
# Solaris Pre-Reboot Validation Script
# Author: Tasleem Ahmed Khan
# Version: 1.1
# Description: Collects pre-reboot system details in structured
#              key=value format for post-reboot comparison.
# =============================================================

if [ $# -lt 1 ]; then
    echo "Usage: $0 <server1> [server2] ..."
    exit 1
fi

timestamp=$(date +'%Y%m%d_%H%M%S')
log_dir="logs"
mkdir -p "$log_dir"

for server in "$@"; do
    outfile="${log_dir}/${server}_pre_${timestamp}.log"
    echo "Collecting pre-reboot data for $server..."

    ssh -o ConnectTimeout=10 "$server" bash <<'EOF' > "$outfile" 2>&1
# ----------------- SYSTEM INFORMATION -----------------
echo "HOSTNAME=$(hostname)"
echo "OS_VERSION=$(cat /etc/release | head -1)"
echo "KERNEL=$(uname -r)"
echo "ARCH=$(uname -p)"
echo "UPTIME=$(uptime | awk -F'up' '{print $2}' | sed 's/ //g')"

# ----------------- CPU -----------------
echo "CPU_CORES=$(psrinfo | wc -l)"
echo "CPU_MODEL=$(psrinfo -pv | head -1)"

# ----------------- MEMORY -----------------
MEM_TOTAL=$(prtconf | grep "Memory size" | awk '{print $3}')
FREE_PAGES=$(vmstat 1 2 | tail -1 | awk '{print $5}')
PAGE_SIZE=$(pagesize)
MEM_FREE=$(echo "$FREE_PAGES * $PAGE_SIZE / 1024 / 1024" | bc)
echo "MEM_TOTAL_MB=$MEM_TOTAL"
echo "MEM_FREE_MB=$MEM_FREE"

# ----------------- DISK -----------------
DISK_TOTAL=$(df -k | awk 'NR>1 {sum+=$2} END {print sum/1024/1024}')
DISK_USED=$(df -k | awk 'NR>1 {sum+=$3} END {print sum/1024/1024}')
DISK_FREE=$(df -k | awk 'NR>1 {sum+=$4} END {print sum/1024/1024}')
echo "DISK_TOTAL_GB=$DISK_TOTAL"
echo "DISK_USED_GB=$DISK_USED"
echo "DISK_FREE_GB=$DISK_FREE"

# ----------------- FILESYSTEMS & MOUNTS -----------------
fs_mounts=$(mount | awk '{print $3":"$5}' | paste -sd,)
echo "FILESYSTEM_MOUNTS=$fs_mounts"

# ----------------- ZFS POOLS -----------------
if command -v zpool >/dev/null 2>&1; then
    zpool_status=$(zpool list -H -o name,health | awk '{print $1":"$2}' | paste -sd,)
    echo "ZFS_POOL_STATUS=$zpool_status"
else
    echo "ZFS_POOL_STATUS=NA"
fi

# ----------------- NFS / NAS MOUNTS -----------------
nfs_mounts=$(mount | grep -i 'nfs|nas' | awk '{print $3}' | paste -sd,)
echo "NFS_MOUNTS=$nfs_mounts"

# ----------------- SERVICES -----------------
services=$(svcs -Ho FMRI | paste -sd,)
echo "SERVICES_RUNNING=$services"

# ----------------- OPEN PORTS -----------------
ports=$(netstat -an | grep LISTEN | awk '{print $4}' | awk -F"." '{print $NF}' | paste -sd,)
echo "OPEN_PORTS=$ports"

# ----------------- PROCESSES -----------------
proc_count=$(ps -ef | wc -l)
echo "PROCESS_COUNT=$proc_count"

# ----------------- LOAD AVERAGE -----------------
load=$(uptime | awk -F'load average:' '{print $2}' | sed 's/ //g')
echo "LOAD_AVG=$load"

# ----------------- OS PATCHES -----------------
echo "OS_PATCHES=$(showrev -p | head -20 | paste -sd,)"
EOF

    echo "Pre-reboot data collected for $server"
    echo "Log saved at: $outfile"
    echo "------------------------------------------------------------"
done

[root@inddcpjmp01 solaris]#

1. Remote Post Check Script (solaris_post_check.sh)
[root@inddcpjmp01 solaris]# cat solaris_post_check.sh
#!/bin/bash
# =============================================================
# Solaris Post-Reboot Validation Script
# Author: Tasleem Ahmed Khan
# Version: 1.1
# Description: Collects post-reboot system details and compares
#              with pre-reboot baseline.
# =============================================================

if [ $# -lt 1 ]; then
    echo "Usage: $0 <server1> [server2] ..."
    exit 1
fi

timestamp=$(date +'%Y%m%d_%H%M%S')
log_dir="logs"
mkdir -p "$log_dir"

for server in "$@"; do
    # Find latest pre-reboot log
    pre_file=$(ls -1t ${log_dir}/${server}_pre_*.log 2>/dev/null | head -1)
    if [ -z "$pre_file" ]; then
        echo "No pre-reboot log found for $server. Run pre-check first!"
        continue
    fi

    post_file="${log_dir}/${server}_post_${timestamp}.log"
    compare_file="${log_dir}/${server}_comparison_${timestamp}.log"

    echo "Collecting post-reboot data for $server..."
    ssh -o ConnectTimeout=10 "$server" bash <<'EOF' > "$post_file" 2>&1
# ----------------- SYSTEM INFORMATION -----------------
echo "HOSTNAME=$(hostname)"
echo "OS_VERSION=$(cat /etc/release | head -1)"
echo "KERNEL=$(uname -r)"
echo "ARCH=$(uname -p)"
echo "UPTIME=$(uptime | awk -F'up' '{print $2}' | sed 's/ //g')"

# ----------------- CPU -----------------
echo "CPU_CORES=$(psrinfo | wc -l)"
echo "CPU_MODEL=$(psrinfo -pv | head -1)"

# ----------------- MEMORY -----------------
MEM_TOTAL=$(prtconf | grep "Memory size" | awk '{print $3}')
FREE_PAGES=$(vmstat 1 2 | tail -1 | awk '{print $5}')
PAGE_SIZE=$(pagesize)
MEM_FREE=$(echo "$FREE_PAGES * $PAGE_SIZE / 1024 / 1024" | bc)
echo "MEM_TOTAL_MB=$MEM_TOTAL"
echo "MEM_FREE_MB=$MEM_FREE"

# ----------------- DISK -----------------
DISK_TOTAL=$(df -k | awk 'NR>1 {sum+=$2} END {print sum/1024/1024}')
DISK_USED=$(df -k | awk 'NR>1 {sum+=$3} END {print sum/1024/1024}')
DISK_FREE=$(df -k | awk 'NR>1 {sum+=$4} END {print sum/1024/1024}')
echo "DISK_TOTAL_GB=$DISK_TOTAL"
echo "DISK_USED_GB=$DISK_USED"
echo "DISK_FREE_GB=$DISK_FREE"

# ----------------- FILESYSTEMS & MOUNTS -----------------
fs_mounts=$(mount | awk '{print $3":"$5}' | paste -sd,)
echo "FILESYSTEM_MOUNTS=$fs_mounts"

# ----------------- ZFS POOLS -----------------
if command -v zpool >/dev/null 2>&1; then
    zpool_status=$(zpool list -H -o name,health | awk '{print $1":"$2}' | paste -sd,)
    echo "ZFS_POOL_STATUS=$zpool_status"
else
    echo "ZFS_POOL_STATUS=NA"
fi

# ----------------- NFS / NAS MOUNTS -----------------
nfs_mounts=$(mount | grep -i 'nfs|nas' | awk '{print $3}' | paste -sd,)
echo "NFS_MOUNTS=$nfs_mounts"

# ----------------- SERVICES -----------------
services=$(svcs -Ho FMRI | paste -sd,)
echo "SERVICES_RUNNING=$services"

# ----------------- OPEN PORTS -----------------
ports=$(netstat -an | grep LISTEN | awk '{print $4}' | awk -F"." '{print $NF}' | paste -sd,)
echo "OPEN_PORTS=$ports"

# ----------------- PROCESSES -----------------
proc_count=$(ps -ef | wc -l)
echo "PROCESS_COUNT=$proc_count"

# ----------------- LOAD AVERAGE -----------------
load=$(uptime | awk -F'load average:' '{print $2}' | sed 's/ //g')
echo "LOAD_AVG=$load"

# ----------------- OS PATCHES -----------------
echo "OS_PATCHES=$(showrev -p | head -20 | paste -sd,)"
EOF

    # ----------------- Comparison -----------------
    echo "==================== PRE vs POST COMPARISON ====================" > "$compare_file"
    while IFS='=' read -r key pre_value; do
        post_value=$(grep "^$key=" "$post_file" | cut -d= -f2-)
        if [ "$pre_value" = "$post_value" ]; then
            status="OK"
        else
            status="MISMATCH"
        fi
        echo "$key : Pre=$pre_value | Post=$post_value | Status=$status" >> "$compare_file"
    done < "$pre_file"

    echo "Post-reboot check completed for $server"
    echo "Post log: $post_file"
    echo "Comparison report: $compare_file"
    echo "------------------------------------------------------------"
done

[root@inddcpjmp01 solaris]#

Example Output:
[root@inddcpjmp01 solaris]# cat logs/indsuntst01_comparison_20251026_135247.log
==================== PRE vs POST COMPARISON ====================
HOSTNAME : Pre=indsuntst01 | Post=indsuntst01 | Status=OK
OS_VERSION : Pre=                             Oracle Solaris 11.4 X86 | Post=                             Oracle Solaris 11.4 X86 | Status=OK
KERNEL : Pre=5.11 | Post=5.11 | Status=OK
ARCH : Pre=i386 | Post=i386 | Status=OK
UPTIME : Pre=p1:14,1 | Post=p1:15,1 | Status=MISMATCH
CPU_CORES : Pre=2 | Post=2 | Status=OK
CPU_MODEL : Pre=The physical processor has 2 virtual processors (0-1) | Post=The physical processor has 2 virtual processors (0-1) | Status=OK
MEM_TOTAL_MB : Pre=4096 | Post=4096 | Status=OK
MEM_FREE_MB : Pre=10325 | Post=10331 | Status=MISMATCH
DISK_TOTAL_GB : Pre=254.52 | Post=254.52 | Status=OK
DISK_USED_GB : Pre=5.51882 | Post=5.51882 | Status=OK
DISK_FREE_GB : Pre=178.836 | Post=178.836 | Status=OK
FILESYSTEM_MOUNTS : Pre= | Post= | Status=OK
ZFS_POOL_STATUS : Pre= | Post= | Status=OK
NFS_MOUNTS : Pre= | Post= | Status=OK
SERVICES_RUNNING : Pre= | Post= | Status=OK
OPEN_PORTS : Pre= | Post= | Status=OK
PROCESS_COUNT : Pre=80 | Post=80 | Status=OK
LOAD_AVG : Pre=oadaverage:0.01,0.01,0.01 | Post=oadaverage:0.02,0.01,0.01 | Status=MISMATCH
bash: line 61: showrev: command not found : Pre= | Post= | Status=OK
OS_PATCHES : Pre= | Post= | Status=OK
[root@inddcpjmp01 solaris]#