Setting up a PowerHA SystemMirror (formerly HACMP) cluster requires careful planning and execution to ensure high availability for critical applications. This guide walks through the steps to configure a two-node AIX cluster using shared storage and CAA (Cluster Aware AIX).
Step 1: Install PowerHA
Mount NFS share both the nodes
# mount nimserver1:/export/powerHA /mnt
Change to the PowerHA installation directory:
# cd /mnt/HA7.x/
Install all packages:
# installp -acgXYd . all
# vi ~/.profile
export PATH=$PATH:/usr/es/sbin/cluster/utilities:/usr/es/sbin/cluster/bin:/usr/es/sbin/cluster
# . ~/.profile
Step 2: Generate SSH Key Pair
On Node1 (repeat for Node2):
# su - root
# ssh-keygen -t rsa -b 4096
Press Enter for default file location.
Leave passphrase empty (important for passwordless access).
Step 3: Copy Public Key to Other node
On node1, view public key:
# cat /root/.ssh/id_rsa.pub
On node2, append it to /root/.ssh/authorized_keys:
# vi /root/.ssh/authorized_keys
# (paste the key from node1)
# chmod 600 /root/.ssh/authorized_keys
Step 4: Set Correct Permissions
On all nodes:
# chmod 700 /root/.ssh
# chmod 600 /root/.ssh/authorized_keys
# chmod 600 /root/.ssh/id_rsa
# chmod 644 /root/.ssh/id_rsa.pub
Step 5: Test Node-to-Node Communication
Test remote shell communication using clrsh:
# ssh node1 hostname # Output: node1
# ssh node2 hostname # Output: node2
If tests fail, the cluster cannot be created until the communication issue is resolved.
Step 6: Verify Shared Storage
Before creating a cluster, ensure all shared disks are visible on both nodes:
# lspv
Check all shared datavg hdisks and the hdisk intended for the CAA repository.
Ensure all shared hdisks have reserve_policy = no_reserve:
# lsattr -El hdiskXX
Example:
for i in hdisk4 hdisk5 hdisk6 hdisk7 hdisk8 hdisk9
do
lsattr -El $i | grep "no_reserve"
done
reserve_policy no_reserve Reserve Policy True
reserve_policy no_reserve Reserve Policy True
reserve_policy no_reserve Reserve Policy True
reserve_policy no_reserve Reserve Policy True
reserve_policy no_reserve Reserve Policy True
reserve_policy no_reserve Reserve Policy True
#
# chdev -l hdiskXX -a reserve_policy=no_reserve
The hdisk used for the CAA repository should be new and never part of a VG.
Step 7: Verify Network Interfaces
Run ifconfig -a on both nodes to identify boot IP addresses for all adapters.
At this stage, there should be no alias IPs configured.
Step 8: Configure rhosts for Cluster Communication
Add all boot IPs to /etc/cluster/rhosts on both nodes (one IP per line, no comments).
Ensure /etc/hosts resolves all IPs locally.
Configure /etc/netsvc.conf to prioritize local hostname resolution:
hosts=local4,bind4
Addressing the netmon.cf Warning
# vi /usr/es/sbin/cluster/netmon.cf
! REQD <gateway_IP>
Example:
! REQD 192.168.10.1
Step 9: Restart Cluster Communication Daemon
# stopsrc -s clcomd; sleep 5; startsrc -s clcomd
Step 10: Create the Cluster
On node1, add the cluster and nodes:
# /usr/es/sbin/cluster/utilities/clmgr add cluster <cluster_name> NODES="node1 node2"
Example:
# /usr/es/sbin/cluster/utilities/clmgr add cluster dcp_cluster NODES="indcppha01 indcppha02"
Add a repository disk (disable validation for a new disk):
# /usr/es/sbin/cluster/utilities/clmgr add repository <Repository_Disk_PVID> DISABLE_VALIDATION=true
Example:
# chdev -l hdisk4 -a pv=yes --run both nodes
varify PVID both the node
# lspv | grep 00659fa7e0f7be1f
# /usr/es/sbin/cluster/utilities/clmgr add repository 00659fa7e0f7be1f DISABLE_VALIDATION=true
Successfully added a primary repository disk.
To view the complete configuration of repository disks use:
"clmgr query repository" or "clmgr view report repository"
clmgr query repository
hdisk4 (00659fa7e0f7be1f)
# clmgr view report repository
dcp_cluster :
00659fa7e0f7be1f hdisk4(indcppha01) active
No backup repository
Synchronize the cluster configuration:
# /usr/es/sbin/cluster/utilities/clmgr sync cluster
or
# smitty hacmp --> Cluster Applications and Resources --> Verify and Synchronize Cluster Configuration
Verify cluster status on both nodes:
# lscluster -m # Should show each node as UP
# lssrc -a | grep cthags # Should show cthags active
Step 11: Create Resource Group and Service IP
Create a resource group:
# /usr/es/sbin/cluster/utilities/clmgr add resource_group <RG_Name> NODES=node1,node2
Example:
# /usr/es/sbin/cluster/utilities/clmgr add resource_group dcp_rg NODES=indcppha01,indcppha02
Create a service IP:
# /usr/es/sbin/cluster/utilities/clmgr add service_ip <Service_IP_Label> NETWORK=<Network_Name>
Example:
Verify network configuration
/usr/es/sbin/cluster/utilities/clmgr query network
cat /etc/hosts | grep dcpapps
192.168.10.18 dcpapps.ppc.com dcpapps
# /usr/es/sbin/cluster/utilities/clmgr add service_ip dcpapps NETWORK=net_ether_02
Step 12: Create Shared Volume Group
# /usr/es/sbin/cluster/sbin/cl_mkvg -f -n -cspoc -n 'node1,node2' -r '<RG_Name>' -y '<VG_Name>' -V '<VG_Major_Number>' -E <hdiskx_PVID>
Example:
# ls -l /dev/hdisk5
brw------- 1 root system 14, 7 Mar 11 09:48 /dev/hdisk5
lspv | grep 00f33172e0f7ffaa
hdisk5 00f33172e0f7ffaa None
# /usr/es/sbin/cluster/sbin/cl_mkvg -f -n -cspoc -n 'indcppha01,indcppha02' -r 'dcp_rg' -y 'dcpvg01' -V '14' -E 00f33172e0f7ffaa
smitty hacmp --> System Management (C-SPOC) --> Storage --> Volume Groups --> Create a Volume Group
Selete indcppha01 & indcppha02 --> select Bigvg --Select PV ID
Create a Big Volume Group
Type or select values in entry fields.
Press Enter AFTER making all desired changes.
[Entry Fields]
Node Names indcppha01,indcppha02
Resource Group Name [dcp_rg] +
PVID 00f33172e0f7ffaa
VOLUME GROUP name [dcpvg01]
Physical partition SIZE in megabytes 4 +
Volume group MAJOR NUMBER [42] #
Enable Fast Disk Takeover or Concurrent Access Fast Disk Takeover +
Volume Group Type Big
CRITICAL volume group? no +
Enable LVM Encryption yes +
Enable PV Encryption no +
Auth Method +
Auth Method Name []
Warning:
Changing the volume group major number may result
in the command being unable to execute
successfully on a node that does not have the
major number currently available. Please check
for a commonly available major number on all nodes
before changing this setting.
F1=Help F2=Refresh F3=Cancel F4=List
Esc+5=Reset Esc+6=Command Esc+7=Edit Esc+8=Image
Esc+9=Shell Esc+0=Exit Enter=Do
Step 13: Add Resources to Resource Group
# /usr/es/sbin/cluster/utilities/clmgr modify resource_group '<RG_Name>' SERVICE_LABEL='<Service_IP_Label>'
Example:
# /usr/es/sbin/cluster/utilities/clmgr modify resource_group 'dcp_rg' SERVICE_LABEL='dcpapps'
Step 14: Synchronize and Start Cluster
Synchronize the cluster:
# /usr/es/sbin/cluster/utilities/clmgr sync cluster
or
# smitty hacmp --> Cluster Applications and Resources --> Verify and Synchronize Cluster Configuration
Start PowerHA and bring the cluster online:
# /usr/es/sbin/cluster/utilities/clmgr online cluster WHEN=now START_CAA=yes
# /usr/es/sbin/cluster/utilities/clmgr online rg dcp_rg
# /usr/es/sbin/cluster/utilities/clmgr online rg dcp_rg
Output:
Example Output:
# clRGinfo
-----------------------------------------------------------------------------
Group Name State Node
-----------------------------------------------------------------------------
dcp_rg ONLINE indcppha01
OFFLINE indcppha02
Create FS:
# mklv -t jfs2 -y dcp_lv dcpvg01 128
# crfs -v jfs2 -d dcp_lv -m /dcpapps -A no -p rw -a logname=INLINE
# mkdir -p /dcpapps
# mount /dcpapps
Failover the Node:
# df -g | grep /dcpapps
/dev/dcp_lv 4.00 3.98 1% 4 1% /dcpapps
Failover the Node:
# /usr/es/sbin/cluster/utilities/clmgr move resource_group <RG_NAME> node=<Node_Name>
# /usr/es/sbin/cluster/utilities/clmgr move resource_group dcp_rg node=indcppha02
# /usr/es/sbin/cluster/utilities/clmgr move resource_group dcp_rg node=indcppha02
Attempting to move resource group dcp_rg to node indcppha02.
Waiting for the cluster to stabilize.............................
# clRGinfo
-----------------------------------------------------------------------------
Group Name State Node
-----------------------------------------------------------------------------
dcp_rg OFFLINE indcppha01
ACQUIRING indcppha02
# clRGinfo
-----------------------------------------------------------------------------
Group Name State Node
-----------------------------------------------------------------------------
dcp_rg OFFLINE indcppha01
ONLINE indcppha02
Example Output:
indcppha02 / # df -g
Filesystem GB blocks Free %Used Iused %Iused Mounted on
/dev/hd4 1.00 0.93 8% 3089 2% /
/dev/hd2 3.00 0.55 82% 52947 29% /usr
/dev/hd9var 1.00 0.92 8% 837 1% /var
/dev/hd3 1.00 1.00 1% 27 1% /tmp
/dev/hd1 1.00 1.00 1% 7 1% /home
/dev/hd11admin 0.12 0.12 1% 5 1% /admin
/proc - - - - - /proc
/dev/hd10opt 1.00 0.93 7% 375 1% /opt
/dev/livedump 0.25 0.25 1% 4 1% /var/adm/ras/livedump
/dev/fslv00 1.00 1.00 1% 5 1% /usr/local
/ahafs - - - 51 1% /aha
/dev/dcp_lv 4.00 3.98 1% 4 1% /dcpapps
indcppha02 / # ifconfig -a
en0: flags=e084863,14c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,LARGESEND,CHAIN>
inet 192.168.20.14 netmask 0xffffff00 broadcast 192.168.20.255
tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
en1: flags=e084863,114c0<UP,BROADCAST,NOTRAILERS,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,LARGESEND,CHAIN>
inet 192.168.10.18 netmask 0xffffff00 broadcast 192.168.10.255
inet 192.168.10.14 netmask 0xffffff00 broadcast 192.168.10.255
tcp_sendspace 262144 tcp_recvspace 262144 rfc1323 1
lo0: flags=e08084b,c0<UP,BROADCAST,LOOPBACK,RUNNING,SIMPLEX,MULTICAST,GROUPRT,64BIT,LARGESEND,CHAIN>
inet 127.0.0.1 netmask 0xff000000 broadcast 127.255.255.255
inet6 ::1%1/64
tcp_sendspace 131072 tcp_recvspace 131072 rfc1323 1
indcppha02 / # cltopinfo
Cluster Name: dcp_cluster
Cluster Type: Standard
Heartbeat Type: Unicast
Repository Disk: hdisk4 (00659fa7e0f7be1f)
There are 2 node(s) and 1 network(s) defined
NODE indcppha01:
Network net_ether_02
dcpapps 192.168.10.18
indcppha01 192.168.10.13
NODE indcppha02:
Network net_ether_02
dcpapps 192.168.10.18
indcppha02 192.168.10.14
Resource Group dcp_rg
Startup Policy Online On Home Node Only
Fallover Policy Fallover To Next Priority Node In The List
Fallback Policy Fallback To Higher Priority Node In The List
Participating Nodes indcppha01 indcppha02
Service IP Label dcpapps
indcppha02 / # clshowres
Resource Group Name dcp_rg
Participating Node Name(s) indcppha01 indcppha02
Startup Policy Online On Home Node Only
Fallover Policy Fallover To Next Priority Node In The List
Fallback Policy Fallback To Higher Priority Node In The List
Site Relationship ignore
Dynamic Node Priority
Service IP Label dcpapps
Filesystems ALL
Filesystems Consistency Check fsck
Filesystems Recovery Method sequential
Filesystems/Directories to be exported (NFSv2/NFSv3)
Filesystems/Directories to be exported (NFSv4)
Filesystems to be NFS mounted
Network For NFS Mount
Filesystem/Directory for NFSv4 Stable Storage
Volume Groups dcpvg01
Concurrent Volume Groups
Use forced varyon for volume groups, if necessary false
Disks
Raw Disks
Disk Error Management?
GMVG Replicated Resources
GMD Replicated Resources
PPRC Replicated Resources
SVC PPRC Replicated Resources
SVC PBR Replicated Resources
EMC SRDF(R) Replicated Resources
TRUECOPY Replicated Resources
GENERIC XD Replicated Resources
Connections Services
Fast Connect Services
Shared Tape Resources
Application Servers
Highly Available Communication Links
Primary Workload Manager Class
Secondary Workload Manager Class
Delayed Fallback Timer
Miscellaneous Data
Automatically Import Volume Groups false
Inactive Takeover
SSA Disk Fencing false
Filesystems mounted before IP configured false
WPAR Name
Run Time Parameters:
Node Name indcppha01
Debug Level high
Format for hacmp.out Standard
Node Name indcppha02
Debug Level high
Format for hacmp.out Standard
indcppha02 / #
No comments:
Post a Comment