In IBM PowerHA SystemMirror (formerly HACMP), you may encounter an error message like this during cluster operations, synchronization, or while running clmgr commands:
Cluster IPC error: The cluster manager on node note1 is in ST_INIT or NOT_CONFIGURED state and cannot process the IPC request.
This error prevents cluster commands from completing successfully and indicates that inter-node communication (IPC) within the cluster has failed.
What This Error Means:
PowerHA nodes communicate through IPC (Inter-Process Communication) between the clstrmgrES daemons on each node.
This message tells you that the target node’s cluster manager process (clstrmgrES) is either:
- ST_INIT The cluster manager is initializing — services haven’t fully started yet.
- NOT_CONFIGURED The node’s local PowerHA configuration (ODM repository) isn’t set up or is corrupted.
So when you try to synchronize, verify, or query the cluster, that node cannot respond.
Root Causes:
This error can happen for several reasons:
- Cluster manager service not running (clstrmgrES daemon stopped or failed).
- ODM or repository corruption in /etc/es/objrepos/HACMP* files.
- Network communication issue — PowerHA communication interfaces down or hostname mismatch.
- Cluster configuration mismatch between nodes.
- Partial node configuration (node added but not yet synchronized).
Step-by-Step Fix:
Step 1: Check the Cluster Manager Daemon
On the affected node (note1):
# ps -ef | grep clstrmgrES
If not running, start it:
# startsrc -s clstrmgrES
Confirm it’s active:
# lssrc -s clstrmgrES
Expected output:
Subsystem Group PID Status
clstrmgrES cluster 123456 active
If the status shows “inoperative,” check /tmp/hacmp.out and /var/hacmp/log/clstrmgr.debug for startup errors.
Step 2: Verify Network and Hostname Resolution
PowerHA depends on stable communication between cluster nodes.
Check interface status:
# netstat -in
# lsdev -Cc if
Verify that all cluster nodes resolve correctly:
# host note1
# host <other_node>
If any hostnames fail, fix /etc/hosts or DNS before continuing.
Step 3: Validate ODM (Object Repository) Integrity
Check that the cluster repository files exist and are consistent:
# ls -l /etc/es/objrepos/HACMP*
If they are missing, empty, or corrupted, restore them from a healthy node:
# scp root@healthy_node:/etc/es/objrepos/HACMP* /etc/es/objrepos/
Then verify:
# clmgr verify cluster
Step 4: Restart PowerHA Cluster Services
Once network and configuration are healthy, restart PowerHA services on the affected node:
# stopsrc -s clstrmgrES
sleep 5
# startsrc -s clstrmgrES
Monitor startup logs:
# tail -f /tmp/hacmp.out
# tail -f /var/hacmp/log/clstrmgr.debug
You should see messages showing the node transitioning from ST_INIT → UP.
Step 5: Synchronize Cluster Configuration
Once all nodes are up and communicating:
# clmgr sync cluster <cluster_name>
If successful, you’ll see:
Synchronization completed successfully.
Step 6: Confirm Cluster Health
Finally, verify cluster and node status:
# clmgr query cluster
# clmgr status cluster
Expected healthy output:
Cluster: mycluster
State: UP
Node note1: UP
Node note2: UP
No comments:
Post a Comment