adminCtrlX – Simplifying System Administration: 2025

RHEL 7, 8, 9, 10 – Storage Issues

Storage issues in Red Hat Enterprise Linux (RHEL) are among the most critical problems administrators face. They can cause boot failures, application downtime, data corruption, or performance degradation.

This guide provides a structured troubleshooting approach that works consistently across RHEL 7 through RHEL 10.

1. Identify the Storage Problem
Start by understanding what type of storage issue you are facing.
Symptom Likely Cause
Filesystem full     → Disk usage or log growth
Mount fails at boot   → /etc/fstab error
Disk not detected   → Hardware or driver issue
LVM volumes missing   → VG/LV not activated
Read-only filesystem → Filesystem corruption
Slow I/O   → Disk or SAN performance
iSCSI/NFS not mounting   → Network or auth issue

2. Check Disk Detection and Hardware Status
List Block Devices
# lsblk
Check Disk Details
# blkid
# fdisk -l
Check Kernel Disk Messages
# dmesg | grep -i sd
If disks are missing, verify:

SAN mapping
VM disk attachment
Hardware health

3. Filesystem Full Issues
Check Disk Usage
# df -h
Find Large Files
# du -sh /* 2>/dev/null
Clear Logs Safely
# journalctl --vacuum-time=7d

4. Read-Only Filesystem Issues
This usually indicates filesystem corruption.
Verify Mount Status
# mount | grep ro
Remount (Temporary)
# mount -o remount,rw /
Permanent Fix
Boot into rescue mode
Run:
# fsck -y /dev/mapper/rhel-root
Never run fsck on mounted filesystems.

5. Fix /etc/fstab Mount Failures
Incorrect entries cause boot into emergency mode.
Check fstab
# vi /etc/fstab
Verify UUIDs
# blkid
Test fstab
# mount -a
Comment out invalid entries if necessary.

6. LVM Issues (Most Common in RHEL)
Check LVM Status
# pvs
# vgs
# lvs
Activate Volume Groups
# vgchange -ay
Scan for Missing Volumes
# pvscan
# vgscan
# lvscan

7. Extend LVM Filesystem (Low Space)
Extend Logical Volume
# lvextend -L +10G /dev/rhel/root
Resize Filesystem
# xfs_growfs /
# resize2fs /dev/rhel/root

8. Recover Missing or Corrupt LVM
Rebuild LVM Metadata
# vgcfgrestore vg_name
List backups:
# ls /etc/lvm/archive/

9. Boot Fails Due to Storage Issues
Check initramfs
# lsinitrd
Rebuild initramfs
# dracut -f
Verify Root Device
# blkid

10. NFS Storage Issues
Check Mount Status
# mount | grep nfs
Test Connectivity
# showmount -e server_ip
Restart Services
# systemctl restart nfs-client.target

11. iSCSI Storage Issues
Check iSCSI Sessions
# iscsiadm -m session
Discover Targets
# iscsiadm -m discovery -t sendtargets -p target_ip
Login to Target
# iscsiadm -m node -l

12. Multipath Issues (SAN Storage)
Check Multipath Status
# multipath -ll
Restart Multipath
# systemctl restart multipathd

13. Storage Performance Issues
Check Disk I/O
# iostat -xm 5
Identify Slow Processes
# iotop

14. SELinux Storage-Related Issues
SELinux may block access to mounted volumes.
Check Denials
# ausearch -m avc -ts recent
Fix Context
# restorecon -Rv /mount_point

15. Backup and Data Safety (Before Fixes)
Always verify backups before major storage changes.
# rsync -av /data /backup

16. Best Practices to Prevent Storage Issues

Monitor disk usage proactively
Validate /etc/fstab changes
Use LVM snapshots
Keep rescue media available
Monitor SAN/NAS health
Perform regular filesystem checks

Conclusion
Storage troubleshooting in RHEL 7, 8, 9, and 10 follows consistent principles:

Verify hardware and detection
Fix filesystem and LVM issues
Validate mounts and network storage
Monitor performance and prevent recurrence

Using this step-by-step approach ensures data integrity, stability, and minimal downtime in enterprise Linux environments.

Unix System Administrator

Welcome to TechSysAdm, your go-to blog for practical insights, troubleshooting tips, and best practices in managing mission-critical enterprise systems. Here, we cover everything from AIX, RHEL, SUSE Linux, Solaris, VMware, and Windows servers to enterprise databases and DevOps environments, helping IT professionals optimize performance, ensure reliability, and solve complex system challenges.

Use NetworkManager consistently
Validate firewall rules
Document static IP settings
Monitor network latency
Test changes before reboot
Keep NIC drivers updated

Conclusion
Network troubleshooting in RHEL 7, 8, 9, and 10 follows the same fundamentals:

Verify interfaces and IPs
Check routing and DNS
Validate NetworkManager
Review firewall and SELinux

Using this step-by-step approach ensures quick resolution and stable connectivity in enterprise Linux environments.

Unix System Administrator

Splunk Server and Forwarder Installation

In any enterprise environment, log collection and analysis are critical for security monitoring, performance troubleshooting, and threat detection. Splunk is a market-leading platform that helps organizations collect, index, and visualize machine data.

However, manually installing Splunk Enterprise Server and configuring forwarders on several client machines can become time-consuming. In this blog post, we will automate the process from end-to-end.

Understanding the Components

Splunk Enterprise Server
This is the main Splunk system that stores, indexes, and searches all logs. It provides:

Web UI
Indexing database
Search head
User management
Dashboard visualization

Splunk Universal Forwarder
This is a lightweight agent installed on client machines. It:

Sends logs to the Splunk server
Runs silently as a background service
Consumes minimal CPU & memory

Prerequisites:

Server Requirements

OS: Linux (Ubuntu/RHEL/CentOS/Amazon Linux)
4+ GB RAM
20+ GB Disk
Port 8000 (Web), 8089 (mgmt), 9997 (data input) open

Client Requirements

Linux-based client machines
sudo access
Network reachability to server port: 9997

Download Splunk Installer Links
Component Download URL
Splunk Enterprise https://www.splunk.com/en_us/download/splunk-enterprise.html
Splunk Forwarder https://www.splunk.com/en_us/download/universal-forwarder.html

SPLUNK ENTERPRISE SERVER INSTALLATION

Step 1: Update system
# dnf update -y

Step 2: Create Splunk OS User
Splunk should never run as root.
# useradd -m splunk
Verify:
# id splunk
uid=1001(splunk) gid=1001(splunk) groups=1001(splunk)

Step 3: Download Splunk Enterprise
Download Splunk from the official Splunk website and copy it to your server (example path used below):
/root/splunk-9.0.2-17e00c557dc1-linux-2.6-x86_64.rpm

Step 4: Install Splunk Enterprise
Install the RPM package:
# rpm -ivh splunk-9.0.2-17e00c557dc1-linux-2.6-x86_64.rpm
By default, Splunk installs to:

# ls -ld /opt/splunk/

drwx------ 12 splunk splunk 4096 Dec 20 03:04 /opt/splunk/

[root@inddcpspn01 ~]#

Step 5: Set Correct Ownership
Give ownership of Splunk files to the splunk user:
# chown -R splunk:splunk /opt/splunk

Step 6: First Start of Splunk (Create Admin User)
This step is critical
The admin user is created only on the first successful start.
Run the following command as the splunk user:
# sudo -u splunk /opt/splunk/bin/splunk start
Type q
Do you agree with this license? [y/n]: y
Please enter an administrator username:admin
Please enter a new password:Welcome@123
Please confirm new password:Welcome@123

Step 7: Verify Admin User Creation
Check the password file:
# ls -l /opt/splunk/etc/passwd
# cat /opt/splunk/etc/passwd
You should see:
:admin:$6$5SYFmoISyswPtUPt$AXKb2n0RD7mL8UAz1wyZkgTdHkHWFIes/9DMz.4gw3.xnVyLyxpzj1mADGt8HTVJ.ky7f8tay1.bg.7osl7ci1::Administrator:admin:changeme@example.com:::20441
If this file exists, the admin user is created successfully.

Step 8: Enable Splunk at Boot
Install chkconfig if your system complains it's missing (e.g., on RHEL/CentOS 9)
$ dnf install chkconfig
$ sudo -u splunk /opt/splunk/bin/splunk stop
$ /opt/splunk/bin/splunk enable boot-start -user splunk
Init script installed at /etc/init.d/splunk.
Init script is configured to run at boot.
$ sudo -u splunk /opt/splunk/bin/splunk start

Step 9: Start / Stop Splunk
Start Splunk
$ sudo -u splunk /opt/splunk/bin/splunk start
Stop Splunk
$ sudo -u splunk /opt/splunk/bin/splunk stop
Check Status
$ sudo -u splunk /opt/splunk/bin/splunk status

Step 10: Access Splunk Web UI
Open a browser and go to:
http://<server-ip>:8000 or http://<Server FQDN>:8000

Step 11: (Optional) Firewall Configuration
Allow Splunk Web port:
# firewall-cmd --add-port=8000/tcp --permanent
# firewall-cmd --reload

Common Issues & Fixes
Admin password not working
Splunk was started before --seed-passwd
/opt/splunk/etc/passwd not created
Wrong user used to start Splunk
Fix: Stop Splunk, remove init files, and start again with --seed-passwd.

ENABLE SPLUNK DATA INPUT (TCP 9997)

Log into Splunk Web UI:
URL:
http://<server-ip>:8000 or http://<Server FQDN>:8000
Then:

Enable Receiving Port

Go to Settings → Forwarding and Receiving

Click Configure Receiving

Click New Receiving Port

Enter:

Port: 9997

Save

Verify Receiving Port

# netstat -tulnp | grep 9997

tcp 0 0 0.0.0.0:9997 0.0.0.0:* LISTEN 42402/splunkd

# ss -tulnp | grep 9997

tcp LISTEN 0 128 0.0.0.0:9997 0.0.0.0:* users:(("splunkd",pid=42402,fd=197))

SPLUNK FORWARDER INSTALLATION ON CLIENT

Step 1: Create Splunk User on Client
Splunk services should not run as root.
# useradd -m splunk
Verify:
# id splunk
uid=1001(splunk) gid=1001(splunk) groups=1001(splunk)

Step 2: Download Splunk Universal Forwarder
Download the Universal Forwarder package from the Splunk website and copy it to the client server.
Example RPM file:
splunkforwarder-9.0.2-17e00c557dc1-linux-2.6-x86_64.rpm

Step 3: Install Splunk Universal Forwarder
Install the RPM:
# rpm -ivh splunkforwarder-9.0.2-17e00c557dc1-linux-2.6-x86_64.rpm
Default installation path:

# ls -ld /opt/splunkforwarder

drwxr-xr-x 9 splunk splunk 4096 Dec 19 22:02 /opt/splunkforwarder

Step 4: Set Correct Ownership
# chown -R splunk:splunk /opt/splunkforwarder

Step 5: First Start of Splunk Forwarder
Start the Universal Forwarder for the first time:
$ sudo -u splunk /opt/splunkforwarder/bin/splunk start

Type q
Do you agree with this license? [y/n]: y

Please enter an administrator username: admin

Please enter a new password: Welcome@123

Please confirm new password: Welcome@123

Step 6: Enable Forwarder to Start at Boot

Install chkconfig using dnf or yum.

# dnf install chkconfig

$ sudo -u splunk /opt/splunkforwarder/bin/splunk stop

# /opt/splunkforwarder/bin/splunk enable boot-start -user splunk

Systemd unit file installed by user at /etc/systemd/system/SplunkForwarder.service.

Configured as systemd managed service.

$ sudo -u splunk /opt/splunkforwarder/bin/splunk start

Step 7: Configure Forwarder to Send Data to Indexer
Add Splunk Indexer as Receiving Destination
$ sudo -u splunk /opt/splunkforwarder/bin/splunk add forward-server 192.168.10.109:9997

Warning: Attempting to revert the SPLUNK_HOME ownership

Warning: Executing "chown -R splunk /opt/splunkforwarder"

egrep: warning: egrep is obsolescent; using grep -E

WARNING: Server Certificate Hostname Validation is disabled. Please see server.conf/[sslConfig]/cliVerifyServerName for details.

Splunk username: admin

Password:

Added forwarding to: 192.168.10.109:9997.

Verify:
$ sudo -u splunk /opt/splunkforwarder/bin/splunk list forward-server

Warning: Attempting to revert the SPLUNK_HOME ownership

Warning: Executing "chown -R splunk /opt/splunkforwarder"

egrep: warning: egrep is obsolescent; using grep -E

WARNING: Server Certificate Hostname Validation is disabled. Please see server.conf/[sslConfig]/cliVerifyServerName for details.

Active forwards:

192.168.10.109:9997

Configured but inactive forwards:

None

Step 8: Add Log Files to Monitor

Example: Monitor Linux system logs
$ sudo -u splunk /opt/splunkforwarder/bin/splunk add monitor /var/log/messages

Warning: Attempting to revert the SPLUNK_HOME ownership

Warning: Executing "chown -R splunk /opt/splunkforwarder"

egrep: warning: egrep is obsolescent; using grep -E

WARNING: Server Certificate Hostname Validation is disabled. Please see server.conf/[sslConfig]/cliVerifyServerName for details.

Added monitor of '/var/log/messages'.

For Ubuntu:

$ sudo -u splunk /opt/splunkforwarder/bin/splunk add monitor /var/log/syslog

Step 9: Restart Splunk Forwarder
$ sudo -u splunk /opt/splunkforwarder/bin/splunk restart

Step 10: Verify Data on Splunk Server
On the Splunk Enterprise server:
Login to Splunk Web : http://<server-ip>:8000 or http://<Server FQDN>:8000
Go to Search & Reporting
Run:
index=_internal | stats count by host
You should see the client hostname.

Firewall Configuration (Optional)
Allow outgoing traffic to indexer:
# firewall-cmd --add-port=9997/tcp --permanent
# firewall-cmd --reload

Common Issues & Troubleshooting

Forwarder not sending data
Indexer port 9997 not enabled
Firewall blocking traffic
Incorrect indexer IP

Check forwarder status
$ sudo -u splunk /opt/splunkforwarder/bin/splunk status

Warning: Attempting to revert the SPLUNK_HOME ownership

Warning: Executing "chown -R splunk /opt/splunkforwarder"

egrep: warning: egrep is obsolescent; using grep -E

splunkd is running (PID: 1768).

splunk helpers are running (PIDs: 1794).

egrep: warning: egrep is obsolescent; using grep -E

Check logs
$ tail -f /opt/splunkforwarder/var/log/splunk/splunkd.log

Conclusion

You have successfully installed and configured the Splunk Enterprise Server and the Splunk Universal Forwarder on the Splunk server and Splunk client machine. The Splunk client is now actively forwarding log data to the Splunk Enterprise server, enabling centralized log collection, monitoring, and analysis across the environment.

This setup provides better visibility into system activity, faster troubleshooting, and a scalable foundation for enterprise-level monitoring and observability.

Unix System Administrator

RHEL 7, 8, 9, 10 – Security Issues

Security issues in Red Hat Enterprise Linux (RHEL) can surface as login failures, service denials, SELinux blocks, firewall problems, authentication errors, or compliance violations.

This guide provides a structured troubleshooting methodology applicable to RHEL 7 through RHEL 10.

1. Identify the Type of Security Issue
Before making changes, determine what is being blocked.
User cannot log in → PAM / SSH / SELinux
Service not accessible → Firewall / SELinux
Permission denied → SELinux / file context
SSH connection refused → sshd / firewall
Application fails after reboot → SELinux labeling
Compliance scan failures → OpenSCAP / crypto policy

2. Check System Logs First (Golden Rule)
Authentication and Security Logs
/var/log/secure
systemd Journal (All Versions)
# journalctl -xe
# journalctl -u sshd

3. SELinux Troubleshooting (Most Common Issue)
SELinux is enabled by default in all RHEL versions.
Check SELinux Status
# getenforce
# sestatus
Identify SELinux Denials
# ausearch -m avc -ts recent
Or:
# journalctl | grep AVC
Interpret SELinux Alerts
# sealert -a /var/log/audit/audit.log
Fix SELinux Issues (Recommended Approach)
Restore File Contexts
# restorecon -Rv /path
Enable Required Booleans
# getsebool -a | grep httpd
# setsebool -P httpd_can_network_connect on
Temporary Disable (For Testing Only)
# setenforce 0
Permanent disable (NOT recommended):
# vi /etc/selinux/config

4. Firewall Issues (firewalld)
Check Firewall Status
# systemctl status firewalld
firewall-cmd --state
List Active Rules
# firewall-cmd --list-all
Allow a Service or Port
# firewall-cmd --add-service=http --permanent
# firewall-cmd --add-port=8080/tcp --permanent
# firewall-cmd --reload
Verify Zones
# firewall-cmd --get-active-zones

5. SSH Security Issues
Check SSH Service
systemctl status sshd
Verify SSH Configuration
# sshd -t
# vi /etc/ssh/sshd_config
Common issues:

PermitRootLogin no
PasswordAuthentication no
Wrong SSH port

Restart SSH Safely
# sshd -t && systemctl restart sshd

6. User Authentication & PAM Issues
Verify User Account
# id username
# passwd -S username
Check Account Lockout
# faillog -u username
# pam_tally2 --user username # RHEL 7
# faillock --user username # RHEL 8+
Reset Failed Login Count
# faillock --user username --reset

7. File and Directory Permission Issues
Check Ownership
# ls -ld /path
Fix Permissions
# chmod 755 /path
# chown user:group /path
Permissions alone may not fix SELinux issues.

8. sudo Issues
Check sudo Access
# sudo -l
Validate sudoers File
# visudo
Check:
username ALL=(ALL) ALL

9. Security Updates and Patch Issues
Check Installed Security Updates
# yum updateinfo list security # RHEL 7
# dnf updateinfo list security # RHEL 8+
Apply Security Updates
# yum update --security
# dnf update --security

10. OpenSCAP & Compliance Failures
Scan System
# oscap xccdf eval --profile standard --results scan.xml /usr/share/xml/scap/ssg/content/ssg-rhel*.xml
Common Compliance Failures

Password complexity
SSH hardening
File permissions
Crypto policies

11. Crypto Policy Issues (RHEL 8+)
Check Current Policy
# update-crypto-policies --show
Set Default Policy
# update-crypto-policies --set DEFAULT

12. Auditd Issues
Check Audit Service
# systemctl status auditd
Search Audit Logs
# ausearch -k ssh

13. Container Security Issues (RHEL 8+)
SELinux + Containers
# podman inspect container_name | grep SELinux
Fix volume labels:
:Z or :z

14. Kernel & Security Module Issues
Check Loaded Modules
# lsmod
Rebuild SELinux Labels
# touch /.autorelabel
# reboot

15. Best Practices to Prevent Security Issues

Keep SELinux enabled
Monitor /var/log/secure
Apply security patches regularly
Use firewalld zones properly
Test changes in non-production
Enable audit logging

Conclusion
Security troubleshooting in RHEL 7, 8, 9, and 10 follows a consistent methodology:

Identify blocked access
Review logs
Check SELinux and firewall
Validate authentication and permissions
Apply fixes systematically

Following these steps ensures secure, compliant, and stable systems in enterprise environments.

Unix System Administrator

RHEL 7, 8, 9 &10 – Bootloader Issues

GRUB (Grand Unified Bootloader) problems are among the most common causes of Linux systems failing to boot. Across RHEL 7, 8, 9, and future RHEL 10, the bootloader stack remains GRUB2 + systemd, with differences mainly in BIOS vs UEFI handling.

This guide provides a version-aware, step-by-step approach to diagnosing and fixing GRUB issues in all supported RHEL versions.

1. Common GRUB Issues in RHEL
Symptom Likely Cause
No GRUB menu   → Missing or corrupted GRUB
grub> prompt    →  GRUB config missing
grub rescue>    → Core GRUB files missing
Boot loops → Wrong kernel or root device
Kernel not found → Incorrect grub.cfg
System boots to rescue  → Wrong kernel parameters

2. Understand GRUB Differences by RHEL Version
RHEL Version Firmware GRUB Location
RHEL 7 BIOS / UEFI /boot/grub2/grub.cfg
RHEL 8 UEFI default /boot/efi/EFI/redhat/grub.cfg
RHEL 9 UEFI only (mostly) /boot/efi/EFI/redhat/grub.cfg
RHEL 10 UEFI only (expected) /boot/efi/EFI/redhat/grub.cfg

3. Access the GRUB Menu
Reboot the system
Press Esc or Shift
If GRUB appears, select Advanced options
Try booting an older kernel
If this works, the issue is likely a broken kernel or config, not GRUB itself.

4. Fix Temporary GRUB Issues (Edit Boot Parameters)
Highlight the kernel
Press e
Find the line starting with linux
Common Debug Parameters
rd.break
systemd.unit=rescue.target
systemd.unit=emergency.target
selinux=0
nomodeset
Press Ctrl + X to boot.

5. Fix “grub>” or “grub rescue>” Prompt
Identify Boot and Root Partitions
ls
ls (hd0,gpt1)/
Set Correct Root
set root=(hd0,gpt1)
set prefix=(hd0,gpt1)/boot/grub2
insmod normal
normal
If GRUB loads, reinstall it permanently (see Section 8).

6. Boot Using RHEL Installation ISO (Rescue Mode)
This is the most reliable recovery method.
Boot from RHEL 7/8/9 ISO
Select:
Troubleshooting → Rescue a Red Hat Enterprise Linux system
Mount the system automatically
Enter shell

7. Chroot into Installed System
# chroot /mnt/sysimage
From here, all fixes apply directly to your installed OS.

8. Reinstall GRUB (Correct Method by Version)
RHEL 7 – BIOS Systems
# grub2-install /dev/sda
# grub2-mkconfig -o /boot/grub2/grub.cfg

RHEL 7 – UEFI Systems
# yum reinstall grub2-efi shim
# grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg

RHEL 8, 9, 10 – UEFI Systems
# dnf reinstall grub2-efi shim
# grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg

Verify EFI entries:
# efibootmgr -v

9. Rebuild GRUB Configuration Only (If GRUB Exists)
Sometimes only grub.cfg is broken.
RHEL 7 (BIOS)
# grub2-mkconfig -o /boot/grub2/grub.cfg
RHEL 8/9/10 (UEFI)
# grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg

10. Fix Wrong Root or UUID in GRUB
Check actual UUIDs:
# blkid
Update GRUB config if root device changed:
# grub2-mkconfig -o /boot/efi/EFI/redhat/grub.cfg

11. Fix GRUB After Disk or Partition Changes
If disk order changed (sda → sdb):
Verify disks:
# lsblk
Reinstall GRUB to correct disk:
# grub2-install /dev/sda

12. Secure Boot Issues (RHEL 8+)
If system fails after enabling Secure Boot:
# dnf reinstall shim grub2-efi kernel
Ensure Secure Boot–signed kernels are installed.

13. SELinux and GRUB Boot Failures
Temporary Fix
Edit GRUB kernel line:
selinux=0
Permanent Fix
# touch /.autorelabel
# reboot

14. Best Practices to Avoid GRUB Issues

Keep multiple kernels installed
Avoid manual GRUB edits
Always regenerate grub.cfg after disk changes
Keep rescue ISO available
Use snapshots before kernel updates (VMs)

15. Quick GRUB Recovery Checklist

Boot older kernel
Rescue mode via ISO
Chroot into system
Reinstall GRUB
Regenerate grub.cfg
Verify EFI boot entries

Conclusion
GRUB issues across RHEL 7, 8, 9, and 10 follow the same recovery principles:

Identify firmware (BIOS vs UEFI)
Use rescue mode
Reinstall GRUB properly
Regenerate configuration files

Mastering these steps ensures fast recovery and minimal downtime in enterprise Linux environments.

Unix System Administrator

Chef Installation

Chef is a powerful configuration management tool that helps automate infrastructure, manage configurations, and ensure consistency across environments.

Chef Server: Central hub that stores cookbooks, policies, and node metadata
Chef Workstation: Used by admins to develop cookbooks and interact with the server using Knife
Chef Infra Client (Node): Target system managed by Chef
Chef Manage: Web-based UI for managing Chef Server

Download and Upload Chef Packages
Download the required RPM packages from Chef Downloads(https://www.chef.io/downloads) and upload them to the Chef Server using WinSCP or scp.
Packages used in this setup:

Chef Infra Server: `chef-server-core-14.9.23-1.el7.x86_64.rpm`
Chef Workstation: `chef-workstation-21.10.640-1.el7.x86_64.rpm`
Chef Manage: `chef-manage-2.5.4-1.el7.x86_64.rpm`
Chef Infra Client: `chef-17.6.18-1.el7.x86_64.rpm`

Install Chef Infra Server
Log in as root on the Chef Server and install the package:
# cd /tmp/chef
# dnf install chef-server-core-14.9.23-1.el7.x86_64.rpm -y
Configure the Chef Server:
# chef-server-ctl reconfigure

Chef License Acceptance

Before you can continue, 3 product licenses

must be accepted. View the license at

https://www.chef.io/end-user-license-agreement/

Licenses that need accepting:

Chef Infra Server
Chef Infra Client
Chef InSpec

Do you accept the 3 product licenses (yes/no)?

> yes

Check the status of Chef services:
# chef-server-ctl status

Create Chef Admin User
Create an administrator user:
# chef-server-ctl user-create admin System Admin sysadm@ppc.com 'Welcome@123' \
--filename /etc/opscode/admin.pem

Create the Organization:

# chef-server-ctl org-create chefmng 'chefmanager' --association_user admin --filename /etc/opscode/org-validator.pem

List existing organizations:
# chef-server-ctl org-list
Verify private keys:
# find /etc/opscode/ -name "*.pem"

Install Chef Manage on the Chef Server:

# cd /tmp/chef
# dnf install chef-manage-2.5.4-1.el7.x86_64.rpm -y
# chef-server-ctl reconfigure
# chef-manage-ctl reconfigure

Type 'yes' to accept the software license agreement, or anything else to cancel.

yes

Access the UI in your browser:
https://<chef-server-ip>

Install Chef Workstation:
On the Chef Workstation machine:
# cd /tmp/chef
# dnf install chef-workstation-21.10.640-1.el7.x86_64.rpm -y
Verify installation:
# chef --version
# knife --version

Set Command Executable Path:

# vi ..bash_profile

export PATH=$PATH:/opt/opscode/bin

Generate a Chef repository:
# chef generate repo chef-repo

+---------------------------------------------+

Chef License Acceptance

Before you can continue, 1 product license

must be accepted. View the license at

https://www.chef.io/end-user-license-agreement/

License that need accepting:

* Chef Workstation

Do you accept the 1 product license (yes/no)?

> yes

Create a `.chef` directory for Knife configuration:
# mkdir ~/chef-repo/.chef
# cd ~/chef-repo

Step 7: Configure SSH Access
Generate SSH keys on the Chef Workstation:
# ssh-keygen -b 4096
Copy the public key to the Chef Server:
# ssh-copy-id root@192.168.10.108
Copy the `.pem` files from Chef Server to Workstation:
# scp root@192.168.10.108:/root/*.pem ~/chef-repo/.chef
Verify copied keys:
# ls ~/chef-repo/.chef

Configure Knife:
Create the Knife configuration file:
# vim ~/chef-repo/.chef/config.rb
Add the following content:

current_dir = File.dirname(__FILE__)

log_level :info

log_location STDOUT

node_name "admin"

client_key "#{current_dir}/admin.pem"

chef_server_url "https://inddcpchf01.ppc.com/organizations/chefmng"

cookbook_path ["#{current_dir}/../cookbooks"]

Fetch SSL certificates:
# knife ssl fetch
Verify connectivity:
# knife client list

Install Chef Infra Client
On the client node:

# cd /tmp/chef
# dnf install chef-17.6.18-1.el7.x86_64.rpm -y

Step 10: Bootstrap a Client Node
From the Chef Workstation:
# knife bootstrap <chef client IP Address> --ssh-user <user name> --ssh-password <password> --node-name <chef client node name>
Verify nodes:
# knife node list
# knife node show client-node

Create Cookbook Directory
# mkdir -p ~/chef-repo/cookbooks/sample_nginx
# cd ~/chef-repo/cookbooks/sample_nginx

Generate Cookbook
# chef generate cookbook .

Edit Default Recipe
Edit `recipes/default.rb`:

package 'nginx' do
action :install
end

service 'nginx' do
action [:enable, :start]
end

file '/etc/nginx/sites-available/default' do
content 'server { listen 80; server_name localhost; location / { root /var/www/html; index index.html; } }'
notifies :restart, 'service[nginx]'
end

Upload the cookbook to Chef Server:
# knife cookbook upload sample_nginx
Bootstrap the node with the recipe:
# knife bootstrap <chef client IP Address> --ssh-user <user name> --ssh-password <password> --node-name <chef client node name>
Run Chef Client manually on the node:
# chef-client

Chef Resources:

package (Linux/Unix/Windows)

action ---> :install, :upgrade, :remove, :purge

version ---> Specify version

options ---> Extra CLI options for package manager

timeout ---> Wait time for install

Variables:

node['cookbook']['package_name'] ---> Package name (nginx, httpd, etc.)

node['cookbook']['package_version'] ---> Version to install

service (Linux/Unix/Windows)

action ---> :start, :stop, :restart, :reload, :enable, :disable

supports ---> Hash of supported actions (restart, reload, status)

subscribes ---> Trigger action on resource change

timeout ---> Wait time for service command

Variables:

node['cookbook']['service_name'] ---> Service name

node['cookbook']['service_action'] ---> Desired actions

template

source ---> Template file in cookbook (.erb)

path ---> Target path (override resource name)

owner ---> File owner

group ---> File group

mode ---> File permissions (0644)

variables ---> Hash of variables passed to template (@var)

action ---> :create, :create_if_missing, :delete

notifies ---> Trigger another resource on change

backup ---> Number of backups to keep

Variables:

node['cookbook']['doc_root'] ---> Document root (Linux)

node['cookbook']['iis_root'] ---> IIS root (Windows)

node['cookbook']['port'] ---> Port number

node['cookbook']['server_name'] ---> Server hostname

file

content ---> File content

owner ---> File owner

group ---> File group

mode ---> File permissions (0644)

backup ---> Number of backups to keep

action ---> :create, :delete, :touch

Variables:

node['cookbook']['file_path'] ---> File path

node['cookbook']['file_content'] ---> Content

user

comment ---> User description/full name

uid ---> User ID

home ---> Home directory

shell ---> Login shell

password ---> Hashed password

manage_home ---> Create home directory if true

action ---> :create, :remove, :modify, :lock, :unlock

Variables:

node['cookbook']['user_name'] ---> Username

node['cookbook']['user_home'] ---> Home directory

node['cookbook']['user_shell'] ---> Shell

node['cookbook']['user_password'] ---> Password hash

directory

owner ---> Directory owner

group ---> Directory group

mode ---> Directory permissions (0755)

recursive ---> Create parent directories if missing

action ---> :create, :delete, :nothing

Variables:

node['cookbook']['dir_path'] ---> Path

node['cookbook']['dir_owner'] ---> Owner

node['cookbook']['dir_group'] ---> Group

execute

command ---> Command to execute

cwd ---> Working directory

environment ---> Environment variables

creates ---> Skip execution if file exists

action ---> :run, :nothing

Variables:

node['cookbook']['exec_command'] ---> Command

node['cookbook']['exec_cwd'] ---> Working directory

powershell_script (Windows)

code ---> PowerShell commands to execute

cwd ---> Working directory

guard_interpreter ---> Interpreter for guards (:powershell_script)

action ---> :run, :nothing

Variables:

node['cookbook']['ps_script'] ---> Code string

node['cookbook']['ps_cwd'] ---> Working directory

cron (Linux/Unix)

minute ---> Minute field

hour ---> Hour field

day ---> Day of month

month ---> Month field

weekday ---> Day of week

command ---> Command to execute

user ---> Run as this user

action ---> :create, :delete, :run

Variables:

node['cookbook']['cron_minute'] ---> Minute

node['cookbook']['cron_hour'] ---> Hour

node['cookbook']['cron_command'] ---> Command

node['cookbook']['cron_user'] ---> User

remote_file

source ---> URL or file path to copy from

path ---> Destination path

owner ---> File owner

group ---> File group

mode ---> Permissions (0644)

action ---> :create, :create_if_missing, :delete

checksum ---> Verify file integrity (MD5/SHA256)

Variables:

node['cookbook']['remote_file_source'] ---> URL/path

node['cookbook']['remote_file_path'] ---> Destination

node['cookbook']['remote_file_owner'] ---> Owner

node['cookbook']['remote_file_mode'] ---> Permissions

git

repository ---> Git repo URL

revision ---> Branch, tag, or commit

destination ---> Local clone path

user ---> Run as user

action ---> :checkout, :sync, :export

enable_submodules ---> true/false

Variables:

node['cookbook']['git_repo'] ---> Repo URL

node['cookbook']['git_branch'] ---> Branch or tag

node['cookbook']['git_dest'] ---> Destination path

bash (Linux/Unix)

code ---> Bash commands

cwd ---> Working directory

environment ---> Environment variables

user ---> Run as this user

group ---> Run as this group

action ---> :run, :nothing

Variables:

node['cookbook']['bash_code'] ---> Commands

node['cookbook']['bash_cwd'] ---> Working directory

windows_feature

feature_name ---> Name of Windows feature

action ---> :install, :remove, :nothing

all ---> Install dependent features (true/false)

Variables:

node['cookbook']['feature_name'] ---> Feature to install

ark (Linux/Unix)

url ---> Download URL

path ---> Installation path

owner ---> Owner

group ---> Group

action ---> :put, :install, :cherry_pick

checksum ---> Verify file integrity

Variables:

node['cookbook']['ark_url'] ---> Archive URL

node['cookbook']['ark_path'] ---> Install path

Unix System Administrator

GitLab CE Installation

GitLab CE Installation on RHEL 9 / CentOS 9
GitLab Community Edition (CE) is a powerful, self-hosted DevOps platform that provides Git repository management, CI/CD pipelines, artifact storage, container registry, issue tracking, and more. This guide walks you through installing GitLab CE on RHEL 9 / CentOS 9, configuring a custom external URL, and implementing SSL/TLS using Apache (httpd) as a reverse proxy.

1. Install Required Dependencies
Before installing GitLab, ensure your system has the required packages.
# dnf install -y curl policycoreutils openssh-server openssh-clients

2. Add GitLab CE Repository
Use GitLab’s official repository installation script.
# curl -sS https://packages.gitlab.com/install/repositories/gitlab/gitlab-ce/script.rpm.sh | sudo bash

3. Install GitLab CE
# dnf install -y gitlab-ce
This installs all required GitLab components, including NGINX (bundled), Redis, and PostgreSQL.

4. Configure GitLab URL
Edit the primary GitLab configuration file:
# vim /etc/gitlab/gitlab.rb
Add or modify the external URL:
external_url 'http://www.gitlab.ppc.com'
Save and exit.

5. Reconfigure GitLab
Run the reconfiguration command to generate configurations and start services.
# gitlab-ctl reconfigure
GitLab will now be accessible at:
http://server-hostname
http://server-IP-address

SSL/TLS Implementation Using Apache (httpd)
GitLab comes with a built-in NGINX server, but many enterprises prefer using Apache for SSL termination and reverse proxying.
Below is how to configure Apache with SSL for GitLab.

6. Install Apache HTTP Server
# dnf install -y httpd mod_ssl
# systemctl enable httpd
# systemctl start httpd

7. Generate or Install SSL Certificates
You can use:
Self-signed Certificates (testing)
Let's Encrypt (production)
CA-signed Certificates (enterprise)

To generate a self-signed certificate:
# openssl req -newkey rsa:2048 -nodes -keyout /etc/pki/tls/private/gitlab.key -x509 -days 365 -out /etc/pki/tls/certs/gitlab.crt

8. Configure Apache Reverse Proxy for GitLab
Create a new Apache configuration file:
# vim /etc/httpd/conf.d/gitlab.conf
Add the following configuration:
<VirtualHost *:443>
ServerName www.gitlab.ppc.com

SSLEngine on
SSLCertificateFile /etc/pki/tls/certs/gitlab.crt
SSLCertificateKeyFile /etc/pki/tls/private/gitlab.key

ProxyPreserveHost On

<Location /> --Optional if any port define Example
ProxyPass http://127.0.0.1:8080/
ProxyPassReverse http://127.0.0.1:8080/
</Location>
</VirtualHost>

<VirtualHost *:80>
ServerName www.gitlab.ppc.com
Redirect permanent / https://gitlab.ppc.com/
</VirtualHost>

Save and exit.

9. Adjust SELinux Policies (if enabled)
# setsebool -P httpd_can_network_connect 1

10. Restart Apache
# systemctl restart httpd
You can now access GitLab using HTTPS:
https://www.gitlab.ppc.com

Conclusion
You have successfully installed GitLab CE on RHEL 9 / CentOS 9, configured the external URL, and set up SSL/TLS security using Apache as a reverse proxy. With GitLab now running securely, you can begin creating repositories, configuring CI/CD pipelines, managing runners, and integrating GitLab with your DevOps ecosystem.

Unix System Administrator

Jenkins Installation

Jenkins Installation on RHEL 9: A Complete Guide
This guide walks through the steps required to install and configure Jenkins on RHEL 9, including setting up Java, dependencies, Jenkins repository, system service configuration, reverse proxy with Apache, Python tooling, Terraform, and PowerShell.

1. Install Required Dependencies
Begin by installing all essential packages, including Java 17, Git, compiler tools, Node.js, Python pip, Docker-related libraries, and others.
# dnf install -y fontconfig java-17-openjdk git gcc gcc-c++ nodejs gettext device-mapper-persistent-data lvm2 bzip2 python3-pip wget libseccomp

# java --version

2. Configure Jenkins Repository
Download and add the Jenkins repository for RHEL-based systems.

# wget -O /etc/yum.repos.d/jenkins.repo https://pkg.jenkins.io/redhat-stable/jenkins.repo

# rpm --import https://pkg.jenkins.io/redhat-stable/jenkins.io.key

3. Install and Start Jenkins
Install Jenkins and configure it as a service.

# dnf install jenkins -y

# systemctl start jenkins

# systemctl enable jenkins

# systemctl status jenkins

Jenkins will now be available on Example : http://192.168.10.106:8080

# cat /var/lib/jenkins/secrets/initialAdminPassword

6bedde9c71eb4d999a5cfdfe43f0d052 -- Enter this Password & Continue

Install the suggested plugins then Click

Now plugins installation in progress

Once Plugin are installed then Set the Admin password & email Address then Save & Continue

Jenkins URL : IP Address or FQDN hostname with port no 8080 as below then Save & Finish

Now Jenkins is ready to start

Install Additional Plugin Ansible, terraform, PowerShell, GitHub, GitLab, AWS, GCP & Azure

4. Configure Apache as a Reverse Proxy for Jenkins

Install and enable Apache HTTP Server.

# dnf install httpd -y
# systemctl start httpd

# systemctl enable httpd

# systemctl status httpd

Navigate to the Apache configuration directory:
# cd /etc/httpd/conf.d/

# mv welcome.conf welcome.conf.bkp

# vi jenkins.conf

ProxyRequests Off

ProxyPreserveHost On

AllowEncodedSlashes NoDecode

Order deny,allow

Allow from all

</Proxy>

ProxyPass / http://localhost:8080/ nocanon

ProxyPassReverse / http://localhost:8080/

ProxyPassReverse / http://www.jenkins.ppc.com/

Restart Apache:

# systemctl restart httpd

Now Jenkins will be accessible using your domain or server IP via port 80.
http://www.jenkins.ppc.com

5. Install and Configure Python Tools
Upgrade pip and install commonly used DevOps/Cloud SDKs.

# python3 -m pip install --upgrade pip

# pip3 install ansible
# pip3 install gcloud

# pip3 install awscli

# pip3 install azure-cli

# pip3 install --upgrade pyvmomi

# pip3 install vmware-vcenter

# pip3 install --upgrade git+https://github.com/vmware/vsphere-automation-sdk-python.git

Create and Activate Python Virtual Environment
# python3 -m venv venv_name

# source venv_name/bin/activate

# pip install --upgrade pip setuptools

6. Install Terraform on RHEL 9
Add the HashiCorp repo and install Terraform.
# yum install -y yum-utils

# yum-config-manager --add-repo https://rpm.releases.hashicorp.com/RHEL/hashicorp.repo

# yum -y install terraform

# terraform version

7. Install PowerShell on RHEL 9
Install PowerShell from the official RPM package.

# dnf install https://github.com/PowerShell/PowerShell/releases/download/v7.5.4/powershell-7.5.4-1.rh.x86_64.rpm

# pwsh --version

Conclusion

You've successfully installed Jenkins, configured Apache reverse proxy, set up Python cloud tooling, installed Terraform, and enabled PowerShell on RHEL 9. This setup prepares your server for end-to-end DevOps automation, CI/CD pipelines, cloud provisioning, and infrastructure management.

Feel free to extend Jenkins further using plugins and pipeline automation.

Unix System Administrator

Ansible AWX

Install Ansible AWX on CentOS/RHEL8/9
If you want to manage automation at scale, Ansible AWX (the open-source version of Ansible Tower) is a powerful solution. This guide walks you through installing AWX 17.1.0 on a CentOS/RHEL-based system using Docker and Docker Compose.

Prerequisites:

Before starting, ensure you have:

A fresh CentOS/RHEL system (8/9 preferred)
Root or sudo access
Internet connectivity

Step 1: Install Required Packages
# dnf -y install git gcc gcc-c++ nodejs gettext device-mapper-persistent-data lvm2 bzip2 python3-pip wget libseccomp

Step 2: Remove Old Docker Installation
# dnf remove docker* -y

Step 3: Configure Docker Repository
# dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo

Step 4: Install Docker CE
# dnf -y install docker-ce
# systemctl enable docker
# systemctl start docker

# systemctl status docker

Step 5: Install Python Build Dependencies
# python3 -m pip install --upgrade pip
# pip3 install setuptools_rust

# pip3 install wheel

# pip3 install ansible

# pip3 install docker-compose

Step 6: Download AWX Installer
# git clone -b 17.1.0 https://github.com/ansible/awx.git
# cd awx/installer

Step 7: Create Required Directories
# mkdir -p /opt/awx/pgdocker /opt/awx/awxcompose /opt/awx/projects

Step 8: Generate a Secret Key
# openssl rand -base64 30
Copy this key for later use.

Step 9: Edit the AWX Inventory File
Open the inventory file:
# vi inventory
Update the following parameters as needed:

admin_password=Welcome@123

awx_official=true

pg_database=awx

pg_password=Welcome@123

awx_alternate_dns_servers="192.168.10.100,192.168.20.100"

postgres_data_dir="/opt/awx/pgdocker"

docker_compose_dir="/opt/awx/awxcompose"

project_data_dir="/opt/awx/projects"

secret_key=XXXXXXXXXX # openssl rand -base64 30 command output

Save and exit the file.

Step 10: Run the AWX Installer
Once the inventory file is updated, run:
ansible-playbook -i inventory install.yml
This may take several minutes.

Step 11: Access AWX Web Interface
After the installation completes, open a browser and go to:
http://<server-ip>
http://<server_hostname or FQDN>

Log in using:
Username: admin
Password: Welcome@123 (or the password you set)

Conclusion
Installing Ansible AWX on CentOS/RHEL 8/9 provides a powerful and centralized way to manage automation across your infrastructure. By following this guide, you’ve prepared your system with all required dependencies, deployed Docker and Docker Compose, configured AWX using the official installer, and successfully launched the AWX web interface.

With AWX now running, you can:
Create and manage projects, inventories, and credentials
Build and schedule playbook automation workflows
Monitor job executions in real time
Integrate AWX with Git, cloud providers, and external systems
Scale automation across teams and environments

This setup forms the foundation for enterprise-grade automation and can be expanded further with clustering, HTTPS/SSL configuration, LDAP/AD integration, and backup strategies.

Unix System Administrator

Renew GPFS (IBM Spectrum Scale) Certificates

IBM Spectrum Scale (GPFS) uses internal SSL certificates to secure communication among cluster nodes. When these certificates are close to expiration—or have already expired—you must renew them to restore healthy cluster communication.

This article provides step-by-step instructions for renewing GPFS certificates using both the online (normal) and offline (expired certificate) methods.

Renewing GPFS Certificate – Online Method (Recommended)
Use this method when the certificates have NOT yet expired.
This method does not require shutting down the cluster.

1. Check the current certificate expiry date
Run on any cluster node:
# mmcommon run mmgskkm print --cert /var/mmfs/ssl/id_rsa_committed.cert | grep Valid
Or:
# /usr/lpp/mmfs/bin/mmcommon run mmgskkm print --cert /var/mmfs/ssl/id_rsa_committed.cert | grep Valid

2. Generate new authentication keys
# mmauth genkey new

3. Commit the new keys
# mmauth genkey commit

4. Validate the updated certificate on all nodes
# mmcommon run mmgskkm print --cert /var/mmfs/ssl/id_rsa_committed.cert | grep Valid
Or:
/usr/lpp/mmfs/bin/mmcommon run mmgskkm print --cert /var/mmfs/ssl/id_rsa_committed.cert | grep Valid

Renewing GPFS Certificate – Offline Method (Certificates Already Expired)
If the cluster fails to start or nodes cannot communicate due to an expired certificate, use this offline method.
This requires a temporary cluster shutdown and manual time adjustment.

1. Verify certificate expiration
# mmdsh -N all 'openssl x509 -in /var/mmfs/ssl/id_rsa_committed.pub -dates -noout'

2. Stop NTP service (important for manual time rollback)
# lssrc -s xntpd
# stopsrc -s xntpd

3. Shut down GPFS on all nodes
# mmshutdown -a

4. Stop CCR monitoring on quorum nodes
# mmdsh -N quorumNodes "/usr/lpp/mmfs/bin/mmcommon killCcrMonitor"

5. Roll back the system time on ALL nodes
Set the clock just before the certificate expiry time.
Example:
date 072019542025
Explanation:

07 = Month (July)

20 = Day

19:54 = Time

2025 = Year

6. Restart CCR monitor
# mmdsh -N quorumNodes "/usr/lpp/mmfs/bin/mmcommon startCcrMonitor"

7. Generate & commit new keys
# mmauth genkey new
# mmauth genkey commit

8. Restore correct date and restart NTP
# date <current_correct_time>
# startsrc -s xntpd

9. Verify the new certificate
# mmdsh -N all 'openssl x509 -in /var/mmfs/ssl/id_rsa_committed.pub -dates -noout'

10. Restart GPFS on all nodes
# mmstartup -a

Unix System Administrator

Extracting Disk Details (Size, LUN ID, and WWPN) on IBM AIX

Managing storage on IBM AIX systems often requires gathering detailed information about disks — including their size, LUN ID, and WWPN (World Wide Port Name) of the Fibre Channel adapters they connect through.

This information is especially useful for SAN teams and system administrators when verifying storage mappings, troubleshooting, or documenting configurations.

In this post, we’ll look at a simple shell script that automates this task.

The script:

Loops through all disks known to AIX (lspv output).
Extracts each disk’s LUN ID from lscfg.
Gets its size in GB using bootinfo.
Finds all FC adapters (fcsX) and displays their WWPNs.
Prints a consolidated, easy-to-read summary.

The Script

#!/bin/ksh
for i in $(lspv | awk '{print $1}')
do

# Get LUN ID
LUNID=$(lscfg -vpl "$i" | grep -i "LIC" | awk -F. '{print $NF}')

# Get size in GB
DiskSizeMB=$(bootinfo -s "$i")
DiskSizeGB=$(echo "scale=2; $DiskSizeMB/1024" | bc)

# Loop over all FC adapters
for j in $(lsdev -Cc adapter | grep fcs | awk '{print $1}')
do
WWPN=$(lscfg -vpl "$j" | grep -i "Network Address" | sed 's/.*Address[ .]*//')
echo "Disk: $i Size: ${DiskSizeGB}GB LUN ID: $LUNID WWPN: $WWPN"
done
done

How It Works:

lspv lists all disks managed by AIX (e.g., hdisk0, hdisk1).
lscfg -vpl hdiskX displays detailed configuration information for each disk, including the LUN ID.
bootinfo -s hdiskX returns the disk size in megabytes.
lsdev -Cc adapter | grep fcs lists all Fibre Channel adapters (fcs0, fcs1, etc.).
lscfg -vpl fcsX | grep "Network Address" shows the adapter’s WWPN.
sed 's/.*Address[ .]*//' cleans the output, leaving only the WWPN value.

Example Output:
Disk: hdisk0 Size: 100.00GB LUN ID: 500507680240C567 WWPN: C0507601D8123456
Disk: hdisk0 Size: 100.00GB LUN ID: 500507680240C567 WWPN: C0507601D8123457
Disk: hdisk1 Size: 200.00GB LUN ID: 500507680240C568 WWPN: C0507601D8123456
Disk: hdisk1 Size: 200.00GB LUN ID: 500507680240C568 WWPN: C0507601D8123457

This shows each disk (hdiskX) with its size, LUN ID, and all connected FC adapter WWPNs.

Unix System Administrator

Presenting Fibre-Channel Storage to AIX LPARs with Dual VIOS (NPIV / vFC)

Present SAN LUNs to AIX LPARs using NPIV / virtual Fibre Channel (vFC) so each LPAR has redundant SAN paths through two VIOS servers (VIOS1 = primary, VIOS2 = backup) and can use multipathing (native MPIO or PowerPath).

NPIV (N_Port ID Virtualization) lets an LPAR present its own virtual WWPNs to the SAN while physical Fibre Channel hardware is on the VIOS. With two VIOS nodes and dual SAN fabrics, you get end-to-end redundancy:

VIOS1 and VIOS2 each present vFC adapters to the LPAR via the HMC.
Each VIOS has physical FC ports connected to redundant SAN switches/fabrics.
LUNs are zoned and masked to VIOS WWPNs. AIX LPARs discover LUNs, use multipathing, and survive single-path failures.

Prerequisites & Assumptions:

HMC admin, VIOS (padmin/root), and AIX root access available.
VIOS1 & VIOS2 installed, registered with HMC and reachable.
Each VIOS has at least one physical FC port (e.g., fcs0, fcs1).
SAN team will perform zoning & LUN masking.
Backups of VIOS and HMC configs completed.
You know which LPARs should receive which LUNs.

High-Level Flow:

Collect physical FC adapter names & WWPNs from VIOS1 and VIOS2.
Provide WWPNs to SAN admin for zoning & LUN masking.
Create vFC adapters for each AIX LPAR on the HMC and map them across VIOS1/VIOS2.
Verify mappings on HMC and VIOS (lsmap).
Ensure VIOS physical FC ports are logged into fabric.
On AIX LPARs: run cfgmgr, enable multipathing, create PVs/VGs/LVs as required.
Test failover by disabling a path and verifying I/O continues.
Document and monitor.

Step-by-Step Configuration

Step 1 — Verify VIOS Physical Fibre Channel Adapters
On VIOS1 and VIOS2, log in as padmin and identify FC adapters:
$ lsdev -type adapter
Expected output snippet:
VIOS1:
fcs0 Available 00-00 Fibre Channel Adapter
fcs1 Available 00-01 Fibre Channel Adapter
VIOS2:
fcs0 Available 00-00 Fibre Channel Adapter
fcs1 Available 00-01 Fibre Channel Adapter
Retrieve WWPNs for each adapter:
$ lsattr -El fcs0 | grep -i wwpn
Record results:
VIOS Adapter WWPN
VIOS1 fcs0 20:00:00:AA:AA:AA
VIOS2 fcs0 20:00:00:CC:CC:CC

Step 2 — SAN Zoning & LUN Presentation
Provide the recorded VIOS WWPNs to the SAN Administrator.
Request:

Zoning between each VIOS WWPN and Storage Controller ports.
LUN masking to present LUN-100 to both VIOS WWPNs.
Confirmation that both VIOS ports see the LUNs across both fabrics.

Tip: Ensure both fabrics (A & B) are zoned independently for redundancy.

Step 3 — Create Virtual Fibre Channel (vFC) Adapters via HMC
On the HMC:
Select AIX-LPAR1 → Configuration → Virtual Adapters.
Click Add → Virtual Fibre Channel Adapter.
Create two vFC adapters:
vfc0 mapped to VIOS1
vfc1 mapped to VIOS2
Save configuration and activate (Dynamic LPAR operation if supported).
Expected vFC mapping:
Adapter Client LPAR Server VIOS Mapping Status
vfc0 AIX-LPAR1 VIOS1 Mapped OK
vfc1 AIX-LPAR1 VIOS2 Mapped OK

Step 4 — Verify vFC Mapping on VIOS
Log in to each VIOS (padmin):
$ lsmap -all -type fcs

Example output:
On VIOS1:
Name Physloc ClntID ClntName ClntOS
------------- ---------------------------------- ------ ----------- --------
vfchost0 U9105.22A.XXXXXX-V1-C5 5 AIX-LPAR1 AIX
Status:LOGGED_IN
FC name:fcs0
Ports logged in: 2
VFC client name: fcs0
VFC client WWPN: 10:00:00:11:22:33:44:55

On VIOS2:

Name Physloc ClntID ClntName ClntOS

------------- ---------------------------------- ------ ----------- --------
vfchost0 U9105.22A.XXXXXX-V2-C6 5 AIX-LPAR1 AIX
Status:LOGGED_IN
FC name:fcs0
Ports logged in: 2
VFC client name: fcs1
VFC client WWPN: 10:00:00:55:66:77:88:99

Confirm each VIOS vFC host maps to the correct AIX vFC client.

Step 5 — Verify VIOS FC Port Fabric Login
On each VIOS:
$ fcstat fcs0
Verify:
Port is online.
Logged into fabric.
No link errors.

Step 6 — Discover Devices on AIX LPAR
Boot or activate AIX-LPAR1 SMS mode and

Open HMC → Open vterm/console for AIX-LPAR1.
HMC GUI: Tasks → Operations → Activate → Advanced → Boot Mode = SMS → Activate.
In SMS console: 5 (Select Boot Options) → Select Install/Boot Device → List all Devices → pick device → Normal Boot Mode → Yes to exit and boot from that device.

Verify Fibre Channel adapters:
# lsdev -Cc adapter | grep fcs
fcs0 Available Fibre Channel Adapter
fcs1 Available Fibre Channel Adapter
List discovered disks:
# lsdev -Cc disk
# lspv
Expected:
hdisk12 Available 00-08-00-4,0 16 Bit LUNZ Disk Drive

Step 7 — Configure Multipathing
If using native AIX MPIO, verify:
# lspath
Enabled hdisk12 fscsi0
Enabled hdisk12 fscsi1
If using EMC PowerPath:
# powermt display dev=all
Confirm both paths active.

Step 8 — Test Redundancy / Failover
To validate multipathing:
On VIOS1, disable the FC port temporarily:
$ rmdev -l fcs0 -R
On AIX LPAR, verify disk is still accessible:
# lspath -l hdisk12
Expected:
Enabled hdisk12 fscsi1
Failed hdisk12 fscsi0
Re-enable path:
$ cfgdev
Confirm path restoration:
Enabled hdisk12 fscsi0
Enabled hdisk12 fscsi1

Step 9— Post-Deployment Checks

Verify all paths:
# lspath
Check VIOS logs for FC errors:
$ errlog -ls
Save configuration backups:
$ backupios -file /home/padmin/vios1_bkup
$ backupios -file /home/padmin/vios2_bkup

Unix System Administrator

adminCtrlX – Simplifying System Administration

Pages

RHEL 7, 8, 9, 10 – Storage Issues

RHEL 7, 8, 9, 10 – Network Issues

Splunk Server and Forwarder Installation

RHEL 7, 8, 9, 10 – Security Issues

RHEL 7, 8, 9 &10 – Bootloader Issues

Chef Installation

GitLab CE Installation

Jenkins Installation

Ansible AWX

Renew GPFS (IBM Spectrum Scale) Certificates

Extracting Disk Details (Size, LUN ID, and WWPN) on IBM AIX

Presenting Fibre-Channel Storage to AIX LPARs with Dual VIOS (NPIV / vFC)