Solaris Performance Management is the practice of monitoring, analyzing, troubleshooting, and tuning Solaris servers to ensure they run efficiently and reliably. The goal is to detect bottlenecks early, understand system behavior, and optimize resources such as CPU, memory, disk, and network.
Performance management in Solaris typically focuses on:
- CPU utilization and load
- Memory and virtual memory behavior
- Disk I/O performance
- Network throughput and errors
- Processes and threads activity
- Kernel and resource tuning
1. CPU Monitoring and Analysis
CPU performance problems usually appear as high load averages, slow applications, or system lag.
A. Process-Level CPU Monitoring
prstat
The prstat command is similar to Linux top. It shows real-time CPU and memory usage per process.
# prstat
To sort by CPU usage:
# prstat -s cpu
Important columns:
- PID – Process ID
- USERNAME – Owner of process
- SIZE – Total virtual memory
- RSS – Resident memory (physical RAM used)
- STATE – sleep, run, etc.
- PRI – Priority
- CPU – Percentage CPU usage
- TIME – Total CPU time consumed
- PROCESS/NLWP – Process name / number of threads
If one process constantly shows high CPU, investigate that application.
B. CPU Usage Per Processor
mpstat
# mpstat 1 5
This command shows CPU statistics every 1 second, 5 times.
Key columns:
- usr – Time spent in user mode
- sys – Time spent in kernel mode
- wt – Waiting for I/O
- idl – Idle time
High usr → Application heavy load
High sys → Kernel or system calls heavy
High wt → Disk bottleneck
Low idl → CPU saturation
C. Overall CPU + System Summary
vmstat
# vmstat 5 5
Important fields:
- r – Number of processes waiting for CPU (run queue)
- b – Processes blocked (usually I/O wait)
- us – User CPU
- sy – System CPU
- id – Idle CPU
If r is consistently higher than number of CPUs, the system is CPU-bound.
2. Memory Management and Monitoring
Solaris uses virtual memory with paging and swapping. Memory issues usually cause slowness or high paging.
A. vmstat Memory Columns
# vmstat
Memory-related fields:
- swap – Available swap space
- free – Free physical memory
- re – Pages reclaimed
- sr – Pages scanned by page scanner
High sr indicates memory pressure.
B. Detailed Kernel Memory Usage
# echo ::memstat | mdb -k
This shows:
- Kernel memory allocations
- Slab usage
- ZFS ARC cache memory
Useful when diagnosing kernel memory leaks.
C. Historical Memory Statistics
# sar -r 1 5
Shows:
- freemem – Free memory pages
- freeswap – Available swap
Good for performance trend analysis.
3. Disk I/O Monitoring
Disk bottlenecks are common in database and application servers.
A. iostat Extended Output
# iostat -x 5 3
Key columns:
- r/s – Reads per second
- w/s – Writes per second
- rkB/s – Read throughput
- wkB/s – Write throughput
- svc_t – Service time (ms)
- %b – Percent busy
If %b is near 100%, disk is saturated.
High svc_t means disk latency is high.
B. ZFS Storage Monitoring
# zpool iostat -v 5
Shows:
- Pool throughput
- Per-disk I/O
- Latency and queue depth
Helps identify slow disks inside a ZFS pool.
4. Network Performance Monitoring
Network issues can cause slow application response or connection drops.
A. Interface Statistics
# netstat -i
Shows:
- Input/output packets
- Errors
- Collisions
High errors indicate cabling or NIC issues.
Protocol statistics:
# netstat -s
Displays TCP retransmissions, UDP errors, etc.
B. Data Link Monitoring
# dladm show-link
# dladm show-phys
Shows:
- Link speed
- Duplex settings
- Physical state
Useful to verify 1G/10G speed settings.
C. Packet Capture
# snoop -d net0
# tcpdump -i net0
Used for deep network troubleshooting and packet-level analysis.
5. Process and Thread Monitoring
A. prstat with Threads
# prstat -L
Shows per-thread CPU usage.
Useful for multithreaded applications like Java.
B. ps Command Custom Output
# ps -eo pid,user,ppid,stime,etime,pcpu,pmem,args
Important fields:
- pcpu – CPU percentage
- pmem – Memory usage percentage
- etime – Elapsed time
Helps find long-running or misbehaving processes.
C. Process Memory Mapping
# pmap -x <pid>
Shows:
- Private memory
- Shared memory
- Heap and stack usage
Useful for memory leak investigation.
6. System Load Monitoring
A. uptime
# uptime
Displays load averages for:
- 1 minute
- 5 minutes
- 15 minutes
If load average > number of CPUs, system may be overloaded.
B. w Command
# w
Shows:
- Logged-in users
- Load average
- What users are running
7. Solaris Resource and Performance Tuning
Performance tuning should only be done after identifying bottlenecks.
A. Process Resource Control
# prctl -n process.max-file-descriptor -i process <pid>
Controls:
- File descriptors
- CPU time
- Memory limits
B. Projects and Resource Pools
Solaris allows grouping users/processes into projects.
Edit /etc/project and use:
# projmod
# newtask -p projectname
Used to control:
- CPU shares
- Memory caps
- Process limits
C. Kernel Parameter Tuning
Using ndd:
# ndd -get /dev/tcp tcp_max_buf
# ndd -set /dev/tcp tcp_max_buf 4194304
Common TCP tuning:
- tcp_recv_hiwat
- tcp_xmit_hiwat
- tcp_conn_req_max_q
These affect network throughput and scalability.
D. ZFS Performance Tuning
Check properties:
# zfs get all <dataset>
Common tuning:
Disable access time updates:
# zfs set atime=off <dataset>
Other useful properties:
- recordsize (important for databases)
- compression
- logbias
- primarycache
8. Performance Troubleshooting Methodology
A structured approach:
Check load average (uptime)
Check CPU (mpstat, prstat)
Check memory (vmstat, sar -r)
Check disk (iostat)
Check network (netstat)
Identify top resource-consuming processes
Tune only after identifying root cause
No comments:
Post a Comment