丁久

Posted on May 12 • Originally published at dingjiu1989-hue.github.io

Linux Performance Tuning

#technology #programming #devops

This article was originally published on AI Study Room. For the full version with working code examples and related articles, visit the original post.

Linux Performance Tuning

Linux performance tuning is essential for running efficient production workloads. Understanding how the kernel manages CPU, memory, disk, and network resources allows you to identify bottlenecks and optimize accordingly.

The USE Method

Brendan Gregg's USE (Utilization, Saturation, Errors) method provides a systematic approach to performance analysis:

Utilization: What percentage of the resource is busy?
Saturation: How much extra work is queued?
Errors: How many error events are there?

Apply this to CPU, memory, storage, and network resources to quickly identify the bottleneck.

CPU Performance Tuning

Monitoring Tools

Real-time CPU monitoring

htop

Per-process CPU usage

top -o %CPU

CPU statistics and context switches

vmstat 1 5

Detailed per-CPU utilization

mpstat -P ALL 1

High context switch rates (above 50,000 per second per core) may indicate inefficient application architecture. Use pidstat -w to identify processes causing excessive context switches.

Kernel Parameters

cat /etc/sysctl.d/99-performance.conf

kernel.sched_min_granularity_ns = 3000000

kernel.sched_wakeup_granularity_ns = 4000000

kernel.sched_migration_cost_ns = 500000

kernel.sched_nr_migrate = 32

These scheduler parameters reduce latency for interactive workloads. Adjust carefully -- aggressive settings can hurt throughput.

Memory Tuning

Monitoring Memory

Memory usage overview

free -h

Detailed memory breakdown

cat /proc/meminfo

Page fault statistics

sar -B 1

Top memory consumers

ps aux --sort=-%mem | head

Check sar -B for page fault rates. High pgmajfault values indicate the system is swapping -- add more RAM or reduce memory pressure.

Swappiness

Reduce swapping tendency (default is 60)

vm.swappiness = 10

Set temporarily

sysctl vm.swappiness=10

For database servers, set swappiness to 1 to avoid swapping. For desktops and general-purpose servers, 10-20 balances responsiveness with memory efficiency.

Transparent Huge Pages

Disable THP for database workloads:

echo never > /sys/kernel/mm/transparent_hugepage/enabled

echo never > /sys/kernel/mm/transparent_hugepage/defrag

THP can cause latency spikes in database systems due to memory defragmentation pauses.

Disk I/O Tuning

I/O Scheduler

Choose the right I/O scheduler for your workload:

Check current scheduler

cat /sys/block/nvme0n1/queue/scheduler

Set to none for NVMe, mq-deadline for spinning disks

echo none > /sys/block/nvme0n1/queue/scheduler

Modern NVMe drives perform best with the none (or nvme) scheduler. Spinning disks benefit from mq-deadline which minimizes seek times.

Monitoring Disk Performance

I/O statistics per device

iostat -x 1

Process-level I/O

iotop

File system latency

bcc-tools/biolatency

High await times (above 20ms) indicate disk saturation. Check iowait in top and svctm in iostat for confirmation.

Network Tuning

Kernel Network Parameters

/etc/sysctl.d/99-network.conf

net.core.somaxconn = 65535

net.core.netdev_max_backlog = 50000

net.ipv4.tcp_tw_reuse = 1

net.ipv4.tcp_fin_timeout = 15

net.ipv4.tcp_keepalive_time = 300

net.ipv4.tcp_keepalive_intvl = 60

net.ipv4.tcp_keepalive_probes = 5

net.core.rmem_max = 134217728

net.core.wmem_max = 134217728

net.ipv4.tcp_rmem = 4096 87380 134217728

net.ipv4.tcp_wmem = 4096 65536 134217728

Increase socket buffer sizes for high-throughput applications. tcp_tw_reuse allows reuse of connections in TIME_WAIT state, important for high-connection-rate servers.