How to Investigate High Disk I/O in Linux (Deep Dive)

High disk I/O is one of the most common and most misunderstood causes of slow Linux servers.

This guide is written the way real incidents are handled โ€” from symptom to root cause, with clear interpretation at every step.


What โ€œHigh Disk I/Oโ€ Actually Means

It does NOT just mean โ€œdisk is busyโ€.

It can mean:

  • Too many reads/writes
  • Slow storage latency
  • Queue buildup (requests waiting)
  • Blocking processes (D-state)

๐Ÿ‘‰ Your goal is to identify which one.


Step 0: Confirm the Symptom (from workflow)

Usually you arrive here after:

  • High load
  • Low CPU usage
  • Slow responses

Step 1: iostat (Primary Tool)

Install if needed:

# Linux (RHEL/CentOS)
yum install sysstat -y
# Linux (Debian/Ubuntu)
apt install sysstat -y

Run:

iostat -x 1

Key Columns (MUST understand)

FieldMeaning
%utilDisk usage percentage
awaitAverage wait time (ms)
r/s, w/sReads/Writes per second
avgqu-szQueue size

How to Interpret iostat (THIS is where expertise is)

Case A: %util ~100%

Disk is saturated


Case B: High await (>50ms HDD, >10ms SSD)

Latency problem


Case C: High avgqu-sz

Requests are piling up (queue bottleneck)


Step 2: iotop (Who is Causing I/O)

iotop

Note: Requires root privileges on Linux. On macOS, iotop is not native.

Look for:

  • High read/write processes

Step 3: vmstat (System View)

vmstat 1

Look at:

  • waย โ†’ I/O wait

Interpretation

  • Highย waย โ†’ CPU waiting on disk

Step 4: Check D-State Processes

ps -eo pid,stat,cmd | grep D

Confirms blocking I/O


Step 5: Identify Files Causing I/O

lsof -p PID

Look for:

  • Logs
  • Databases
  • Temporary files

Real Production Case (Step-by-Step)

Situation

  • Website slow
  • Load = 15
  • CPU idle high

Investigation

  1. iostatย โ†’ %util = 100%
  2. vmstatย โ†’ wa high
  3. iotopย โ†’ MySQL heavy writes

Root Cause

Database writing too frequently to disk


Fix Options

  • Optimize queries
  • Add caching
  • Move to faster storage (SSD/NVMe)

Common Causes of High Disk I/O

  • Database overload
  • Log files growing rapidly
  • Backup processes
  • Malware / abuse scripts
  • Slow or failing disks

Time-Based Thinking

Sudden spike

  • Traffic surge
  • Backup job started

Gradual increase

  • Database growth
  • Log accumulation

Common Mistakes

โŒ Looking only at %util

Latency (await) is equally important


โŒ Killing processes blindly

Fix cause, not symptom


When This Matters in Production

Disk I/O issues affect:

  • Databases
  • File servers
  • Streaming workloads

Infrastructure options:


Related Linux Guides


Final Takeaway

High disk I/O is not just a metric.

It is a signal of system pressure.

An expert does not just see 100% usage โ€”

๐Ÿ‘‰ They understand why the disk is struggling.

Share:

Facebook
Twitter
Pinterest
LinkedIn