My Experience with NCQ Errors
What started as a normal day in my homelab turned into hours of debugging, unexpected freezes, and a long chain of NCQ timeout errors coming from my Seagate Exos 12TB drive. I swapped cables, changed power supplies, rebooted the system dozens of times — nothing seemed to fix it. Eventually, I learned the hard way how sensitive enterprise HDDs can be to power stability and controller behavior.
The Beginning of the Problem
The issue began when I noticed strange freezes while accessing my 12TB Seagate Exos (SATA, model Exos X18/X20, 7200 RPM).
The logs were full of lines like:
ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x6 frozen
ata1.00: failed command: READ FPDMA QUEUED
ata1.00: cmd 60/08
This is the infamous NCQ (Native Command Queuing) error, a sign that the drive was stuck executing a queued command.
Sometimes the disk would disappear for a second, sometimes the entire OS froze for minutes.
It looked like either:
- the drive was failing,
- the controller was unstable,
- or power delivery wasn’t sufficient.
At the beginning, I obviously feared the worst: a dying HDD.
My First Attempts (and Failures)
Like anyone who works with Linux storage long enough, I tried all the usual steps:
- Changing the SATA cable
- Changing the SATA port
- Checking SMART long tests
- Reseating the drive
- Checked PSU rails
Nothing helped. The NCQ errors kept coming back,
The Real Cause: NCQ on Enterprise Drives
Eventually I learned a painful truth:
Some Seagate Exos drives behave badly with NCQ on certain controllers.
Enterprise disks have aggressive queueing behavior, and cheaper onboard SATA controllers (especially in mini-PCs or consumer motherboards) can't always keep up.
The result:
NCQ commands get stuck → the kernel marks the command as frozen → ATA timeouts explode.
So I decided to test the one thing I hadn't tried yet: disabling NCQ.
The Fix: Disable NCQ Completely
I disabled NCQ using:
echo 1 > /sys/block/sdX/device/queue_depth
and the effect was immediate:
- no more freezes
- no more ATA errors
- no more I/O stalls
- disk running perfectly stable
After hours of debugging, that one line fixed everything.
To make it permanent, I added a kernel parameter:
libata.force=noncq
Updated GRUB:
update-grub
Rebooted.
And from that moment on, the drive has been flawless.
Why Disabling NCQ Works
NCQ is designed to improve performance by sending multiple commands at once.
But with certain combinations of:
- enterprise drives (Exos, IronWolf Pro)
- consumer SATA controllers
- high queue depths
- specific firmware versions
NCQ can cause more harm than good.
Disabling it reduces performance slightly, but guarantees:
- stable read/write behavior
- zero freezes
- no timeouts
- consistent operation under load
And on a 12TB drive used for bulk storage, consistency matters more than a small NCQ performance gain.
Final Thoughts
In the end, the disk was never faulty.
The problem was NCQ itself.
If you own a Seagate Exos drive and are facing:
- READ FPDMA QUEUED
- frozen command
- NCQ errors
- ata1.00: exception Emask...
Try disabling NCQ before assuming your HDD is failing.
It saved my drive — and hours of frustration.
