[Bug 229745] ahcich: CAM status: Command timeout

bugzilla-noreply at freebsd.org bugzilla-noreply at freebsd.org
Wed Feb 6 12:36:23 UTC 2019


https://bugs.freebsd.org/bugzilla/show_bug.cgi?id=229745

sec <szczepan at szczepan.net> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |szczepan at szczepan.net

--- Comment #19 from sec <szczepan at szczepan.net> ---
Also having the same problem, before 11.2, my ZFS mirror was working fine.
After upgrade, those CAM errors started to show up.
I replaced PSU, replaced one of the drives, replaced cables - still the same.
Checked smart for drives, did shoty/long tests - drives are fine.
Even did memtest :)

My observations:
- when HDD's connected directly into motherboard - there are Timeout errors
- when HDD's connected to pci-e sata controller - there are unrecoverable CRC
errors

I tried to disable NCQ and cache - nothing helps.

Strange thing is, when mirror is broken (there's only one drive connected) -
everything is fine. It's only when 2 drives are connected into same mirror,
those start to show up.

My drives are WD Gold 1TB:
1. WDC WD1005FBYZ-01YCBB2 RR07
2. WDC WD1005FBYZ-01YCBB1 RR04

Before I had two RR07 working fine, after upgrade, errors shows up, so I RMA
one of them and got RR04 - which didn't fix the error. Also the problem only
shows up only on one of the drives in mirror.

Right now I'm the process of migrating data to 11.1 zfs pool, then I will
downgrade my server back to releng/11.1 and check if pool is working fine.

I also have two SAMSUNG HD642JJ 1AA01113 connected into other mirror - no
issues with those.

Tried to swap cables, ports, etc - problem is following those drives, together.

Hope for some solution to this one, becuase it will block any upgrade to 12 :)

-- 
You are receiving this mail because:
You are the assignee for the bug.


More information about the freebsd-bugs mailing list