8.0-RELEASE: disk IO temporarily hangs up (ZFS or ATA related problem)

Fri Dec 18 09:11:55 UTC 2009

On 18/12/2009, at 9:39 PM, Phil Murray wrote:

> 
> 
> On 18/12/2009, at 9:15 PM, "Alexander Zagrebin" <alexz at visp.ru> wrote:
> 
>> Big thanks for your reply!
>> 
>>>> I use onboard ICH7 SATA controller with two disks attached:
>>>> 
>>>> atapci1:<Intel ICH7 SATA300 controller>  port
>>>> 
>>> 0x30c8-0x30cf,0x30ec-0x30ef,0x30c0-0x30c7,0x30e8-0x30eb,0x30a0
>>> -0x30af irq 19
>>>> at device 31.2 on pci0
>>>> atapci1: [ITHREAD]
>>>> ata2:<ATA channel 0>  on atapci1
>>>> ata2: [ITHREAD]
>>>> ata3:<ATA channel 1>  on atapci1
>>>> ata3: [ITHREAD]
>>>> ad4: 1430799MB<Seagate ST31500541AS CC34>  at ata2-master SATA150
>>>> ad6: 1430799MB<WDC WD15EADS-00P8B0 01.00A01>  at ata3-master SATA150
>>>> 
>>>> The disks are used for mirrored ZFS pool.
>>>> I have noticed that the system periodically locks up on
>>> disk operations.
>>>> After approx. 10 min of very slow disk i/o (several KB/s)
>>> the speed of disk
>>>> operations restores to normal.
>>>> gstat has shown that the problem is in ad6.
>>>> For example, there is a filtered output of iostat -x 1:
>>>> 
>>>>                        extended device statistics
>>>> device     r/s   w/s    kr/s    kw/s wait svc_t  %b
>>>> ad6      985.1   0.0  5093.9     0.0    0   0.2  23
>>>> ad6      761.8   0.0  9801.3     0.0    1   0.4  31
>>>> ad6      698.7   0.0  9215.1     0.0    0   0.4  30
>>>> ad6      434.2 513.9  5903.1 13658.3   48  10.2  55
>>>> ad6        3.0 762.8   191.2 28732.3    0  57.6  99
>>>> ad6       10.0   4.0   163.9     4.0    1   1.6   2
>>>> 
>>>> Before this line we have a normal operations.
>>>> Then the behaviour of ad6 changes (pay attention to high
>>> average access time
>>>> and percent of "busy" significantly greater than 100):
>>>> 
>>>> ad6        0.0   0.0     0.0     0.0    1   0.0   0
>>>> ad6        1.0   0.0     0.5     0.0    1 1798.3 179
>>>> ad6        1.0   0.0     1.5     0.0    1 1775.4 177
>>>> ad6        0.0   0.0     0.0     0.0    1   0.0   0
>>>> ad6       10.0   0.0    75.2     0.0    1 180.3 180
>>>> ad6        0.0   0.0     0.0     0.0    1   0.0   0
>>>> ad6        1.0   0.0     2.0     0.0    1 1786.7 178
>>>> ad6        0.0   0.0     0.0     0.0    1   0.0   0
>>>> 
>>>> And so on for about 10 minutes.
>>>> Then the disk i/o is reverted to normal:
>>>> 
>>>> ad6      139.4   0.0  8860.5     0.0    1   4.4  61
>>>> ad6      167.3   0.0 10528.7     0.0    1   3.3  55
>>>> ad6       60.8 411.5  3707.6  8574.8    1  19.6  87
>>>> ad6      163.4   0.0 10334.9     0.0    1   4.4  72
>>>> ad6      157.4   0.0  9770.7     0.0    1   5.0  78
>>>> ad6      108.5   0.0  6886.8     0.0    0   3.9  43
>>>> 
>>>> There are no ata error messages neither in the system log,
>>> nor on the
>>>> console.
>>>> The manufacture's diagnostic test is passed on ad6 without
>>> any errors.
>>>> The ad6 also contains swap partition.
>>>> I have tried to run several (10..20) instances of dd, which
>>> read and write
>>>> data
>>>> from and to the swap partition simultaneously, but it has
>>> not called the
>>>> lockup.
>>>> So there is a probability that this problem is ZFS related.
>>>> 
>>>> I have been forced to switch ad6 to the offline state... :(
>>>> 
>>>> Any suggestions on this problem?
>>>> 
>>> I also have been experiencing the same problem with a different
>>> disk/controller (via mpt on a vmware machine). During the
>>> same period I
>>> notice that system cpu usage hits 80+% and top shows the
>>> zfskern process
>>> being the main culprit. At the same time I've discovered the
>>> kstat.zfs.misc.arcstats.memory_throttle_count sysctl rising.
>>> Arc is also
>>> normally close to the arc_max limit.
>> 
>> My case has differences.
>> 1. CPU usage is near 0%
>> 2. zfs's sysctls doesn't change significantly during
>>  "normal operation" -> "lockup" -> "normal" transition
>> 3. ARC size is far from its limits,
>> kstat.zfs.misc.arcstats.memory_throttle_count: 0
>> 
>> Here my actions, observations and conclusions:
>> 1. I have tried to change placements of disks on sata channels.
>>  Nothing has changed - the problems still on WD15EADS, although it became
>> ad4.
>>  So issue isn't in south bridge, sata cables and so on.
>> 2. I have tried to detach ad6 from the pool, to zero system area, and to
>> reattach it again.
>>  Of course, resilvering was started. During resilvering 250 GB was copied
>> without lockups
>>  and delays. While resilvering, I have tried periodically to load drive
>> with a read
>>  operations (dd if=/dev/ad6 of=/dev/null ...).
>>  But after resilvering and several minutes of normal mirror operation,
>> lockups appeared again.
>>  So drive is seems to be ok and we have a software problem?
>> 3. I have noticed that lockups often happens during postgresql activity.
>>  postgresql often uses sync. So I have tried to disable ZIL.
>>  No success.
>> 4. "IDE LED" is constantly on during lockups.
>>  So it is really read/write delays.
>> 5. I see two variants of zfskern's state:
>>  a) it is constantly in the vgeom:io
>>  b) it is in either zio->io_ state (when active), or in tx->tx_s (when
>> idle).
>>     During lockups it is mostly in zio->io_.
>>  What the difference with vgeom:io and zio->io_/tx->tx_s?
>> 
>> May be a problem is in ata? WD15EADS is a "green" series of drives.
> 
> The WD green drives have a feature called Time Limited Error Recovery where the disk can spend several minutes trying to read a bad block etc.
> 
> It plays havoc with RAID arrays which is why WD recommend you don't use the green drives in arrays. They have more info about the "feature" in the WD FAQ/knowledgebase
> 

Sorry, TLER is the feature that 'fixes' the problem, see:

http://wdc.custhelp.com/cgi-bin/wdc.cfg/php/enduser/std_adp.php?p_faqid=1397&p_created=1131638613&p_sid=vfyE1KPj&p_accessibility=0&p_redirect=&p_srch=1&p_lva=&p_sp=cF9zcmNoPTEmcF9zb3J0X2J5PSZwX2dyaWRzb3J0PSZwX3Jvd19jbnQ9MTcsMTcmcF9wcm9kcz0yMjcsMjk0JnBfY2F0cz0mcF9wdj0yLjI5NCZwX2N2PSZwX3BhZ2U9MSZwX3NlYXJjaF90ZXh0PXJhaWQ!&p_li=&p_topview=1

Sounds like your drive is going into the recovery procedure...

> 
>> May be i have a problem with its power management?
>> Is there a method to completely reset sata channel and drive?
>> atacontrol reinit will do it?
> 
> 
> 
>> 
>> Any help is welcomed.
>> 
>> -- 
>> Alexander Zagrebin
>> 
>> _______________________________________________
>> freebsd-current at freebsd.org mailing list
>> http://lists.freebsd.org/mailman/listinfo/freebsd-current
>> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"
> _______________________________________________
> freebsd-current at freebsd.org mailing list
> http://lists.freebsd.org/mailman/listinfo/freebsd-current
> To unsubscribe, send any mail to "freebsd-current-unsubscribe at freebsd.org"