strange i/o and tty lockups/delays on 6.1, strange sync() issue

Marek Blaszkowski mb at kis.p.lodz.pl
Fri Nov 2 06:06:16 PDT 2007


Hi,
hangs described in this thread are very similar to my problem. I'll try describe it
more clear:

1. hangs are periodical, min. time beetween 'hangs' I observed was 15 secs. max was
about 20 secs. Hang time is not const, min hang time is about 2 secs max 30 secs.
2. hangs affect only access file(s) (read or write) or loop (exec) commands, interactive
apps already looped like top or iostat durning 'hang' update their stats smoothly.
Beetween hangs file access is fast and smooth.
3. durning hang there is no ANY message from the kernel or dev drivers, kernel dmesg is silent.
Smartctl doesn't reports anything significant.
4. there is no high (abnormal) CPU or I/O load durning hang (maybe just top or iostat can't
'detect' it), always durnig hang HDD LED lights up (so, it's 100% i/o disk related).

I done more investigations, I wrote simple app which do an infinite loop with open file,
write 1024 bytes buffer do some time calculations beetwen fopen, fwrite and fclose calls
(it's simple time measurement - how many usec these calls takes), sleeps for one sec
and do loop again, it just print on stdout time measurement. I ran test for one minute:
---loop #1---
fopen time: 0.000072 secs
fwrite time: 0.000029 secs
fclose time: 0.000160 secs
---loop #2---
fopen time: 0.000068 secs
fwrite time: 0.000009 secs
fclose time: 0.000073 secs
---loop #3---
fopen time: 0.000063 secs
fwrite time: 0.000015 secs
fclose time: 0.000063 secs
---loop #4---
fopen time: 0.000065 secs
fwrite time: 0.000010 secs
fclose time: 0.000054 secs
---loop #5---
fopen time: 0.000066 secs
fwrite time: 0.000010 secs
fclose time: 0.000053 secs
---loop #6---
fopen time: 0.000086 secs
fwrite time: 0.000012 secs
fclose time: 0.000057 secs
---loop #7---
fopen time: 0.000058 secs
fwrite time: 0.000011 secs
fclose time: 0.000057 secs
---loop #8---
fopen time: 0.000079 secs
fwrite time: 0.000013 secs
fclose time: 0.000053 secs
[... skip next 52 loops ...]

As you see nothing special, hangs are not 'detectable' by this measurement,
when I added fsync() after fwrite, the same test gives me:
---loop #1---
fopen time: 0.000080 secs
fwrite time: 0.009770 secs <- notice longer time of fwrite with fsync but it's normal, system just don't cache or buffer writes
fclose time: 0.000074 secs
---loop #2---
fopen time: 0.000057 secs
fwrite time: 0.009043 secs
fclose time: 0.000054 secs
---loop #3---
fopen time: 0.000070 secs
fwrite time: 0.025974 secs
fclose time: 0.000062 secs
---loop #4---
fopen time: 0.000108 secs
fwrite time: 0.017486 secs
fclose time: 0.000066 secs
---loop #5---
fopen time: 0.000070 secs
fwrite time: 0.009742 secs
fclose time: 0.000057 secs
---loop #6---
fopen time: 0.000070 secs
fwrite time: 0.001636 secs
fclose time: 0.000097 secs
---loop #7---
fopen time: 0.000099 secs
fwrite time: 4.918546 secs <-- here was 'hang'
fclose time: 0.000072 secs
---loop #8---
fopen time: 0.000074 secs
fwrite time: 2.346643 secs
fclose time: 0.000076 secs
---loop #9---
fopen time: 0.000081 secs
fwrite time: 1.294689 secs
fclose time: 0.000069 secs
---loop #10---
fopen time: 0.000093 secs
fwrite time: 0.017332 secs
fclose time: 0.000076 secs
---loop #11---
fopen time: 0.000096 secs
fwrite time: 0.009575 secs
fclose time: 0.000071 secs


Take a look at loop #7, flushing 1024 bytes buffer takes about 5 secs, in the same time on other screen
consoles I noticed hang durning starting (exec) top, top exec tooks about 7 secs).

More thoughts:
Hangs appeared suddenly in lifetime of this machine, there was no any hardware or kernel changes. Just
some day I noticed these strange 'hangs'. Rebooting doesn't help. This machine runs  now 5 java/tomcat instances,
it seems like hangs appeared suddenly (in the past) after adding more javas/tomcats instances
(machine don't swapping at all, there is enough memory). Now I try find out which JVM (if it's java related)
couse the problems by eliminating proccess one by one and check if hangs still appear.


regards,
-- 
Marek B.

Rainer Hurling napisał(a):
> Hi Marek,
> 
> a thread on stable@ is dealing with this kind of hangings:
> 
> http://www.nabble.com/forum/ViewPost.jtp?post=13296788
> 
> Perhaps there is some relationship between them?
> 
> Regards,
> Rainer
> 
> 
> Marek Blaszkowski schrieb:
> 
>> Hi,
>> my config:
>> thunder k8sr, dual opteron 244, 8GB RAM, FreeBSD 6.1, sata disk
>>
>>
>> for time to time (every 10 or 20 secs) any process which try write or
>> read any, even small file,
>> is locked for a couple seconds (about 7 or 9 secs), very iritating
>> when I try save
>> small files edited in vim (size about 2-3 kbytes), it's looks like this:
>> #vim /path/to/file
>> ...editing..
>> :w
>> and now (afetr w command) vim freeze for 7-10 secs until it displays
>> "file written".
>>
>> Delays accours on tty, too. In vim I can't move around edited file
>> (up/down in rows),
>> vim just got hang, strange is that I can switch (durning vim lockup)
>> beetwen virtual screen
>> consoles.
>>
>> System is not swapping (4GB ram free), iostat says:
>>
>> #iostat 1
>>       tty             ad4              ad6             cpu
>>  tin tout  KB/t tps  MB/s   KB/t tps  MB/s  us ni sy in id
>>    1  353 17.94  45  0.78   0.64   0  0.00  11  1  6  1 82
>>    0  360 16.00  72  1.12   0.00   0  0.00  17  0  2  1 81
>>    0  120 16.00  77  1.20   0.00   0  0.00   6  0  3  1 90
>>    0  120 14.66  73  1.04   0.00   0  0.00  41  0 11  1 47
>>    0  120 16.00  78  1.22   0.00   0  0.00  27  0  1  0 72
>>    0  120 16.22  74  1.17   0.00   0  0.00   0  0  0  0 100
>>    0  122 17.22  79  1.33   0.00   0  0.00   8  1  5  0 85
>>    0  120 16.46  79  1.27   0.00   0  0.00  14  0  3  2 81
>>    0  120 13.85  48  0.65   0.00   0  0.00   7  0  4  0 88
>>    0  120 15.13  79  1.17   0.00   0  0.00  12  0  4  1 83
>>    0  120 11.11  83  0.90   0.00   0  0.00   8  0  3  1 89
>>
>> as You see, there is no heavy i/o load.
>>
>> Strange things is I have two identical setup of the same system
>> configuration with the same
>> apps running (web server mirror) but other motherboard (K8W), on the
>> second system there no such delay/lockup
>> issue.
>> It's looks like some other process calls sync() on every 7 or 10
>> seconds and it couse such i/o delay.
>> Other question is why sync takes so much time, when i/o disk
>> performance its not bad (tested with dd is about 50MB/sec)
>> and iostat (durning sync calls) doesn't show any high disk transfers.
>> Example:
>> #time sync
>> real    2m0.853s
>> user    0m0.000s
>> sys     0m0.145s
>>
>> 2 mins was the longest time amount I saw durning tests, in most cases
>> sync takies about 10 secs, when sync was called on the
>> other tty iostat doesn't print any high transfer, never it exceed 1MB/s.
>> Qestion is: if disks transfers (using cp or dd) are high, why sync
>> waits so long ?
>>
>> I played with vfs.write_behind=0 but it doesn't change anything.
>>
>> PS.
>> I checkced motherboards manuals: K8W and K8SR have the same sata chip
>> (Silicon Image SIL3114), so I think
>> above problems are not ata driver issues.
>>
>> regards,
> 
> 



More information about the freebsd-amd64 mailing list