amrd disk performance drop after running under high load

Alexey Popov lol at chistydom.ru
Wed Oct 17 01:07:46 PDT 2007


Hi.

Kris Kennaway wrote:
>>>>>> After some time of running under high load disk performance become 
>>>>>> expremely poor. At that periods 'systat -vm 1' shows something like
>>>>>> this:
>>>> This web service is similiar to YouTube. This server is video store. I
>>>> have around 200G of *.flv (flash video) files on the server
>>>> I run lighttpd as a web server. Disk load is usually around 50%, 
>>>> network
>>>> output 100Mbit/s, 100 simultaneous connections. CPU is mostly idle.
>> This is very unlikely, because I have 5 another video storage servers 
>> of the same hardware and software configurations and they feel good.
> Clearly something is different about them, though.  If you can 
> characterize exactly what that is then it will help.
I can't see any difference but a date of installation. Really I compared 
all parameters and got nothing interesting.

>> At first glance one can say that problem is in Dell's x850 series or 
>> amr(4), but we run this hardware on many other projects and they work 
>> well. Also Linux on them works.
> 
> OK but there is no evidence in what you posted so far that amr is 
> involved in any way.  There is convincing evidence that it is the mbuf 
> issue.
Why are you sure this is the mbuf issue? For example, if there is a real 
problem with amr or VM causing disk slowdown, then when it occurs the 
network subsystem will have another load pattern. Instead of just quick 
sending large amounts of data, the system will have to accept large 
amount of sumultaneous connections waiting for data. Can this cause high 
mbuf contention?

> 
>> And few hours ago I received feed back from Andrzej Tobola, he has the 
>> same problem on FreeBSD 7 with Promise ATA software mirror:
> Well, he didnt provide any evidence yet that it is the same problem, so 
> let's not become confused by feelings :)
I think he is telling about 100% disk busy while processing ~5 
transfers/sec.

>> So I can conclude that FreeBSD has a long standing bug in VM that 
>> could be triggered when serving large amount of static data (much 
>> bigger than memory size) on high rates. Possibly this only applies to 
>> large files like mp3 or video. 
> It is possible, we have further work to do to conclude this though.
I forgot to mention I have pmc and kgmon profiling for good and bad 
times. But I have not enough knowledge to interpret it right and not 
sure if it can help.

Also now I run nginx instead of lighttpd on one of the problematic 
servers. It seems to work much better - sometimes there is a peaks in 
disk load, but disk does not become very slow and network output does 
not change. The difference of nginx is that it runs in multiple 
processes, while lighttpd by default has only one process. Now I 
configured lighttpd on other server to run in multiple workers. I'll see 
if it helps.

What else can i try?

With best regards,
Alexey Popov


More information about the freebsd-stable mailing list