processes not getting fair share of available disk I/O (was: Re: TCP parameters and interpreting tcpdump output )

Kris Kennaway kris at obsecurity.org
Thu Dec 7 16:09:46 PST 2006


On Thu, Dec 07, 2006 at 03:21:46PM +0000, Dieter wrote:
> > > > > > > 	hw.ata.wc=3D3D3D0
> > > > > >         ^^^^^^^^^^^
> > > > > > "Make my hard drive go reeeeally slow please (just in case I crash)=
> > " :)
> > > > >=3D20
> > > > > Slower, yes, but not *that* slow.
> > > > >=3D20
> > > > > Normal ls : 0.032 second.  Two processes using same disk, multiply by=
> >  two,
> > > > > so 0.064 second.  Maybe the multiplier is more than 2, call it 10x, so
> > > > > 0.32 second.  But I'm seeing a factor of over 9100x.
> > > >=20
> > > > Humour me and turn it back on, then see what happens.
> > >=20
> > > Where is the knob to turn the write cache on/off on a per-drive basis
> > > in FreeBSD?  I can do this in NetBSD, but the only knob I can find in
> > > FreeBSD affects all drives, and requires a reboot.
> > 
> > Yes, I think you need to do it globally at boot time.
> > 
> > > Humour me and read the Subject line.  The ls does not get its fair share
> > > of disk I/O.
> > >=20
> > > Both times are with the disk's write cache in write-through mode.
> > > I'm not comparing times with the write cache in different modes.
> > > I'm comparing ls by itself against ls competing with cp.
> > 
> > Your cp is going to be running synchronously, i.e. spend a lot of time
> > waiting on the disk to perform the writes.  This may well be the cause
> > of your problem.  Once we have established whether or not it is the
> > cause, we can proceed to whether this behaviour can be improved.
> 
> I submitted PR 106340 asking for a way to control the disk write cache on
> a per disk basis like NetBSD can.  Meanwhile, I added a PATA via USB disk,
> which judging from the write speed, appears to be immune from hw.ata.wc=0.
> 
> So I now have a disk which has the write cache on, is connected via a different
> controller, and thus uses a different device driver.
> 
> I still see the same problems.  Writing to one disk *significantly* slows down
> writing to another disk.  Even if one process is at normal default priority
> and the other is running at rtprio 5.  Regardless of which process uses the
> USB disk and which uses the direct-to-chipset disk.  Even if the rtprio 5
> process only needs a very small fraction of the disk bandwidth, it still gets
> slowed down to the point that data is lost.
> 
> My current SWAG is that writing to a disk requires some spl/mutex/lock that
> is global across all disks on the system.  And this spl/mutex/lock is a
> bottleneck.

In the case of USB devices, yes - all USB accesses require Giant so
all USB I/O is serialized.  This isn't true in general though, unless
you have debug.mpsafevfs=0 set (or forced because of something else,
e.g. quotas).  If this is set then all filesystem I/O is serialized
(and maybe it's even worse, if there are also device drivers in the
I/O path that also require Giant, like USB).

However, I don't know what you mean by "data is lost".  Data should
never be lost from the filesystem regardless of how slow the I/O is
happening, unless there's something else going wrong (e.g. driver
bug).

Also, rtprio should not be used in general - see the manpage.  Were
you using rtprio in your original scenario?  It can easily cause
resource starvation.

Kris
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 187 bytes
Desc: not available
Url : http://lists.freebsd.org/pipermail/freebsd-questions/attachments/20061208/05b623e2/attachment.pgp


More information about the freebsd-questions mailing list