Pluggable Disk Schedulers in GEOM

Robert Watson rwatson at FreeBSD.org
Fri Jan 5 14:38:43 UTC 2007


On Fri, 5 Jan 2007, lulf at stud.ntnu.no wrote:

> Anyway, I'd like to research a bit on this topic to just see how much it 
> does matter with different I/O scheduling for different purposes.

I think working on this is interesting, but the one caution I'd have is that 
it's possibly to introduce serious priority inversions through any complex 
scheduling scheme for I/O.  In our VFS, I/O is frequently performed while 
holding locks or things that act like locks -- for example, during a directory 
lookup, while pulling an inode off the disk, etc.  The I/O will be initiated 
by one thread, but then other threads will end up waiting for it also.  If 
there is a naive mapping of initiating thread priority to I/O request 
priority, then you can end up with high priority threads being blocked on a 
low priority tasks, leading to nasty starvation effects, especially if the 
scheduler allows indefinite waiting for I/O at a low priority.  This, at a 
rough approximation, is the problem that Kirk ran into when trying to rate 
limit bgfsck I/O in the kernel: key vnode locks, such as directory vnode 
locks, would be held across de-prioritized I/O, and high priority processes 
would then block on the vnode locks.  There are various ways to address this, 
not least priority propagation (in which I/O priority is increased to match 
the priority of the highest priority thread waiting on the I/O request), but I 
wanted to make sure you had it on the list of design concerns.

Robert N M Watson
Computer Laboratory
University of Cambridge


More information about the freebsd-geom mailing list