Writing a plain text file to disk from kernel space

John Baldwin jhb at freebsd.org
Tue Jun 5 19:51:58 UTC 2007


On Thursday 24 May 2007 11:26:58 pm Lawrence Stewart wrote:
> Comments inline...
> 
> Dag-Erling Smørgrav wrote:
> > Lawrence Stewart <lstewart at room52.net> writes:
> >   
> >> Dag-Erling Smørgrav <des at des.no> writes:
> >>     
> >>> Since you are writing kernel code, I assume you have KDB/DDB in your
> >>> kernel and know how to use it.
> >>>       
> >> I don't know how to use them really. Thus far I haven't had a need for
> >> really low level debugging tools... seems that may have changed
> >> though! Any good tutorials/pointers on how to get started with kernel
> >> debugging?
> >>     
> >
> > The handbook and FAQ have information on debugging panics.  Greg Lehey
> > (grog@) does a tutorial on kernel debugging, you can probably find
> > slides online (or just ask him)
> >   
> 
> 
> For reference, I found what looks to be a very comprehensive kernel 
> debugging reference here: 
> http://www.lemis.com/grog/Papers/Debug-tutorial/tutorial.pdf
> 
> Greg certainly knows the ins and outs of kernel debugging!
> 
> >   
> >>> kio_write probably blocks waiting for the write to complete.  You can't
> >>> do that while holding a non-sleepable lock.
> >>>       
> >> So this is where my knowledge/understanding gets very hazy...
> >>
> >> When a thread blocks waiting for some operation to complete or event
> >> to happen, the thread effectively goes to sleep, correct?
> >>     
> >
> > It depends on the type of lock used, but mostly, yes.
> >
> >   
> >> Looking at the kio_write code in subr_kernio.c, I'm guessing the lock
> >> that is causing the trouble is related to the "vn_lock" function call?
> >>     
> >
> > What matters is that kio_write() may sleep and therefore can't be called
> > while holding a non-sleepable lock.
> >
> >   
> >> I don't understand though why the vnode lock would be set up in such a
> >> way that when the write blocks whilst waiting for the underlying
> >> filesystem to signal everything is ok, it causes the kernel to panic!
> >>     
> >
> > You cannot sleep while holding a non-sleepable lock.  You need to find
> > out which locks are held at the point where you call kio_write(), and
> > figure out a way to delay the kio_write() call until those locks are
> > released.
> >
> >   
> >> How do I make the lock "sleepable" or make sure the thread doesn't try
> >> go to sleep whilst holding the lock?
> >>     
> >
> > You can't make an unsleepable lock sleepable.  You might be able to
> > replace it with a sleepable lock, but you would have to go through every
> > part of the kernel that uses the lock and make sure that it works
> > correctly with a sleepable lock.  Most likely, it won't.
> >
> >   
> 
> 
> Thanks for the explanations. I'm starting to get a better picture of 
> what's actually going on.
> 
> So it seems that there is no way I can call kio_write from within the 
> function that is acting as a pfil output hook, because it blocks at some 
> point whilst doing the disk write, which makes the kernel unhappy 
> because pfil code is holding a non-sleepable mutex somewhere.
> 
> If you read my other message from yesterday, I still can't figure out 
> why this only happens with outbound TCP traffic, but anyways...
> 
> I'll have a bit more of a think about it and get back to the list shortly...

Use a task to defer the kio_write() to a taskqueue.  You have to malloc state 
(using M_NOWAIT, which can fail) to do this properly.  If you are doing this 
for every packet, you are probably better off using malloc() to throw items 
into a queue and having a global task that drains the queue on each execution 
doing kio_write()'s for each object.

Regarding sleepable vs. non-sleepable locks.  Getting preempted by an 
interrupt is not considered "sleeping".  Sleeping means voluntarily yielding 
the CPU to wait for an event such as via msleep() or a condition variable.  
Note that interrupt handlers can acquire non-sleepable locks.  If you sleep 
while holding a non-sleepable lock, you may have an interrupt handler that 
can't run while it waits for some async event (like disk I/O) to complete.

-- 
John Baldwin


More information about the freebsd-hackers mailing list