Query about bhyve's blockif_cancel and the signalling mechanisms

Tue Dec 13 06:32:19 UTC 2016

Hi Ian,

> To recap my understanding of the mechanisms at work (glossing over the
> queue handling and condvars involved etc), the bhyve block_if
> infrastructure registers a callback for SIGCONT with the mevent
> subsystem, which is a kevent/kqueue thing which delivers events to the
> main thread (mevent_dispatch is the last thing in main()) it also sets
> SIGCONT to SIG_IGN.

  That's correct. The intent was to have the signal delivered via the 
kevent callback rather than standard signal delivery.

> When a disk controller device model wants to
> cancel a block request (e.g. in ahci_port_stop) it calls
> blockif_cancel which sends a SIGCONT to the blkio thread which has
> claimed the request, notionally to kick it out of whatever blocking
> system call it is in and cause it to return an error to the device
> model.

  Yep, that's correct.

> The main thing I do not follow is whether or not the blkio thread is
> actually interrupted at all when the signal has been configured to be
> delivered via the kevent/kqueue mechanisms to a 3rd unrelated thread.

  It is interrupted on FreeBSD.

> I've dug around in the FreeBSD kevent and signal man pages but I
> cannot find any part which describes anything of the semantics which
> bhyve seems to be relying on (which seems to be that the system call
> in the target thread will return EINTR at some point before the thread
> which is "handling" the signal via kevent/kqueue sees that event).
>
> Have I missed something here or is bhyve relying on some subtle
> underlying semantics?

  I didn't think it too FreeBSD-specific - if a thread is blocked in a 
system call, sending a signal should force it to exit on most Unices.

> I have a secondary concern which is what happens if the IO thread is
> on its way to making a blocking system call in blockif_proc but has
> not actually done so when the signal is delivered. It seems like it
> would simply carry on and make the blocking call with perhaps
> unexpected consequences (i/o getting wedged, perhaps only until a
> second reset attempt). I've not actually seen this happening though
> and there's a chance I'm simply over thinking things after staring at
> them for so long!

  I believe this case is handled - I discussed this at length with Tycho 
when the code was committed a while back.

  Tycho - any thoughts ?

later,

Peter.