svn commit: r276626 - head/sys/kern

Sat Jan 3 20:17:51 UTC 2015

On Sat, 3 Jan 2015, Hans Petter Selasky wrote:

> Log:
>  Rework r276532 a bit. Always avoid recursing into the console drivers
>  clients, hence they might not handle it very well. This change allows
>  debugging mutex problems with kernel console drivers when
>  "debug.witness.skipspin=0" is set in the boot environment.

This keeps getting worse.  Console drivers are required to be reentrant
so that console i/o works in any context, including reentering the
console driver a couple of times.
   (One level of reentrancy always occurs when the console driver is
   stepped through using ddb -- then the debugger traps occur while
   the console driver is in any state, and the debugger proceeds to
   reenter the console driver to do its i/o.  ddb doesn't support
   debugging itself, so the reentering is normally limited to 1 level.
   However, once the console driver supports one level of reentrancy
   it is easy to support any number, and another level or two is useful
   for reporting bugs in ddb.  E.g., buggy locking in ddb or the console
   driver
     (almost any locking there is buggy, like the locking bugs moved in
     this commit)
   then witness might detect the bug and try to print a message about it.)
No console driver is reentrant enough to work right
   (only sio is close; syscons is very broken, and I suspect vt and usb
   (keyboards) are worse.  Full reentrancy is not possible; all that
   can be asked for is that new i/o is done, and the driver and device
   states are not corrupted; old and interrupted i/o must not be waited
   for without bound and it may be corrupted)
but that is another bug suite.  It cannot be fixed here.

> Modified: head/sys/kern/kern_cons.c
> ==============================================================================
> --- head/sys/kern/kern_cons.c	Sat Jan  3 16:48:08 2015	(r276625)
> +++ head/sys/kern/kern_cons.c	Sat Jan  3 17:21:19 2015	(r276626)
> @@ -512,6 +512,13 @@ cnputs(char *p)
> 	int unlock_reqd = 0;
>
> 	if (use_cnputs_mtx) {
> +	  	/*
> +		 * NOTE: Debug prints and/or witness printouts in
> +		 * console driver clients can cause the "cnputs_mtx"
> +		 * mutex to recurse. Simply return if that happens.
> +		 */

This comment is too verbose, but not verbose enough to mention as many
as 1% of the bugs.

It is somewhat misleading in saying "Debug prints" (sic).  Printing by
the debugger (ddb) doesn't go through here (except for the bug that
debugger entry uses printf()).  Debugging printf()s in console driver
code (including code called by the console driver) might recurse here,
but there shouldn't be any such printf()s unless you are debugging this
deadlock bug.  The deadlock bug for witness is a special case of this.
Console code shouldn't have any LORs that would cause a witness warning
(unless you put the LORs to debug its handling...).  However, it is
reasonable to recurse here for witness warnings from console non-clients:
the console driver may be trapped or interrupted, and then the trap
handler may try to acquire a lock, and if this lock is witnessed it
may trigger a witness warning.
   (The main case of a trap is a debugger trap.  Only non-maskable
   interrupts can occur, since spinlocks disable interrupts.  The main
   case of an interrupt is a START or STOP IPI.  When I debugged console
   i/o in syscons, I generated races and deadlocks mainly by sprinkling
   printfs() in trap() and IPI handlers.  Production code shouldn't have
   such printfs, but they should work.  In theory, a fatal trap can occur
   while in console code.  Then panic() will call here.

> +		if (mtx_owned(&cnputs_mtx))
> +			return;

The largest obvious new bug here is losing all output from panic()
(and possibly other critical output) for panics that occur while
the mutex is held by the same CPU.  The previous change made this
case work (for reentrant console drivers) instead of deadlocking.

> 		mtx_lock_spin(&cnputs_mtx);
> 		unlock_reqd = 1;
> 	}

One reason the console driver bugs cannot be fixed here is that this
lock only covers one entry point to the console driver.  Only line-buffered
i/o goes through here.  ddb output never goes through here.  Line
buffering is only done if PRINTF_BUFR_SIZE is configured.  PRINTF_BUFR_SIZE
gives a very bad implementation of serialization of output.  Old message
buffer code used atomic ops to avoid deadlock.  It was replaced by broken
locking.  The locking gives deadlock on recursion in the same way as here.
However, the race window for it (mainly in msgbuf_addchar()) is much
shorter than here and I wasn't able to demonstrate it deadlocking.

> @@ -601,13 +608,7 @@ static void
> cn_drvinit(void *unused)
> {
>
> -	/*
> -	 * NOTE: Debug prints and/or witness printouts in console
> -	 * driver clients can cause the "cnputs_mtx" mutex to
> -	 * recurse. Make sure the "MTX_RECURSE" flags is set!
> -	 */
> -	mtx_init(&cnputs_mtx, "cnputs_mtx", NULL, MTX_SPIN |
> -	    MTX_NOWITNESS | MTX_RECURSE);
> +	mtx_init(&cnputs_mtx, "cnputs_mtx", NULL, MTX_SPIN | MTX_NOWITNESS);
> 	use_cnputs_mtx = 1;
> }

cnputs() and its lock are used mainly to support the serialization of
output done under PRINTF_BUFR_SIZE.  It is obviously broken since it gives
deadlock.  It should at least have used MTX_RECURSE.  Your previous
change to add this seems to be correct in the sense of not adding new
bugs.  Any locking here cannot fix driver bugs, but it can act as a
serialization hint.  The lock here prevents interference from other
CPUs and thus improves serialization.  Making it recursive changes
deadlock to non-serialization in the unusual case of recursion.

Note that the locking here (like any normal mutex locking) is still
fundamentally broken.  The most obviously broken case is when a panic
occurs while another CPU holds the lock.  Then that CPU will be stopped
very early in panic() and never started again.  If it is stopped before
it can release the lock, then the CPU executing the panic() will
deadlock on the lock.  Returning would be little better since it would
lose all the panic() output, as happens now when the lock is held by
the same CPU.

Bruce