kern/171865: [geom] g_wither_washer() keeping a core busy
Fabian Keil
fk at fabiankeil.de
Sat Sep 22 08:40:08 UTC 2012
>Number: 171865
>Category: kern
>Synopsis: [geom] g_wither_washer() keeping a core busy
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Sat Sep 22 08:40:07 UTC 2012
>Closed-Date:
>Last-Modified:
>Originator: Fabian Keil
>Release: HEAD
>Organization:
>Environment:
FreeBSD r500.local 10.0-CURRENT FreeBSD 10.0-CURRENT #484 r+345840c: Fri Sep 21 20:20:56 CEST 2012 fk at r500.local:/usr/obj/usr/src/sys/ZOEY amd64
>Description:
In http://lists.freebsd.org/pipermail/freebsd-fs/2011-June/011855.html
I reported a problem with g_wither_washer() being called more than
400000 times per second after a device got lost, keeping a cpu busy:
fk at r500 ~ $sudo dtrace -n 'fbt:kernel:g_*:entry { @[probefunc, stack()] = count(); } tick-1sec { trunc(@, 3); printa(@); trunc(@)}'
dtrace: description 'fbt:kernel:g_*:entry ' matched 359 probes
CPU ID FUNCTION:NAME
0 32988 :tick-1sec
g_wither_washer
kernel`g_run_events+0x3b5
kernel`0xffffffff8084967e
446626
0 32988 :tick-1sec
g_trace
kernel`g_io_request+0x4d
kernel`g_io_schedule_down+0x25f
kernel`g_down_procbody+0x6d
kernel`fork_exit+0x9a
kernel`0xffffffff8084967e
230
g_trace
kernel`g_io_deliver+0x7a
kernel`g_up_procbody+0x6d
kernel`fork_exit+0x9a
kernel`0xffffffff8084967e
230
[...]
I recently found a way to reproduce the problem without using
ZFS or writing to the device.
>How-To-Repeat:
geli onetime /dev/md0
geom sched insert -a rr /dev/md0.eli
geli detach /dev/md0.eli.sched.
>Fix:
I don't have a fix, but the attached patch can be used as a workaround.
After kern.geom.debugflags has been set to 256, it can be set to 0 again,
but the problem will be back after the next geom "event".
Patch attached with submission follows:
>From 8680caf9ab5322377736f62cd4eb674a938bb445 Mon Sep 17 00:00:00 2001
From: Fabian Keil <fk at fabiankeil.de>
Date: Thu, 12 Jul 2012 12:38:00 +0200
Subject: [PATCH] Allow to use kern.geom.debugflags to prevent g_run_events()
from calling g_wither_washer()
Workaround for geom keeping a whole core busy failing
to remove a lost device.
---
sys/geom/geom_event.c | 3 +++
sys/geom/geom_int.h | 1 +
2 files changed, 4 insertions(+)
diff --git a/sys/geom/geom_event.c b/sys/geom/geom_event.c
index 3805dcd..b9bfc25 100644
--- a/sys/geom/geom_event.c
+++ b/sys/geom/geom_event.c
@@ -47,6 +47,7 @@ __FBSDID("$FreeBSD: src/sys/geom/geom_event.c,v 1.62 2012/07/29 11:51:48 mav Exp
#include <sys/kernel.h>
#include <sys/lock.h>
#include <sys/mutex.h>
+#include <sys/sysctl.h>
#include <sys/proc.h>
#include <sys/errno.h>
#include <sys/time.h>
@@ -286,6 +287,8 @@ g_run_events()
;
mtx_assert(&g_eventlock, MA_OWNED);
*i = g_wither_work;
+ if (g_debugflags & G_F_STOP_WITHERING)
+ *i = 0;
if (*i) {
mtx_unlock(&g_eventlock);
while (*i) {
diff --git a/sys/geom/geom_int.h b/sys/geom/geom_int.h
index 50f3a2a..0c11be8 100644
--- a/sys/geom/geom_int.h
+++ b/sys/geom/geom_int.h
@@ -50,6 +50,7 @@ extern int g_debugflags;
*/
#define G_F_DISKIOCTL 64
#define G_F_CTLDUMP 128
+#define G_F_STOP_WITHERING 256
/* geom_dump.c */
void g_confxml(void *, int flag);
--
1.7.11.5
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list