svn commit: r227530 - head/sys/vm
Konstantin Belousov
kostikbel at gmail.com
Tue Jan 8 22:49:38 UTC 2013
On Tue, Jan 08, 2013 at 03:11:34PM -0700, Kenneth D. Merry wrote:
> On Tue, Nov 15, 2011 at 14:40:00 +0000, Konstantin Belousov wrote:
> > Author: kib
> > Date: Tue Nov 15 14:40:00 2011
> > New Revision: 227530
> > URL: http://svn.freebsd.org/changeset/base/227530
> >
> > Log:
> > Update the device pager interface, while keeping the compatibility
> > layer for old KPI and KBI. New interface should be used together with
> > d_mmap_single cdevsw method.
> >
> > Device pager can be allocated with the cdev_pager_allocate(9)
> > function, which takes struct cdev_pager_ops, containing
> > constructor/destructor and page fault handler methods supplied by
> > driver.
> >
> > Constructor and destructor, called at the pager allocation and
> > deallocation time, allow the driver to handle per-object private data.
> >
> > The pager handler is called to handle page fault on the vm map entry
> > backed by the driver pager. Driver shall return either the vm_page_t
> > which should be mapped, or error code (which does not cause kernel
> > panic anymore). The page handler interface has a placeholder to
> > specify the access mode causing the fault, but currently PROT_READ is
> > always passed there.
> >
> > Sponsored by: The FreeBSD Foundation
> > Reviewed by: alc
> > MFC after: 1 month
>
> This is an old commit, but I believe there is a bug introduced by this
> commit that triggered the following panic:
>
> panic: dev_rel((null)) gave negative count
> cpuid = 5
> KDB: stack backtrace:
> db_trace_self_wrapper() at 0xffffffff803639da = db_trace_self_wrapper+0x2a
> kdb_backtrace() at 0xffffffff8090e677 = kdb_backtrace+0x37
> panic() at 0xffffffff808d6bd8 = panic+0x1d8
> dev_rel() at 0xffffffff8088f929 = dev_rel+0xc9
> dev_pager_dealloc() at 0xffffffff80b2be30 = dev_pager_dealloc+0x30
> vm_object_terminate() at 0xffffffff80b44d7d = vm_object_terminate+0x13d
> vm_object_deallocate() at 0xffffffff80b465ca = vm_object_deallocate+0x17a
> cdev_pager_allocate() at 0xffffffff80b2c5e6 = cdev_pager_allocate+0x236
> dev_pager_alloc() at 0xffffffff80b2c697 = dev_pager_alloc+0x27
> vm_mmap_cdev() at 0xffffffff80b41b86 = vm_mmap_cdev+0x116
> vm_mmap() at 0xffffffff80b42221 = vm_mmap+0x671
> sys_mmap() at 0xffffffff80b428c6 = sys_mmap+0x1d6
> amd64_syscall() at 0xffffffff80bd1ed8 = amd64_syscall+0x318
> Xfast_syscall() at 0xffffffff80bbd387 = Xfast_syscall+0xf7
> --- syscall (477, FreeBSD ELF64, sys_mmap), rip = 0x8008ae25c, rsp =
> 0x7fffffffd968, rbp = 0 ---
> KDB: enter: panic
> [ thread pid 4109 tid 102348 ]
> Stopped at 0xffffffff8090e33b = kdb_enter+0x3b: movq $0,0xb433e2(%rip)
>
> This is a stable/9 system, but the code is the same in head.
>
> I triggered it by running dmidecode (perhaps a few times) on a machine with
> INVARIANTS enabled. You can probably trigger it before too long with
> something like this:
>
> ((i=0)); while [ $i -lt 50 ]; do dmidecode > /dev/null & ((i++)); done
>
> Run that loop until the system panics.
>
> The problem is that dev_pager_dealloc() passes in the VM object handle to
> the cdev_pg_dtor() method (in this case old_dev_pager_dtor()), which
> assumes that it is a struct cdev.
>
> But there is a case in cdev_pager_allocate() where if a race condition is
> hit, the object handle is set to the object itself. In that case, we wind
> up triggering an assertion in dev_rel(), because it is not operating on a
> struct cdev, but rather a VM object.
>
> I've attached my local commit and diffs. This does work, but you may
> have a better solution.
>
> Thanks,
>
> Ken
> --
> Kenneth Merry
> ken at FreeBSD.ORG
> Change 649772 by kenm at ken.spectrabsd9 on 2013/01/08 14:54:45
>
> Fix a bug in the device pager code that can trigger an assertion
> in devfs if a particular race condition is hit in the device pager
> code.
>
> This fixes BUG25532.
>
> This was a side effect of FreeBSD SVN change 227530 (our Perforce
> change 513754), which changed the device pager interface to call
> a new destructor routine for the cdev. That destructor routine,
> old_dev_pager_dtor(), takes a VM object handle.
>
> The object handle is cast to a struct cdev *, and passed into
> dev_rel().
>
> That works in most cases, except the case in cdev_pager_allocate()
> where there is a race condition between two threads allocating an
> object backed by the same device. The loser of the race
> deallocates its object at the end of the function.
>
> The problem is that before inserting the object into the
> dev_pager_object_list, the object's handle is changed from the
> struct cdev pointer to the object's own address. This is to avoid
> conflicts with the winner of the race, which already inserted an
> object in the list with a handle that is a pointer to the same cdev
> structure.
>
> The object is then passed to vm_object_deallocate(), and eventually
> makes its way down to old_dev_pager_dtor(). That function passes
> the handle pointer (which is actually a VM object, not a struct
> cdev as usual) into dev_rel(). dev_rel() decrements the reference
> count in the assumed struct cdev (which happens to be 0), and
> that triggers the assertion in dev_rel() that the reference count
> is greater than or equal to 0.
>
> The fix is to add a cdev pointer to the VM object, and use that
> pointer when calling the cdev_pg_dtor() routine. There may be
> other, better, ways to fix this, so I'll ask for a review by the
> appropriate FreeBSD VM maintainers.
>
> vm_object.h: Add a struct cdev pointer to the VM object
> structure.
>
> device_pager.c: In cdev_pager_allocate(), populate the new cdev
> pointer.
>
> In dev_pager_dealloc(), use the new cdev pointer
> when calling the object's cdev_pg_dtor() routine.
>
> TeamTrack: BUG25532
>
> Affected files ...
>
> ... //SpectraBSD/stable/sys/vm/device_pager.c#3 edit
> ... //SpectraBSD/stable/sys/vm/vm_object.h#3 edit
>
> Differences ...
>
> ==== //SpectraBSD/stable/sys/vm/device_pager.c#3 (text) ====
>
> @@ -156,6 +156,7 @@
> object1->pg_color = color;
> object1->handle = handle;
> object1->un_pager.devp.ops = ops;
> + object1->un_pager.devp.dev = handle;
> TAILQ_INIT(&object1->un_pager.devp.devp_pglist);
> mtx_lock(&dev_pager_mtx);
> object = vm_pager_object_lookup(&dev_pager_object_list, handle);
> @@ -233,7 +234,7 @@
> vm_page_t m;
>
> VM_OBJECT_UNLOCK(object);
> - object->un_pager.devp.ops->cdev_pg_dtor(object->handle);
> + object->un_pager.devp.ops->cdev_pg_dtor(object->un_pager.devp.dev);
>
> mtx_lock(&dev_pager_mtx);
> TAILQ_REMOVE(&dev_pager_object_list, object, pager_object_list);
>
> ==== //SpectraBSD/stable/sys/vm/vm_object.h#3 (text) ====
>
> @@ -123,6 +123,7 @@
> struct {
> TAILQ_HEAD(, vm_page) devp_pglist;
> struct cdev_pager_ops *ops;
> + struct cdev *dev;
> } devp;
>
> /*
Hm, yes, I remembered this. I had the following patch sitting in my
local repository since the autumn, I think.
Might be, your solution is more clean, but the main idea is the same,
to put the cdev pointer into the additional place.
diff --git a/sys/vm/device_pager.c b/sys/vm/device_pager.c
index ba70571..538c887 100644
--- a/sys/vm/device_pager.c
+++ b/sys/vm/device_pager.c
@@ -183,6 +183,7 @@ cdev_pager_allocate(void *handle, enum obj_type tp, struct cdev_pager_ops *ops,
mtx_unlock(&dev_pager_mtx);
if (object1 != NULL) {
object1->handle = object1;
+ object1->un_pager.devp.bhandle = handle;
mtx_lock(&dev_pager_mtx);
TAILQ_INSERT_TAIL(&dev_pager_object_list, object1,
pager_object_list);
@@ -233,13 +234,15 @@ dev_pager_dealloc(object)
vm_object_t object;
{
vm_page_t m;
+ void *h;
VM_OBJECT_UNLOCK(object);
- object->un_pager.devp.ops->cdev_pg_dtor(object->handle);
-
mtx_lock(&dev_pager_mtx);
TAILQ_REMOVE(&dev_pager_object_list, object, pager_object_list);
mtx_unlock(&dev_pager_mtx);
+ h = object->handle == object ? object->un_pager.devp.bhandle :
+ object->handle;
+ object->un_pager.devp.ops->cdev_pg_dtor(h);
VM_OBJECT_LOCK(object);
if (object->type == OBJT_DEVICE) {
diff --git a/sys/vm/vm_object.h b/sys/vm/vm_object.h
index b584239..6a39e18 100644
--- a/sys/vm/vm_object.h
+++ b/sys/vm/vm_object.h
@@ -136,6 +136,7 @@ struct vm_object {
struct {
TAILQ_HEAD(, vm_page) devp_pglist;
struct cdev_pager_ops *ops;
+ void *bhandle;
} devp;
/*
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 834 bytes
Desc: not available
URL: <http://lists.freebsd.org/pipermail/svn-src-head/attachments/20130109/9b4b693d/attachment.sig>
More information about the svn-src-head
mailing list