newfs locks entire machine for 20seconds

Steven Hartland killing at multiplay.co.uk
Fri Feb 1 04:26:40 PST 2008


----- Original Message ----- 
From: "Ivan Voras" <ivoras at freebsd.org>
...
>> geom debugging I get:-
>> Feb  1 06:04:45 geomtest kernel: g_post_event_x(0xffffffff802394c0,
>> 0xffffff00010e6100, 2, 0)
>> Feb  1 06:04:45 geomtest kernel: ref 0xffffff00010e6100
>> Feb  1 06:04:45 geomtest kernel: g_post_event_x(0xffffffff802394c0,
>> 0xffffff00014e6700, 2, 0)
>> Feb  1 06:04:45 geomtest kernel: ref 0xffffff00014e6700
>> Feb  1 06:04:49 geomtest kernel: g_post_event_x(0xffffffff80239270,
>> 0xffffff00010e6100, 2, 0)
>> Feb  1 06:04:49 geomtest kernel: ref 0xffffff00010e6100
>> Feb  1 06:04:49 geomtest kernel: g_post_event_x(0xffffffff80239270,
>> 0xffffff00014e6700, 2, 0)
>> Feb  1 06:04:49 geomtest kernel: ref 0xffffff00014e6700
>> Feb  1 06:04:49 geomtest kernel: mbr_taste(MBR,da0s3)
>> Feb  1 06:04:49 geomtest kernel: g_mbrext_taste(MBREXT,da0s3)
>> Feb  1 06:04:49 geomtest kernel: g_slice_spoiled(0xffffff0001b27180/da0s3)
>> Feb  1 06:04:49 geomtest kernel: g_wither_geom(0xffffff0001a33c00(da0s3))
>> Feb  1 06:04:49 geomtest kernel: g_part_taste(PART,da0s3)
>> Feb  1 06:04:56 geomtest kernel: g_post_event_x(0xffffffff80235b10,
>> 0xffffff000144a9c0, 2, 262144)
>> Feb  1 06:05:00 geomtest kernel: g_wither_geom(0xffffff000158b300(da0s3))
>> Feb  1 06:05:00 geomtest kernel:
>> Feb  1 06:05:00 geomtest kernel: g_label_taste(LABEL, da0s3)
>> Feb  1 06:05:00 geomtest kernel:
>> Feb  1 06:05:16 geomtest kernel: GEOM_LABEL[1]: MSDOSFS: da0s3: FAT32
>> volume not valid.
>> Feb  1 06:05:16 geomtest kernel: g_detach(0xffffff0001b23980)
>> Feb  1 06:05:16 geomtest kernel: g_destroy_consumer(0xffffff0001b23980)
>> Feb  1 06:05:16 geomtest kernel:
>
>> So after all that I can see why the sysctl call is taking
>> so long to complete but the burning question is why does
>
> Can you explain - I don't see it :) Do you mean to say there's a
> contention for sysctl lock between geom_confxml and g_waitfor_event or
> that geom_label tasting has something to do with it?

Nope what I belive to be happening is this sysctl_kern_geom_confxml which
is called while SYSCTL_LOCK is held from userland_sysctl returns EAGAIN
for a considerable period while newfs runs. 

>> the entire system lock because of this? What else is
>> waiting on the sysctl lock which is so critical?
> 
> What I do know is that sysctl is GIANT-locked, which is also used by
> some parts of device handling infrastructure (dead_cdevsw), the USB
> stack, and can creep itself in the timer via swi_sched in
> subr_taskqueue.c:303. I cannot say for sure that's what happening here,
> but they are possibilities.
> 
> If you can provoke this reliably, I think there is a (old!) patch for
> removing sysctl from under the Giant lock that you could try.

Yes this is 100% reproducable and I'm beginning to suspect this issue may
the cause of the regular pauses on our hosting machines so it might have
further reaching effects than just this one example.

If SYSCTL_LOCK is indeed GIANT locked then it might explain why its effect
is so pronounced.

Tried ADAPTIVE_SX as test to see if that helped as the sysctl lock is
and sx but didnt help

    Regards
    Steve

================================================
This e.mail is private and confidential between Multiplay (UK) Ltd. and the person or entity to whom it is addressed. In the event of misdirection, the recipient is prohibited from using, copying, printing or otherwise disseminating it or any information contained in it. 

In the event of misdirection, illegible or incomplete transmission please telephone +44 845 868 1337
or return the E.mail to postmaster at multiplay.co.uk.



More information about the freebsd-performance mailing list