svn commit: r364946 - head/sys/kern
Warner Losh
imp at bsdimp.com
Sat Aug 29 11:34:45 UTC 2020
On Sat, Aug 29, 2020 at 5:25 AM Michal Meloun <meloun.michal at gmail.com>
wrote:
>
>
> On 29.08.2020 13:02, Warner Losh wrote:
> > On Sat, Aug 29, 2020 at 4:38 AM Michal Meloun <meloun.michal at gmail.com>
> > wrote:
> >
> >>
> >>
> >> On 29.08.2020 12:04, Warner Losh wrote:
> >>> On Sat, Aug 29, 2020 at 1:09 AM Mateusz Guzik <mjguzik at gmail.com>
> wrote:
> >>>
> >>>> This crashes on boot for me:
> >>>>
> >>>
> >>> I wasn't able to get it to crash on boot for me, but I was able to
> >> recreate
> >>> it.
> >> It crashed on ofw based systems where some enumerated devices have not a
> >> suitable driver, see:
> >> ---------------------------------------
> >> sysctl_devices: nameunit: root0, descs: System root bus, driver: root
> >> sysctl_devices: nameunit: nexus0, descs: (null), driver: nexus
> >> sysctl_devices: nameunit: ofwbus0, descs: Open Firmware Device Tree,
> >> driver: ofwbus
> >> sysctl_devices: nameunit: pcib0, descs: Nvidia Integrated PCI/PCI-E
> >> Controller, driver: pcib
> >> sysctl_devices: nameunit: simplebus0, descs: Flattened device tree
> >> simple bus, driver: simplebus
> >> sysctl_devices: nameunit: gic0, descs: ARM Generic Interrupt Controller,
> >> driver: gic
> >> sysctl_devices: nameunit: (null), descs: (null), driver:
> >> sysctl_devices: nameunit: lic0, descs: (null), driver: lic
> >> sysctl_devices: nameunit: (null), descs: (null), driver:
> >> sysctl_devices: nameunit: car0, descs: Tegra Clock Driver, driver: car
> >> ....
> >> ----------------------------------------------------------------------
> >>> Fixed in r364949.Confirmed.
> >> I think it didn't crash on boot for me because
> >>> kldxref failed due to the segment thing so devmatch didn't run which
> >> would
> >>> have triggered this bug. devinfo did trigger a very similar crash, and
> >>> r364949 fixes that crash. Even a new kldxref failed due to the too many
> >>> segments thing, so I can't confirm that's what you hit, but I'm pretty
> >> sure
> >>> it is...
> >>>
> >> But there is another issue in device_sysctl_handler() (not analyzed
> yet):
> >> root at tegra210:~ # sysctl dev.cpu.
> >> dev.cpu.3.temperature: 50.5C
> >> dev.cpu.3panic: sbuf_clear makes no sense on sbuf 0xffff00006f21a528
> >> with drain
> >> cpuid = 2
> >> time = 1598696937
> >> KDB: stack backtrace:
> >> db_trace_self() at db_fetch_ksymtab+0x164
> >> pc = 0xffff0000006787f4 lr = 0xffff000000153400
> >> sp = 0xffff00006f21a1b0 fp = 0xffff00006f21a3b0
> >>
> >> db_fetch_ksymtab() at vpanic+0x198
> >> pc = 0xffff000000153400 lr = 0xffff00000036b274
> >> sp = 0xffff00006f21a3c0 fp = 0xffff00006f21a420
> >>
> >> vpanic() at panic+0x44
> >> pc = 0xffff00000036b274 lr = 0xffff00000036b018
> >> sp = 0xffff00006f21a430 fp = 0xffff00006f21a4e0
> >>
> >> panic() at sbuf_clear+0xa0
> >> pc = 0xffff00000036b018 lr = 0xffff0000003c17c8
> >> sp = 0xffff00006f21a4f0 fp = 0xffff00006f21a4f0
> >>
> >> sbuf_clear() at sbuf_cpy+0x58
> >> pc = 0xffff0000003c17c8 lr = 0xffff0000003c1ff0
> >> sp = 0xffff00006f21a500 fp = 0xffff00006f21a500
> >>
> >> sbuf_cpy() at _gone_in_dev+0x560
> >> pc = 0xffff0000003c1ff0 lr = 0xffff0000003a9078
> >> sp = 0xffff00006f21a510 fp = 0xffff00006f21a570
> >>
> >> _gone_in_dev() at sbuf_new_for_sysctl+0x170
> >> pc = 0xffff0000003a9078 lr = 0xffff00000037c1a8
> >> sp = 0xffff00006f21a580 fp = 0xffff00006f21a5a0
> >>
> >> sbuf_new_for_sysctl() at kernel_sysctl+0x36c
> >> pc = 0xffff00000037c1a8 lr = 0xffff00000037b63c
> >> sp = 0xffff00006f21a5b0 fp = 0xffff00006f21a630
> >>
> >
> > This traceback is all kinds of crazy. sbuf_new_for_sysctl doesn't call
> > _gone_in_dev(), which doesn't do sbuf stuff at all. And neither does it
> > call sbuf_cpy(). Though I get a crash that looks like:
> > Tracing pid 66442 tid 101464 td 0xfffffe02f47b7c00
> > kdb_enter() at kdb_enter+0x37/frame 0xfffffe02f4ae3740
> > vpanic() at vpanic+0x19e/frame 0xfffffe02f4ae3790
> > panic() at panic+0x43/frame 0xfffffe02f4ae37f0
> > sbuf_clear() at sbuf_clear+0xac/frame 0xfffffe02f4ae3800
> > sbuf_cpy() at sbuf_cpy+0x5a/frame 0xfffffe02f4ae3820
> > device_sysctl_handler() at device_sysctl_handler+0x133/frame
> > 0xfffffe02f4ae38a0
> > sysctl_root_handler_locked() at sysctl_root_handler_locked+0x9c/frame
> > 0xfffffe02f4ae38f0
> > sysctl_root() at sysctl_root+0x20a/frame 0xfffffe02f4ae3970
> > userland_sysctl() at userland_sysctl+0x17d/frame 0xfffffe02f4ae3a20
> > sys___sysctl() at sys___sysctl+0x5f/frame 0xfffffe02f4ae3ad0
> > amd64_syscall() at amd64_syscall+0x140/frame 0xfffffe02f4ae3bf0
> > fast_syscall_common() at fast_syscall_common+0xf8/frame
> 0xfffffe02f4ae3bf0
> > --- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x80042d50a, rsp =
> > 0x7fffffffd458, rbp = 0x7fffffffd490 ---
> >
> > on a sysctl -a which I think makes more sense... I'll see if I can track
> > it down... I think it's because sbuf_cpy does an unconditional clear,
> which
> > triggers this assert, which is likely bogus for this case. sbuf_cat
> doesn't
> > seem to have this issue... I'll confirm and commit.
> >
> > Warner
>
> Yeah, sorry. Local symbols are not available for netbooted kernel :(.
> And i csan confirm that problem is cause by using sbuf_cpy() on sbuf
> allocated by sbuf_new_for_sysctl() (thus with drain handler) in
> device_sysctl_handler(). But pure replacing sbuf_cpy() by sbuf_cat()
> gives me another panic:
> panic: Assertion (sb->s_flags & SBUF_INCLUDENUL) == 0 failed at
> /usr2/Meloun/git/pmap/sys/kern/subr_bus.c:4936
> (still as respose for sysctl dev.cpu)
>
OK. My bouncer system here has something wrong with /, but I changed the
sbuf_cpy to sbuf_cat. Can you confirm that it works for you?
Warner
>
>
> >
> >> kernel_sysctl() at userland_sysctl+0xf4
> >> pc = 0xffff00000037b63c lr = 0xffff00000037bc5c
> >> sp = 0xffff00006f21a640 fp = 0xffff00006f21a6d0
> >>
> >> userland_sysctl() at sys___sysctl+0x68
> >> pc = 0xffff00000037bc5c lr = 0xffff00000037bb28
> >> sp = 0xffff00006f21a6e0 fp = 0xffff00006f21a790
> >>
> >> sys___sysctl() at do_el0_sync+0x4e0
> >> pc = 0xffff00000037bb28 lr = 0xffff000000697918
> >> sp = 0xffff00006f21a7a0 fp = 0xffff00006f21a830
> >>
> >> do_el0_sync() at handle_el0_sync+0x90
> >> pc = 0xffff000000697918 lr = 0xffff00000067aa24
> >> sp = 0xffff00006f21a840 fp = 0xffff00006f21a980
> >>
> >> handle_el0_sync() at 0x4047764c
> >> pc = 0xffff00000067aa24 lr = 0x000000004047764c
> >> sp = 0xffff00006f21a990 fp = 0x0000ffffffffc250
> >>
> >> KDB: enter: panic
> >> [ thread pid 1263 tid 100092 ]
> >> Stopped at 0x40477fb4: undefined 54000042
> >>
> >>> Warner
> >>>
> >>
> >>>
> >>>> atal trap 12: page fault while in kernel mode
> >>>> cpuid = 0; apic id = 00
> >>>> fault virtual address = 0x0
> >>>> fault code = supervisor read data, page not present
> >>>> instruction pointer = 0x20:0xffffffff805b0a7f
> >>>> stack pointer = 0x28:0xfffffe002366a7f0
> >>>> frame pointer = 0x28:0xfffffe002366a7f0
> >>>> code segment = base 0x0, limit 0xfffff, type 0x1b
> >>>> = DPL 0, pres 1, long 1, def32 0, gran 1
> >>>> processor eflags = interrupt enabled, resume, IOPL = 0
> >>>> current process = 89 (devmatch)
> >>>> trap number = 12
> >>>> panic: page fault
> >>>> cpuid = 0
> >>>> time = 1598692135
> >>>> KDB: stack backtrace:
> >>>> db_trace_self_wrapper() at db_trace_self_wrapper+0x2b/frame
> >>>> 0xfffffe002366a4a0
> >>>> vpanic() at vpanic+0x182/frame 0xfffffe002366a4f0
> >>>> panic() at panic+0x43/frame 0xfffffe002366a550
> >>>> trap_fatal() at trap_fatal+0x387/frame 0xfffffe002366a5b0
> >>>> trap_pfault() at trap_pfault+0x4f/frame 0xfffffe002366a610
> >>>> trap() at trap+0x27d/frame 0xfffffe002366a720
> >>>> calltrap() at calltrap+0x8/frame 0xfffffe002366a720
> >>>> --- trap 0xc, rip = 0xffffffff805b0a7f, rsp = 0xfffffe002366a7f0, rbp
> >>>> = 0xfffffe002366a7f0 ---
> >>>> strlen() at strlen+0x1f/frame 0xfffffe002366a7f0
> >>>> sbuf_cat() at sbuf_cat+0x15/frame 0xfffffe002366a810
> >>>> sysctl_devices() at sysctl_devices+0x104/frame 0xfffffe002366a8a0
> >>>> sysctl_root_handler_locked() at sysctl_root_handler_locked+0x91/frame
> >>>> 0xfffffe002366a8f0
> >>>> sysctl_root() at sysctl_root+0x249/frame 0xfffffe002366a970
> >>>> userland_sysctl() at userland_sysctl+0x170/frame 0xfffffe002366aa20
> >>>> sys___sysctl() at sys___sysctl+0x5f/frame 0xfffffe002366aad0
> >>>> amd64_syscall() at amd64_syscall+0x10c/frame 0xfffffe002366abf0
> >>>> fast_syscall_common() at fast_syscall_common+0xf8/frame
> >> 0xfffffe002366abf0
> >>>> --- syscall (202, FreeBSD ELF64, sys___sysctl), rip = 0x80041c0ea, rsp
> >>>> = 0x7fffffffda78, rbp = 0x7fffffffdab0 ---
> >>>> KDB: enter: panic
> >>>> [ thread pid 89 tid 100067 ]
> >>>> Stopped at kdb_enter+0x37: movq $0,0x7e2616(%rip)
> >>>>
> >>>>
> >>>> On 8/29/20, Warner Losh <imp at freebsd.org> wrote:
> >>>>> Author: imp
> >>>>> Date: Sat Aug 29 04:30:12 2020
> >>>>> New Revision: 364946
> >>>>> URL: https://svnweb.freebsd.org/changeset/base/364946
> >>>>>
> >>>>> Log:
> >>>>> Move to using sbuf for some sysctl in newbus
> >>>>>
> >>>>> Convert two different sysctl to using sbuf. First, for all the
> >> default
> >>>>> sysctls we implement for each device driver that's attached. This
> is
> >> a
> >>>>> pure sbuf conversion.
> >>>>>
> >>>>> Second, convert sysctl_devices to fill its buffer with sbuf rather
> >>>>> than a hand-rolled crappy thing I wrote years ago.
> >>>>>
> >>>>> Reviewed by: cem, markj
> >>>>> Differential Revision: https://reviews.freebsd.org/D26206
> >>>>>
> >>>>> Modified:
> >>>>> head/sys/kern/subr_bus.c
> >>>>>
> >>>>> Modified: head/sys/kern/subr_bus.c
> >>>>>
> >>>>
> >>
> ==============================================================================
> >>>>> --- head/sys/kern/subr_bus.c Sat Aug 29 04:30:06 2020
> (r364945)
> >>>>> +++ head/sys/kern/subr_bus.c Sat Aug 29 04:30:12 2020
> (r364946)
> >>>>> @@ -260,36 +260,33 @@ enum {
> >>>>> static int
> >>>>> device_sysctl_handler(SYSCTL_HANDLER_ARGS)
> >>>>> {
> >>>>> + struct sbuf sb;
> >>>>> device_t dev = (device_t)arg1;
> >>>>> - const char *value;
> >>>>> - char *buf;
> >>>>> int error;
> >>>>>
> >>>>> - buf = NULL;
> >>>>> + sbuf_new_for_sysctl(&sb, NULL, 1024, req);
> >>>>> switch (arg2) {
> >>>>> case DEVICE_SYSCTL_DESC:
> >>>>> - value = dev->desc ? dev->desc : "";
> >>>>> + sbuf_cpy(&sb, dev->desc ? dev->desc : "");
> >>>>> break;
> >>>>> case DEVICE_SYSCTL_DRIVER:
> >>>>> - value = dev->driver ? dev->driver->name : "";
> >>>>> + sbuf_cpy(&sb, dev->driver ? dev->driver->name : "");
> >>>>> break;
> >>>>> case DEVICE_SYSCTL_LOCATION:
> >>>>> - value = buf = malloc(1024, M_BUS, M_WAITOK | M_ZERO);
> >>>>> - bus_child_location_str(dev, buf, 1024);
> >>>>> + bus_child_location_sb(dev, &sb);
> >>>>> break;
> >>>>> case DEVICE_SYSCTL_PNPINFO:
> >>>>> - value = buf = malloc(1024, M_BUS, M_WAITOK | M_ZERO);
> >>>>> - bus_child_pnpinfo_str(dev, buf, 1024);
> >>>>> + bus_child_pnpinfo_sb(dev, &sb);
> >>>>> break;
> >>>>> case DEVICE_SYSCTL_PARENT:
> >>>>> - value = dev->parent ? dev->parent->nameunit : "";
> >>>>> + sbuf_cpy(&sb, dev->parent ? dev->parent->nameunit :
> "");
> >>>>> break;
> >>>>> default:
> >>>>> + sbuf_delete(&sb);
> >>>>> return (EINVAL);
> >>>>> }
> >>>>> - error = SYSCTL_OUT_STR(req, value);
> >>>>> - if (buf != NULL)
> >>>>> - free(buf, M_BUS);
> >>>>> + error = sbuf_finish(&sb);
> >>>>> + sbuf_delete(&sb);
> >>>>> return (error);
> >>>>> }
> >>>>>
> >>>>> @@ -5464,13 +5461,13 @@ SYSCTL_PROC(_hw_bus, OID_AUTO, info,
> >>>> CTLTYPE_STRUCT
> >>>>> |
> >>>>> static int
> >>>>> sysctl_devices(SYSCTL_HANDLER_ARGS)
> >>>>> {
> >>>>> + struct sbuf sb;
> >>>>> int *name = (int *)arg1;
> >>>>> u_int namelen = arg2;
> >>>>> int index;
> >>>>> device_t dev;
> >>>>> struct u_device *udev;
> >>>>> int error;
> >>>>> - char *walker, *ep;
> >>>>>
> >>>>> if (namelen != 2)
> >>>>> return (EINVAL);
> >>>>> @@ -5501,34 +5498,21 @@ sysctl_devices(SYSCTL_HANDLER_ARGS)
> >>>>> udev->dv_devflags = dev->devflags;
> >>>>> udev->dv_flags = dev->flags;
> >>>>> udev->dv_state = dev->state;
> >>>>> - walker = udev->dv_fields;
> >>>>> - ep = walker + sizeof(udev->dv_fields);
> >>>>> -#define CP(src) \
> >>>>> - if ((src) == NULL) \
> >>>>> - *walker++ = '\0'; \
> >>>>> - else { \
> >>>>> - strlcpy(walker, (src), ep - walker); \
> >>>>> - walker += strlen(walker) + 1; \
> >>>>> - } \
> >>>>> - if (walker >= ep) \
> >>>>> - break;
> >>>>> -
> >>>>> - do {
> >>>>> - CP(dev->nameunit);
> >>>>> - CP(dev->desc);
> >>>>> - CP(dev->driver != NULL ? dev->driver->name : NULL);
> >>>>> - bus_child_pnpinfo_str(dev, walker, ep - walker);
> >>>>> - walker += strlen(walker) + 1;
> >>>>> - if (walker >= ep)
> >>>>> - break;
> >>>>> - bus_child_location_str(dev, walker, ep - walker);
> >>>>> - walker += strlen(walker) + 1;
> >>>>> - if (walker >= ep)
> >>>>> - break;
> >>>>> - *walker++ = '\0';
> >>>>> - } while (0);
> >>>>> -#undef CP
> >>>>> - error = SYSCTL_OUT(req, udev, sizeof(*udev));
> >>>>> + sbuf_new(&sb, udev->dv_fields, sizeof(udev->dv_fields),
> >>>> SBUF_FIXEDLEN);
> >>>>> + sbuf_cat(&sb, dev->nameunit);
> >>>>> + sbuf_putc(&sb, '\0');
> >>>>> + sbuf_cat(&sb, dev->desc);
> >>>>> + sbuf_putc(&sb, '\0');
> >>>>> + sbuf_cat(&sb, dev->driver != NULL ? dev->driver->name : '\0');
> >>>>> + sbuf_putc(&sb, '\0');
> >>>>> + bus_child_pnpinfo_sb(dev, &sb);
> >>>>> + sbuf_putc(&sb, '\0');
> >>>>> + bus_child_location_sb(dev, &sb);
> >>>>> + sbuf_putc(&sb, '\0');
> >>>>> + error = sbuf_finish(&sb);
> >>>>> + if (error == 0)
> >>>>> + error = SYSCTL_OUT(req, udev, sizeof(*udev));
> >>>>> + sbuf_delete(&sb);
> >>>>> free(udev, M_BUS);
> >>>>> return (error);
> >>>>> }
> >>>>>
> >>>>
> >>>>
> >>>> --
> >>>> Mateusz Guzik <mjguzik gmail.com>
> >>>>
> >>>
> >>
> >
>
More information about the svn-src-head
mailing list