atomic changes break drm-next-kmod?
Niclas Zeising
zeising+freebsd at daemonic.se
Fri Jul 6 07:52:29 UTC 2018
On 07/06/18 00:02, Warner Losh wrote:
>
>
> On Thu, Jul 5, 2018 at 1:44 PM, John Baldwin <jhb at freebsd.org
> <mailto:jhb at freebsd.org>> wrote:
>
> On 7/5/18 12:36 PM, Konstantin Belousov wrote:
> > On Thu, Jul 05, 2018 at 09:12:24PM +0200, Hans Petter Selasky wrote:
> >> On 07/05/18 20:59, Hans Petter Selasky wrote:
> >>> On 07/05/18 19:48, Pete Wright wrote:
> >>>>
> >>>>
> >>>> On 07/05/2018 10:10, John Baldwin wrote:
> >>>>> On 7/3/18 5:10 PM, Pete Wright wrote:
> >>>>>>
> >>>>>> On 07/03/2018 15:56, John Baldwin wrote:
> >>>>>>> On 7/3/18 3:34 PM, Pete Wright wrote:
> >>>>>>>> On 07/03/2018 15:29, John Baldwin wrote:
> >>>>>>>>> That seems like kgdb is looking at the wrong CPU. Can
> you use
> >>>>>>>>> 'info threads' and look for threads not stopped in
> 'sched_switch'
> >>>>>>>>> and get their backtraces? You could also just do 'thread
> apply
> >>>>>>>>> all bt' and put that file at a URL if that is easiest.
> >>>>>>>>>
> >>>>>>>> sure thing John - here's a gist of "thread apply all bt"
> >>>>>>>>
> >>>>>>>>
> https://gist.github.com/gem-pete/d8d7ab220dc8781f0827f965f09d43ed
> <https://gist.github.com/gem-pete/d8d7ab220dc8781f0827f965f09d43ed>
> >>>>>>> That doesn't look right at all. Are you sure the kernel
> matches the
> >>>>>>> vmcore? Also, which kgdb version are you using?
> >>>>>>>
> >>>>>> yea i agree that doesn't look right at all. here is my setup:
> >>>>>>
> >>>>>> $ which kgdb
> >>>>>> /usr/bin/kgdb
> >>>>>> $ kgdb
> >>>>>> GNU gdb 6.1.1 [FreeBSD]
> >>>>>> $ ls -lh /var/crash/vmcore.1
> >>>>>> -rw------- 1 root wheel 1.6G Jul 3 15:03
> /var/crash/vmcore.1
> >>>>>> $ ls -l /usr/lib/debug/boot/kernel/kernel.debug
> >>>>>> -r-xr-xr-x 1 root wheel 87840496 Jul 3 13:54
> >>>>>> /usr/lib/debug/boot/kernel/kernel.debug
> >>>>>>
> >>>>>> and i invoke kgdb like so:
> >>>>>> $ sudo kgdb /usr/lib/debug/boot/kernel/kernel.debug
> /var/crash/vmcore.1
> >>>>>>
> >>>>>> here's a gist of my full gdb session:
> >>>>>> http://termbin.com/krsn
> >>>>>>
> >>>>>> dunno - maybe i have a bad core dump? regardless, more than
> happy to
> >>>>>> help so let me know if i should try anything else or patches
> etc..
> >>>>> Can you try installing gdb from ports and using
> /usr/local/bin/kgdb?
> >>>>>
> >>>>
> >>>> that seems to have done the trick, at least the output looks more
> >>>> encouraging.
> >>>>
> >>>> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> >>>> KDB: enter: panic
> >>>>
> >>>> __curthread () at ./machine/pcpu.h:231
> >>>> 231 __asm("movq %%gs:%1,%0" : "=r" (td)
> >>>>
> >>>>
> >>>> here's my full kgdb session:
> >>>> http://termbin.com/qa4f
> >>>>
> >>>> i don't see any threads not in "sched_switch" though :(
> >>>
> >>> Hi,
> >>>
> >>> The problem may be that the patch to enable atomic inlining of all
> >>> macros forgot to set the SMP keyword which means SMP is not
> defined at
> >>> all for KLD's so all non-kernel atomic usage is with MPLOCKED
> empty!
> > Problem is that out-of-tree modules build does not have opt*.h files
> > from the kernel. UP config is a valid one, flipping some option's
> > default value does not solve the problem.
>
> Yes, but using the lock prefix in a generic module is ok (it will still
> work, just not quite as fast) whereas the lack of lock is fatal on
> SMP. I would amend Hans' patch slightly to honor the opt_* setting
> for KLD_TIED (but that is only true if KLD_TIED means "built as part of
> a kernel build, so has valid opt_foo.h headers" and not
> 'a standalone module where someone put MODULES_TIED=1 on the command
> line
> to make').
>
>
> I agree with this default. It's sensible to default to (a) the most
> popular thing and (b) thing that always works, especially when (a) and
> (b) are identical.
>
> Don't make me start the "Do we really need an SMP option, why not make
> it always on" thread :) The number of relevant uniprocessor x86 boxes
> that benefit from omitting SMP is so small as to be irrelevant, IMHO. A
> MP kernel runs just fine on them...
>
> Warner
Where are we on this?
It is important to get it fixed, it's already been 4 days, which means 4
days of all modern FreeBSD desktop systems being broken, and possibly
other systems with kernel modules from ports as well.
Another question, how hard would it be to expose how the kernel was
built to modules built from ports, so that they can figure out stuff
like SMP and others, that might affect the module build?
Regards
--
Niclas
More information about the freebsd-current
mailing list