atomic changes break drm-next-kmod?
Hans Petter Selasky
hps at selasky.org
Fri Jul 6 10:16:18 UTC 2018
On 07/06/18 11:14, Johannes Lundberg wrote:
> On Fri, Jul 6, 2018 at 9:49 AM Konstantin Belousov <kostikbel at gmail.com>
> wrote:
>
>> On Fri, Jul 06, 2018 at 09:52:24AM +0200, Niclas Zeising wrote:
>>> On 07/06/18 00:02, Warner Losh wrote:
>>>>
>>>>
>>>> On Thu, Jul 5, 2018 at 1:44 PM, John Baldwin <jhb at freebsd.org
>>>> <mailto:jhb at freebsd.org>> wrote:
>>>>
>>>> On 7/5/18 12:36 PM, Konstantin Belousov wrote:
>>>> > On Thu, Jul 05, 2018 at 09:12:24PM +0200, Hans Petter Selasky
>> wrote:
>>>> >> On 07/05/18 20:59, Hans Petter Selasky wrote:
>>>> >>> On 07/05/18 19:48, Pete Wright wrote:
>>>> >>>>
>>>> >>>>
>>>> >>>> On 07/05/2018 10:10, John Baldwin wrote:
>>>> >>>>> On 7/3/18 5:10 PM, Pete Wright wrote:
>>>> >>>>>>
>>>> >>>>>> On 07/03/2018 15:56, John Baldwin wrote:
>>>> >>>>>>> On 7/3/18 3:34 PM, Pete Wright wrote:
>>>> >>>>>>>> On 07/03/2018 15:29, John Baldwin wrote:
>>>> >>>>>>>>> That seems like kgdb is looking at the wrong CPU. Can
>>>> you use
>>>> >>>>>>>>> 'info threads' and look for threads not stopped in
>>>> 'sched_switch'
>>>> >>>>>>>>> and get their backtraces? You could also just do
>> 'thread
>>>> apply
>>>> >>>>>>>>> all bt' and put that file at a URL if that is easiest.
>>>> >>>>>>>>>
>>>> >>>>>>>> sure thing John - here's a gist of "thread apply all bt"
>>>> >>>>>>>>
>>>> >>>>>>>>
>>>> https://gist.github.com/gem-pete/d8d7ab220dc8781f0827f965f09d43ed
>>>> <https://gist.github.com/gem-pete/d8d7ab220dc8781f0827f965f09d43ed
>>>
>>>> >>>>>>> That doesn't look right at all. Are you sure the kernel
>>>> matches the
>>>> >>>>>>> vmcore? Also, which kgdb version are you using?
>>>> >>>>>>>
>>>> >>>>>> yea i agree that doesn't look right at all. here is my
>> setup:
>>>> >>>>>>
>>>> >>>>>> $ which kgdb
>>>> >>>>>> /usr/bin/kgdb
>>>> >>>>>> $ kgdb
>>>> >>>>>> GNU gdb 6.1.1 [FreeBSD]
>>>> >>>>>> $ ls -lh /var/crash/vmcore.1
>>>> >>>>>> -rw------- 1 root wheel 1.6G Jul 3 15:03
>>>> /var/crash/vmcore.1
>>>> >>>>>> $ ls -l /usr/lib/debug/boot/kernel/kernel.debug
>>>> >>>>>> -r-xr-xr-x 1 root wheel 87840496 Jul 3 13:54
>>>> >>>>>> /usr/lib/debug/boot/kernel/kernel.debug
>>>> >>>>>>
>>>> >>>>>> and i invoke kgdb like so:
>>>> >>>>>> $ sudo kgdb /usr/lib/debug/boot/kernel/kernel.debug
>>>> /var/crash/vmcore.1
>>>> >>>>>>
>>>> >>>>>> here's a gist of my full gdb session:
>>>> >>>>>> http://termbin.com/krsn
>>>> >>>>>>
>>>> >>>>>> dunno - maybe i have a bad core dump? regardless, more
>> than
>>>> happy to
>>>> >>>>>> help so let me know if i should try anything else or
>> patches
>>>> etc..
>>>> >>>>> Can you try installing gdb from ports and using
>>>> /usr/local/bin/kgdb?
>>>> >>>>>
>>>> >>>>
>>>> >>>> that seems to have done the trick, at least the output looks
>> more
>>>> >>>> encouraging.
>>>> >>>>
>>>> >>>> --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
>>>> >>>> KDB: enter: panic
>>>> >>>>
>>>> >>>> __curthread () at ./machine/pcpu.h:231
>>>> >>>> 231 __asm("movq %%gs:%1,%0" : "=r" (td)
>>>> >>>>
>>>> >>>>
>>>> >>>> here's my full kgdb session:
>>>> >>>> http://termbin.com/qa4f
>>>> >>>>
>>>> >>>> i don't see any threads not in "sched_switch" though :(
>>>> >>>
>>>> >>> Hi,
>>>> >>>
>>>> >>> The problem may be that the patch to enable atomic inlining
>> of all
>>>> >>> macros forgot to set the SMP keyword which means SMP is not
>>>> defined at
>>>> >>> all for KLD's so all non-kernel atomic usage is with MPLOCKED
>>>> empty!
>>>> > Problem is that out-of-tree modules build does not have opt*.h
>> files
>>>> > from the kernel. UP config is a valid one, flipping some
>> option's
>>>> > default value does not solve the problem.
>>>>
>>>> Yes, but using the lock prefix in a generic module is ok (it will
>> still
>>>> work, just not quite as fast) whereas the lack of lock is fatal on
>>>> SMP. I would amend Hans' patch slightly to honor the opt_* setting
>>>> for KLD_TIED (but that is only true if KLD_TIED means "built as
>> part of
>>>> a kernel build, so has valid opt_foo.h headers" and not
>>>> 'a standalone module where someone put MODULES_TIED=1 on the
>> command
>>>> line
>>>> to make').
>>>>
>>>>
>>>> I agree with this default. It's sensible to default to (a) the most
>>>> popular thing and (b) thing that always works, especially when (a) and
>>>> (b) are identical.
>>>>
>>>> Don't make me start the "Do we really need an SMP option, why not make
>>>> it always on" thread :) The number of relevant uniprocessor x86 boxes
>>>> that benefit from omitting SMP is so small as to be irrelevant, IMHO.
>> A
>>>> MP kernel runs just fine on them...
>>>>
>>>> Warner
>>>
>>> Where are we on this?
>>> It is important to get it fixed, it's already been 4 days, which means 4
>>> days of all modern FreeBSD desktop systems being broken, and possibly
>>> other systems with kernel modules from ports as well.
>>>
>>>
>>> Another question, how hard would it be to expose how the kernel was
>>> built to modules built from ports, so that they can figure out stuff
>>> like SMP and others, that might affect the module build?
>> Point the KERNBUILDDIR variable to the directory of the kernel build.
>> This is the directory where *.o and opt*.h are located. Then everything
>> would just work.
>>
>
> Is the solution that we require everyone to build a kernel before they can
> build the standalone modules or am I missing something here?
>
Hi,
Here is a temporary fix:
https://svnweb.freebsd.org/changeset/base/336025
Like Konstantin says this issue needs to be revisited.
--HPS
More information about the freebsd-current
mailing list