drm-current-kmod-4.16.g20190424 hangs

Jakob Alvermark jakob at alvermark.net
Sat Apr 27 15:58:00 UTC 2019


Hi,


Tried both of them, unsuccessfully.


Jakob

On 2019-04-26 17:23, Johannes Lundberg wrote:
>
> From https://reviews.freebsd.org/D19845 Can you try the sysctls 
> suggested there?
>
>     In D19845#428854 <https://reviews.freebsd.org/D19845#428854>,
>     @tychon <https://reviews.freebsd.org/p/tychon/> wrote:
>
>         In D19845#428768 <https://reviews.freebsd.org/D19845#428768>,
>         @greg_unrelenting.technology
>         <https://reviews.freebsd.org/p/greg_unrelenting.technology/>
>         wrote:
>
>         Some more i915 GPU testing (w/o the latest update here): after
>         using Firefox (opengl layers, xwayland) for some time, GPU
>         resets start happening
>
>         drmn0: Resetting chip for stuck wait on rcs0
>         drmn0: Resetting chip for stuck wait on rcs0
>         drmn0: Resetting chip for stuck wait on rcs0
>>         DMAR0: Fault Overflow
>         DMAR0: vgapci0: pci:0:2:0 sid 10 fault acc 0 adt 0x0 reason 0x5 addr 2e09000
>         DMAR0: Fault Overflow
>         DMAR0: vgapci0: pci:0:2:0 sid 10 fault acc 0 adt 0x0 reason 0x5 addr 2e09000
>
>         and eventually the whole system freezes if I don't quit the
>         compositor / switch to vt console.
>
>     Looks like a symptom of non-translatable physical address. I've
>     encountered drivers which need additional work outside of the
>     scope of this effort. Perhaps this is the case there as I can't
>     any more cases in the Linux KPI where a physical address is
>     substituted for a DMA one.
>     Also, I assume this is in remap mode. Does it work in identify map
>     mode hw.busdma.default="bounce"? Unless there is an API which
>     escaped, if it works in hw.dmar.enable="0" it's not a regression
>     from before :-/
>
>
>
> On 4/26/19 8:13 AM, Jakob Alvermark wrote:
>> Sure:
>>
>> ---<<BOOT>>---
>> Copyright (c) 1992-2019 The FreeBSD Project.
>> Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
>>     The Regents of the University of California. All rights reserved.
>> FreeBSD is a registered trademark of The FreeBSD Foundation.
>> FreeBSD 13.0-CURRENT #194 r346736M: Fri Apr 26 12:26:20 CEST 2019
>> root at flyer:/usr/obj/usr/src/amd64.amd64/sys/FLYER amd64
>> FreeBSD clang version 8.0.0 (tags/RELEASE_800/final 356365) (based on 
>> LLVM 8.0.0)
>> VT(efifb): resolution 1366x768
>> CPU: Intel(R) Pentium(R) CPU  N3540  @ 2.16GHz (2166.72-MHz K8-class 
>> CPU)
>>   Origin="GenuineIntel"  Id=0x30678  Family=0x6  Model=0x37 Stepping=8
>> Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE> 
>>
>> Features2=0x41d8e3bf<SSE3,PCLMULQDQ,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM,SSE4.1,SSE4.2,MOVBE,POPCNT,TSCDLT,RDRAND> 
>>
>>   AMD Features=0x28100800<SYSCALL,NX,RDTSCP,LM>
>>   AMD Features2=0x101<LAHF,Prefetch>
>>   Structured Extended Features=0x2282<TSCADJ,SMEP,ERMS,NFPUSG>
>>   VT-x: PAT,HLT,MTF,PAUSE,EPT,UG,VPID
>>   TSC: P-state invariant, performance statistics
>> real memory  = 8589934592 (8192 MB)
>> avail memory = 8120422400 (7744 MB)
>> Event timer "LAPIC" quality 600
>> ACPI APIC Table: <ACRSYS ACRPRDCT>
>> WARNING: L1 data cache covers fewer APIC IDs than a core (0 < 1)
>> FreeBSD/SMP: Multiprocessor System Detected: 4 CPUs
>> FreeBSD/SMP: 1 package(s) x 4 core(s)
>> __stack_chk_init: WARNING: Initializing stack protection with 
>> non-random cookies!
>> __stack_chk_init: WARNING: This severely limits the benefit of 
>> -fstack-protector!
>> ioapic0: Changing APIC ID to 2
>> ioapic0 <Version 2.0> irqs 0-86 on motherboard
>> Launching APs: 2 3 1
>> Timecounter "TSC-low" frequency 1083359641 Hz quality 1000
>> Cuse v0.1.36 @ /dev/cuse
>> random: entropy device external interface
>> kbd1 at kbdmux0
>> module_register_init: MOD_LOAD (vesa, 0xffffffff81150570, 0) error 19
>> random: registering fast source Intel Secure Key RNG
>> random: fast provider: "Intel Secure Key RNG"
>> 000.000049 [4254] netmap_init               netmap: loaded module
>> [ath_hal] loaded
>> nexus0
>> efirtc0: <EFI Realtime Clock> on motherboard
>> efirtc0: registered as a time-of-day clock, resolution 1.000000s
>> cryptosoft0: <software crypto> on motherboard
>> acpi0: <ACRSYS ACRPRDCT> on motherboard
>> acpi0: Power Button (fixed)
>> unknown: I/O range not supported
>> cpu0: <ACPI CPU> on acpi0
>> atrtc0: <AT realtime clock> port 0x70-0x77 on acpi0
>> atrtc0: Warning: Couldn't map I/O.
>> atrtc0: registered as a time-of-day clock, resolution 1.000000s
>> Event timer "RTC" frequency 32768 Hz quality 0
>> hpet0: <High Precision Event Timer> iomem 0xfed00000-0xfed003ff irq 8 
>> on acpi0
>> Timecounter "HPET" frequency 14318180 Hz quality 950
>> Event timer "HPET" frequency 14318180 Hz quality 450
>> Event timer "HPET1" frequency 14318180 Hz quality 440
>> Event timer "HPET2" frequency 14318180 Hz quality 440
>> attimer0: <AT timer> port 0x40-0x43,0x50-0x53 irq 0 on acpi0
>> Timecounter "i8254" frequency 1193182 Hz quality 0
>> Event timer "i8254" frequency 1193182 Hz quality 100
>> Timecounter "ACPI-safe" frequency 3579545 Hz quality 850
>> acpi_timer0: <24-bit timer at 3.579545MHz> port 0x408-0x40b on acpi0
>> acpi_ec0: <Embedded Controller: GPE 0x18> port 0x62,0x66 on acpi0
>> pcib0: <ACPI Host-PCI bridge> port 0xcf8-0xcff on acpi0
>> pcib0: Length mismatch for 3 range: 109fffff vs 10a00000
>> pci0: <ACPI PCI bus> on pcib0
>> vgapci0: <VGA-compatible display> port 0x2050-0x2057 mem 
>> 0x90000000-0x903fffff,0x80000000-0x8fffffff at device 2.0 on pci0
>> vgapci0: Boot video device
>> ahci0: <AHCI SATA controller> port 
>> 0x2048-0x204f,0x205c-0x205f,0x2040-0x2047,0x2058-0x205b,0x2020-0x203f 
>> mem 0x9091e000-0x9091e7ff at device 19.0 on pci0
>> ahci0: AHCI v1.30 with 2 3Gbps ports, Port Multiplier not supported
>> ahcich0: <AHCI channel> at channel 0 on ahci0
>> xhci0: <Intel BayTrail USB 3.0 controller> mem 0x90900000-0x9090ffff 
>> at device 20.0 on pci0
>> xhci0: 32 bytes context size, 64-bit DMA
>> xhci0: Port routing mask set to 0xffffffff
>> usbus0 on xhci0
>> usbus0: 5.0Gbps Super Speed USB v3.0
>> pci0: <encrypt/decrypt> at device 26.0 (no driver attached)
>> hdac0: <Intel BayTrail HDA Controller> mem 0x90910000-0x90913fff at 
>> device 27.0 on pci0
>> pcib1: <ACPI PCI-PCI bridge> at device 28.0 on pci0
>> pcib1: [GIANT-LOCKED]
>> pcib2: <ACPI PCI-PCI bridge> at device 28.1 on pci0
>> pcib2: [GIANT-LOCKED]
>> pci1: <ACPI PCI bus> on pcib2
>> iwn0: <Intel Centrino Advanced 6235> mem 0x90600000-0x90601fff at 
>> device 0.0 on pci1
>> arc4random: WARNING: initial seeding bypassed the cryptographic 
>> random device because it was not yet seeded and the knob 
>> 'bypass_before_seeding' was enabled.
>> pcib3: <ACPI PCI-PCI bridge> at device 28.2 on pci0
>> pcib3: [GIANT-LOCKED]
>> pci2: <ACPI PCI bus> on pcib3
>> re0: <RealTek 8168/8111 B/C/CP/D/DP/E/F/G PCIe Gigabit Ethernet> port 
>> 0x1000-0x10ff mem 0x90500000-0x90500fff,0x90400000-0x90403fff at 
>> device 0.0 on pci2
>> re0: Using 1 MSI-X message
>> re0: ASPM disabled
>> re0: Chip rev. 0x4c000000
>> re0: MAC rev. 0x00000000
>> miibus0: <MII bus> on re0
>> rgephy0: <RTL8251/8153 1000BASE-T media interface> PHY 1 on miibus0
>> rgephy0:  none, 10baseT, 10baseT-FDX, 10baseT-FDX-flow, 100baseTX, 
>> 100baseTX-FDX, 100baseTX-FDX-flow, 1000baseT-FDX, 
>> 1000baseT-FDX-master, 1000baseT-FDX-flow, 1000baseT-FDX-flow-master, 
>> auto, auto-flow
>> re0: Using defaults for TSO: 65518/35/2048
>> re0: Ethernet address: c4:54:44:d6:95:39
>> re0: netmap queues/slots: TX 1/256, RX 1/256
>> isab0: <PCI-ISA bridge> at device 31.0 on pci0
>> isa0: <ISA bus> on isab0
>> acpi_button0: <Power Button> on acpi0
>> acpi_button1: <Sleep Button> on acpi0
>> gpio0: <Intel Baytrail GPIO Controller> iomem 0xfed0c000-0xfed0cfff 
>> irq 49 on acpi0
>> gpiobus0: <GPIO bus> on gpio0
>> gpioc0: <GPIO controller> on gpio0
>> gpio1: <Intel Baytrail GPIO Controller> iomem 0xfed0d000-0xfed0dfff 
>> irq 48 on acpi0
>> gpiobus1: <GPIO bus> on gpio1
>> gpioc1: <GPIO controller> on gpio1
>> gpio2: <Intel Baytrail GPIO Controller> iomem 0xfed0e000-0xfed0efff 
>> irq 50 on acpi0
>> gpiobus2: <GPIO bus> on gpio2
>> gpioc2: <GPIO controller> on gpio2
>> sdhci_acpi0: <Intel Bay Trail/Braswell eMMC 4.5/4.5.1 Controller> 
>> iomem 0x90a02000-0x90a02fff irq 44 on acpi0
>> mmc0: <MMC/SD bus> on sdhci_acpi0
>> sdhci_acpi1: <Intel Bay Trail/Braswell SDXC Controller> iomem 
>> 0x90a00000-0x90a00fff irq 47 on acpi0
>> ig4iic_acpi0: <Designware I2C Controller> iomem 0x90a07000-0x90a07fff 
>> irq 32 on acpi0
>> acpi_acad0: <AC Adapter> on acpi0
>> battery0: <ACPI Control Method Battery> on acpi0
>> acpi_lid0: <Control Method Lid Switch> on acpi0
>> acpi_tz0: <Thermal Zone> on acpi0
>> atkbdc0: <Keyboard controller (i8042)> port 0x60,0x64 irq 1 on acpi0
>> atkbd0: <AT Keyboard> irq 1 on atkbdc0
>> kbd0 at atkbd0
>> atkbd0: [GIANT-LOCKED]
>> psm0: <PS/2 Mouse> irq 12 on atkbdc0
>> psm0: [GIANT-LOCKED]
>> psm0: model Synaptics Touchpad, device ID 0
>> uart0: <16550 or compatible> at port 0x3f8 irq 4 flags 0x10 on isa0
>> coretemp0: <CPU On-Die Thermal Sensors> on cpu0
>> est0: <Enhanced SpeedStep Frequency Control> on cpu0
>> ZFS filesystem version: 5
>> ZFS storage pool version: features support (5000)
>> Timecounters tick every 1.000 msec
>> hdacc0: <Realtek ALC283 HDA CODEC> at cad 0 on hdac0
>> hdaa0: <Realtek ALC283 Audio Function Group> at nid 1 on hdacc0
>> hdaa0: Coef 0x06 val 0x2104 -> 0x2100
>> hdaa0: Coef 0x45 val 0xc429 -> 0xd429
>> hdaa0: Coef 0x1b val 0x080b -> 0x0c2b
>> hdaa0: Coef 0x32 val 0x4ea3 -> 0x4ea3
>> pcm0: <Realtek ALC283 (Analog 2.0+HP/2.0)> at nid 20,33 and 18 on hdaa0
>> hdacc1: <Intel (0x2882) HDA CODEC> at cad 2 on hdac0
>> hdaa1: <Intel (0x2882) Audio Function Group> at nid 1 on hdacc1
>> pcm1: <Intel (0x2882) (HDMI/DP 8ch)> at nid 4 on hdaa1
>> ugen0.1: <0x8086 XHCI root HUB> at usbus0
>> uhub0: <0x8086 XHCI root HUB, class 9/0, rev 3.00/1.00, addr 1> on 
>> usbus0
>> ada0 at ahcich0 bus 0 scbus0 target 0 lun 0
>> ada0: <SAMSUNG MZ7PD128HCFV-000H1 DXM01H0Q> ACS-2 ATA SATA 3.x device
>> ada0: Serial Number S1MBNSAFA22012
>> ada0: 300.000MB/s transfers (SATA 2.x, UDMA6, PIO 8192bytes)
>> ada0: Command Queueing enabled
>> ada0: 122104MB (250069680 512 byte sectors)
>> ada0: quirks=0x3<4K,NCQ_TRIM_BROKEN>
>> mmc0: No compatible cards found on bus
>> iicbus0: <Philips I2C bus> on ig4iic_acpi0
>> iicsmb0: <SMBus over I2C bridge> on iicbus0
>> smbus0: <System Management Bus> on iicsmb0
>> Trying to mount root from zfs:flyer2/ROOT/default []...
>> Root mount waiting for: usbus0
>> uhub0: 7 ports with 7 removable, self powered
>> Root mount waiting for: usbus0
>> ugen0.2: <vendor 0x05e3 USB2.0 Hub> at usbus0
>> uhub1 on uhub0
>> uhub1: <vendor 0x05e3 USB2.0 Hub, class 9/0, rev 2.00/85.37, addr 1> 
>> on usbus0
>> uhub1: 4 ports with 3 removable, self powered
>> Root mount waiting for: usbus0
>> ugen0.3: <vendor 0x8087 product 0x07da> at usbus0
>> Root mount waiting for: usbus0
>> ugen0.4: <Cisco-Linksys Compact Wireless-G USB Adapter> at usbus0
>> ugen0.5: <SunplusIT INC. HD WebCam> at usbus0
>> random: unblocking device.
>> drmn0: <drmn> on vgapci0
>> vgapci0: child drmn0 requested pci_enable_io
>> [drm] Unable to create a private tmpfs mount, hugepage support will 
>> be disabled(-19).
>> Successfully added WC MTRR for [0x80000000-0x8fffffff]: 0;
>> [drm] Got stolen memory base 0x7b000000, size 0x4000000
>> [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
>> [drm] Driver supports precise vblank timestamp query.
>> [drm] Connector VGA-1: get mode from tunables:
>> [drm]   - kern.vt.fb.modes.VGA-1
>> [drm]   - kern.vt.fb.default_mode
>> [drm] Connector DP-1: get mode from tunables:
>> [drm]   - kern.vt.fb.modes.DP-1
>> [drm]   - kern.vt.fb.default_mode
>> [drm] Connector HDMI-A-1: get mode from tunables:
>> [drm]   - kern.vt.fb.modes.HDMI-A-1
>> [drm]   - kern.vt.fb.default_mode
>> [drm] Connector eDP-1: get mode from tunables:
>> [drm]   - kern.vt.fb.modes.eDP-1
>> [drm]   - kern.vt.fb.default_mode
>> [drm] Initialized i915 1.6.0 20171222 for drmn0 on minor 0
>> ichwd0: <Intel Bay Trail SoC watchdog timer> on isa0
>> VT: Replacing driver "efifb" with new "fb".
>> start FB_INFO:
>> type=11 height=768 width=1366 depth=32
>> cmsize=16 size=4227072
>> pbase=0x80000000 vbase=0xfffff80080000000
>> name=drmn0 flags=0x0 stride=5504 bpp=32
>> cmap[0]=0 cmap[1]=7f0000 cmap[2]=7f00 cmap[3]=c4a000
>> end FB_INFO
>> drmn0: fb0: inteldrmfb frame buffer device
>> wlan0: Ethernet address: 80:00:0b:5a:cd:23
>> lo0: link state changed to UP
>> iwn0: iwn_read_firmware: ucode rev=0x12a80601
>> re0: link state changed to DOWN
>> wlan0: link state changed to UP
>> ubt0 on uhub1
>> ubt0: <vendor 0x8087 product 0x07da, class 224/1, rev 2.00/78.69, 
>> addr 2> on usbus0
>> rum0 on uhub1
>> rum0: <Cisco-Linksys Compact Wireless-G USB Adapter, class 0/0, rev 
>> 2.00/0.01, addr 3> on usbus0
>> rum0: MAC/BBP RT2573 (rev 0x2573a), RF RT2528
>> WARNING: attempt to domain_add(bluetooth) after domainfinalize()
>> WARNING: attempt to domain_add(netgraph) after domainfinalize()
>> ubt0: ubt_bulk_read_callback:979: bulk-in transfer failed: 
>> USB_ERR_STALLED
>> wlan1: Ethernet address: 00:18:f8:34:d2:8d
>> wlan1: link state changed to UP
>> .
>> Security policy loaded: MAC/ntpd (mac_ntpd)
>> .
>> [drm] GPU HANG: ecode 7:0:0x86f2fffd, in Xorg [100491], reason: Hang 
>> on rcs0, action: reset
>> drmn0: Resetting chip after gpu hang
>> drmn0: i915_reset_device timed out, cancelling all in-flight rendering.
>>
>> On 2019-04-26 16:48, Johannes Lundberg wrote:
>>> Hi
>>>
>>> Hmm, this is not good. The only thing I can think of is the dma changes
>>> to base linuxkpi...
>>>
>>> Can you share a dmesg output from boot to crash, or at least to after
>>> driver is loaded?
>>>
>>> Tycho, any ideas?
>>>
>>>
>>> On 4/26/19 5:00 AM, Jakob Alvermark wrote:
>>>> Hi,
>>>>
>>>>
>>>> When I upgraded -current to r346730 drm-current-kmod-4.16.g20190323
>>>> wouldn't load, "device_attach: drmn0 attach returned 19"
>>>>
>>>> So I upgraded drm-current-kmod to 4.16.g20190424.
>>>>
>>>> It loads fine, but shortly after starting Xorg the screen freezes.
>>>>
>>>> The only way out is pressing the power button, it shuts down cleanly.
>>>>
>>>> /var/log/messages shows this:
>>>>
>>>> kernel: [drm] GPU HANG: ecode 7:0:0x60ac6ee9, in Xorg [100385],
>>>> reason: Hang on rcs0, action: reset
>>>> kernel: drmn0: Resetting chip after gpu hang
>>>> syslogd: last message repeated 1 times
>>>> kernel: drmn0: i915_reset_device timed out, cancelling all in-flight
>>>> rendering.
>>>> kernel: .
>>>>
>>>> Tried once more, same thing happened:
>>>>
>>>> kernel: [drm] GPU HANG: ecode 7:0:0x86f2fffd, in Xorg [100491],
>>>> reason: Hang on rcs0, action: reset
>>>> kernel: drmn0: Resetting chip after gpu hang
>>>> syslogd: last message repeated 1 times
>>>> kernel: drmn0: i915_reset_device timed out, cancelling all in-flight
>>>> rendering.
>>>> kernel: .
>>>>
>>>> Reverting back to drm-current-kmod-4.16.g20190323 and -current to
>>>> r346593 (yay boot environments!) it is stable.
>>>>
>>>> This is on a laptop with CPU: Intel(R) Pentium(R) CPU N3540  @
>>>> 2.16GHz (2166.72-MHz K8-class CPU)
>>>> Baytrail graphics.
>>>>
>>>>
>>>> Jakob
>>>>
>>>> _______________________________________________
>>>> freebsd-x11 at freebsd.org mailing list
>>>> https://lists.freebsd.org/mailman/listinfo/freebsd-x11
>>>> To unsubscribe, send any mail to "freebsd-x11-unsubscribe at freebsd.org"


More information about the freebsd-x11 mailing list