r343567 aka PAE vs non-PAE merge breaks i386 freebsd
Rodney W. Grimes
freebsd-rwg at pdx.rh.CN85.dnsmgr.net
Sat Feb 23 18:04:12 UTC 2019
> On Sat, Feb 23, 2019 at 11:19:31AM +0200, Konstantin Belousov wrote:
> > On Fri, Feb 22, 2019 at 07:26:44PM -0800, Steve Kargl wrote:
> > > On Thu, Feb 21, 2019 at 10:04:10PM -0800, Steve Kargl wrote:
> > > > On Thu, Feb 21, 2019 at 07:39:25PM -0800, Steve Kargl wrote:
> > > > > r343567 merges the PAE vs non-PAE pmap headers for i386
> > > > > freebsd. After bisection and dealing with the drm-legacy-kmod
> > > > > fallout, I bisected /usr/src to r343567. Building world and
> > > > > a GENERIC kernel and the minimum set of ports to start Xorg
> > > > > on my Dell Latitude D530 laptop, results in a black screen
> > > > > of death and a locked up laptop (no keyboard, mouse, or video).
> > > > >
> > > > > A comparison of /etc/log/Xorg.0.log for r343566 (Xorg loads
> > > > > and functions) and r353467 (Xorg black screen of death) shows
> > > > > that /boot/modules/i915kms.ko loads correctly as the log
> > > > > files are identical.
> > > > >
> > > > > Comparing dmesg for r343566 to r343567 shows the following
> > > > >
> > > > > --- dmesg.343566 2019-02-20 08:13:07.727202000 -0800
> > > > > +++ dmesg.343567 2019-02-21 19:02:24.469562000 -0800
> > > > > @@ -3,11 +3,11 @@
> > > > > Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994
> > > > > The Regents of the University of California. All rights reserved.
> > > > > FreeBSD is a registered trademark of The FreeBSD Foundation.
> > > > > -FreeBSD 13.0-CURRENT r343566 GENERIC i386
> > > > > +FreeBSD 13.0-CURRENT r343567 GENERIC i386
> > > > > FreeBSD clang version 7.0.1 (tags/RELEASE_701/final 349250) (based on LLVM 7.0.1)
> > > > > WARNING: WITNESS option enabled, expect reduced performance.
> > > > > VT(vga): resolution 640x480
> > > > > -CPU: Intel(R) Core(TM)2 Duo CPU T7250 @ 2.00GHz (1995.05-MHz 686-class CPU)
> > > > > +CPU: Intel(R) Core(TM)2 Duo CPU T7250 @ 2.00GHz (1995.04-MHz 686-class CPU)
> > > > > Origin="GenuineIntel" Id=0x6fd Family=0x6 Model=0xf Stepping=13
> > > > > Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>
> > > > > Features2=0xe3bd<SSE3,DTES64,MON,DS_CPL,VMX,EST,TM2,SSSE3,CX16,xTPR,PDCM>
> > > > > @@ -16,7 +16,7 @@
> > > > > VT-x: (disabled in BIOS) HLT,PAUSE
> > > > > TSC: P-state invariant, performance statistics
> > > > > real memory = 4294967296 (4096 MB)
> > > > > -avail memory = 3639914496 (3471 MB)
> > > > > +avail memory = 4154175488 (3961 MB)
> > > > >
> > > > > Somehow the r343567 kernel found an addition 490 MB of memory,
> > > > > which leads me to believe the after loading i915kms.ko there
> > > > > is some serious memory stomping issues.
> > > > >
> > > > > I willing to do whatever is necessary to fix this issue (shorter
> > > > > of mailing the laptop to someone). Is it possible to revert
> > > > > r343567 and move forward?
> > > > >
> > > >
> > > > More info from sysctl. With the "good" r343566, I see
> > > >
> > > > vm.kmem_map_free: 1187033088
> > > > vm.kmem_map_size: 27234304
> > > > vm.kmem_size_scale: 3
> > > > vm.kmem_size_max: 1715470336
> > > > vm.kmem_size_min: 12582912
> > > > vm.kmem_zmax: 65536
> > > > vm.kmem_size: 1214267392
> > > > hw.physmem: 3714269184
> > > > hw.usermem: 3650867200
> > > > hw.realmem: 4294963200
> > > >
> > > > With the problematic r343567, I see
> > > >
> > > > vm.kmem_map_free: 1683152896
> > > > vm.kmem_map_size: 28123136
> > > > vm.kmem_size_scale: 1
> > > > vm.kmem_size_max: 1711276032
> > > > vm.kmem_size_min: 12582912
> > > > vm.kmem_zmax: 65536
> > > > vm.kmem_size: 1711276032
> > > > hw.physmem: 4252360704
> > > > hw.usermem: 4146999296
> > > > hw.realmem: 4294963200
> > > >
> > > > Ideas?
> > > >
> > >
> > > Here's the 'diff -uw' between a verbose dmesg boot of r343566
> > > and dmesg boot of r343567. The memory size looks rather puzzling.
> > > Can the people responsible for the i386 pmap.h merging take a
> > > look?
> > What is puzzling ?
>
> Supposely, the laptop only has 4 GB of memory. Not sure how
> it finds memory above 4 GB.
It probably has what is called UMA and the graphics
framebuffer is mapped into memory below 4G and the
original memory is mapped above 4G, giving you this
little bit of >4G memory that is trigger PAE now.
This may not be desired, is there any performance
advantage to not turning on PAE in this situation?
> I build 343566 and minimum ports needed for Xorg including
> drm-legacy-kmod. I can load xorg, and in fact, I am typing
> this email now on the laptop with vi in xterm.
>
> I build 343567 and minimum ports needed for Xorg including
> drm-legacy-kmod. I try to start Xorg. Black screen of death.
> No mouse. No keyboard. Just a hard reset.
That would be a regression caused by PAE coming into play.
> I build 343567 and minimum ports needed for Xorg including
> drm-legacy-kmod. I load i915kms.ko, do not start Xorg. There
> are surprising strikes/blotches of color on screen. Building any
> port with the system's cc results in occasion segfaults.
>
> > When kernel boots in PAE mode, it can (and will) get a use for physical
> > memory mapped above 4G. I highlighted the SMAP entry which represents
> > such memory, below.
> >
> > kmem_scale was changed in the PAE commit, see the commit message for
> > explanation.
> >
>
> I read it multiple times. It does not explain how to get the
> old pre-343567 behavior where the laptop is usable. It mentions
> two new sysctl entities. One is irrelevant as I don't have 24+ GB
> of memory. The other has this in the commit message:
>
> There are two tunables added: hw.above4g_allow and ...,
I think trying to set that sysctl to 0 in a post 343567 system
is worth a try.
> the first one is kept enabled for now to evaluate the status
> on HEAD, ...
>
> Well, here's a report that indicates the status is "not okay". The
> commit message also has the afterthought:
>
> Also, VM_KMEM_SIZE_SCALE changed from 3 to 1.
>
> Okay, so what does that mean. Will setting vm.kmem_size_scale to 3
> fix what appears to be some memory corruption or mismanagement?
>
--
Rod Grimes rgrimes at freebsd.org
More information about the freebsd-current
mailing list