7.0 broken on e4500

Marius Strobl marius at alchemy.franken.de
Wed Nov 7 13:33:25 PST 2007


On Wed, Nov 07, 2007 at 10:26:48PM +0100, Kris Kennaway wrote:
> Marius Strobl wrote:
> >On Tue, Nov 06, 2007 at 01:38:16AM -0600, Alan Cox wrote:
> >>Kris Kennaway wrote:
> >>
> >>>Marius Strobl wrote:
> >>>
> >>>>On Sun, Nov 04, 2007 at 11:19:31PM +0100, Kris Kennaway wrote:
> >>>>
> >>>>>Kris Kennaway wrote:
> >>>>>
> >>>>>>Marius Strobl wrote:
> >>>>>>
> >>>>>>>On Sat, Oct 06, 2007 at 02:22:30AM -0400, John Baldwin wrote:
> >>>>>>>
> >>>>>>>>On Wednesday 03 October 2007 09:29:44 am Marius Strobl wrote:
> >>>>>>>>
> >>>>>>>>>On Sat, Sep 29, 2007 at 09:56:45PM +0200, Kris Kennaway wrote:
> >>>>>>>>>
> >>>>>>>>>>I get this early during boot with a CVS kernel (updated from last 
> >>>>>>>>December):
> >>>>>>>>
> >>>>>>>>>>>FreeBSD/SMP: Multiprocessor System Detected: 10 CPUs
> >>>>>>>>>>>panic: tsb_tte_enter: replacing valid kernel mapping
> >>>>>>>>>>>cpuid = 0
> >>>>>>>>>>>KDB: enter: panic
> >>>>>>>>>>>[thread pid 0 tid 0 ]
> >>>>>>>>>>>Stopped at      kdb_enter+0x68: ta              %xcc, 1
> >>>>>>>>>>>db> wh
> >>>>>>>>>>>Tracing pid 0 tid 0 td 0xc0744f80
> >>>>>>>>>>>panic() at panic+0x204
> >>>>>>>>>>>tsb_tte_enter() at tsb_tte_enter+0xdc
> >>>>>>>>>>>pmap_enter_locked() at pmap_enter_locked+0x2d0
> >>>>>>>>>>>pmap_enter() at pmap_enter+0x64
> >>>>>>>>>>>kmem_malloc() at kmem_malloc+0x6e0
> >>>>>>>>>>>page_alloc() at page_alloc+0x28
> >>>>>>>>>>>uma_large_malloc() at uma_large_malloc+0x44
> >>>>>>>>>>>malloc() at malloc+0x1b0
> >>>>>>>>>>>sf_buf_init() at sf_buf_init+0xf8
> >>>>>>>>>>>mi_startup() at mi_startup+0x18c
> >>>>>>>>>>>btext() at btext+0x34
> >>>>>>>>>Do you by chance load the new kernel manually via the loader
> >>>>>>>>>prompt, with the old kernel being <= 8MB in size and the new
> >>>>>>>>>one > 8MB?
> >>>>>>>>I get this panic on an E220R at work, but my "new" kernel is 
> >>>>>>>>smaller.
> >>>>>>>>
> >>>>>>>If the actual panic string is "vm_phys_paddr_to_vm_page: paddr <foo>
> >>>>>>>is not in any segment" than that's the problem I had in mind when
> >>>>>>>replying to Kris but unfortunately failed to describe the right
> >>>>>>>way around.
> >>>>>>>
> >>>>>>>>>ll /boot/kernel/kernel* /boot/test/kernel*
> >>>>>>>>-r-xr-xr-x  1 root  wheel   7821094 Feb  6  2007 /boot/kernel/kernel
> >>>>>>>>-r-xr-xr-x  1 root  wheel  13902501 Feb  6  2007 
> >>>>>>>>/boot/kernel/kernel.symbols
> >>>>>>>>-r-xr-xr-x  1 root  wheel   4534968 Oct  6 00:20 /boot/test/kernel
> >>>>>>>>-r-xr-xr-x  1 root  wheel  10101980 Oct  6 00:20 
> >>>>>>>>/boot/test/kernel.symbols
> >>>>>>>>
> >>>>>>>>The working kernel (~7MB) is the GENERIC kernel, and the "test" 
> >>>>>>>>kernel
> >>>>>>>>is the stripped down kernel for this machine.  In my case I'm 
> >>>>>>>>panicing in pmap_remove_tte() called from pmap_enter_locked().  I 
> >>>>>>>>added some KTR traces to the pmap code to try and investigate, 
> >>>>>>>>but I'm guessing the root problem is that the loader doesn't 
> >>>>>>>>properly handle telling OFW about needing to change the mappings 
> >>>>>>>>when unloading and then loading a new kernel?
> >>>>>>>>
> >>>>>>>>Hmm, it looks like currently the loader doesn't do any sort of MD 
> >>>>>>>>callback
> >>>>>>>>when unloading a file, so the loader isn't going to free up the 
> >>>>>>>>RAM it asked for from OFW for the old kernel.
> >>>>>>>>
> >>>>>>>Correct, the immediate problem (which I had a patch for somewhere)
> >>>>>>>is that in case the "old" kernel required more TLB slots to be used
> >>>>>>>than the "new" one one can't use the kernel end in order to determine
> >>>>>>>how many slots are used for the kernel map. As you describe the real
> >>>>>>>problem lies within the loader though. The funny thing is that no
> >>>>>>>arch except sparc64 and sun4v seems to rely on the kernel end
> >>>>>>>provided by the loader.
> >>>>>>>If no idea what's the cause of the problem Kris is seeing though.
> >>>>>>>
> >>>>>>>Marius
> >>>>>>>
> >>>>>>>
> >>>>>>FYI one of the e4500's is now booting again but another is still 
> >>>>>>failing with the same panic:
> >>>>>>
> >>>>>>FreeBSD 8.0-CURRENT #44: Mon Nov  5 01:52:42 JST 2007
> >>>>>>  root at e4500-2.allbsd.org:/usr/src/sys/sparc64/compile/E4500_2
> >>>>>>real memory  = 9663676416 (9216 MB)
> >>>>>>avail memory = 9433554944 (8996 MB)
> >>>>>>cpu0: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU)
> >>>>>>cpu1: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU)
> >>>>>>cpu2: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU)
> >>>>>>cpu3: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU)
> >>>>>>cpu4: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU)
> >>>>>>cpu5: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU)
> >>>>>>cpu6: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU)
> >>>>>>cpu7: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU)
> >>>>>>cpu8: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU)
> >>>>>>cpu9: Sun Microsystems UltraSparc-II Processor (400.00 MHz CPU)
> >>>>>>FreeBSD/SMP: Multiprocessor System Detected: 10 CPUs
> >>>>>>panic: tsb_tte_enter: replacing valid kernel mapping
> >>>>>>db> wh
> >>>>>>Tracing pid 0 tid 0 td 0xc056ad30
> >>>>>>panic() at panic+0x248
> >>>>>>tsb_tte_enter() at tsb_tte_enter+0xdc
> >>>>>>pmap_enter_locked() at pmap_enter_locked+0x318
> >>>>>>pmap_enter() at pmap_enter+0x64
> >>>>>>kmem_malloc() at kmem_malloc+0x644
> >>>>>>page_alloc() at page_alloc+0x28
> >>>>>>uma_large_malloc() at uma_large_malloc+0x44
> >>>>>>malloc() at malloc+0x1a0
> >>>>>>sf_buf_init() at sf_buf_init+0xe8
> >>>>>>mi_startup() at mi_startup+0x1e8
> >>>>>>btext() at btext+0x34
> >>>>>>
> >>Can anyone tell me more about the "vm_phys_paddr_to_vm_page: paddr <foo> 
> >>is not in any segment" panic?
> >>
> >
> >The relevant info should be also above; if one unloads a kernel
> >in the loader and loads another one which occupies fewer TLB
> >slots than the previous one, the excess slots aren't flushed.
> >The kernel in turn relies on the MODINFOMD_KERNEND provided
> >by the loader (i.e. the ekva supplied to pmap_bootstrap()) for
> >calculating the start of KVA however, which doesn't include
> >the excess slots with locked entries entered by the loader.
> >Typical panics look like:
> >cpu0: Sun Microsystems UltraSparc-IIi Processor (440.16 MHz CPU)
> >panic: vm_phys_paddr_to_vm_page: paddr 0x1e01a000 is not in any segment
> >cpuid = 0
> >KDB: enter: panic
> >[thread pid 0 tid 0 ]
> >Stopped at      kdb_enter+0x68: ta              %xcc, 1
> >db> bt
> >Tracing pid 0 tid 0 td 0xc06a2780
> >panic() at panic+0x204
> >vm_phys_paddr_to_vm_page() at vm_phys_paddr_to_vm_page+0x84
> >pmap_remove_tte() at pmap_remove_tte+0x44
> >pmap_enter_locked() at pmap_enter_locked+0x1b4
> >pmap_enter() at pmap_enter+0x94
> >kmem_malloc() at kmem_malloc+0x69c
> >page_alloc() at page_alloc+0x28
> >uma_large_malloc() at uma_large_malloc+0x44
> >malloc() at malloc+0xc4
> >sf_buf_init() at sf_buf_init+0xf8
> >mi_startup() at mi_startup+0x18c
> >btext() at btext+0x34
> >db>
> >
> >Marius
> >
> >
> 
> Well, except I'm not unloading the kernel, just letting it boot the 
> default /boot/kernel/kernel.
> 

Yup, as also written above you're obviously facing another
problem. I just initially thought it might be the same.

Marius



More information about the freebsd-sparc64 mailing list