SMP options and core dump failure

Yong Rao yrao at force10networks.com
Tue Jul 3 01:11:07 UTC 2007


Hello,

 

We have a problem with SMP kernel. It could not dump out core when the
crash happens.

 

I am able to isolate the problem to kernel configurations which have SMP
enabled when used with 2 cpus. 

With ONE cpu the core dump works ok.

 

I built the kernel with GENERIC, and deliberately crash the kernel (for
testing purpose). The core dump works fine.

Only added the "options SMP" and crashed the kernel, then prior to any
pages being dumped out, it hangs there.

 

Has someone successfully core dumped on a system using SMP kernel with
multiple CPUs?

 

I tried on two different boxes (different motherboards, CPUs and hard
disks). Both got failed.

 

I tried to enable the DDB, but don't know what to look for when it goes
into ddb. Appreciate any pointers.

 

a) The CPU information is 

 

CPU: Dual Core AMD Opteron(tm) Processor 280 (2405.47-MHz 686-class CPU)

  Origin = "AuthenticAMD"  Id = 0x20f12  Stepping = 2

 
Features=0x178bfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE
,MCA,CMOV,PAT,PSE36,CLFLUSH,MMX,FXSR,S

SE,SSE2,HTT>

  Features2=0x1<SSE3>

  AMD Features=0xe2500800<SYSCALL,NX,MMX+,FFXSR,LM,3DNow+,3DNow>

  AMD Features2=0x3<LAHF,CMP>

  Cores per package: 2

 

 

b) We also tried on another mother board, which has 2 CPUs. The CPU
information is below.

 

CPU: Intel(R) Xeon(TM) CPU 2.80GHz (2800.11-MHz 686-class CPU)

  Origin = "GenuineIntel"  Id = 0xf29  Stepping = 9

 
Features=0xbfebfbff<FPU,VME,DE,PSE,TSC,MSR,PAE,MCE,CX8,APIC,SEP,MTRR,PGE
,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS,HTT,TM,PBE>

  Features2=0x4400<CNTX-ID,<b14>>

real memory  = 2147418112 (2047 MB)

avail memory = 2096300032 (1999 MB)

ACPI APIC Table: <A M I  OEMAPIC >

FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs  cpu0 (BSP): APIC
ID:  0

 cpu1 (AP): APIC ID:  6

 

 

c) The following are the prints when the dump hung.

 

mem dump: start address = 0x4352, len=0x30


 


 


Fatal trap 12: page fault while in kernel mode


cpuid = 1; apic id = 01


fault virtual address   = 0x4352


fault code              = supervisor read, page not present


instruction pointer     = 0x20:0xc9e9fc92


stack pointer           = 0x28:0xebdbdbdc


frame pointer           = 0x28:0xebdbdbf8


code segment            = base 0x0, limit 0xfffff, type 0x1b


                        = DPL 0, pres 1, def32 1, gran 1


processor eflags        = interrupt enabled, resume, IOPL = 0


current process         = 74231 (pnicdbg)


trap number             = 12


panic: page fault


cpuid = 1


Uptime: 1d18h27m42s


Dumping 4030 MB (2 chunks)


  chunk 0: 1MB (154 pages) ... ok


  chunk 1: 4031MB (1031776 pages)   (stopped and hung here)

 

 

Thanks,

 

Yong Rao

Force10 Networks Inc.

350 Holger Way

San Jose, CA 95132

408 571 6317

 



More information about the freebsd-questions mailing list