Kernel panics in ahc during load with stable built 18th Feb
Tony Frank
tfrank at optushome.com.au
Thu Feb 19 05:04:14 PST 2004
Hi all again,
Some more updates on my problems.
On Thu, Feb 19, 2004 at 04:23:36PM +1100, Tony Frank wrote:
> As per the subject I seem to be getting kernel panics in ahc
> driver since upgrading my kernel & world to -stable.
> This occurs specifically when writing high volume of files to
> vinum raid5 volume spanning 4 scsi drives connected to Adaptec 2940 PCI
> controller.
>
> The fault appears to be reproducable - every time I try to extract
> a tar file containing a copy of /usr/obj from another system.
>
> vinum init and a lot of benchmarking (rawio and bonnie) work fine on the
> volume.
>
> dmesg lines:
> ahc0: <Adaptec 2940 Ultra SCSI adapter> port 0xb400-0xb4ff mem 0xe0800000-0xe080
> 0fff irq 10 at device 11.0 on pci0
>
> Custom kernel is configured with:
> options AHC_ALLOW_MEMIO
>
> I setup a serial console and rebuilt kernel to include debugging bits.
>
> Fatal trap 12: page fault while in kernel mode
> fault virtual address = 0x5c
> fault code = supervisor read, page not present
> instruction pointer = 0x8:0xc015cab2
> stack pointer = 0x10:0xc02e2b58
> frame pointer = 0x10:0xc02e2b68
> code segment = base 0x0, limit 0xfffff, type 0x1b
> = DPL 0, pres 1, def32 1, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = Idle
> interrupt mask = cam
> kernel: type 12 trap, code=0
> Stopped at ahc_done+0xc2: pushl 0x5c(%ebx)
>
> db> trace
> ahc_done(c0f9a200,c0fb53c0) at ahc_done+0xc2
> ahc_run_qoutfifo(c0f9a200) at ahc_run_qoutfifo+0xf1
> ahc_platform_intr(c0f9a200,0,c02e2bf8,c027ab82,c0322458) at ahc_platform_intr+0x
> 174
> add_interrupt_randomness(c0322458,0,400010,c0300010,c0300010) at add_interrupt_r
> andomness+0xe
> Xresume10() at Xresume10+0x2b
> --- interrupt, eip = 0xc027fa46, esp = 0xc02e2bf0, ebp = 0xc02e2bf8 ---
> cpu_idle(e,633,2,80f9ff,0) at cpu_idle+0xe
> idle_loop() at idle_loop+0x1d
This problem easily occured after ~20 mins of extracting the mentioned tar file.
I rebuilt my kernel without options AHC_ALLOW_MEMIO.
With the new kernel the tar file extracted without any panics.
When I then tried to 'stress' the system a bit more I did get a bunch of messages
from ahc0 driver plus a panic.
My "stress test" consisted of:
FTP download (~500meg file over 100Mbps LAN using fxp0 from server)
tar xf objtest.tar (~500meg tar file containing /usr/obj /usr/src copy)
second tar xf objtest.tar (second time in different filesystem on ata disks)
cvsup (stable from local cvsup mirror server)
This was busy making noise & heat for about 1hr 40 mins before it died as mentioned.
As such without the AHC_ALLOW_MEMIO option it worked a bit longer but still failed.
Details of the new failure are included below:
tar: Skipping to next header
tar: Skipping to next header
tar: Skipping to next header
tar: Archive contains obsolescent base-64 headers
tar: Skipping to next header
tar: Skipping to next header
tar: Error exit delayed from previous errors
> ahc0:A:1: no active SCB for reconnecting target - issuing BUS DEVICE RESET
SAVED_SCSIID == 0x17, SAVED_LUN == 0x0, ARG_1 == 0x27 ACCUM = 0x27
SEQ_FLAGS == 0xc0, SCBPTR == 0xa, BTT == 0xff, SINDEX == 0x31
SCSIID == 0x27, SCB_SCSIID == 0x27, SCB_LUN == 0x0, SCB_TAG == 0x27, SCB_CONTROL
== 0x64
SCSIBUSL == 0x27, SCSISIGI == 0xe6
SXFRCTL0 == 0x88
SEQCTL == 0x10
>>>>>>>>>>>>>>>>>> Dump Card State Begins <<<<<<<<<<<<<<<<<
ahc0: Dumping Card State in Message-in phase, at SEQADDR 0x1b8
Card was paused
ACCUM = 0x27, SINDEX = 0x31, DINDEX = 0x52, ARG_2 = 0xff
HCNT = 0x0 SCBPTR = 0xa
SCSISIGI[0xe6]:(REQI|BSYI|MSGI|IOI|CDI) ERROR[0x0]
SCSIBUSL[0x27] LASTPHASE[0xe0]:(MSGI|IOI|CDI) SCSISEQ[0x12]:(ENAUTOATNP|ENRSELI)
SBLKCTL[0x2]:(SELWIDE) SCSIRATE[0x0] SEQCTL[0x10]:(FASTMODE)
SEQ_FLAGS[0xc0]:(NO_CDB_SENT|NOT_IDENTIFIED) SSTAT0[0x7]:(DMADONE|SPIORDY|SDONE)
SSTAT1[0x3]:(REQINIT|PHASECHG) SSTAT2[0x0] SSTAT3[0x0]
SIMODE0[0x0] SIMODE1[0xac]:(ENSCSIPERR|ENBUSFREE|ENSCSIRST|ENSELTIMO)
SXFRCTL0[0x88]:(SPIOEN|DFON) DFCNTRL[0x0] DFSTATUS[0x2d]:(FIFOEMP|DFTHRESH|HDONE
|FIFOQWDEMP)
STACK: 0x12c 0x0 0x151 0x192
SCB count = 50
Kernel NEXTQSCB = 38
Card NEXTQSCB = 38
QINFIFO entries:
Waiting Queue entries:
Disconnected Queue entries: 10:39 11:46 14:39
QOUTFIFO entries:
Sequencer Free SCB List: 12 9 6 2 3 7 8 15 1 4 0 5 13
Sequencer SCB Info:
0 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x37]
SCB_LUN[0x0] SCB_TAG[0xff]
1 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0xff]
2 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0xff]
3 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x37]
SCB_LUN[0x0] SCB_TAG[0xff]
4 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0xff]
5 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x37]
SCB_LUN[0x0] SCB_TAG[0xff]
6 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x37]
SCB_LUN[0x0] SCB_TAG[0xff]
7 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0xff]
8 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0xff]
9 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0xff]
10 SCB_CONTROL[0x64]:(DISCONNECTED|TAG_ENB|DISCENB) SCB_SCSIID[0x27]
SCB_LUN[0x0] SCB_TAG[0x27]
11 SCB_CONTROL[0x64]:(DISCONNECTED|TAG_ENB|DISCENB) SCB_SCSIID[0x17]
SCB_LUN[0x0] SCB_TAG[0x2e]
12 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x37]
SCB_LUN[0x0] SCB_TAG[0xff]
13 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x7]
SCB_LUN[0x0] SCB_TAG[0xff]
14 SCB_CONTROL[0x64]:(DISCONNECTED|TAG_ENB|DISCENB) SCB_SCSIID[0x17]
SCB_LUN[0x0] SCB_TAG[0x27]
15 SCB_CONTROL[0xe0]:(TAG_ENB|DISCENB|TARGET_SCB) SCB_SCSIID[0x17]
SCB_LUN[0x0] SCB_TAG[0xff]
Pending list:
39 SCB_CONTROL[0x60]:(TAG_ENB|DISCENB) SCB_SCSIID[0x27] SCB_LUN[0x0]
46 SCB_CONTROL[0x60]:(TAG_ENB|DISCENB) SCB_SCSIID[0x17] SCB_LUN[0x0]
46 SCB_CONTROL[0x60]:(TAG_ENB|DISCENB) SCB_SCSIID[0x17] SCB_LUN[0x0]
[ ... this line repeats ~256 times ... ]
Kernel Free SCB list: 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 3
9 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39
39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39
39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 3
9 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39
39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39
39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 3
9 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39
39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39
39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39 39
<<<<<<<<<<<<<<<<< Dump Card State Ends >>>>>>>>>>>>>>>>>>
Fatal trap 12: page fault while in kernel mode
fault virtual address = 0x2c
fault code = supervisor read, page not present
instruction pointer = 0x8:0xc015755f
stack pointer = 0x10:0xccd72c40
frame pointer = 0x10:0xccd72c50
code segment = base 0x0, limit 0xfffff, type 0x1b
= DPL 0, pres 1, def32 1, gran 1
processor eflags = interrupt enabled, resume, IOPL = 0
current process = 165 (top)
interrupt mask = cam
kernel: type 12 trap, code=0
Stopped at ahc_match_scb+0xa3: movl 0x2c(%eax),%eax
db> trace
ahc_match_scb(c0f9a200,c0fb54c0,1,41,ffffffff,ff,1) at ahc_match_scb+0xa3
ahc_abort_scbs(c0f9a200,1,41,ffffffff,ff) at ahc_abort_scbs+0x2d9
ahc_handle_devreset(c0f9a200,ccd72d44,17,c02acb84,0) at ahc_handle_devreset+0x2c
ahc_handle_scsiint(c0f9a200,64,c0322458,2,ccd72dc0) at ahc_handle_scsiint+0x95f
ahc_platform_intr(c0f9a200,ccd72e48,ccd72e1c,c027ab82,c0322458) at ahc_platform_
intr+0x1f7
add_interrupt_randomness(c0322458,0,1030010,c7f80010,ccd70010) at add_interrupt_
randomness+0xe
Xresume10() at Xresume10+0x2b
--- interrupt, eip = 0xc0188a2f, esp = 0xccd72e08, ebp = 0xccd72e1c ---
sysctl_find_oid(ccd72ef8,2,ccd72e44,ccd72e48,ccd72e70) at sysctl_find_oid+0x1b
sysctl_root(0,ccd72ef8,2,ccd72e70,0) at sysctl_root+0x22
userland_sysctl(c7f825a0,ccd72ef8,2,bfbff9cc,bfbff9d4) at userland_sysctl+0x111
__sysctl(c7f825a0,ccd72f80,2815578c,bfbff9d8,2) at __sysctl+0x5c
syscall2(2f,c107002f,bfbf002f,2,bfbff9d8) at syscall2+0x1f5
Xint0x80_syscall() at Xint0x80_syscall+0x25
db> ps
pid proc addr uid ppid pgrp flag stat wmesg wchan cmd
354 c7f801e0 cd00f000 1001 249 354 004006 2 cvsup
288 c7f80860 ccff5000 1001 240 288 004086 3 ttyin c1115830 ftp
249 c7f80380 cd00c000 1001 248 249 2004086 3 pause cd00c260 tcsh
248 c7f80520 cd007000 1001 246 118 000184 3 select c0324168 sshd
246 c7f806c0 cd002000 0 118 118 000184 3 sbwait cc5bfec8 sshd
240 c7f80a00 ccfed000 1001 239 240 2004086 3 pause ccfed260 tcsh
239 c7f80d40 ccf5e000 1001 237 118 000184 2 sshd
237 c7f80ee0 ccf7a000 0 118 118 000184 3 sbwait cc5be308 sshd
223 c7f80ba0 ccf6e000 1001 222 223 004086 3 ttyin c0f9aa30 tcsh
222 c7f81080 ccf57000 1001 220 118 000184 3 select c0324168 sshd
220 c7f813c0 cce0b000 0 118 118 000184 3 sbwait cc5be548 sshd
171 c7f81560 ccded000 1001 153 171 004186 2 systat
165 c7f825a0 ccd70000 1001 152 165 004106 2 top
159 c7f818a0 ccda1000 1001 158 159 004086 3 ttyin c105b810 tcsh
158 c7f81a40 ccd9c000 0 1 158 004186 3 wait c7f81a40 login
153 c7f82dc0 ccd2f000 0 1 153 004186 3 wait c7f82dc0 login
152 c7f82f60 ccd27000 0 1 152 004186 3 wait c7f82f60 login
118 c7f81be0 ccd91000 0 1 118 000184 3 select c0324168 sshd
116 c7f81d80 ccd84000 0 1 116 000484 2 cron
109 c7f81f20 ccd80000 0 1 104 000084 3 nfsidl c032a6ac nfsiod
--More-- 108 c7f820c0 ccd7c000 0 1 104 000084 3 nfsidl c032a6a8 nfsiod
107 c7f82260 ccd78000 0 1 104 000084 3 nfsidl c032a6a4 nfsiod
106 c7f82400 ccd74000 0 1 104 000084 3 nfsidl c032a6a0 nfsiod
101 c7f82c20 ccd37000 0 1 101 000084 2 ntpd
96 c7f82a80 ccd3b000 0 1 96 000004 2 syslogd
69 c7f82740 ccd44000 0 1 69 000084 3 select c0324168 dhclient
29 c7f828e0 ccd3f000 0 1 29 2000084 3 pause ccd3f260 adjkerntz
9 c7f83100 ccabf000 0 0 0 000204 3 vlruwt c7f83100 vnlru
8 c7f832a0 ccabc000 0 0 0 000204 2 syncer
7 c7f83440 ccab9000 0 0 0 000204 3 vrlock c104c000 bufdaemon
6 c7f835e0 ccab6000 0 0 0 000204 3 psleep c031b260 vmdaemon
5 c7f83780 ccab3000 0 0 0 000204 3 psleep c02ffef8 pagedaemon
4 c7f83920 cc5b8000 0 0 0 000204 3 idle c0f9a200 aic_recovery
0
3 c7f83ac0 cc5b5000 0 0 0 000204 3 idle c0f9a200 aic_recovery
0
2 c7f83c60 c856b000 0 0 0 000204 3 tqthr c0324164 taskqueue
1 c7f83e00 c7f88000 0 0 1 004284 3 wait c7f83e00 init
0 c0323460 c0473000 0 0 0 000204 3 sched c0323460 swapper
Any assistance is appreciated,
Tony
More information about the freebsd-stable
mailing list