cvs commit: src/sys/dev/pci pci.c pci_if.m pci_private.h pcivar.h src/sys/dev/sk if_sk.c if_skreg.h

John-Mark Gurney gurney_j at resnet.uoregon.edu
Tue Oct 17 10:57:54 PDT 2006


Oleg Bulyzhin wrote this message on Tue, Oct 17, 2006 at 14:53 +0400:
> On Mon, Oct 16, 2006 at 10:05:17AM -0700, John-Mark Gurney wrote:
> > Oleg Bulyzhin wrote this message on Mon, Oct 16, 2006 at 12:14 +0400:
> > > On Mon, Oct 09, 2006 at 04:15:56PM +0000, John-Mark Gurney wrote:
> > > > jmg         2006-10-09 16:15:56 UTC
> > > > 
> > > >   FreeBSD src repository
> > > > 
> > > >   Modified files:
> > > >     sys/dev/pci          pci.c pci_if.m pci_private.h pcivar.h 
> > > >     sys/dev/sk           if_sk.c if_skreg.h 
> > > >   Log:
> > > >   provide routines to access VPD data at the PCI layer...
> > > >   
> > > >   remove sk's own implementation, and use the new calls to get the data...
> > > >   
> > > >   Reviewed by:    -arch
> > > >   
> > > >   Revision  Changes    Path
> > > >   1.315     +339 -3    src/sys/dev/pci/pci.c
> > > >   1.9       +13 -0     src/sys/dev/pci/pci_if.m
> > > >   1.18      +4 -0      src/sys/dev/pci/pci_private.h
> > > >   1.71      +34 -0     src/sys/dev/pci/pcivar.h
> > > >   1.131     +7 -148    src/sys/dev/sk/if_sk.c
> > > >   1.39      +0 -31     src/sys/dev/sk/if_skreg.h
> > > 
> > > I have problem with my test machine since this commit:
> > > kernel is panicing on boot if i have my pci bge(4) NIC plugged in.
> > > 
> > > Last kernel messages are:
> > > pci1: physical bus=1
> > > pci1:2:0: bad VPD cksum, remain 244
> > > 
> > > Invoking ddb after panic gives this backtrace:
> > > [skipped]
> > > pci_read_vpd
> > > pci_read_extcap
> > > pci_read_device
> > > pci_add_children
> > > acpi_pci_attach
> > > device_attach
> > > [skipped]
> > > 
> > > (i'm unable to get crashdump)
> > > 
> > > If i unplug bge, kernel boots just fine.
> > > 
> > > P.S. i can provide any additional info needed and can test patches.
> > 
> > Can you get a line number from pci_read_vpd?  Even if you can't get a
> > crash dump, you can use addr2line (or kgdb) w/ the ip of the panic...
> > That would help..
> > 
> > Looks like some manufacturers aren't following the PCI standard.. :(
> > 
> > -- 
> >   John-Mark Gurney				Voice: +1 415 225 5579
> > 
> >      "All that I will do, has been done, All that I have, has not."
> 
> Okay, there is some details:
> 
> When kernel boots with bge plugged in:
> 1) message about bad VPD cksum
> 2) then kernel seems to hang for about 30 seconds (i'm not able to break into
> DDB)
> 3) then i get panic.
> 
> Panic message is about trap inside 'swapper' process (backtrace shows 2 traps
> first inside pci_read_vpd, next one in swapper).
> 
> 1st trap is at pci_read_vpd+0x2c6 it is /usr/src/sys/dev/pci/pci.c:665
> I've added debug printf there:
> 		case 3: /* VPD-R Keyword Value */
> 			printf("off: %d i: %d\n", off, i);
> 			cfg->vpd.vpd_ros[off].value[i++] = byte;
> 
> It seems that 30 seconds delay is memory rewriting (off is constant, while i
> is incrementing).  I get panic message after:
> off:5 i:15407

I just realized that if the card puts a length of zero there, we will end
up wrapping dflen..  This should not happened, definately not for RV
which is required to be at least 1 byte in length...  This will allow
non-RV's to be zero bytes in length, but if an RV has zero bytes in
length, we'll treat it as a checksum failure and remove all read-only
VPD data, and not continue on processing read-write data...

Here is an updated patch that should handle your card properly...

-- 
  John-Mark Gurney				Voice: +1 415 225 5579

     "All that I will do, has been done, All that I have, has not."
-------------- next part --------------
Index: pci.c
===================================================================
RCS file: /home/ncvs/src/sys/dev/pci/pci.c,v
retrieving revision 1.315
diff -u -r1.315 pci.c
--- pci.c	9 Oct 2006 16:15:55 -0000	1.315
+++ pci.c	17 Oct 2006 17:56:15 -0000
@@ -653,12 +653,36 @@
 			cfg->vpd.vpd_ros[off].keyword[0] = byte;
 			cfg->vpd.vpd_ros[off].keyword[1] = vpd_nextbyte(&vrs);
 			dflen = vpd_nextbyte(&vrs);
-			cfg->vpd.vpd_ros[off].value = malloc((dflen + 1) *
-			    sizeof *cfg->vpd.vpd_ros[off].value,
-			    M_DEVBUF, M_WAITOK);
+			if (dflen == 0 &&
+			    strncmp(vpd.vpd_ros[off], "RV", 2) == 0) {
+				/*
+				 * if this happens, we can't trust the rest
+				 * of the VPD.
+				 */
+				printf("pci%d:%d:%d: bad keyword length: %d\n",
+				    cfg->bus, cfg->slot, cfg->func, dflen);
+				cksumvalid = 0;
+				end = 1;
+				break;
+			} else if (dflen == 0) {
+				cfg->vpd.vpd_ros[off].value = malloc(1 *
+				    sizeof *cfg->vpd.vpd_ros[off].value,
+				    M_DEVBUF, M_WAITOK)
+				cfg->vpd.vpd_ros[off].value[0] = '\x00';
+			} else
+				cfg->vpd.vpd_ros[off].value = malloc(
+				    (dflen + 1) *
+				    sizeof *cfg->vpd.vpd_ros[off].value,
+				    M_DEVBUF, M_WAITOK);
 			remain -= 3;
 			i = 0;
-			state = 3;
+			/* keep in sync w/ state 3's transistions */
+			if (dflen == 0 && remain == 0)
+				state = 0;
+			else if (dflen == 0)
+				state = 2;
+			else
+				state = 3;
 			break;
 
 		case 3:	/* VPD-R Keyword Value */
@@ -673,10 +697,13 @@
 					    cfg->bus, cfg->slot, cfg->func,
 					    vrs.cksum);
 					cksumvalid = 0;
+					end = 1;
+					break;
 				}
 			}
 			dflen--;
 			remain--;
+			/* keep in sync w/ state 2's transistions */
 			if (dflen == 0)
 				cfg->vpd.vpd_ros[off++].value[i++] = '\0';
 			if (dflen == 0 && remain == 0) {
@@ -736,6 +763,15 @@
 			break;
 		}
 	}
+
+	if (cksumvalid == 0) {
+		/* read-only data bad, clean up */
+		for (; off; off--)
+			free(cfg->vpd.vpd_ros[off].value, M_DEVBUF);
+
+		free(cfg->vpd.vpd_ros, M_DEVBUF);
+		cfg->vpd.vpd_ros = NULL;
+	}
 #undef REG
 }
 


More information about the cvs-all mailing list