kern/59211: System crashes when moving files from NWFS mounted system

Navoyenok Sergei nasa at les.veco.ru
Wed Nov 12 04:30:20 PST 2003


>Number:         59211
>Category:       kern
>Synopsis:       System crashes when moving files from NWFS mounted system
>Confidential:   no
>Severity:       critical
>Priority:       high
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Wed Nov 12 04:30:17 PST 2003
>Closed-Date:
>Last-Modified:
>Originator:     Navoyenok Sergei
>Release:        FreeBSD 4.9-STABLE i386
>Organization:
Liski Powered Networks
>Environment:
System: FreeBSD nasa.les.veco.ru 4.9-STABLE FreeBSD 4.9-STABLE #0: Wed Nov  5 10:54:33 MSK 2003 nasa at nasa.les.veco.ru:/usr/src/sys/compile/NASAPC i386
cvsup date: 4 Nov 2003

Kernel's configuration file is:
machine		i386
cpu		I686_CPU
ident		NASAPC
maxusers	0

options 	INET			#InterNETworking
options 	FFS			#Berkeley Fast Filesystem
options 	FFS_ROOT		#FFS usable as root device [keep this!]
options 	SOFTUPDATES		#Enable FFS soft updates support
options 	UFS_DIRHASH		#Improve performance on big directories
options 	MSDOSFS			#MSDOS Filesystem
options 	CD9660			#ISO 9660 Filesystem
options 	PROCFS			#Process filesystem
options 	COMPAT_43		#Compatible with BSD 4.3 [KEEP THIS!]
options 	UCONSOLE		#Allow users to grab the console
options 	SYSVSHM			#SYSV-style shared memory
options 	SYSVMSG			#SYSV-style message queues
options 	SYSVSEM			#SYSV-style semaphores
options 	P1003_1B		#Posix P1003_1B real-time extensions
options 	_KPOSIX_PRIORITY_SCHEDULING
options 	ICMP_BANDLIM		#Rate limit bad replies
options 	KBD_INSTALL_CDEV	# install a CDEV entry in /dev
options 	AHC_REG_PRETTY_PRINT	# Print register bitfields in debug
					# output.  Adds ~128k to driver.
options 	AHD_REG_PRETTY_PRINT	# Print register bitfields in debug 
					# output.  Adds ~215k to driver.

device		isa
device		eisa
device		pci

# Floppy drives
device		fdc0	at isa? port IO_FD1 irq 6 drq 2
device		fd0	at fdc0 drive 0

# ATA and ATAPI devices
device		ata0	at isa? port IO_WD1 irq 14
device		ata1	at isa? port IO_WD2 irq 15
device		ata
device		atadisk			# ATA disk drives
device		atapicd			# ATAPI CDROM drives
options 	ATA_STATIC_ID		#Static device numbering

# atkbdc0 controls both the keyboard and the PS/2 mouse
device		atkbdc0	at isa? port IO_KBD
device		atkbd0	at atkbdc? irq 1 flags 0x1

device		vga0	at isa?

# splash screen/screen saver
pseudo-device	splash

# syscons is the default console driver, resembling an SCO console
device		sc0	at isa? flags 0x100

device		agp		# support several AGP chipsets

# Floating point support - do not disable.
device		npx0	at nexus? port IO_NPX irq 13

# Power management support (see LINT for more options)
device		apm0	at nexus? disable flags 0x20 # Advanced Power Management

# Serial (COM) ports
device		sio0	at isa? port IO_COM1 flags 0x10 irq 4
device		sio1	at isa? port IO_COM2 irq 3
device		sio2	at isa? disable port IO_COM3 irq 5
device		sio3	at isa? disable port IO_COM4 irq 9

# Parallel port
device		ppc0	at isa? irq 7
device		ppbus		# Parallel port bus (required)
device		lpt		# Printer


# PCI Ethernet NICs that use the common MII bus controller code.
# NOTE: Be sure to keep the 'device miibus' line in order to use these NICs!
device		miibus		# MII bus support
device		rl		# RealTek 8129/8139

# Pseudo devices - the number indicates how many units to allocate.
pseudo-device	loop		# Network loopback
pseudo-device	ether		# Ethernet support
pseudo-device	ppp	1	# Kernel PPP
pseudo-device	tun		# Packet tunnel.
pseudo-device	pty		# Pseudo-ttys (telnet etc)

# For NASAPC only
options		NFS_NOSERVER
options		USER_LDT
options		SC_DISABLE_REBOOT
options		IPX
options		NCP
options		NWFS
options		NETSMB
options		NETSMBCRYPTO
options		LIBMCHAIN
options		LIBICONV
options		SMBFS
device		pcm
device		sbc

# for Debugging
options	DDB
makeoptions	DEBUG=-g


>Description:
  System crashes when you move data from Novell Netware server with diagnostics:
  panic: vrele: negative ref cnt

  Beware of using rw nwfs mounts!

>How-To-Repeat:
  There is no difference here between -stable and -current branches.
  System crashes happily under either one.
  The problem was repeated on different sites for different versions
  of FreeBSD and Novell.

  I use Novell Netware 4.11 server (long filenames support, OS/2 namespace)
  Command to mount:
  mount_nwfs -S ServerName -U UserName -V VolumeName -l ru_RU.KOI8-R /MountPoint
  
  System crashes with kernel panic when I move files from /MountPoint with mv(1).
  The error appears without any obvious pattern, accidentaly, when moving
  a number of files.

>Fix:
This is Debugging session:

nasa# gdb -k kernel.debug.0 vmcore.0
GNU gdb 4.18 (FreeBSD)
Copyright 1998 Free Software Foundation, Inc.
GDB is free software, covered by the GNU General Public License, and you are
welcome to change it and/or distribute copies of it under certain conditions.
Type "show copying" to see the conditions.
There is absolutely no warranty for GDB.  Type "show warranty" for details.
This GDB was configured as "i386-unknown-freebsd"...Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 2627 in elfstab_build_psymtabs
Deprecated bfd_read called at /usr/src/gnu/usr.bin/binutils/gdb/../../../../contrib/gdb/gdb/dbxread.c line 933 in fill_symbuf

IdlePTD at phsyical address 0x00383000
initial pcb at physical address 0x002cefc0
panicstr: from debugger
panic messages:
---
panic: vrele: negative ref cnt
panic: from debugger
Uptime: 10m16s

dumping to dev #ad/0x30001, offset 524312
dump ata0: resetting devices .. done
255 254 253 252 251 250 249 248 247 246 245 244 243 242 241 240 239 238 237 236 235 234 233 232 231 230 229 228 227 226 225 224 223 222 221 220 219 218 217 216 215 214 213 212 211 210 209 208 207 206 205 204 203 202 201 200 199 198 197 196 195 194 193 192 191 190 189 188 187 186 185 184 183 182 181 180 179 178 177 176 175 174 173 172 171 170 169 168 167 166 165 164 163 162 161 160 159 158 157 156 155 154 153 152 151 150 149 148 147 146 145 144 143 142 141 140 139 138 137 136 135 134 133 132 131 130 129 128 127 126 125 124 123 122 121 120 119 118 117 116 115 114 113 112 111 110 109 108 107 106 105 104 103 102 101 100 99 98 97 96 95 94 93 92 91 90 89 88 87 86 85 84 83 82 81 80 79 78 77 76 75 74 73 72 71 70 69 68 67 66 65 64 63 62 61 60 59 58 57 56 55 54 53 52 51 50 49 48 47 46 45 44 43 42 41 40 39 38 37 36 35 34 33 32 31 30 29 28 27 26 25 24 23 22 21 20 19 18 17 16 15 14 13 12 11 10 9 8 7 6 5 4 3 2 1 0 
---
#0  dumpsys () at ../../kern/kern_shutdown.c:487
487		if (dumping++) {

(kgdb) where

#0  dumpsys () at ../../kern/kern_shutdown.c:487
#1  0xc0158514 in boot (howto=260) at ../../kern/kern_shutdown.c:316
#2  0xc0158961 in panic (fmt=0xc0260204 "from debugger")
    at ../../kern/kern_shutdown.c:595
#3  0xc0122e29 in db_panic (addr=-1071392883, have_addr=0, count=-1, 
    modif=0xd1685c64 "") at ../../ddb/db_command.c:435
#4  0xc0122dc7 in db_command (last_cmdp=0xc0295f34, cmd_table=0xc0295d74, 
    aux_cmd_tablep=0xc02ca178) at ../../ddb/db_command.c:333
#5  0xc0122e8e in db_command_loop () at ../../ddb/db_command.c:457
#6  0xc012505f in db_trap (type=3, code=0) at ../../ddb/db_trap.c:71
#7  0xc023d53c in kdb_trap (type=3, code=0, regs=0xd1685d6c)
    at ../../i386/i386/db_interface.c:158
#8  0xc024a7c8 in trap (frame={tf_fs = -1070792688, tf_es = 16, 
      tf_ds = -781713392, tf_edi = -781408448, tf_esi = 256, 
      tf_ebp = -781689420, tf_isp = -781689448, tf_ebx = -1071191614, 
      tf_edx = -1071073617, tf_ecx = 32, tf_eax = 18, tf_trapno = 3, 
      tf_err = 0, tf_eip = -1071392883, tf_cs = 8, tf_eflags = 582, 
      tf_esp = -1071073633, tf_ss = -1071211365}) at ../../i386/i386/trap.c:592
#9  0xc023d78d in Debugger (msg=0xc0269c9b "panic") at machine/cpufunc.h:67
#10 0xc0158958 in panic (fmt=0xc026e9c2 "vrele: negative ref cnt")
    at ../../kern/kern_shutdown.c:593
#11 0xc0188492 in vrele (vp=0xd16c9cc0) at ../../kern/vfs_subr.c:1621
#12 0xc01c2b51 in nwfs_reclaim (ap=0xd1685e2c) at ../../nwfs/nwfs_node.c:244
#13 0xc01888b0 in vclean (vp=0xd16ca740, flags=8, p=0xcbfcca00)
    at vnode_if.h:836
#14 0xc0188a18 in vgonel (vp=0xd16ca740, p=0xcbfcca00)
    at ../../kern/vfs_subr.c:2058
#15 0xc01889de in vgone (vp=0xd16ca740) at ../../kern/vfs_subr.c:2031
#16 0xc01c2c01 in nwfs_inactive (ap=0xd1685eb8) at ../../nwfs/nwfs_node.c:271
#17 0xc018850c in vput (vp=0xd16ca740) at vnode_if.h:815
#18 0xc018d64e in rmdir (p=0xcbfcca00, uap=0xd1685f80)
    at ../../kern/vfs_syscalls.c:2851
#19 0xc024b082 in syscall2 (frame={tf_fs = 47, tf_es = 47, tf_ds = 47, 
      tf_edi = 135054443, tf_esi = 135176416, tf_ebp = -1077940464, 
      tf_isp = -781688876, tf_ebx = 135459776, tf_edx = 135176416, 
      tf_ecx = 35, tf_eax = 137, tf_trapno = 7, tf_err = 2, 
      tf_eip = 674555048, tf_cs = 31, tf_eflags = 647, tf_esp = -1077940492, 
      tf_ss = 47}) at ../../i386/i386/trap.c:1175
#20 0xc023e435 in Xint0x80_syscall ()
#21 0x80a7253 in ?? ()
#22 0x805f640 in ?? ()
#23 0x805c40f in ?? ()
#24 0x805bf1a in ?? ()
#25 0x805d00b in ?? ()
#26 0x80517d4 in ?? ()
#27 0x808bb95 in ?? ()
#28 0x805859c in ?? ()
#29 0x8058687 in ?? ()
#30 0x80589c5 in ?? ()
#31 0x8058a90 in ?? ()
#32 0x806b848 in ?? ()
#33 0x806ba1b in ?? ()
#34 0x806c278 in ?? ()
#35 0x804bfee in ?? ()

The System is panic from #10 after calling vrele (#11)

(kgdb) up 11
#11 0xc0188492 in vrele (vp=0xd16c9cc0) at ../../kern/vfs_subr.c:1621
1621			panic("vrele: negative ref cnt");

I enter into file /usr/src/sys/kern/vfs_subr.c and go to line 1621. I see what system
rise panic if v_usercount < 1.

I go into nwfs_node.c at line 12

(kgdb) up 12
#12 0xc01c2b51 in nwfs_reclaim (ap=0xd1685e2c) at ../../nwfs/nwfs_node.c:244
244			vrele(dvp);

I see dpv (it's pointer to struct vnode type)

(kgdb) p dvp
$1 = (struct vnode *) 0xd16c9cc0
(kgdb) p *dvp
$2 = {v_flag = 532480, v_usecount = 0, v_writecount = 0, v_holdcnt = 0, 
  v_id = 1661, v_mount = 0xc1360a00, v_op = 0xc11f2b00, v_freelist = {
    tqe_next = 0x0, tqe_prev = 0xd16d10dc}, v_nmntvnodes = {
    tqe_next = 0xd16d28c0, tqe_prev = 0xd16ca464}, v_cleanblkhd = {
    tqh_first = 0x0, tqh_last = 0xd16c9cec}, v_dirtyblkhd = {tqh_first = 0x0, 
    tqh_last = 0xd16c9cf4}, v_synclist = {le_next = 0x0, le_prev = 0x0}, 
  v_numoutput = 0, v_type = VDIR, v_un = {vu_mountedhere = 0x0, 
    vu_socket = 0x0, vu_spec = {vu_specinfo = 0x0, vu_specnext = {
        sle_next = 0x0}}, vu_fifoinfo = 0x0}, v_lease = 0x0, v_lastw = 0, 
  v_cstart = 0, v_lasta = 0, v_clen = 0, v_object = 0xd166c284, v_interlock = {
    lock_data = 0}, v_vnlock = 0x0, v_tag = VT_NWFS, v_data = 0xc1380200, 
  v_cache_src = {lh_first = 0x0}, v_cache_dst = {tqh_first = 0x0, 
    tqh_last = 0xd16c9d40}, v_dd = 0xd16c9cc0, v_ddid = 0, v_pollinfo = {
    vpi_lock = {lock_data = 0}, vpi_selinfo = {si_pid = 0, si_note = {
        slh_first = 0x0}, si_flags = 0}, vpi_events = 0, vpi_revents = 0}, 
  v_vxproc = 0x0}
(kgdb)

Oops! It's Really. The Value of field "v_usecount" is equal 0 ! (
As the subroutine nwfs_reclaim(..) (execute wich to rise the panic) yourself to set value of dvp
as I can elliminate this problem with next code into nwfs_node.c:

/*
 * Free nwnode, and give vnode back to system
 */
int
nwfs_reclaim(ap)                     
        struct vop_reclaim_args /* {
    		struct vnode *a_vp;
		struct proc *a_p;
        } */ *ap;
{
	struct vnode *dvp = NULL, *vp = ap->a_vp;
..
	if (dvp && dvp->v_usecount) {
	      /*~~~~~~~~~~~~~~~~~~ corrected code */
		vrele(dvp);
	}
	return (0);
}


Help me to correct this problem correctly, pleasure.

>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list