From borjam at sarenet.es Mon Nov 2 08:28:50 2009 From: borjam at sarenet.es (Borja Marcos) Date: Mon Nov 2 08:28:57 2009 Subject: 8.0-RC2: ZFS deadlock with zfs receive In-Reply-To: <20091030232805.GA2996@garage.freebsd.pl> References: <804B79F6-27CE-40D2-8AB8-6FC378F448FA@sarenet.es> <4AEA0EAD.1050302@memberwebs.com> <20091030232805.GA2996@garage.freebsd.pl> Message-ID: On Oct 31, 2009, at 12:28 AM, Pawel Jakub Dawidek wrote: > On Thu, Oct 29, 2009 at 03:52:45PM -0600, Stef Walter wrote: >> Borja Marcos wrote: >>> I've been sending some alltraces to pjd about this easy to reproduce >>> problem. I am using zfs send/zfs receive to replicate a dataset >>> from one >>> server to another. At 1 minute intervals, an incremental snapshot is >>> sent to update the dataset copy. If there is reading activity on the >>> dataset copy, a deadlock can happen rendering ZFS and all the FS >>> subsystem unusable. I've tried with 8.0RC2 and it still happens. > > I was able to reproduce it, but I don't have fix yet. > >> FWIW, another (or the same) zfs recv deadlock I've been trying to >> get to >> the bottom of: >> >> http://lists.freebsd.org/pipermail/freebsd-fs/2009-October/ >> 006999.html > > Could you guys recompile your kernel after uncommenting line: > > #CFLAGS+=-DDEBUG=1 > > in sys/modules/zfs/Makefile? > > You should see panic on assertion instead of deadlock, I believe. Aye aye, Sir! What do you want me to collect? Just an alltrace? Anything else? If you want the VMWare images to examine the panic by yourself just let me know. Borja. From borjam at sarenet.es Mon Nov 2 10:35:22 2009 From: borjam at sarenet.es (Borja Marcos) Date: Mon Nov 2 10:35:29 2009 Subject: 8.0-RC2: ZFS deadlock with zfs receive In-Reply-To: <20091030232805.GA2996@garage.freebsd.pl> References: <804B79F6-27CE-40D2-8AB8-6FC378F448FA@sarenet.es> <4AEA0EAD.1050302@memberwebs.com> <20091030232805.GA2996@garage.freebsd.pl> Message-ID: <435C3337-40EA-4A79-8774-D394D71342D3@sarenet.es> On Oct 31, 2009, at 12:28 AM, Pawel Jakub Dawidek wrote: > Could you guys recompile your kernel after uncommenting line: > > #CFLAGS+=-DDEBUG=1 > > in sys/modules/zfs/Makefile? > > You should see panic on assertion instead of deadlock, I believe. I've started the test. For now I see a bunch of LOR complaints. # dmesg Copyright (c) 1992-2009 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.0-RC2 #0: Mon Oct 26 14:40:09 CET 2009 root@:/pool/newsrc/obj/pool/newsrc/src/sys/DEBUG WARNING: WITNESS option enabled, expect reduced performance. WARNING: DIAGNOSTIC option enabled, expect reduced performance. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Core(TM)2 Duo CPU T8100 @ 2.10GHz (2094.43-MHz K8- class CPU) Origin = "GenuineIntel" Id = 0x10676 Stepping = 6 Features = 0xfebfbff < FPU ,VME ,DE ,PSE ,TSC ,MSR ,PAE ,MCE ,CX8 ,APIC ,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS> Features2=0x80082201> AMD Features=0x20100800 AMD Features2=0x1 TSC: P-state invariant real memory = 780140544 (744 MB) avail memory = 730873856 (697 MB) ACPI APIC Table: MADT: Forcing active-low polarity and level trigger for SCI ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-safe" frequency 3579545 Hz quality 850 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: at device 1.0 on pci0 pci1: on pcib1 isab0: at device 7.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x10c0-0x10cf at device 7.1 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] pci0: at device 7.3 (no driver attached) pci0: at device 7.7 (no driver attached) vgapci0: port 0x10d0-0x10df mem 0xd0000000-0xd7ffffff,0xd8000000-0xd87fffff irq 16 at device 15.0 on pci0 mpt0: port 0x1400-0x14ff mem 0xd8820000-0xd883ffff,0xd8800000-0xd881ffff irq 17 at device 16.0 on pci0 mpt0: [ITHREAD] mpt0: MPI Version=1.2.0.0 pcib2: at device 17.0 on pci0 pci2: on pcib2 em0: port 0x2000-0x203f mem 0xd8940000-0xd895ffff,0xd8900000-0xd890ffff irq 18 at device 0.0 on pci2 em0: Memory Access and/or Bus Master bits were not set! em0: [FILTER] em0: Ethernet address: 00:0c:29:fd:ab:03 em1: port 0x2040-0x207f mem 0xd8960000-0xd897ffff,0xd8910000-0xd891ffff irq 19 at device 1.0 on pci2 em1: Memory Access and/or Bus Master bits were not set! em1: [FILTER] em1: Ethernet address: 00:0c:29:fd:ab:f9 pci2: at device 2.0 (no driver attached) em2: port 0x20c0-0x20ff mem 0xd8980000-0xd899ffff,0xd8920000-0xd892ffff irq 17 at device 3.0 on pci2 em2: Memory Access and/or Bus Master bits were not set! em2: [FILTER] em2: Ethernet address: 00:0c:29:fd:ab:0d em3: port 0x2400-0x243f mem 0xd89a0000-0xd89bffff,0xd8930000-0xd893ffff irq 19 at device 5.0 on pci2 em3: Memory Access and/or Bus Master bits were not set! em3: [FILTER] em3: Ethernet address: 00:0c:29:fd:ab:17 pcib3: at device 21.0 on pci0 pci3: on pcib3 pcib4: at device 21.1 on pci0 pci4: on pcib4 pcib5: at device 21.2 on pci0 pci5: on pcib5 pcib6: at device 21.3 on pci0 pci6: on pcib6 pcib7: at device 21.4 on pci0 pci7: on pcib7 pcib8: at device 21.5 on pci0 pci8: on pcib8 pcib9: at device 21.6 on pci0 pci9: on pcib9 pcib10: at device 21.7 on pci0 pci10: on pcib10 pcib11: at device 22.0 on pci0 pci11: on pcib11 pcib12: at device 22.1 on pci0 pci12: on pcib12 pcib13: at device 22.2 on pci0 pci13: on pcib13 pcib14: at device 22.3 on pci0 pci14: on pcib14 pcib15: at device 22.4 on pci0 pci15: on pcib15 pcib16: at device 22.5 on pci0 pci16: on pcib16 pcib17: at device 22.6 on pci0 pci17: on pcib17 pcib18: at device 22.7 on pci0 pci18: on pcib18 pcib19: at device 23.0 on pci0 pci19: on pcib19 pcib20: at device 23.1 on pci0 pci20: on pcib20 pcib21: at device 23.2 on pci0 pci21: on pcib21 pcib22: at device 23.3 on pci0 pci22: on pcib22 pcib23: at device 23.4 on pci0 pci23: on pcib23 pcib24: at device 23.5 on pci0 pci24: on pcib24 pcib25: at device 23.6 on pci0 pci25: on pcib25 pcib26: at device 23.7 on pci0 pci26: on pcib26 pcib27: at device 24.0 on pci0 pci27: on pcib27 pcib28: at device 24.1 on pci0 pci28: on pcib28 pcib29: at device 24.2 on pci0 pci29: on pcib29 pcib30: at device 24.3 on pci0 pci30: on pcib30 pcib31: at device 24.4 on pci0 pci31: on pcib31 pcib32: at device 24.5 on pci0 pci32: on pcib32 pcib33: at device 24.6 on pci0 pci33: on pcib33 pcib34: at device 24.7 on pci0 pci34: on pcib34 acpi_acad0: on acpi0 acpi_button0: on acpi0 atrtc0: port 0x70-0x71 irq 8 on acpi0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: [ITHREAD] psm0: model IntelliMouse, device ID 3 ppc0: port 0x378-0x37f irq 7 on acpi0 ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode ppc0: [ITHREAD] ppbus0: on ppc0 plip0: on ppbus0 plip0: [ITHREAD] lpt0: on ppbus0 lpt0: [ITHREAD] lpt0: Interrupt-driven port ppi0: on ppbus0 uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: [FILTER] uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0 uart1: [FILTER] fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FILTER] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 cpu0: on acpi0 acpi_throttle0: on cpu0 orm0: at iomem 0xc0000-0xc7fff,0xca000-0xcafff, 0xcb000-0xcbfff,0xcc000-0xccfff,0xcd000-0xcdfff,0xdc000-0xdffff, 0xe4000-0xe7fff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is present; to enable, add "vfs.zfs.prefetch_disable=0" to /boot/ loader.conf. ZFS WARNING: Recommended minimum kmem_size is 512MB; expect unstable behavior. Consider tuning vm.kmem_size and vm.kmem_size_max in /boot/loader.conf. ZFS filesystem version 13 ZFS storage pool version 13 Timecounter "TSC" frequency 2094428941 Hz quality 800 Timecounters tick every 10.000 msec Waiting 5 seconds for SCSI devices to settle acd0: CDROM at ata1-master UDMA33 da0 at mpt0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-2 device da0: 320.000MB/s transfers (160.000MHz, offset 127, 16bit) da0: Command Queueing enabled da0: 8192MB (16777216 512 byte sectors: 255H 63S/T 1044C) da1 at mpt0 bus 0 target 1 lun 0 da1: Fixed Direct Access SCSI-2 device da1: 320.000MB/s transfers (160.000MHz, offset 127, 16bit) da1: Command Queueing enabled da1: 4096MB (8388608 512 byte sectors: 255H 63S/T 522C) da2 at mpt0 bus 0 target 2 lun 0 da2: Fixed Direct Access SCSI-2 device da2: 320.000MB/s transfers (160.000MHz, offset 127, 16bit) da2: Command Queueing enabled da2: 4096MB (8388608 512 byte sectors: 255H 63S/T 522C) WARNING: WITNESS option enabled, expect reduced performance. WARNING: DIAGNOSTIC option enabled, expect reduced performance. Trying to mount root from ufs:/dev/da0s1a Expensive timeout(9) function: 0xffffffff8046e5b0(0xffffffff80e4f5e0) 0.002432990 s Expensive timeout(9) function: 0xffffffff8031ffc0(0xffffff8000277000) 0.008108572 s Expensive timeout(9) function: 0xffffffff8031ffc0(0xffffff8000265000) 0.035305324 s lock order reversal: 1st 0xffffff800f0c3cc8 bufwait (bufwait) @ /pool/newsrc/src/sys/kern/ vfs_bio.c:2559 2nd 0xffffff000c721a00 dirhash (dirhash) @ /pool/newsrc/src/sys/ufs/ ufs/ufs_dirhash.c:285 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e _sx_xlock() at _sx_xlock+0x55 ufsdirhash_acquire() at ufsdirhash_acquire+0x44 ufsdirhash_remove() at ufsdirhash_remove+0x16 ufs_dirremove() at ufs_dirremove+0x181 ufs_remove() at ufs_remove+0x92 VOP_REMOVE_APV() at VOP_REMOVE_APV+0xd7 kern_unlinkat() at kern_unlinkat+0x252 syscall() at syscall+0x1d0 Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (10, FreeBSD ELF64, unlink), rip = 0x80072be5c, rsp = 0x7fffffffe8b8, rbp = 0x800902300 --- Waiting (max 60 seconds) for system process `vnlru' to stop...done Waiting (max 60 seconds) for system process `bufdaemon' to stop...done Waiting (max 60 seconds) for system process `syncer' to stop... Syncing disks, vnodes remaining...2 2 1 0 0 done All buffers synced. lock order reversal: 1st 0xffffff0002558308 ufs (ufs) @ /pool/newsrc/src/sys/kern/ vfs_mount.c:1200 2nd 0xffffff0002558cc8 devfs (devfs) @ /pool/newsrc/src/sys/kern/ vfs_subr.c:2083 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e __lockmgr_args() at __lockmgr_args+0xd03 vop_stdlock() at vop_stdlock+0x39 VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b _vn_lock() at _vn_lock+0x57 vget() at vget+0x7b devfs_allocv() at devfs_allocv+0x100 devfs_root() at devfs_root+0x48 dounmount() at dounmount+0x474 vfs_unmountall() at vfs_unmountall+0x54 boot() at boot+0x7d3 reboot() at reboot+0x68 syscall() at syscall+0x1d0 Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (55, FreeBSD ELF64, reboot), rip = 0x40892c, rsp = 0x7fffffffe738, rbp = 0x4023d0 --- Uptime: 2h1m35s Rebooting... Copyright (c) 1992-2009 The FreeBSD Project. Copyright (c) 1979, 1980, 1983, 1986, 1988, 1989, 1991, 1992, 1993, 1994 The Regents of the University of California. All rights reserved. FreeBSD is a registered trademark of The FreeBSD Foundation. FreeBSD 8.0-RC2 #2: Mon Nov 2 12:08:37 CET 2009 root@:/pool/newsrc/obj/pool/newsrc/src/sys/DEBUG WARNING: WITNESS option enabled, expect reduced performance. WARNING: DIAGNOSTIC option enabled, expect reduced performance. Timecounter "i8254" frequency 1193182 Hz quality 0 CPU: Intel(R) Core(TM)2 Duo CPU T8100 @ 2.10GHz (2094.42-MHz K8- class CPU) Origin = "GenuineIntel" Id = 0x10676 Stepping = 6 Features = 0xfebfbff < FPU ,VME ,DE ,PSE ,TSC ,MSR ,PAE ,MCE ,CX8 ,APIC ,SEP,MTRR,PGE,MCA,CMOV,PAT,PSE36,CLFLUSH,DTS,ACPI,MMX,FXSR,SSE,SSE2,SS> Features2=0x80082201> AMD Features=0x20100800 AMD Features2=0x1 TSC: P-state invariant real memory = 780140544 (744 MB) avail memory = 730472448 (696 MB) ACPI APIC Table: MADT: Forcing active-low polarity and level trigger for SCI ioapic0 irqs 0-23 on motherboard kbd1 at kbdmux0 acpi0: on motherboard acpi0: [ITHREAD] acpi0: Power Button (fixed) Timecounter "ACPI-safe" frequency 3579545 Hz quality 850 acpi_timer0: <24-bit timer at 3.579545MHz> port 0x1008-0x100b on acpi0 pcib0: port 0xcf8-0xcff on acpi0 pci0: on pcib0 pcib1: at device 1.0 on pci0 pci1: on pcib1 isab0: at device 7.0 on pci0 isa0: on isab0 atapci0: port 0x1f0-0x1f7,0x3f6,0x170-0x177,0x376,0x10c0-0x10cf at device 7.1 on pci0 ata0: on atapci0 ata0: [ITHREAD] ata1: on atapci0 ata1: [ITHREAD] pci0: at device 7.3 (no driver attached) pci0: at device 7.7 (no driver attached) vgapci0: port 0x10d0-0x10df mem 0xd0000000-0xd7ffffff,0xd8000000-0xd87fffff irq 16 at device 15.0 on pci0 mpt0: port 0x1400-0x14ff mem 0xd8820000-0xd883ffff,0xd8800000-0xd881ffff irq 17 at device 16.0 on pci0 mpt0: [ITHREAD] mpt0: MPI Version=1.2.0.0 pcib2: at device 17.0 on pci0 pci2: on pcib2 em0: port 0x2000-0x203f mem 0xd8940000-0xd895ffff,0xd8900000-0xd890ffff irq 18 at device 0.0 on pci2 em0: Memory Access and/or Bus Master bits were not set! em0: [FILTER] em0: Ethernet address: 00:0c:29:fd:ab:03 em1: port 0x2040-0x207f mem 0xd8960000-0xd897ffff,0xd8910000-0xd891ffff irq 19 at device 1.0 on pci2 em1: Memory Access and/or Bus Master bits were not set! em1: [FILTER] em1: Ethernet address: 00:0c:29:fd:ab:f9 pci2: at device 2.0 (no driver attached) em2: port 0x20c0-0x20ff mem 0xd8980000-0xd899ffff,0xd8920000-0xd892ffff irq 17 at device 3.0 on pci2 em2: Memory Access and/or Bus Master bits were not set! em2: [FILTER] em2: Ethernet address: 00:0c:29:fd:ab:0d em3: port 0x2400-0x243f mem 0xd89a0000-0xd89bffff,0xd8930000-0xd893ffff irq 19 at device 5.0 on pci2 em3: Memory Access and/or Bus Master bits were not set! em3: [FILTER] em3: Ethernet address: 00:0c:29:fd:ab:17 pcib3: at device 21.0 on pci0 pci3: on pcib3 pcib4: at device 21.1 on pci0 pci4: on pcib4 pcib5: at device 21.2 on pci0 pci5: on pcib5 pcib6: at device 21.3 on pci0 pci6: on pcib6 pcib7: at device 21.4 on pci0 pci7: on pcib7 pcib8: at device 21.5 on pci0 pci8: on pcib8 pcib9: at device 21.6 on pci0 pci9: on pcib9 pcib10: at device 21.7 on pci0 pci10: on pcib10 pcib11: at device 22.0 on pci0 pci11: on pcib11 pcib12: at device 22.1 on pci0 pci12: on pcib12 pcib13: at device 22.2 on pci0 pci13: on pcib13 pcib14: at device 22.3 on pci0 pci14: on pcib14 pcib15: at device 22.4 on pci0 pci15: on pcib15 pcib16: at device 22.5 on pci0 pci16: on pcib16 pcib17: at device 22.6 on pci0 pci17: on pcib17 pcib18: at device 22.7 on pci0 pci18: on pcib18 pcib19: at device 23.0 on pci0 pci19: on pcib19 pcib20: at device 23.1 on pci0 pci20: on pcib20 pcib21: at device 23.2 on pci0 pci21: on pcib21 pcib22: at device 23.3 on pci0 pci22: on pcib22 pcib23: at device 23.4 on pci0 pci23: on pcib23 pcib24: at device 23.5 on pci0 pci24: on pcib24 pcib25: at device 23.6 on pci0 pci25: on pcib25 pcib26: at device 23.7 on pci0 pci26: on pcib26 pcib27: at device 24.0 on pci0 pci27: on pcib27 pcib28: at device 24.1 on pci0 pci28: on pcib28 pcib29: at device 24.2 on pci0 pci29: on pcib29 pcib30: at device 24.3 on pci0 pci30: on pcib30 pcib31: at device 24.4 on pci0 pci31: on pcib31 pcib32: at device 24.5 on pci0 pci32: on pcib32 pcib33: at device 24.6 on pci0 pci33: on pcib33 pcib34: at device 24.7 on pci0 pci34: on pcib34 acpi_acad0: on acpi0 acpi_button0: on acpi0 atrtc0: port 0x70-0x71 irq 8 on acpi0 atkbdc0: port 0x60,0x64 irq 1 on acpi0 atkbd0: irq 1 on atkbdc0 kbd0 at atkbd0 atkbd0: [GIANT-LOCKED] atkbd0: [ITHREAD] psm0: irq 12 on atkbdc0 psm0: [GIANT-LOCKED] psm0: [ITHREAD] psm0: model IntelliMouse, device ID 3 ppc0: port 0x378-0x37f irq 7 on acpi0 ppc0: Generic chipset (NIBBLE-only) in COMPATIBLE mode ppc0: [ITHREAD] ppbus0: on ppc0 plip0: on ppbus0 plip0: [ITHREAD] lpt0: on ppbus0 lpt0: [ITHREAD] lpt0: Interrupt-driven port ppi0: on ppbus0 uart0: <16550 or compatible> port 0x3f8-0x3ff irq 4 flags 0x10 on acpi0 uart0: [FILTER] uart1: <16550 or compatible> port 0x2f8-0x2ff irq 3 on acpi0 uart1: [FILTER] fdc0: port 0x3f0-0x3f5,0x3f7 irq 6 drq 2 on acpi0 fdc0: [FILTER] fd0: <1440-KB 3.5" drive> on fdc0 drive 0 cpu0: on acpi0 acpi_throttle0: on cpu0 orm0: at iomem 0xc0000-0xc7fff,0xca000-0xcafff, 0xcb000-0xcbfff,0xcc000-0xccfff,0xcd000-0xcdfff,0xdc000-0xdffff, 0xe4000-0xe7fff on isa0 sc0: at flags 0x100 on isa0 sc0: VGA <16 virtual consoles, flags=0x300> vga0: at port 0x3c0-0x3df iomem 0xa0000-0xbffff on isa0 ZFS NOTICE: Prefetch is disabled by default if less than 4GB of RAM is present; to enable, add "vfs.zfs.prefetch_disable=0" to /boot/ loader.conf. ZFS WARNING: Recommended minimum kmem_size is 512MB; expect unstable behavior. Consider tuning vm.kmem_size and vm.kmem_size_max in /boot/loader.conf. ZFS filesystem version 13 ZFS storage pool version 13 Timecounter "TSC" frequency 2094424190 Hz quality 800 Timecounters tick every 10.000 msec Waiting 5 seconds for SCSI devices to settle acd0: CDROM at ata1-master UDMA33 da0 at mpt0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-2 device da0: 320.000MB/s transfers (160.000MHz, offset 127, 16bit) da0: Command Queueing enabled da0: 8192MB (16777216 512 byte sectors: 255H 63S/T 1044C) da1 at mpt0 bus 0 target 1 lun 0 da1: Fixed Direct Access SCSI-2 device da1: 320.000MB/s transfers (160.000MHz, offset 127, 16bit) da1: Command Queueing enabled da1: 4096MB (8388608 512 byte sectors: 255H 63S/T 522C) Expensive timeout(9) function: 0xffffffff806fd940(0) 0.002228495 s da2 at mpt0 bus 0 target 2 lun 0 da2: Fixed Direct Access SCSI-2 device da2: 320.000MB/s transfers (160.000MHz, offset 127, 16bit) da2: Command Queueing enabled da2: 4096MB (8388608 512 byte sectors: 255H 63S/T 522C) WARNING: WITNESS option enabled, expect reduced performance. WARNING: DIAGNOSTIC option enabled, expect reduced performance. Trying to mount root from ufs:/dev/da0s1a lock order reversal: 1st 0xffffff000284d010 buf->b_lock (buf->b_lock) @ /pool/newsrc/src/ sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c: 2506 2nd 0xffffff0002851058 db->db_mtx (db->db_mtx) @ /pool/newsrc/src/ sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/ dbuf.c:421 3rd 0xffffff000284cdd8 buf->b_lock (buf->b_lock) @ /pool/newsrc/src/ sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c: 3014 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e _sx_slock() at _sx_slock+0x55 arc_released() at arc_released+0x34 dbuf_set_data() at dbuf_set_data+0x62 dbuf_read_done() at dbuf_read_done+0x10b arc_read_done() at arc_read_done+0x1d2 zio_done() at zio_done+0x308 zio_execute() at zio_execute+0xb1 arc_read_nolock() at arc_read_nolock+0x3d0 arc_read() at arc_read+0xaf dbuf_read() at dbuf_read+0x62b dbuf_findbp() at dbuf_findbp+0x19b dbuf_hold_impl() at dbuf_hold_impl+0x164 dbuf_hold() at dbuf_hold+0x1b dnode_hold_impl() at dnode_hold_impl+0xc5 dmu_buf_hold() at dmu_buf_hold+0x34 zap_lockdir() at zap_lockdir+0x6e zap_lookup_norm() at zap_lookup_norm+0x45 zap_lookup() at zap_lookup+0x2e dsl_pool_open() at dsl_pool_open+0xe3 spa_load() at spa_load+0x3a9 spa_open_common() at spa_open_common+0x12d spa_get_stats() at spa_get_stats+0x42 zfs_ioc_pool_stats() at zfs_ioc_pool_stats+0x2c zfsdev_ioctl() at zfsdev_ioctl+0x8d devfs_ioctl_f() at devfs_ioctl_f+0x76 kern_ioctl() at kern_ioctl+0xc5 ioctl() at ioctl+0xfd syscall() at syscall+0x1d0 Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800fe874c, rsp = 0x7fffffffd808, rbp = 0x801222140 --- lock order reversal: 1st 0xffffff0002850058 db->db_mtx (db->db_mtx) @ /pool/newsrc/src/ sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/ dbuf.c:549 2nd 0xffffff00028580d8 dn->dn_mtx (dn->dn_mtx) @ /pool/newsrc/src/ sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/ dnode.c:1196 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Expensive timeout(9) function: 0xffffffff80875e50(0xffffffff80e3ff80) 0.002947581 s _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e _sx_xlock() at _sx_xlock+0x55 dnode_block_freed() at dnode_block_freed+0x8e dbuf_read() at dbuf_read+0x155 dmu_buf_hold_array_by_dnode() at dmu_buf_hold_array_by_dnode+0x12a dmu_read() at dmu_read+0x80 load_nvlist() at load_nvlist+0x85 spa_load() at spa_load+0x49a spa_open_common() at spa_open_common+0x12d spa_get_stats() at spa_get_stats+0x42 zfs_ioc_pool_stats() at zfs_ioc_pool_stats+0x2c zfsdev_ioctl() at zfsdev_ioctl+0x8d devfs_ioctl_f() at devfs_ioctl_f+0x76 kern_ioctl() at kern_ioctl+0xc5 ioctl() at ioctl+0xfd syscall() at syscall+0x1d0 Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800fe874c, rsp = 0x7fffffffd808, rbp = 0x801222140 --- lock order reversal: 1st 0xffffff0002850e70 db->db_mtx (db->db_mtx) @ /pool/newsrc/src/ sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/ dnode_sync.c:381 2nd 0xffffff000279e140 osi->os_lock (osi->os_lock) @ /pool/newsrc/ src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/ dnode.c:323 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e _sx_xlock() at _sx_xlock+0x55 dnode_destroy() at dnode_destroy+0xa6 dnode_buf_pageout() at dnode_buf_pageout+0xb2 dbuf_evict_user() at dbuf_evict_user+0x55 dbuf_clear() at dbuf_clear+0x5e dnode_evict_dbufs() at dnode_evict_dbufs+0x98 dmu_objset_evict_dbufs() at dmu_objset_evict_dbufs+0x11c dmu_objset_evict() at dmu_objset_evict+0xbf dsl_pool_close() at dsl_pool_close+0x52 spa_unload() at spa_unload+0xb2 spa_load() at spa_load+0x4da spa_open_common() at spa_open_common+0x12d spa_get_stats() at spa_get_stats+0x42 zfs_ioc_pool_stats() at zfs_ioc_pool_stats+0x2c zfsdev_ioctl() at zfsdev_ioctl+0x8d devfs_ioctl_f() at devfs_ioctl_f+0x76 kern_ioctl() at kern_ioctl+0xc5 ioctl() at ioctl+0xfd syscall() at syscall+0x1d0 Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800fe874c, rsp = 0x7fffffffd808, rbp = 0x801222140 --- lock order reversal: 1st 0xffffff0002850e70 db->db_mtx (db->db_mtx) @ /pool/newsrc/src/ sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/ dbuf.c:1116 2nd 0xffffff0002881738 dr->dt.di.dr_mtx (dr->dt.di.dr_mtx) @ /pool/ newsrc/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/ fs/zfs/dbuf.c:1120 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e _sx_xlock() at _sx_xlock+0x55 dbuf_dirty() at dbuf_dirty+0x892 dnode_setdirty() at dnode_setdirty+0x1a9 dbuf_dirty() at dbuf_dirty+0xa53 bplist_vacate() at bplist_vacate+0x4d spa_sync() at spa_sync+0x297 txg_sync_thread() at txg_sync_thread+0x2d7 fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff80001ffd30, rbp = 0 --- lock order reversal: 1st 0xffffff00028f4438 dr->dt.di.dr_mtx (dr->dt.di.dr_mtx) @ /pool/ newsrc/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/ fs/zfs/dbuf.c:1905 2nd 0xffffff000255e2f0 spa->spa_sync_bplist.bpl_lock (spa- >spa_sync_bplist.bpl_lock) @ /pool/newsrc/src/sys/modules/zfs/../../ cddl/contrib/opensolaris/uts/common/fs/zfs/bplist.c:235 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e _sx_xlock() at _sx_xlock+0x55 bplist_enqueue_deferred() at bplist_enqueue_deferred+0x47 zio_free() at zio_free+0x105 arc_free() at arc_free+0x11c dsl_dataset_block_kill() at dsl_dataset_block_kill+0x483 dbuf_write() at dbuf_write+0x24c dbuf_sync_list() at dbuf_sync_list+0x3eb dbuf_sync_list() at dbuf_sync_list+0x17f dnode_sync() at dnode_sync+0xc12 dmu_objset_sync() at dmu_objset_sync+0x134 dsl_pool_sync() at dsl_pool_sync+0x200 spa_sync() at spa_sync+0x35e txg_sync_thread() at txg_sync_thread+0x2d7 fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff80001ffd30, rbp = 0 --- lock order reversal: 1st 0xffffff0002935098 zfs (zfs) @ /pool/newsrc/src/sys/modules/ zfs/../../cddl/contrib/opensolaris/uts/common/fs/gfs.c:437 2nd 0xffffff000266d310 zfsvfs->z_hold_mtx[i] (zfsvfs->z_hold_mtx[i]) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/ common/fs/zfs/zfs_znode.c:863 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e _sx_xlock() at _sx_xlock+0x55 zfs_zget() at zfs_zget+0x23a zfs_root() at zfs_root+0x50 zfsctl_create() at zfsctl_create+0x82 zfs_mount() at zfs_mount+0x7ef vfs_donmount() at vfs_donmount+0xcde nmount() at nmount+0x63 syscall() at syscall+0x1d0 Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (378, FreeBSD ELF64, nmount), rip = 0x800f4a4dc, rsp = 0x7fffffffced8, rbp = 0x7fffffffcef8 --- lock order reversal: 1st 0xffffff000292c078 zp->z_name_lock (zp->z_name_lock) @ /pool/ newsrc/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/ fs/zfs/zfs_dir.c:212 2nd 0xffffff000266d330 zfsvfs->z_hold_mtx[i] (zfsvfs->z_hold_mtx[i]) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/ common/fs/zfs/zfs_znode.c:863 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e _sx_xlock() at _sx_xlock+0x55 zfs_zget() at zfs_zget+0x23a zfs_dirent_lock() at zfs_dirent_lock+0x4a0 zfs_dirlook() at zfs_dirlook+0x90 zfs_lookup() at zfs_lookup+0x257 zfs_freebsd_lookup() at zfs_freebsd_lookup+0x8d VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV+0xaf vfs_cache_lookup() at vfs_cache_lookup+0xf0 VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0xb7 lookup() at lookup+0x2eb namei() at namei+0x4a9 kern_statat_vnhook() at kern_statat_vnhook+0x8f kern_statat() at kern_statat+0x15 lstat() at lstat+0x2a syscall() at syscall+0x1d0 Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (190, FreeBSD ELF64, lstat), rip = 0x800fd94fc, rsp = 0x7fffffffcf38, rbp = 0x7fffffffd3d0 --- lock order reversal: 1st 0xffffff000266d210 zfsvfs->z_teardown_inactive_lock (zfsvfs- >z_teardown_inactive_lock) @ /pool/newsrc/src/sys/modules/zfs/../../ cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:3724 2nd 0xffffff000266d330 zfsvfs->z_hold_mtx[i] (zfsvfs->z_hold_mtx[i]) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/ common/fs/zfs/zfs_znode.c:1023 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e _sx_xlock() at _sx_xlock+0x55 zfs_zinactive() at zfs_zinactive+0x8d zfs_inactive() at zfs_inactive+0x7e zfs_freebsd_inactive() at zfs_freebsd_inactive+0x1a VOP_INACTIVE_APV() at VOP_INACTIVE_APV+0xb5 vinactive() at vinactive+0x90 vput() at vput+0x250 kern_statat_vnhook() at kern_statat_vnhook+0xfa kern_statat() at kern_statat+0x15 lstat() at lstat+0x2a syscall() at syscall+0x1d0 Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (190, FreeBSD ELF64, lstat), rip = 0x800fd94fc, rsp = 0x7fffffffcf38, rbp = 0x7fffffffd3d0 --- lock order reversal: 1st 0xffffff0002914210 zfsvfs->z_teardown_inactive_lock (zfsvfs- >z_teardown_inactive_lock) @ /pool/newsrc/src/sys/modules/zfs/../../ cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:915 2nd 0xffffff000278f4f8 ds->ds_rwlock (ds->ds_rwlock) @ /pool/newsrc/ src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/ dsl_dataset.c:2864 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e _sx_xlock() at _sx_xlock+0x55 dsl_dataset_clone_swap() at dsl_dataset_clone_swap+0x5a dmu_recv_end() at dmu_recv_end+0x94 zfs_ioc_recv() at zfs_ioc_recv+0x29d zfsdev_ioctl() at zfsdev_ioctl+0x8d devfs_ioctl_f() at devfs_ioctl_f+0x76 kern_ioctl() at kern_ioctl+0xc5 ioctl() at ioctl+0xfd syscall() at syscall+0x1d0 Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800fe874c, rsp = 0x7fffffffbbf8, rbp = 0x7fffffffc930 --- lock order reversal: 1st 0xffffff000278f438 ds->ds_deadlist.bpl_lock (ds- >ds_deadlist.bpl_lock) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/ contrib/opensolaris/uts/common/fs/zfs/bplist.c:331 2nd 0xffffff0002857b88 dn->dn_struct_rwlock (dn->dn_struct_rwlock) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/ common/fs/zfs/dnode.c:130 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e _sx_slock() at _sx_slock+0x55 dnode_verify() at dnode_verify+0x70 dnode_hold_impl() at dnode_hold_impl+0x73 dmu_bonus_hold() at dmu_bonus_hold+0x31 bplist_hold() at bplist_hold+0x48 bplist_space_birthrange() at bplist_space_birthrange+0xb1 dsl_dataset_clone_swap_sync() at dsl_dataset_clone_swap_sync+0xee dsl_sync_task_group_sync() at dsl_sync_task_group_sync+0x173 dsl_pool_sync() at dsl_pool_sync+0x122 spa_sync() at spa_sync+0x35e txg_sync_thread() at txg_sync_thread+0x2d7 fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff80001ffd30, rbp = 0 --- lock order reversal: 1st 0xffffff000278f438 ds->ds_deadlist.bpl_lock (ds- >ds_deadlist.bpl_lock) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/ contrib/opensolaris/uts/common/fs/zfs/bplist.c:331 2nd 0xffffff0002ee9888 dn->dn_mtx (dn->dn_mtx) @ /pool/newsrc/src/ sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/ dnode.c:624 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e _sx_xlock() at _sx_xlock+0x55 dnode_hold_impl() at dnode_hold_impl+0x184 dmu_bonus_hold() at dmu_bonus_hold+0x31 bplist_hold() at bplist_hold+0x48 bplist_space_birthrange() at bplist_space_birthrange+0xb1 dsl_dataset_clone_swap_sync() at dsl_dataset_clone_swap_sync+0xee dsl_sync_task_group_sync() at dsl_sync_task_group_sync+0x173 dsl_pool_sync() at dsl_pool_sync+0x122 spa_sync() at spa_sync+0x35e txg_sync_thread() at txg_sync_thread+0x2d7 fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff80001ffd30, rbp = 0 --- lock order reversal: 1st 0xffffff000278f438 ds->ds_deadlist.bpl_lock (ds- >ds_deadlist.bpl_lock) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/ contrib/opensolaris/uts/common/fs/zfs/bplist.c:331 2nd 0xffffff0002851d28 db->db_mtx (db->db_mtx) @ /pool/newsrc/src/ sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/ dbuf.c:1724 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e _sx_xlock() at _sx_xlock+0x55 dbuf_rele() at dbuf_rele+0x2d dnode_hold_impl() at dnode_hold_impl+0x20f dmu_bonus_hold() at dmu_bonus_hold+0x31 bplist_hold() at bplist_hold+0x48 bplist_space_birthrange() at bplist_space_birthrange+0xb1 dsl_dataset_clone_swap_sync() at dsl_dataset_clone_swap_sync+0xee dsl_sync_task_group_sync() at dsl_sync_task_group_sync+0x173 dsl_pool_sync() at dsl_pool_sync+0x122 spa_sync() at spa_sync+0x35e txg_sync_thread() at txg_sync_thread+0x2d7 fork_exit() at fork_exit+0x12a fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff80001ffd30, rbp = 0 --- lock order reversal: 1st 0xffffff0002914250 zfsvfs->z_znodes_lock (zfsvfs->z_znodes_lock) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/ common/fs/zfs/zfs_vfsops.c:1314 2nd 0xffffff0002914310 zfsvfs->z_hold_mtx[i] (zfsvfs->z_hold_mtx[i]) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/ common/fs/zfs/zfs_znode.c:963 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a _witness_debugger() at _witness_debugger+0x2e witness_checkorder() at witness_checkorder+0x81e _sx_xlock() at _sx_xlock+0x55 zfs_rezget() at zfs_rezget+0x4a zfs_resume_fs() at zfs_resume_fs+0x158 zfs_ioc_recv() at zfs_ioc_recv+0x2b4 zfsdev_ioctl() at zfsdev_ioctl+0x8d devfs_ioctl_f() at devfs_ioctl_f+0x76 kern_ioctl() at kern_ioctl+0xc5 ioctl() at ioctl+0xfd syscall() at syscall+0x1d0 Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800fe874c, rsp = 0x7fffffffbbf8, rbp = 0x7fffffffc930 --- # From borjam at sarenet.es Mon Nov 2 11:02:04 2009 From: borjam at sarenet.es (Borja Marcos) Date: Mon Nov 2 11:02:12 2009 Subject: 8.0-RC2: ZFS deadlock with zfs receive In-Reply-To: <20091030232805.GA2996@garage.freebsd.pl> References: <804B79F6-27CE-40D2-8AB8-6FC378F448FA@sarenet.es> <4AEA0EAD.1050302@memberwebs.com> <20091030232805.GA2996@garage.freebsd.pl> Message-ID: (Resending to the list, I replied only to Pawel) On Oct 31, 2009, at 12:28 AM, Pawel Jakub Dawidek wrote: > On Thu, Oct 29, 2009 at 03:52:45PM -0600, Stef Walter wrote: >> Borja Marcos wrote: >>> I've been sending some alltraces to pjd about this easy to reproduce >>> problem. I am using zfs send/zfs receive to replicate a dataset >>> from one >>> server to another. At 1 minute intervals, an incremental snapshot is >>> sent to update the dataset copy. If there is reading activity on the >>> dataset copy, a deadlock can happen rendering ZFS and all the FS >>> subsystem unusable. I've tried with 8.0RC2 and it still happens. > > I was able to reproduce it, but I don't have fix yet. > >> FWIW, another (or the same) zfs recv deadlock I've been trying to >> get to >> the bottom of: >> >> http://lists.freebsd.org/pipermail/freebsd-fs/2009-October/ >> 006999.html > > Could you guys recompile your kernel after uncommenting line: > > #CFLAGS+=-DDEBUG=1 > > in sys/modules/zfs/Makefile? > > You should see panic on assertion instead of deadlock, I believe. No panic, it just deadlocked. Last trace of LORs Nov 2 13:45:44 kernel: lock order reversal: Nov 2 13:45:44 kernel: 1st 0xffffff00025d4c38 ds- >ds_deadlist.bpl_lock (ds->ds_deadlist.bpl_lock) @ /pool/newsrc/src/ sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/ bplist.c:189 Nov 2 13:45:44 kernel: 2nd 0xffffff000279bd40 osi->os_lock (osi- >os_lock) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/ opensolaris/uts/common/fs/zfs/dnode.c:705 Nov 2 13:45:44 kernel: KDB: stack backtrace: Nov 2 13:45:44 kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Nov 2 13:45:44 kernel: _witness_debugger() at _witness_debugger+0x2e Nov 2 13:45:44 kernel: witness_checkorder() at witness_checkorder +0x81e Nov 2 13:45:44 kernel: _sx_xlock() at _sx_xlock+0x55 Nov 2 13:45:44 kernel: dnode_setdirty() at dnode_setdirty+0xbc Nov 2 13:45:44 kernel: dbuf_dirty() at dbuf_dirty+0x516 Nov 2 13:45:44 kernel: bplist_enqueue() at bplist_enqueue+0xbd Nov 2 13:45:44 kernel: dsl_dataset_block_kill() at dsl_dataset_block_kill+0x119 Nov 2 13:45:44 kernel: dmu_objset_sync() at dmu_objset_sync+0x1fe Nov 2 13:45:44 kernel: dsl_pool_sync() at dsl_pool_sync+0x88 Nov 2 13:45:44 kernel: spa_sync() at spa_sync+0x35e Nov 2 13:45:44 kernel: txg_sync_thread() at txg_sync_thread+0x2d7 Nov 2 13:45:44 kernel: fork_exit() at fork_exit+0x12a Nov 2 13:45:44 kernel: fork_trampoline() at fork_trampoline+0xe Nov 2 13:45:44 kernel: --- trap 0, rip = 0, rsp = 0xffffff80001ffd30, rbp = 0 --- Nov 2 13:45:44 kernel: lock order reversal: Nov 2 13:45:44 kernel: 1st 0xffffff0002eb9d38 dr->dt.di.dr_mtx (dr- >dt.di.dr_mtx) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/ opensolaris/uts/common/fs/zfs/dbuf.c:1905 Nov 2 13:45:44 kernel: 2nd 0xffffff0002ed3000 dn->dn_struct_rwlock (dn->dn_struct_rwlock) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/ contrib/opensolaris/uts/common/fs/zfs/dbuf.c:543 Nov 2 13:45:44 kernel: KDB: stack backtrace: Nov 2 13:45:44 kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Nov 2 13:45:44 kernel: _witness_debugger() at _witness_debugger+0x2e Nov 2 13:45:44 kernel: witness_checkorder() at witness_checkorder +0x81e Nov 2 13:45:44 kernel: _sx_slock() at _sx_slock+0x55 Nov 2 13:45:44 kernel: dbuf_read() at dbuf_read+0x2ad Nov 2 13:45:44 kernel: dbuf_will_dirty() at dbuf_will_dirty+0x53 Nov 2 13:45:44 kernel: dsl_dataset_block_kill() at dsl_dataset_block_kill+0xe9 Nov 2 13:45:44 kernel: dbuf_write() at dbuf_write+0x24c Nov 2 13:45:44 kernel: dbuf_sync_list() at dbuf_sync_list+0x159 Nov 2 13:45:44 kernel: dbuf_sync_list() at dbuf_sync_list+0x17f Nov 2 13:45:44 kernel: dnode_sync() at dnode_sync+0xc12 Nov 2 13:45:44 kernel: dmu_objset_sync() at dmu_objset_sync+0x134 Nov 2 13:45:44 kernel: dsl_pool_sync() at dsl_pool_sync+0x88 Nov 2 13:45:44 kernel: spa_sync() at spa_sync+0x35e Nov 2 13:45:44 kernel: txg_sync_thread() at txg_sync_thread+0x2d7 Nov 2 13:45:44 kernel: fork_exit() at fork_exit+0x12a Nov 2 13:45:44 kernel: fork_trampoline() at fork_trampoline+0xe Nov 2 13:45:44 kernel: --- trap 0, rip = 0, rsp = 0xffffff80001ffd30, rbp = 0 --- Nov 2 13:46:03 kernel: lock order reversal: Nov 2 13:46:03 kernel: 1st 0xffffff00025d4c38 ds- >ds_deadlist.bpl_lock (ds->ds_deadlist.bpl_lock) @ /pool/newsrc/src/ sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/ bplist.c:152 Nov 2 13:46:03 kernel: 2nd 0xffffff0002f01ae0 dn->dn_dbufs_mtx (dn- >dn_dbufs_mtx) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/ opensolaris/uts/common/fs/zfs/dbuf.c:1518 Nov 2 13:46:03 kernel: KDB: stack backtrace: Nov 2 13:46:03 kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Nov 2 13:46:03 kernel: _witness_debugger() at _witness_debugger+0x2e Nov 2 13:46:03 kernel: witness_checkorder() at witness_checkorder +0x81e Nov 2 13:46:03 kernel: _sx_xlock() at _sx_xlock+0x55 Nov 2 13:46:03 kernel: dbuf_destroy() at dbuf_destroy+0x58 Nov 2 13:46:03 kernel: bplist_cache() at bplist_cache+0x2e Nov 2 13:46:03 kernel: bplist_iterate() at bplist_iterate+0xb3 Nov 2 13:46:03 kernel: bplist_space_birthrange() at bplist_space_birthrange+0x60 Nov 2 13:46:03 kernel: dsl_dataset_clone_swap_sync() at dsl_dataset_clone_swap_sync+0xee Nov 2 13:46:03 kernel: dsl_sync_task_group_sync() at dsl_sync_task_group_sync+0x173 Nov 2 13:46:03 kernel: dsl_pool_sync() at dsl_pool_sync+0x122 Nov 2 13:46:03 kernel: spa_sync() at spa_sync+0x35e Nov 2 13:46:03 kernel: txg_sync_thread() at txg_sync_thread+0x2d7 Nov 2 13:46:03 kernel: fork_exit() at fork_exit+0x12a Nov 2 13:46:03 kernel: fork_trampoline() at fork_trampoline+0xe Nov 2 13:46:03 kernel: --- trap 0, rip = 0, rsp = 0xffffff80001ffd30, rbp = 0 --- Nov 2 13:46:03 kernel: lock order reversal: Nov 2 13:46:03 kernel: 1st 0xffffff00025d4c38 ds- >ds_deadlist.bpl_lock (ds->ds_deadlist.bpl_lock) @ /pool/newsrc/src/ sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/ bplist.c:152 Nov 2 13:46:03 kernel: 2nd 0xffffffff81121d50 h->hash_mutexes[i] (h- >hash_mutexes[i]) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/ contrib/opensolaris/uts/common/fs/zfs/dbuf.c:191 Nov 2 13:46:03 kernel: KDB: stack backtrace: Nov 2 13:46:03 kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Nov 2 13:46:03 kernel: _witness_debugger() at _witness_debugger+0x2e Nov 2 13:46:03 kernel: witness_checkorder() at witness_checkorder +0x81e Nov 2 13:46:03 kernel: _sx_xlock() at _sx_xlock+0x55 Nov 2 13:46:03 kernel: dbuf_destroy() at dbuf_destroy+0x111 Nov 2 13:46:03 kernel: bplist_cache() at bplist_cache+0x2e Nov 2 13:46:03 kernel: bplist_iterate() at bplist_iterate+0xb3 Nov 2 13:46:03 kernel: bplist_space_birthrange() at bplist_space_birthrange+0x60 Nov 2 13:46:03 kernel: dsl_dataset_clone_swap_sync() at dsl_dataset_clone_swap_sync+0xee Nov 2 13:46:03 kernel: dsl_sync_task_group_sync() at dsl_sync_task_group_sync+0x173 Nov 2 13:46:03 kernel: dsl_pool_sync() at dsl_pool_sync+0x122 Nov 2 13:46:03 kernel: spa_sync() at spa_sync+0x35e Nov 2 13:46:03 kernel: txg_sync_thread() at txg_sync_thread+0x2d7 Nov 2 13:46:03 kernel: fork_exit() at fork_exit+0x12a Nov 2 13:46:03 kernel: fork_trampoline() at fork_trampoline+0xe Nov 2 13:46:03 kernel: --- trap 0, rip = 0, rsp = 0xffffff80001ffd30, rbp = 0 --- Nov 2 13:46:35 kernel: lock order reversal: Nov 2 13:46:35 kernel: 1st 0xffffff000279c398 dp->dp_config_rwlock (dp->dp_config_rwlock) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/ contrib/opensolaris/uts/common/fs/zfs/dsl_synctask.c:171 Nov 2 13:46:35 kernel: 2nd 0xffffff0002784938 ds->ds_opening_lock (ds->ds_opening_lock) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/ contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:483 Nov 2 13:46:35 kernel: KDB: stack backtrace: Nov 2 13:46:35 kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Nov 2 13:46:35 kernel: _witness_debugger() at _witness_debugger+0x2e Nov 2 13:46:35 kernel: witness_checkorder() at witness_checkorder +0x81e Nov 2 13:46:35 kernel: _sx_xlock() at _sx_xlock+0x55 Nov 2 13:46:35 kernel: dmu_objset_create_impl() at dmu_objset_create_impl+0x50 Nov 2 13:46:35 kernel: dmu_objset_create_sync() at dmu_objset_create_sync+0xfc Nov 2 13:46:35 kernel: dsl_sync_task_group_sync() at dsl_sync_task_group_sync+0x173 Nov 2 13:46:35 kernel: dsl_pool_sync() at dsl_pool_sync+0x122 Nov 2 13:46:35 kernel: spa_sync() at spa_sync+0x35e Nov 2 13:46:35 kernel: txg_sync_thread() at txg_sync_thread+0x2d7 Nov 2 13:46:35 kernel: fork_exit() at fork_exit+0x12a Nov 2 13:46:35 kernel: fork_trampoline() at fork_trampoline+0xe Nov 2 13:46:35 kernel: --- trap 0, rip = 0, rsp = 0xffffff80001ffd30, rbp = 0 --- Nov 2 13:46:35 kernel: lock order reversal: Nov 2 13:46:35 kernel: 1st 0xffffff000279c398 dp->dp_config_rwlock (dp->dp_config_rwlock) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/ contrib/opensolaris/uts/common/fs/zfs/dsl_synctask.c:171 Nov 2 13:46:35 kernel: 2nd 0xffffff0002fb7578 zfs (zfs) @ /pool/ newsrc/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/ fs/zfs/zfs_znode.c:152 Nov 2 13:46:35 kernel: KDB: stack backtrace: Nov 2 13:46:35 kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Nov 2 13:46:35 kernel: _witness_debugger() at _witness_debugger+0x2e Nov 2 13:46:35 kernel: witness_checkorder() at witness_checkorder +0x81e Nov 2 13:46:35 kernel: __lockmgr_args() at __lockmgr_args+0xd03 Nov 2 13:46:35 kernel: vop_stdlock() at vop_stdlock+0x39 Nov 2 13:46:35 kernel: VOP_LOCK1_APV() at VOP_LOCK1_APV+0x9b Nov 2 13:46:35 kernel: _vn_lock() at _vn_lock+0x57 Nov 2 13:46:35 kernel: zfs_znode_cache_constructor() at zfs_znode_cache_constructor+0x65 Nov 2 13:46:35 kernel: zfs_create_fs() at zfs_create_fs+0x2ab Nov 2 13:46:35 kernel: dmu_objset_create_sync() at dmu_objset_create_sync+0x116 Nov 2 13:46:35 kernel: dsl_sync_task_group_sync() at dsl_sync_task_group_sync+0x173 Nov 2 13:46:35 kernel: dsl_pool_sync() at dsl_pool_sync+0x122 Nov 2 13:46:35 kernel: spa_sync() at spa_sync+0x35e Nov 2 13:46:35 kernel: txg_sync_thread() at txg_sync_thread+0x2d7 Nov 2 13:46:35 kernel: fork_exit() at fork_exit+0x12a Nov 2 13:46:35 kernel: fork_trampoline() at fork_trampoline+0xe Nov 2 13:46:35 kernel: --- trap 0, rip = 0, rsp = 0xffffff80001ffd30, rbp = 0 --- Nov 2 13:46:36 kernel: lock order reversal: Nov 2 13:46:36 kernel: 1st 0xffffff001a0a7430 db->db_mtx (db- >db_mtx) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/ opensolaris/uts/common/fs/zfs/dbuf.c:1724 Nov 2 13:46:36 kernel: 2nd 0xffffff001a4c0708 dn->dn_dbufs_mtx (dn- >dn_dbufs_mtx) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/ opensolaris/uts/common/fs/zfs/dnode_sync.c:373 Nov 2 13:46:36 kernel: KDB: stack backtrace: Nov 2 13:46:36 kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Nov 2 13:46:36 kernel: _witness_debugger() at _witness_debugger+0x2e Nov 2 13:46:36 kernel: witness_checkorder() at witness_checkorder +0x81e Nov 2 13:46:36 kernel: _sx_xlock() at _sx_xlock+0x55 Nov 2 13:46:36 kernel: dnode_evict_dbufs() at dnode_evict_dbufs+0x57 Nov 2 13:46:36 kernel: dmu_objset_evict_dbufs() at dmu_objset_evict_dbufs+0xd4 Nov 2 13:46:36 kernel: dmu_objset_evict() at dmu_objset_evict+0xbf Nov 2 13:46:36 kernel: dsl_dataset_evict() at dsl_dataset_evict+0x54 Nov 2 13:46:36 kernel: dbuf_evict_user() at dbuf_evict_user+0x55 Nov 2 13:46:36 kernel: dbuf_rele() at dbuf_rele+0x173 Nov 2 13:46:36 kernel: dsl_pool_zil_clean() at dsl_pool_zil_clean+0x3c Nov 2 13:46:36 kernel: spa_sync() at spa_sync+0x618 Nov 2 13:46:36 kernel: txg_sync_thread() at txg_sync_thread+0x2d7 Nov 2 13:46:36 kernel: fork_exit() at fork_exit+0x12a Nov 2 13:46:36 kernel: fork_trampoline() at fork_trampoline+0xe Nov 2 13:46:36 kernel: --- trap 0, rip = 0, rsp = 0xffffff80001ffd30, rbp = 0 --- Nov 2 13:46:36 kernel: lock order reversal: Nov 2 13:46:36 kernel: 1st 0xffffff001a0a7430 db->db_mtx (db- >db_mtx) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/ opensolaris/uts/common/fs/zfs/dbuf.c:1724 Nov 2 13:46:36 kernel: 2nd 0xffffff001a4c03d8 dn->dn_struct_rwlock (dn->dn_struct_rwlock) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/ contrib/opensolaris/uts/common/fs/zfs/dnode_sync.c:409 Nov 2 13:46:36 kernel: KDB: stack backtrace: Nov 2 13:46:36 kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Nov 2 13:46:36 kernel: _witness_debugger() at _witness_debugger+0x2e Nov 2 13:46:36 kernel: witness_checkorder() at witness_checkorder +0x81e Nov 2 13:46:36 kernel: _sx_xlock() at _sx_xlock+0x55 Nov 2 13:46:36 kernel: dnode_evict_dbufs() at dnode_evict_dbufs+0x1b8 Nov 2 13:46:36 kernel: dmu_objset_evict_dbufs() at dmu_objset_evict_dbufs+0xd4 Nov 2 13:46:36 kernel: dmu_objset_evict() at dmu_objset_evict+0xbf Nov 2 13:46:36 kernel: dsl_dataset_evict() at dsl_dataset_evict+0x54 Nov 2 13:46:36 kernel: dbuf_evict_user() at dbuf_evict_user+0x55 Nov 2 13:46:36 kernel: dbuf_rele() at dbuf_rele+0x173 Nov 2 13:46:36 kernel: dsl_pool_zil_clean() at dsl_pool_zil_clean+0x3c Nov 2 13:46:36 kernel: spa_sync() at spa_sync+0x618 Nov 2 13:46:36 kernel: txg_sync_thread() at txg_sync_thread+0x2d7 Nov 2 13:46:36 kernel: fork_exit() at fork_exit+0x12a Nov 2 13:46:36 kernel: fork_trampoline() at fork_trampoline+0xe Nov 2 13:46:36 kernel: --- trap 0, rip = 0, rsp = 0xffffff80001ffd30, rbp = 0 --- Nov 2 13:46:57 kernel: lock order reversal: Nov 2 13:46:57 kernel: 1st 0xffffff001a81ec38 dr->dt.di.dr_mtx (dr- >dt.di.dr_mtx) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/ opensolaris/uts/common/fs/zfs/dbuf.c:1905 Nov 2 13:46:57 kernel: 2nd 0xffffff0002fdb4b0 dn->dn_mtx (dn- >dn_mtx) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/ opensolaris/uts/common/fs/zfs/dbuf.c:1066 Nov 2 13:46:57 kernel: KDB: stack backtrace: Nov 2 13:46:57 kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Nov 2 13:46:57 kernel: _witness_debugger() at _witness_debugger+0x2e Nov 2 13:46:57 kernel: witness_checkorder() at witness_checkorder +0x81e Nov 2 13:46:57 kernel: _sx_xlock() at _sx_xlock+0x55 Nov 2 13:46:57 kernel: dbuf_dirty() at dbuf_dirty+0xa03 Nov 2 13:46:57 kernel: dsl_dataset_block_kill() at dsl_dataset_block_kill+0x38b Nov 2 13:46:57 kernel: dbuf_write() at dbuf_write+0x24c Nov 2 13:46:57 kernel: dbuf_sync_list() at dbuf_sync_list+0x3eb Nov 2 13:46:57 kernel: dbuf_sync_list() at dbuf_sync_list+0x17f Nov 2 13:46:57 last message repeated 5 times Nov 2 13:46:57 kernel: dnode_sync() at dnode_sync+0xc12 Nov 2 13:46:57 kernel: dmu_objset_sync() at dmu_objset_sync+0x134 Nov 2 13:46:57 kernel: dsl_pool_sync() at dsl_pool_sync+0x88 Nov 2 13:46:57 kernel: spa_sync() at spa_sync+0x35e Nov 2 13:46:57 kernel: txg_sync_thread() at txg_sync_thread+0x2d7 Nov 2 13:46:57 kernel: fork_exit() at fork_exit+0x12a Nov 2 13:46:57 kernel: fork_trampoline() at fork_trampoline+0xe Nov 2 13:46:57 kernel: --- trap 0, rip = 0, rsp = 0xffffff80001ffd30, rbp = 0 --- Nov 2 13:46:57 kernel: lock order reversal: Nov 2 13:46:57 kernel: 1st 0xffffff001a81ec38 dr->dt.di.dr_mtx (dr- >dt.di.dr_mtx) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/ opensolaris/uts/common/fs/zfs/dbuf.c:1905 Nov 2 13:46:57 kernel: 2nd 0xffffff000279bd40 osi->os_lock (osi- >os_lock) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/ opensolaris/uts/common/fs/zfs/dnode.c:705 Nov 2 13:46:57 kernel: KDB: stack backtrace: Nov 2 13:46:57 kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Nov 2 13:46:57 kernel: _witness_debugger() at _witness_debugger+0x2e Nov 2 13:46:57 kernel: witness_checkorder() at witness_checkorder +0x81e Nov 2 13:46:57 kernel: _sx_xlock() at _sx_xlock+0x55 Nov 2 13:46:57 kernel: dnode_setdirty() at dnode_setdirty+0xbc Nov 2 13:46:57 kernel: dbuf_dirty() at dbuf_dirty+0xa53 Nov 2 13:46:57 kernel: dsl_dataset_block_kill() at dsl_dataset_block_kill+0x38b Nov 2 13:46:57 kernel: dbuf_write() at dbuf_write+0x24c Nov 2 13:46:57 kernel: dbuf_sync_list() at dbuf_sync_list+0x3eb Nov 2 13:46:57 kernel: dbuf_sync_list() at dbuf_sync_list+0x17f Nov 2 13:46:57 last message repeated 5 times Nov 2 13:46:57 kernel: dnode_sync() at dnode_sync+0xc12 Nov 2 13:46:57 kernel: dmu_objset_sync() at dmu_objset_sync+0x134 Nov 2 13:46:57 kernel: dsl_pool_sync() at dsl_pool_sync+0x88 Nov 2 13:46:57 kernel: spa_sync() at spa_sync+0x35e Nov 2 13:46:57 kernel: txg_sync_thread() at txg_sync_thread+0x2d7 Nov 2 13:46:57 kernel: fork_exit() at fork_exit+0x12a Nov 2 13:46:57 kernel: fork_trampoline() at fork_trampoline+0xe Nov 2 13:46:57 kernel: --- trap 0, rip = 0, rsp = 0xffffff80001ffd30, rbp = 0 --- Nov 2 13:47:28 kernel: lock order reversal: Nov 2 13:47:28 kernel: 1st 0xffffff0002fb6a58 syncer (syncer) @ / pool/newsrc/src/sys/kern/vfs_subr.c:1693 Nov 2 13:47:28 kernel: 2nd 0xffffff001a868350 zfsvfs->z_hold_mtx[i] (zfsvfs->z_hold_mtx[i]) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/ contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:863 Nov 2 13:47:28 kernel: KDB: stack backtrace: Nov 2 13:47:28 kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Nov 2 13:47:28 kernel: _witness_debugger() at _witness_debugger+0x2e Nov 2 13:47:28 kernel: witness_checkorder() at witness_checkorder +0x81e Nov 2 13:47:28 kernel: _sx_xlock() at _sx_xlock+0x55 Nov 2 13:47:28 kernel: zfs_zget() at zfs_zget+0x23a Nov 2 13:47:28 kernel: zfs_get_data() at zfs_get_data+0x5e Nov 2 13:47:28 kernel: zil_commit() at zil_commit+0x5aa Nov 2 13:47:28 kernel: zfs_sync() at zfs_sync+0xa6 Nov 2 13:47:28 kernel: sync_fsync() at sync_fsync+0x13a Nov 2 13:47:28 kernel: VOP_F Nov 2 13:47:28 kernel: SYNC_APV() at VOP_FSYNC_APV+0xb5 Nov 2 13:47:28 kernel: sync_vnode() at sync_vnode+0x157 Nov 2 13:47:28 kernel: sched_sync() at sched_sync+0x1d2 Nov 2 13:47:28 kernel: fork_exit() at fork_exit+0x12a Nov 2 13:47:28 kernel: fork_trampoline() at fork_trampoline+0xe Nov 2 13:47:28 kernel: --- trap 0, rip = 0, rsp = 0xffffff8017aebd30, rbp = 0 --- Nov 2 13:47:44 kernel: lock order reversal: Nov 2 13:47:44 kernel: 1st 0xffffff002325d4c0 zp->z_parent_lock (zp- >z_parent_lock) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/contrib/ opensolaris/uts/common/fs/zfs/zfs_dir.c:379 Nov 2 13:47:44 kernel: 2nd 0xffffff0002914370 zfsvfs->z_hold_mtx[i] (zfsvfs->z_hold_mtx[i]) @ /pool/newsrc/src/sys/modules/zfs/../../cddl/ contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:863 Nov 2 13:47:44 kernel: KDB: stack backtrace: Nov 2 13:47:44 kernel: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a Nov 2 13:47:44 kernel: _witness_debugger() at _witness_debugger+0x2e Nov 2 13:47:44 kernel: witness_checkorder() at witness_checkorder +0x81e Nov 2 13:47:44 kernel: _sx_xlock() at _sx_xlock+0x55 Nov 2 13:47:44 kernel: zfs_zget() at zfs_zget+0x23a Nov 2 13:47:44 kernel: zfs_dirlook() at zfs_dirlook+0x1fc Nov 2 13:47:44 kernel: zfs_lookup() at zfs_lookup+0x257 Nov 2 13:47:44 kernel: zfs_freebsd_lookup() at zfs_freebsd_lookup+0x8d Nov 2 13:47:44 kernel: VOP_CACHEDLOOKUP_APV() at VOP_CACHEDLOOKUP_APV +0xaf Nov 2 13:47:44 kernel: vfs_cache_lookup() at vfs_cache_lookup+0xf0 Nov 2 13:47:44 kernel: VOP_LOOKUP_APV() at VOP_LOOKUP_APV+0xb7 Nov 2 13:47:44 kernel: lookup() at lookup+0x2eb Nov 2 13:47:44 kernel: namei() at namei+0x4a9 Nov 2 13:47:44 kernel: kern_chdir() at kern_chdir+0x78 Nov 2 13:47:44 kernel: syscall() at syscall+0x1d0 Nov 2 13:47:44 kernel: Xfast_syscall() at Xfast_syscall+0xe1 Nov 2 13:47:44 kernel: --- syscall (12, FreeBSD ELF64, chdir), rip = 0x800da8a9c, rsp = 0x7fffffffe8b8, rbp = 0x8010135e0 --- From bugmaster at FreeBSD.org Mon Nov 2 11:06:53 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Nov 2 11:07:59 2009 Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org Message-ID: <200911021106.nA2B6qsk033573@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/140134 fs [msdosfs] write and fsck destroy filesystem integrity o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs o bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139597 fs [patch] [tmpfs] tmpfs initializes va_gen but doesn't u o kern/139564 fs [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/139363 fs [nfs] diskless root nfs mount from non FreeBSD server o kern/138790 fs [zfs] ZFS ceases caching when mem demand is high o kern/138524 fs [msdosfs] disks and usb flashes/cards with Russian lab o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138367 fs [tmpfs] [panic] 'panic: Assertion pages > 0 failed' wh o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/138109 fs [extfs] [patch] Minor cleanups to the sys/gnu/fs/ext2f f kern/137037 fs [zfs] [hang] zfs rollback on root causes FreeBSD to fr o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic o kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135594 fs [zfs] Single dataset unresponsive with Samba o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133980 fs [panic] [ffs] panic: ffs_valloc: dup alloc o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [panic] panic: ffs_truncate: read-only filesystem o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int f kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132597 fs [tmpfs] [panic] tmpfs-related panic while interrupting o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131995 fs [nfs] Failure to mount NFSv4 server o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130979 fs [smbfs] [panic] boot/kernel/smbfs.ko o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129059 fs [zfs] [patch] ZFS bootloader whitelistable via WITHOUT f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/127659 fs [tmpfs] tmpfs memory leak o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS p kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o kern/122047 fs [ext2fs] [patch] incorrect handling of UF_IMMUTABLE / o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [fs] [snapshot] System crashes when manipulati o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o kern/116913 fs [ffs] [panic] ffs_blkfree: freeing free block p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/115645 fs [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/105093 fs [ext2fs] [patch] ext2fs on read-only media cannot be m o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna f kern/91568 fs [ufs] [panic] writing to UFS/softupdates DVD media in o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/89991 fs [ufs] softupdates with mount -ur causes fs UNREFS o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/84589 fs [2TB] 5.4-STABLE unresponsive during background fsck 2 o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o kern/77826 fs [ext2fs] ext2fs usb filesystem will not mount RW o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 138 problems total. From jh at FreeBSD.org Mon Nov 2 15:17:44 2009 From: jh at FreeBSD.org (jh@FreeBSD.org) Date: Mon Nov 2 15:17:51 2009 Subject: kern/122047: [ext2fs] [patch] incorrect handling of UF_IMMUTABLE / UF_APPEND flag on EXT2FS (maybe others) Message-ID: <200911021517.nA2FHhQV059156@freefall.freebsd.org> Synopsis: [ext2fs] [patch] incorrect handling of UF_IMMUTABLE / UF_APPEND flag on EXT2FS (maybe others) Responsible-Changed-From-To: freebsd-fs->jh Responsible-Changed-By: jh Responsible-Changed-When: Mon Nov 2 15:17:43 UTC 2009 Responsible-Changed-Why: Take. http://www.freebsd.org/cgi/query-pr.cgi?pr=122047 From jh at FreeBSD.org Mon Nov 2 17:15:59 2009 From: jh at FreeBSD.org (jh@FreeBSD.org) Date: Mon Nov 2 17:16:05 2009 Subject: kern/105093: [ext2fs] [patch] ext2fs on read-only media cannot be mounted Message-ID: <200911021715.nA2HFw5i062257@freefall.freebsd.org> Synopsis: [ext2fs] [patch] ext2fs on read-only media cannot be mounted State-Changed-From-To: open->feedback State-Changed-By: jh State-Changed-When: Mon Nov 2 17:13:28 UTC 2009 State-Changed-Why: Apparently this has been fixed. Can you confirm? Responsible-Changed-From-To: freebsd-fs->jh Responsible-Changed-By: jh Responsible-Changed-When: Mon Nov 2 17:13:28 UTC 2009 Responsible-Changed-Why: Track. http://www.freebsd.org/cgi/query-pr.cgi?pr=105093 From jh at FreeBSD.org Mon Nov 2 17:26:20 2009 From: jh at FreeBSD.org (jh@FreeBSD.org) Date: Mon Nov 2 17:26:26 2009 Subject: kern/77826: [ext2fs] ext2fs usb filesystem will not mount RW Message-ID: <200911021726.nA2HQJIX070408@freefall.freebsd.org> Synopsis: [ext2fs] ext2fs usb filesystem will not mount RW State-Changed-From-To: open->feedback State-Changed-By: jh State-Changed-When: Mon Nov 2 17:22:10 UTC 2009 State-Changed-Why: Note that submitter has been asked for feedback. Responsible-Changed-From-To: freebsd-fs->jh Responsible-Changed-By: jh Responsible-Changed-When: Mon Nov 2 17:22:10 UTC 2009 Responsible-Changed-Why: Track. http://www.freebsd.org/cgi/query-pr.cgi?pr=77826 From borjam at sarenet.es Tue Nov 3 11:09:04 2009 From: borjam at sarenet.es (Borja Marcos) Date: Tue Nov 3 11:09:10 2009 Subject: zfs receive gives: internal error: Argument list too long In-Reply-To: <20091029205121.GB3418@garage.freebsd.pl> References: <20091029205121.GB3418@garage.freebsd.pl> Message-ID: <9AA2C968-F09D-473D-BD13-F13B3F94ED60@sarenet.es> On Oct 29, 2009, at 9:51 PM, Pawel Jakub Dawidek wrote: > On Wed, Oct 28, 2009 at 09:51:46PM +0100, Ronald Klop wrote: >> Hi, >> >> I'm forwarding this, because there was no answer on freebsd-stable. >> >> Does anybody know about this and have some tips on how to solve it? > > Could you try this patch: > > http://people.freebsd.org/~pjd/patches/zfs_recv_E2BIG.patch It's caused a panic for me on 8.0-RC2/amd64. Seems a new problem, never saw a panic in this situation before. How to reproduce: With /usr/src and /usr/obj in a dataset, just cd /usr/src make clean Instant panic, in less than 20 seconds. Trying to get panic information, unfortunately I'm running on VMWare Fussion and the silly thing doesn't offer the equivalent of a serial console. Borja. From csaba.henk at creo.hu Tue Nov 3 20:26:51 2009 From: csaba.henk at creo.hu (Csaba Henk) Date: Tue Nov 3 20:26:57 2009 Subject: [creo] Re: kern/105093: [ext2fs] [patch] ext2fs on read-only media cannot be mounted In-Reply-To: <200911021715.nA2HFw5i062257@freefall.freebsd.org> References: <200911021715.nA2HFw5i062257@freefall.freebsd.org> Message-ID: <100d90a90911031201y2c36ee70ida572ed2e63f3ba1@mail.gmail.com> On Mon, Nov 2, 2009 at 6:15 PM, wrote: > Synopsis: [ext2fs] [patch] ext2fs on read-only media cannot be mounted > > Apparently this has been fixed. Can you confirm? > > http://www.freebsd.org/cgi/query-pr.cgi?pr=105093 I don't have a system at hand where I could test it but there were other reports that this problem has gone away, so I think it's indeed fixed. Csaba From ai at kliksys.ru Thu Nov 5 06:25:38 2009 From: ai at kliksys.ru (Artemiev Igor) Date: Thu Nov 5 06:25:45 2009 Subject: Low zfs prefetch hits - why? Message-ID: <20091105061430.GA92808@one.kliksys.ru> #sysctl vfs.zfs.zfetch vfs.zfs.zfetch.array_rd_sz: 524288 vfs.zfs.zfetch.block_cap: 8 vfs.zfs.zfetch.min_sec_reap: 2 vfs.zfs.zfetch.max_streams: 8 Trying to sequential read 1G file by 128K chunks. #./arcstat.pl -f Time,pmis,pm% Time pmis pm% 05:48:40 749 100 05:48:41 1K 100 05:48:42 1K 100 05:48:43 1K 100 05:48:44 1K 100 05:48:45 1K 100 05:48:46 1K 100 05:48:47 257 99 I thought, blocks didn`t prefetching and wrote small dtrace script: #!/usr/sbin/dtrace -qs fbt:zfs:dmu_zfetch:entry { printf("dmu_zfetch: offset=%ld size=%ld prefetched=%ld\n", arg1, arg2, arg3); } fbt:zfs:dbuf_prefetch:entry { printf(" zfetching block %d\n", arg1); } Here output: dmu_zfetch: offset=325058560 size=131072 prefetched=32 dmu_zfetch: offset=325189632 size=131072 prefetched=32 dmu_zfetch: offset=325320704 size=131072 prefetched=32 dmu_zfetch: offset=325451776 size=131072 prefetched=32 dmu_zfetch: offset=325582848 size=131072 prefetched=32 dmu_zfetch: offset=326369280 size=131072 prefetched=32 dmu_zfetch: offset=326500352 size=131072 prefetched=32 dmu_zfetch: offset=326631424 size=131072 prefetched=32 dmu_zfetch: offset=326762496 size=131072 prefetched=32 dmu_zfetch: offset=326893568 size=131072 prefetched=32 dmu_zfetch: offset=327024640 size=131072 prefetched=32 zfetching block 2503 zfetching block 2504 zfetching block 2505 zfetching block 2506 zfetching block 2507 zfetching block 2508 zfetching block 2509 zfetching block 2510 dmu_zfetch: offset=327155712 size=131072 prefetched=32 dmu_zfetch: offset=327286784 size=131072 prefetched=32 dmu_zfetch: offset=327417856 size=131072 prefetched=32 dmu_zfetch: offset=327548928 size=131072 prefetched=32 dmu_zfetch: offset=327680000 size=131072 prefetched=32 Why this happening? Statistic is unrelevant? With sendfile(2) prefetching completely didn`t work - no forward read ahead. From gerrit at pmp.uni-hannover.de Fri Nov 6 08:47:38 2009 From: gerrit at pmp.uni-hannover.de (Gerrit =?ISO-8859-1?Q?K=FChn?=) Date: Fri Nov 6 08:47:44 2009 Subject: zfs panic mounting fs after crash with RC2 Message-ID: <20091106094734.4b056899.gerrit@pmp.uni-hannover.de> Hi, unfortunately I got no answer concerning this problem so far on -stable and -current (apart from the suggestion to try it again here :-). I can reproduce the panic, and if someone can guide me what to do with kdb, gdb, zdb or whatever tool might be needed to get the information needed to fix this, I'm all ears... cu Gerrit Begin forwarded message: Date: Wed, 4 Nov 2009 09:29:00 +0100 From: Gerrit K?hn To: freebsd-stable@FreeBSD.ORG Cc: Subject: zfs panic mounting fs after crash with RC2 Hi, Yesterday I had the opportunity to play around with my yet-to-become new fileserver a bit more. Originally I had installed 7.2-R, which I upgraded to 8-0-RC2 yesterday. After that I upgraded my zpool consisting of 4 disks in raidz1 constallation to v13. Some time later I tried to use powerd which was obviously a bad idea: it crashed the machine immediately. I will give a separate report on that later as it is probably related to the hardware, which is a bit exotic (VIA VB8001 board with 64bit Via Nano processor). However, the worst thing for me is, that after rebooting from that crash, one of my zfs fs cannot be mounted anymore. As soon as I try to mount it I get a kernel panic. I can still access the properties (I made use of "canmount=noauto" for the first time :-), but I cannot do a snapshot of the fs (funny enough, zfs complains that the fs is busy, while in reality it is not even mounted - so how could it be busy?). I took a picture of the kernel panic and put it here (don't know if there is any useful information in it): The pool as such seems to be fine, all other fs in it can be mounted and used, only trying to mount tank/sys/var triggers this panic. Are there any suggestions what I could do to get my fs back? Please let me know if (and how) I can provide more debugging information. cu Gerrit From dimitry at andric.com Fri Nov 6 12:10:34 2009 From: dimitry at andric.com (Dimitry Andric) Date: Fri Nov 6 12:10:42 2009 Subject: zfs panic mounting fs after crash with RC2 In-Reply-To: <20091106094734.4b056899.gerrit@pmp.uni-hannover.de> References: <20091106094734.4b056899.gerrit@pmp.uni-hannover.de> Message-ID: <4AF4123A.4080301@andric.com> On 2009-11-06 09:47, Gerrit K?hn wrote: > unfortunately I got no answer concerning this problem so far on -stable > and -current (apart from the suggestion to try it again here :-). > I can reproduce the panic, and if someone can guide me what to do with kdb, > gdb, zdb or whatever tool might be needed to get the information needed to > fix this, I'm all ears... At least a backtrace would be nice. :) From gerrit at pmp.uni-hannover.de Fri Nov 6 12:40:01 2009 From: gerrit at pmp.uni-hannover.de (Gerrit =?ISO-8859-1?Q?K=FChn?=) Date: Fri Nov 6 12:40:06 2009 Subject: zfs panic mounting fs after crash with RC2 In-Reply-To: <4AF4123A.4080301@andric.com> References: <20091106094734.4b056899.gerrit@pmp.uni-hannover.de> <4AF4123A.4080301@andric.com> Message-ID: <20091106133956.1091ed8c.gerrit@pmp.uni-hannover.de> On Fri, 06 Nov 2009 13:10:34 +0100 Dimitry Andric wrote about Re: zfs panic mounting fs after crash with RC2: DA> > unfortunately I got no answer concerning this problem so far on DA> > -stable and -current (apart from the suggestion to try it again DA> > here :-). I can reproduce the panic, and if someone can guide me DA> > what to do with kdb, gdb, zdb or whatever tool might be needed to DA> > get the information needed to fix this, I'm all ears... DA> At least a backtrace would be nice. :) I know. Unfortunately I know not much about debugging the kernel. I read , but I do not get a kernel core file, because I run the system from a CF card and use the hds completely for zfs. I have no swap partition I could dump to. Is it possible to dump onto a zfs fs? Or is there any other way for debugging? cu Gerrit From gerrit at pmp.uni-hannover.de Fri Nov 6 12:50:24 2009 From: gerrit at pmp.uni-hannover.de (Gerrit =?ISO-8859-1?Q?K=FChn?=) Date: Fri Nov 6 12:50:30 2009 Subject: zfs panic mounting fs after crash with RC2 In-Reply-To: <4AF4123A.4080301@andric.com> References: <20091106094734.4b056899.gerrit@pmp.uni-hannover.de> <4AF4123A.4080301@andric.com> Message-ID: <20091106135020.7c837bc7.gerrit@pmp.uni-hannover.de> On Fri, 06 Nov 2009 13:10:34 +0100 Dimitry Andric wrote about Re: zfs panic mounting fs after crash with RC2: DA> > unfortunately I got no answer concerning this problem so far on DA> > -stable and -current (apart from the suggestion to try it again DA> > here :-). I can reproduce the panic, and if someone can guide me DA> > what to do with kdb, gdb, zdb or whatever tool might be needed to DA> > get the information needed to fix this, I'm all ears... DA> At least a backtrace would be nice. :) Thinking about my situation and assuming that I cannot dump directly onto a zfs fs, I could probably either plug in an usb stick and try to dump onto that or recompile the kernel with ddb to try online debugging. Any suggestions? cu Gerrit From attilio at freebsd.org Fri Nov 6 15:08:57 2009 From: attilio at freebsd.org (Attilio Rao) Date: Fri Nov 6 15:09:03 2009 Subject: [PATCH] Transform vfs.root.mountfrom into a list of fs:device Message-ID: <3bbf2fe10911060708t50f5ead8t76329b379c56a5eb@mail.gmail.com> This patch adds the possibility to specify multiple couplets of fs:device for the environ vfs.root.mountfrom (space separated) rather just one item: http://www.freebsd.org/~attilio/Sandvine/STABLE_8/vfsrootmountfrom/vfsrootmountfrom.diff The one going to boot is the first valid one. While there, this patch also fixes a nit into a comment. This patch has been contributed back by Sandvine Incorporated. Please review. Thanks, Attilio -- Peace can only be achieved by understanding - A. Einstein From ivoras at freebsd.org Fri Nov 6 19:34:06 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Fri Nov 6 19:34:12 2009 Subject: Performance issues with 8.0 ZFS and sendfile/lighttpd In-Reply-To: <4AF46CA9.1040904@quip.cz> References: <772532900-1257123963-cardhu_decombobulator_blackberry.rim.net-1402739480-@bda715.bisx.prod.on.blackberry> <4AEEBD4B.1050407@quip.cz> <4AEEDB3B.5020600@quip.cz> <4AF46CA9.1040904@quip.cz> Message-ID: <9bbcef730911061101h5356d2acob2ac8791afe112@mail.gmail.com> 2009/11/6 Miroslav Lachman <000.fbsd@quip.cz>: > I do not understand why there are 10MB/s read from disks when network > traffic dropped to around 1MB/s (8Mbps) > > root@cage ~/# iostat -w 20 > ? ? ?tty ? ? ? ? ? ? ad4 ? ? ? ? ? ? ?ad6 ? ? ? ? ? ? cpu > ?tin tout ?KB/t tps ?MB/s ? KB/t tps ?MB/s ?us ni sy in id > ? 0 ? 14 41.66 ?53 ?2.17 ?41.82 ?53 ?2.18 ? 0 ?0 ?2 ?0 97 > ? 0 ? 18 50.92 ?96 ?4.77 ?54.82 114 ?6.12 ? 0 ?0 ?3 ?1 96 > ? 0 ? ?6 53.52 101 ?5.29 ?54.98 108 ?5.81 ? 1 ?0 ?4 ?1 94 > ? 0 ? ?6 54.82 ?98 ?5.26 ?55.89 108 ?5.89 ? 0 ?0 ?3 ?1 96 Yes, this could limit your IO if the requests are random enough. Unfortunately I don't know how would you track down what is really going on. Maybe some tracing with DTrace? I'd tell you to use "top -m io" to see if there is a process responsible, but apparently these statistics are not updated for ZFS, which in itself may be a bug (which is why I'm crossposting to freebsd-fs). From gerrit at pmp.uni-hannover.de Fri Nov 6 22:14:46 2009 From: gerrit at pmp.uni-hannover.de (Gerrit =?ISO-8859-1?Q?K=FChn?=) Date: Fri Nov 6 22:14:52 2009 Subject: trace for zfs panic mounting fs after crash with RC2 In-Reply-To: <4AF4123A.4080301@andric.com> References: <20091106094734.4b056899.gerrit@pmp.uni-hannover.de> <4AF4123A.4080301@andric.com> Message-ID: <20091106231440.4f0f2cbb.gerrit@pmp.uni-hannover.de> On Fri, 06 Nov 2009 13:10:34 +0100 Dimitry Andric wrote about Re: zfs panic mounting fs after crash with RC2: DA> On 2009-11-06 09:47, Gerrit K?hn wrote: DA> > unfortunately I got no answer concerning this problem so far on DA> > -stable and -current (apart from the suggestion to try it again DA> > here :-). I can reproduce the panic, and if someone can guide me DA> > what to do with kdb, gdb, zdb or whatever tool might be needed to DA> > get the information needed to fix this, I'm all ears... DA> At least a backtrace would be nice. :) I recomplied the kernel with ddb support and got the following trace (using mount -t zfs instead of zfs mount this time, but getting the same panic): I have the system still sitting at this point and can also 100% reproduce the panic. Please let me know if (and how) any further information can get pulled out of the debugger. cu Gerrit From james-freebsd-fs2 at jrv.org Fri Nov 6 23:02:27 2009 From: james-freebsd-fs2 at jrv.org (James R. Van Artsdalen) Date: Fri Nov 6 23:02:34 2009 Subject: trace for zfs panic mounting fs after crash with RC2 In-Reply-To: <20091106231440.4f0f2cbb.gerrit@pmp.uni-hannover.de> References: <20091106094734.4b056899.gerrit@pmp.uni-hannover.de> <4AF4123A.4080301@andric.com> <20091106231440.4f0f2cbb.gerrit@pmp.uni-hannover.de> Message-ID: <4AF4AAFF.2080104@jrv.org> Gerrit K?hn wrote: > I recomplied the kernel with ddb support and got the following trace > (using mount -t zfs instead of zfs mount this time, but getting the same > panic): You may be able to recover your pool by changing the line below, but I have never tried it: it may clobber the pool. You definitely don't want this change normally! It may be necessary to avoid calling zil_destroy here too. How the ZIL got corrupted - if it did - is a harder question. What kind of hard disk is this, and how is it connected to the system? Was there any redundancy (mirror, raidz)? void zil_replay(objset_t *os, void *arg, uint64_t *txgp, zil_replay_func_t *replay_func[TX_MAX_TYPE], zil_replay_cleaner_t *replay_cleaner) { zilog_t *zilog = dmu_objset_zil(os); const zil_header_t *zh = zilog->zl_header; zil_replay_arg_t zr; ==> if (1 || zil_empty(zilog)) { zil_destroy(zilog, B_TRUE); return; } From gerrit at pmp.uni-hannover.de Fri Nov 6 23:53:24 2009 From: gerrit at pmp.uni-hannover.de (Gerrit =?ISO-8859-1?Q?K=FChn?=) Date: Fri Nov 6 23:53:31 2009 Subject: trace for zfs panic mounting fs after crash with RC2 In-Reply-To: <4AF4AAFF.2080104@jrv.org> References: <20091106094734.4b056899.gerrit@pmp.uni-hannover.de> <4AF4123A.4080301@andric.com> <20091106231440.4f0f2cbb.gerrit@pmp.uni-hannover.de> <4AF4AAFF.2080104@jrv.org> Message-ID: <20091107005320.cd6a9fad.gerrit@pmp.uni-hannover.de> On Fri, 06 Nov 2009 17:02:23 -0600 "James R. Van Artsdalen" wrote about Re: trace for zfs panic mounting fs after crash with RC2: JRVA> > I recomplied the kernel with ddb support and got the following JRVA> > trace (using mount -t zfs instead of zfs mount this time, but JRVA> > getting the same panic): JRVA> You may be able to recover your pool by changing the line below, but JRVA> I have never tried it: it may clobber the pool. You definitely JRVA> don't want this change normally! It may be necessary to avoid JRVA> calling zil_destroy here too. Well, as I said before, the pool itself and all other filesystems in it are fine. The pool can be imported and all other filesystems can be mounted and used. Just one of it panics the system when I try to mount it. JRVA> How the ZIL got corrupted - if it did - is a harder question. What JRVA> kind of hard disk is this, and how is it connected to the system? JRVA> Was there any redundancy (mirror, raidz)? These are 4x2.5" 400GB drives (WD4000BEVT) in a RAID-Z1 setup on a Supermicro AOC-USAS-L8i controller (LSI chip, mpt driver) in a VIA VB8001 board (powered by a Via Nano 1.6GHz) with 4GB of memory. The system paniced when I tried to run powerd, after reboot the pool came back fine, but the system paniced again when trying to mount this particular fs (tank/sys/var). Before this happened I had one similar issue when the system crashed (probably because I was mechanically pushing the controller card a bit too hard during operation when trying to fix some SATA cables). However, after this crash the whole pool did not come back and the system paniced when trying to import the pool - but also with ZIL-replay problems. As this happened right after installing the base system, I simply re-did the pool and re-installed the system. However, with this similar problem after a quite "normal" crash (hey, I only started powerd) my confidence is a bit low and I would like to have this fixed before I really put data on the machine. cu Gerrit From linimon at FreeBSD.org Sat Nov 7 03:12:19 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Sat Nov 7 03:12:31 2009 Subject: kern/140338: [vm] [zfs] [panic] FreeBSD 8.0 RC2 with vm.pmap.pg_ps_enabled=1 kernel panic with makeworld Message-ID: <200911070312.nA73CJ7m075818@freefall.freebsd.org> Old Synopsis: FreeBSD 8.0 RC2 with vm.pmap.pg_ps_enabled=1 kernel panic with makeworld New Synopsis: [vm] [zfs] [panic] FreeBSD 8.0 RC2 with vm.pmap.pg_ps_enabled=1 kernel panic with makeworld Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Sat Nov 7 03:10:45 UTC 2009 Responsible-Changed-Why: Seems to happen with a combination of vm and zfs settings. Since I have to pick an assignee, use the fs@ one. http://www.freebsd.org/cgi/query-pr.cgi?pr=140338 From xclin at cs.nctu.edu.tw Sat Nov 7 08:04:49 2009 From: xclin at cs.nctu.edu.tw (Chen-Chuan Lin) Date: Sat Nov 7 08:04:57 2009 Subject: ZFSBoot with zpool inside bsdlabel Message-ID: <20091107074823.GA78260@cs.nctu.edu.tw> Hi all, I want to build a zfs only system and need multiboot with Microsoft Windows. So I can only choose MBR and zfsboot. And I found that zfsboot doesn't work in my environment. I create 2 slices, 1 for windows and 2 for FreeBSD. In slice 2, I create 2 partitions, A for zfs and B for swap. If partition A's offset is 0, zfsboot works fine, but if partition A is behind B, zfsboot1 can't locate zfsboot2. Also, zfsboot2 can't find my zpool. 0 1 n n+512 n+1024 (sector relative to +------+------+----------+--------+--------+-------+---- current slice) | boot | disk | ... swap | vdev | vdev | Boot |.... | code | label| | label0 | label1 | Block | +------+------+----------+--------+--------+-------+--- | Partition B | Partition A +------------------------+------------------------------ Original zfsboot only support 0 512 1024 +------+--------+-------+-------+---- | boot | vdev | vdev | Boot | +------+ label | label | Block | ..... | 0 | 1 | | +-------------+-+-------+-------+---- where boot code is zfsboot1 and boot block is zfsboot2 I found that zfsboot won't read any disklabel and assume zpool is placed in the whole slice instead of a single partition. So I modified zfsboot.c (part of zfsboot2) and zfsldr.S (zfsboot1). Make prove_drive() in zfsboot.c will scan zfs metadata no only in slice but also partition A if there is disklabel in the slice. In zfsldr.S. There is no enough space to put additional code to scan disk label, so i make a change. Every thing before main.5 is unchanged. And then scan disklable, check magic number, partition type, and calculate the offset of partition A. Finally, read BTX & BTX client from there, instead of sector 1024 of the slice. The rest of thing (BTX relocation & set A20) has been moved into zfsboot1.5, which store in sector 2 of the booting slice and will be loaded into 0xA00 (I don't know this memory location is ok or not). zfsboot1 will load boot1.5, load BTX and then jump to boot1.5. Boot 1.5 does the rest of things and jump to BTX to continue the booting procedure. The steps to install bootcode has become (assume zpool is in ad0s2a) dd if=zfsboot1 of=/dev/ad0s2 dd if=zfsboot15 of=/dev/ad0s2 seek=2 dd if=zfsboot2 of=/dev/ad0s2a seek=1024 There is a sample zfsboot at http://140.113.17.225/~xclin/zfsboot (with some debug message and only for i386/amd64) The first 512 bytes are zfsboot1, the second 512 bytes are zfsboot1.5, and the remaining 32K are zfsboot2 dd if=zfsboot of=/dev/ad0s2 count=1 dd if=zfsboot of=/dev/ad0s2 count=1 skip=1 seek=2 dd if=zfsboot of=/dev/ad0s2a skip=2 seek=1024 I don't know where this function is necessary or not... The following are patches & zfsboot1.5's code (or http://140.113.17.225/~xclin/zfsboot.patch) --- sys/boot/i386/zfsboot/Makefile.orig 2009-10-30 23:48:15.356651674 +0800 +++ sys/boot/i386/zfsboot/Makefile 2009-10-30 23:49:59.515556507 +0800 @@ -15,6 +15,7 @@ REL1= 0x700 ORG1= 0x7c00 +ORG15= 0xa00 ORG2= 0x2000 CFLAGS= -Os -g \ @@ -45,8 +46,8 @@ CLEANFILES= zfsboot -zfsboot: zfsboot1 zfsboot2 - cat zfsboot1 zfsboot2 > zfsboot +zfsboot: zfsboot1 zfsboot1.5 zfsboot2 + cat zfsboot1 zfsboot1.5 zfsboot2 > zfsboot CLEANFILES+= zfsboot1 zfsldr.out zfsldr.o @@ -56,6 +57,14 @@ zfsldr.out: zfsldr.o ${LD} ${LDFLAGS} -e start -Ttext ${ORG1} -o ${.TARGET} zfsldr.o +CLEANFILES+= zfsboot1.5 boot15.out boot15.o + +zfsboot1.5: boot15.out + objcopy -S -O binary boot15.out ${.TARGET} + +boot15.out: boot15.o + ${LD} ${LDFLAGS} -e start -Ttext ${ORG15} -o ${.TARGET} boot15.o + CLEANFILES+= zfsboot2 zfsboot.ld zfsboot.ldr zfsboot.bin zfsboot.out \ zfsboot.o zfsboot.s zfsboot.s.tmp zfsboot.h sio.o --- sys/boot/i386/zfsboot/boot15.S.orig 1970-01-01 08:00:00.000000000 +0800 +++ sys/boot/i386/zfsboot/boot15.S 2009-10-30 23:37:11.310256951 +0800 @@ -0,0 +1,93 @@ + +/* Memory Locations */ + .set MEM_ORG_15,0xa00 # Origin of boot15 + .set MEM_BUF,0x8000 # Load area + .set MEM_BTX,0x9000 # BTX start + .set MEM_JMP,0x9010 # BTX entry point + .set MEM_USR,0xa000 # Client start + .set BDA_BOOT,0x472 # Boot howto flag + +/* Misc. Constants */ + .set SIZ_PAG,0x1000 # Page size + .set SIZ_SEC,0x200 # Sector size + .set NSECT,0x40 + + .global start + .code16 + + .set NREADP, 0x7c27 + +start: jmp main + +main: + mov $msg_hello, %si + callw putstr + +/* + * We have already loaded BTX and client into $MEM_BUF in boot1. The + * only thing boot1.5 need to do is just relocate BTX and client and + * then jump to entry point + */ + +main.7: mov $MEM_BUF,%si # BTX (before reloc) + mov 0xa(%si),%bx # Get BTX length and set + mov $NSECT*SIZ_SEC-1,%di # Size of load area (less one) + mov %di,%si # End of load + add $MEM_BUF,%si # area + sub %bx,%di # End of client, 0xc000 rel + mov %di,%cx # Size of + inc %cx # client + mov $(MEM_USR+2*SIZ_PAG)>>4,%dx # Segment + mov %dx,%es # addressing 0xc000 + std # Move with decrement + rep # Relocate + movsb # client + mov %ds,%dx # Back to + mov %dx,%es # zero segment + mov $MEM_BUF,%si # BTX (before reloc) + mov $MEM_BTX,%di # BTX + mov %bx,%cx # Get BTX length + cld # Increment this time + rep # Relocate + movsb # BTX + +/* + * Enable A20 so we can access memory above 1 meg. + * Use the zero-valued %cx as a timeout for embedded hardware which do not + * have a keyboard controller. + */ +seta20: cli # Disable interrupts +seta20.1: dec %cx # Timeout? + jz seta20.3 # Yes + inb $0x64,%al # Get status + testb $0x2,%al # Busy? + jnz seta20.1 # Yes + movb $0xd1,%al # Command: Write + outb %al,$0x64 # output port +seta20.2: inb $0x64,%al # Get status + testb $0x2,%al # Busy? + jnz seta20.2 # Yes + movb $0xdf,%al # Enable + outb %al,$0x60 # A20 +seta20.3: sti # Enable interrupts + + + jmp start+MEM_JMP-MEM_ORG_15 # Start BTX + +putstr.0: mov $0x7,%bx + movb $0xe,%ah + int $0x10 +putstr: lodsb + testb %al,%al + jne putstr.0 + +ereturn: movb $0x1,%ah + stc +return: retw + + +msg_hello: .asciz "welcom to boot1.5\r\n" + + .org 0x1FE, 0x90 +/* useless but check if code is more then 512byte */ +part_magic: .word 0xaa55 --- sys/boot/i386/zfsboot/zfsboot.c.orig 2009-10-30 23:48:15.355651268 +0800 +++ sys/boot/i386/zfsboot/zfsboot.c 2009-10-30 23:50:44.761515700 +0800 @@ -22,6 +22,7 @@ #ifdef GPT #include #endif +#include #include #include @@ -154,6 +155,7 @@ struct dmadat { char rdbuf[READ_BUF_SIZE]; /* for reading large things */ char secbuf[READ_BUF_SIZE]; /* for MBR/disklabel */ + char labbuf[READ_BUF_SIZE]; /* for MBR/disklabel */ }; static struct dmadat *dmadat; @@ -524,6 +526,32 @@ */ dsk = copy_dsk(dsk); } + else if(dp[i].dp_typ == DOSPTYP_386BSD) + { + struct disklabel *label; + char *lhdr = dmadat->labbuf; + /* save slice offset if we want to scan all partitions */ + u_int32_t dp_start = dp[i].dp_start; + + if(drvread(dsk, lhdr, 1 ,1)) + continue; + + label = lhdr; + + if(label->d_magic !=DISKMAGIC || label->d_magic2 != DISKMAGIC) + continue; + + if(!label->d_partitions[0].p_size) + continue; + + dsk->start += label->d_partitions[0].p_offset; + dsk->start -= label->d_partitions[RAW_PART].p_offset; + + if(vdev_probe(vdev_read, dsk, spap) == 0) { + spap =0; + dsk = copy_dsk(dsk); + } + } } } --- sys/boot/i386/zfsboot/zfsldr.S.orig 2009-10-30 23:48:15.353651293 +0800 +++ sys/boot/i386/zfsboot/zfsldr.S 2009-10-30 23:51:13.431489398 +0800 @@ -18,6 +18,7 @@ /* Memory Locations */ .set MEM_REL,0x700 # Relocation address .set MEM_ARG,0x900 # Arguments + .set MEM_BOOT15,0xa00 # Boot1.5 Location .set MEM_ORG,0x7c00 # Origin .set MEM_BUF,0x8000 # Load area .set MEM_BTX,0x9000 # BTX start @@ -38,6 +39,16 @@ .set SIZ_SEC,0x200 # Sector size .set NSECT,0x40 + +/* Disklabel Constants */ + .set DL_DISKMAGIC,0x82564557 # Disklabel Magic + .set DL_MAGIC1,0x0 # Magic1 + .set DL_MAGIC2,0x84 # Magic2 + .set DL_SECT,0x94 # Number of sector + .set DL_PARTA_OFFSET,0x98 # PartA offset + .set DL_PARTA_TYPE,0xA0 # PartA Type + .set DL_RAW_OFFSET,0xb8 # Raw Part offset + .globl start .globl xread .code16 @@ -194,9 +205,64 @@ * area and target area do not overlap. */ main.5: mov %dx,MEM_ARG # Save args + movw 0x8(%si),%ax # Backup + movw %ax,MEM_BUF+SIZ_SEC+0x8 # original + movw 0xa(%si),%ax # slice + movw %ax,MEM_BUF+SIZ_SEC+0xa # start +/* + * Because we dont's have enough space, relocateing BTX will take + * place in boot 1.5. Load it from sector 2 and relocate into 0xa00 + */ + movb $1,%dh # Load + movw $2,%ax # Boot 1.5 from + callw nread.1 # Read disk + mov $MEM_BUF,%si # Boot 1.5 (before) + mov $MEM_BOOT15,%di # Boot 1.5 + mov $SIZ_SEC,%cx # Size + rep # Relocate + movsb # Boot 1.5 +/* + * Read sector 1 from slice to check out whethre there is disklabel + * or not. If so, check magic, partition type and size of partition A + * And then caculate partition A's offset and put it in + * MEM_BUF+SIZ_SEC+0x8, otherwise, it remain unchanged + */ + movw $MEM_BUF+SIZ_SEC,%si # offset + movb $1,%dh # Sector count + movw $1,%ax # Offset to disklabel + callw nread.1 # Read disk + mov $MEM_BUF,%si # Disklabel + cmpl $DL_DISKMAGIC,0x0(%si) # Check + jne main.6 # msgic1 + cmpl $DL_DISKMAGIC,0x84(%si) # Check + jne main.6 # magic2 + cmpl $0,DL_SECT(%si) # Check label a + je main.6 # size + cmpb $27,DL_PARTA_TYPE(%si) # Check label a + jne main.6 # type ZFS=27 + + movw MEM_BUF+SIZ_SEC+0x8,%ax # Partition A + movw MEM_BUF+SIZ_SEC+0xa,%cx # start at + addw DL_PARTA_OFFSET(%si),%ax # slice + adcw DL_PARTA_OFFSET+2(%si),%cx # offset + + subw DL_RAW_OFFSET(%si),%ax # part a + sbbw DL_RAW_OFFSET+2(%si),%cx # offset - + movw %ax,MEM_BUF+SIZ_SEC+0x8 # raw part + movw %cx,MEM_BUF+SIZ_SEC+0xa # offset +/* + * We have boot partitions's offset in $MEM_BUS+SIZ_SEC+8, put it in + * %si and load. And then read boot2 from disk. The rest of things + * will be done in boot 1.5 + */ +main.6: + movw $MEM_BUF+SIZ_SEC,%si # offset movb $NSECT,%dh # Sector count movw $1024,%ax # Offset to boot2 callw nread.1 # Read disk + + jmp start+MEM_BOOT15-MEM_ORG # Goto Boot1.5 + +#if 0 /* below is the original boot code */ main.6: mov $MEM_BUF,%si # BTX (before reloc) mov 0xa(%si),%bx # Get BTX length and set mov $NSECT*SIZ_SEC-1,%di # Size of load area (less one) @@ -241,14 +307,16 @@ jmp start+MEM_JMP-MEM_ORG # Start BTX +#endif /* * Trampoline used to call read from within boot1. */ nread: xor %ax,%ax # Sector offset in partition nread.1: mov $MEM_BUF,%bx # Transfer buffer + xor %cx,%cx # Clear add 0x8(%si),%ax # Get - mov 0xa(%si),%cx # LBA + adc 0xa(%si),%cx # LBA push %cs # Read from callw xread.1 # disk jnc return # If success, return From gallasch at free.de Sat Nov 7 08:40:03 2009 From: gallasch at free.de (Kai Gallasch) Date: Sat Nov 7 08:40:09 2009 Subject: kern/140338: FreeBSD 8.0 RC2 with vm.pmap.pg_ps_enabled=1 kernel panic with makeworld Message-ID: <200911070840.nA78e2qc088426@freefall.freebsd.org> The following reply was made to PR kern/140338; it has been noted by GNATS. From: Kai Gallasch To: bug-followup@FreeBSD.org Cc: Patrick Lamaiziere , linimon@FreeBSD.org Subject: Re: kern/140338: FreeBSD 8.0 RC2 with vm.pmap.pg_ps_enabled=1 kernel panic with makeworld Date: Sat, 7 Nov 2009 09:09:22 +0100 Am Fri, 6 Nov 2009 19:36:54 +0100 schrieb Patrick Lamaiziere : > Le Fri, 6 Nov 2009 17:28:40 GMT, > Kai Gallasch : > > Hello, > > > ZFS filesystem version 13 > > ZFS storage pool version 13 > > It seems you are using ZFS on this box? No. The server is not in production and never was with FreeBSD 8. Only the kernel module is loaded, due to zfs_enable="YES" in rc.conf as a preparation for using ZFS with 8.0-RELEASE The problem described in the PR occured when zfs_enable="YES" was not set in rc.conf I see no direct connection to with this ZFS. > > On the same box, I've used super-pages for a longtime on FreeBSD 7.2 > and with 8.0/BETA without any problem (but without ZFS too). Since > I've turned off super-pages, ZFS is stable. I tested superpages on 7.2-STABLE with ZFS and had to deactivate them, after the server became instable. --Kai From gavin at FreeBSD.org Sat Nov 7 14:28:13 2009 From: gavin at FreeBSD.org (gavin@FreeBSD.org) Date: Sat Nov 7 14:28:20 2009 Subject: kern/140338: [vm][panic] FreeBSD 8.0 RC2 with vm.pmap.pg_ps_enabled=1 kernel panic with makeworld Message-ID: <200911071428.nA7ESDhU097466@freefall.freebsd.org> Old Synopsis: [vm] [zfs] [panic] FreeBSD 8.0 RC2 with vm.pmap.pg_ps_enabled=1 kernel panic with makeworld New Synopsis: [vm][panic] FreeBSD 8.0 RC2 with vm.pmap.pg_ps_enabled=1 kernel panic with makeworld Responsible-Changed-From-To: freebsd-fs->freebsd-bugs Responsible-Changed-By: gavin Responsible-Changed-When: Sat Nov 7 14:25:06 UTC 2009 Responsible-Changed-Why: Not ZFS related after all http://www.freebsd.org/cgi/query-pr.cgi?pr=140338 From 000.fbsd at quip.cz Sat Nov 7 20:18:31 2009 From: 000.fbsd at quip.cz (Miroslav Lachman) Date: Sat Nov 7 20:18:37 2009 Subject: Performance issues with 8.0 ZFS and sendfile/lighttpd In-Reply-To: <9bbcef730911061101h5356d2acob2ac8791afe112@mail.gmail.com> References: <772532900-1257123963-cardhu_decombobulator_blackberry.rim.net-1402739480-@bda715.bisx.prod.on.blackberry> <4AEEBD4B.1050407@quip.cz> <4AEEDB3B.5020600@quip.cz> <4AF46CA9.1040904@quip.cz> <9bbcef730911061101h5356d2acob2ac8791afe112@mail.gmail.com> Message-ID: <4AF5D611.7060408@quip.cz> Ivan Voras wrote: > 2009/11/6 Miroslav Lachman<000.fbsd@quip.cz>: > >> I do not understand why there are 10MB/s read from disks when network >> traffic dropped to around 1MB/s (8Mbps) >> >> root@cage ~/# iostat -w 20 >> tty ad4 ad6 cpu >> tin tout KB/t tps MB/s KB/t tps MB/s us ni sy in id >> 0 14 41.66 53 2.17 41.82 53 2.18 0 0 2 0 97 >> 0 18 50.92 96 4.77 54.82 114 6.12 0 0 3 1 96 >> 0 6 53.52 101 5.29 54.98 108 5.81 1 0 4 1 94 >> 0 6 54.82 98 5.26 55.89 108 5.89 0 0 3 1 96 > > Yes, this could limit your IO if the requests are random enough. > Unfortunately I don't know how would you track down what is really > going on. Maybe some tracing with DTrace? > > I'd tell you to use "top -m io" to see if there is a process > responsible, but apparently these statistics are not updated for ZFS, > which in itself may be a bug (which is why I'm crossposting to > freebsd-fs). DTrace is totally out of my skills ;( There is otput of top -m io sorted by VCSW displaying JID. last pid: 17724; load averages: 0.01, 0.07, 0.08 up 74+20:49:49 21:03:40 195 processes: 1 running, 193 sleeping, 1 zombie CPU: 0.0% user, 0.0% nice, 3.6% system, 0.4% interrupt, 96.1% idle Mem: 462M Active, 2385M Inact, 977M Wired, 21M Cache, 399M Buf, 100M Free Swap: 6144M Total, 2024K Used, 6142M Free PID JID USERNAME VCSW IVCSW READ WRITE FAULT TOTAL PERCENT COMMAND 17681 8 www 657 64 0 0 0 0 0.00% lighttpd 17683 8 www 379 41 0 0 0 0 0.00% lighttpd 17680 8 www 136 5 0 0 0 0 0.00% lighttpd 17682 8 www 85 0 0 0 0 0 0.00% lighttpd 4689 1 90 10 0 0 0 0 0 0.00% fb_inet_server 3403 1 90 10 0 0 0 0 0 0.00% fb_inet_server 2632 1 90 10 0 0 0 0 0 0.00% fb_inet_server All four top consumers is Lighttpd workers. And as you noted, read, write, fault, total and percent are not updated on machine with ZFS, so I can't compare it with UFS2 based machine. Is this bug in top fixed in 8.x? Will you file a PR? (you know more about FS related things than me :]) Miroslav Lachman From ivoras at freebsd.org Sat Nov 7 20:42:48 2009 From: ivoras at freebsd.org (Ivan Voras) Date: Sat Nov 7 20:42:55 2009 Subject: Performance issues with 8.0 ZFS and sendfile/lighttpd In-Reply-To: <4AF5D611.7060408@quip.cz> References: <772532900-1257123963-cardhu_decombobulator_blackberry.rim.net-1402739480-@bda715.bisx.prod.on.blackberry> <4AEEBD4B.1050407@quip.cz> <4AEEDB3B.5020600@quip.cz> <4AF46CA9.1040904@quip.cz> <9bbcef730911061101h5356d2acob2ac8791afe112@mail.gmail.com> <4AF5D611.7060408@quip.cz> Message-ID: <9bbcef730911071242m5ad91720xcccb7586c6848ffd@mail.gmail.com> 2009/11/7 Miroslav Lachman <000.fbsd@quip.cz>: > > And as you noted, read, write, fault, total and percent are not updated on > machine with ZFS, so I can't compare it with UFS2 based machine. > Is this bug in top fixed in 8.x? Will you file a PR? (you know more about FS > related things than me :]) Not much... it depends on from where the stats are collected - there is a fair bit of file system infrastructure that ZFS bypasses and if these stats come from it, they cannot be collected. The stat is apparently updated around sys/kern/vfs_cluster.c: 233 . I'm not very familiar with this layer but since it uses struct buf and the ZFS doesn't use bufcache, this is probably one of the things that is bypassed, though it would be nice if it weren't since this code also defines and uses the vfs.write_behind and vfs.read_max sysctls. Also, since ZFS uses its own threads for IO and the stats are for curthread, it looks like it would maybe need careful work to actually assign the IO stats to the correct thread; otherwise it may be sufficient to add it to vdev_disk.c in vdev_disk_physio(). I don't really know this code and this is mostly mechanical analisys - it might be wrong. At least I'd like to read someone's comment about what is curthread in this code path. From kamikaze at bsdforen.de Sun Nov 8 11:20:09 2009 From: kamikaze at bsdforen.de (Dominic Fandrey) Date: Sun Nov 8 11:20:23 2009 Subject: kern/140134: [msdosfs] write and fsck destroy filesystem integrity Message-ID: <200911081120.nA8BK9qf010747@freefall.freebsd.org> The following reply was made to PR kern/140134; it has been noted by GNATS. From: Dominic Fandrey To: bug-followup@FreeBSD.org, kamikaze@bsdforen.de Cc: Subject: Re: kern/140134: [msdosfs] write and fsck destroy filesystem integrity Date: Sun, 08 Nov 2009 12:18:53 +0100 I just rebuilt world and kernel with RELENG_8 from ~3 hours ago. Afterwards I repeated the How-To-Repeat procedure and the problems shown therein didn't appear. In the next step I put my Cowon S9 (audio/video player with 16GB fat32 partition) on the line. I mounted the device and created a file with touch. fsck_msdosfs said the drive was fine. Windows chkdsk reported the drive was fine. So the next attempt was to remove the empty file and copy a JPG ~2.5MB onto the drive. fsck_msdosfs reported the following: > fsck_msdosfs -n /dev/msdosfs/VALHALLA ** /dev/msdosfs/VALHALLA ** Phase 1 - Read and Compare FATs ** Phase 2 - Check Cluster Chains ** Phase 3 - Checking Directories ** Phase 4 - Checking for Lost Files Lost cluster chain at cluster 4 49 Cluster(s) lost Reconnect? no Clear? no Lost cluster chain at cluster 53 37 Cluster(s) lost Reconnect? no Clear? no Lost cluster chain at cluster 90 9 Cluster(s) lost Reconnect? no Clear? no Lost cluster chain at cluster 99 9 Cluster(s) lost Reconnect? no Clear? no Lost cluster chain at cluster 108 1126 Cluster(s) lost Reconnect? no Clear? no Lost cluster chain at cluster 1234 9 Cluster(s) lost Reconnect? no Clear? no Lost cluster chain at cluster 1244 5 Cluster(s) lost Reconnect? no Clear? no Lost cluster chain at cluster 1248 49 Cluster(s) lost Reconnect? no Clear? no Lost cluster chain at cluster 1302 282 Cluster(s) lost Reconnect? no Clear? no Lost cluster chain at cluster 1600 5 Cluster(s) lost Reconnect? no Clear? no Lost cluster chain at cluster 1605 59 Cluster(s) lost Reconnect? no Clear? no Lost cluster chain at cluster 449483 1 Cluster(s) lost Reconnect? no Clear? no Free space in FSInfo block (-1) not correct (577592) Fix? no Next free cluster in FSInfo block (2) not free Fix? no 2130 files, 426432 free (577592 clusters) Kinda panicked I ran Windows chkdsk and it reported the drive was just fine fine. fsck_msdosfs still reports the same errors. I conclude that writing to msdosfs seems to be fine, now, however fsck_msdosfs is a dangerous threat to fat32 file systems. From gerrit at pmp.uni-hannover.de Mon Nov 9 09:13:01 2009 From: gerrit at pmp.uni-hannover.de (Gerrit =?ISO-8859-1?Q?K=FChn?=) Date: Mon Nov 9 09:13:09 2009 Subject: trace for zfs panic mounting fs after crash with RC2 In-Reply-To: <4AF4AAFF.2080104@jrv.org> References: <20091106094734.4b056899.gerrit@pmp.uni-hannover.de> <4AF4123A.4080301@andric.com> <20091106231440.4f0f2cbb.gerrit@pmp.uni-hannover.de> <4AF4AAFF.2080104@jrv.org> Message-ID: <20091109101255.e81774e4.gerrit@pmp.uni-hannover.de> On Fri, 06 Nov 2009 17:02:23 -0600 "James R. Van Artsdalen" wrote about Re: trace for zfs panic mounting fs after crash with RC2: JRVA> How the ZIL got corrupted - if it did - is a harder question. I think it is. Otherwise zfs would not crash while trying to replay the ZIL, wouldn't it? It seems that this happens rather easily with the system I have at hand (it happend twice to me so far - and I crashed the system only twice, that makes 100%, although I doubt that it is that reproducible). Searching around I found some reports of the same or similar issues (but no solution). So apart from recovering my fs (I did not try your suggested patch yet), there are two things I regard as very important: 1. Find you why the ZIL gets corrupted under some circumstances. 2. Find a safe way to recover a fs with a corrupted ZIL. I guess I could live with a corrupted ZIL after a crash, if there was some kind of --ignore-zil switch to get my data back online. In any case, zfs should not panic on corrupted ZIL data, should it? As I do not dare to use the system for storing data until this is sorted out, I can try out almost anything to get more information about the problem. Please let me know what I should do to support debugging. cu Gerrit From bugmaster at FreeBSD.org Mon Nov 9 11:06:51 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Nov 9 11:07:58 2009 Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org Message-ID: <200911091106.nA9B6oeh078976@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/140134 fs [msdosfs] write and fsck destroy filesystem integrity o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs o bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139597 fs [patch] [tmpfs] tmpfs initializes va_gen but doesn't u o kern/139564 fs [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/139363 fs [nfs] diskless root nfs mount from non FreeBSD server o kern/138790 fs [zfs] ZFS ceases caching when mem demand is high o kern/138524 fs [msdosfs] disks and usb flashes/cards with Russian lab o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138367 fs [tmpfs] [panic] 'panic: Assertion pages > 0 failed' wh o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/138109 fs [extfs] [patch] Minor cleanups to the sys/gnu/fs/ext2f f kern/137037 fs [zfs] [hang] zfs rollback on root causes FreeBSD to fr o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic o kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135594 fs [zfs] Single dataset unresponsive with Samba o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133980 fs [panic] [ffs] panic: ffs_valloc: dup alloc o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [panic] panic: ffs_truncate: read-only filesystem o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int f kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132597 fs [tmpfs] [panic] tmpfs-related panic while interrupting o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131995 fs [nfs] Failure to mount NFSv4 server o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130979 fs [smbfs] [panic] boot/kernel/smbfs.ko o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129059 fs [zfs] [patch] ZFS bootloader whitelistable via WITHOUT f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/127659 fs [tmpfs] tmpfs memory leak o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS p kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [fs] [snapshot] System crashes when manipulati o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o kern/116913 fs [ffs] [panic] ffs_blkfree: freeing free block p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/115645 fs [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna f kern/91568 fs [ufs] [panic] writing to UFS/softupdates DVD media in o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/89991 fs [ufs] softupdates with mount -ur causes fs UNREFS o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/84589 fs [2TB] 5.4-STABLE unresponsive during background fsck 2 o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 135 problems total. From gavin at FreeBSD.org Mon Nov 9 21:45:58 2009 From: gavin at FreeBSD.org (gavin@FreeBSD.org) Date: Mon Nov 9 21:46:11 2009 Subject: kern/140433: ZFS panics kernel while replaying ZIL after crash Message-ID: <200911092145.nA9LjwAC037224@freefall.freebsd.org> Synopsis: ZFS panics kernel while replaying ZIL after crash Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: gavin Responsible-Changed-When: Mon Nov 9 21:30:36 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). To submitter: can you also please give the output of "zdb -C"? http://www.freebsd.org/cgi/query-pr.cgi?pr=140433 From gleb.kurtsou at gmail.com Mon Nov 9 23:40:52 2009 From: gleb.kurtsou at gmail.com (Gleb Kurtsou) Date: Mon Nov 9 23:40:59 2009 Subject: Deleting files/dirs from partially damaged ZFS filesystem Message-ID: <20091109234045.GA3679@tops.skynet.lt> Hello, I have ZFS filesystem with some inconsistencies I'd like to fix. It's root filesystem on my laptop, so backup/restore everything is rather troublesome. ZFS scrub doesn't show/fix any errors. These inconsistencies occurred several months ago and I had no other issues with filesystem since then. These issues can be interesting for those working on ZFS stability. I wasn't able to find any useful information on ZFS debugging tools, so posting it here. 1. ~/.mozilla-bug2/firefox/5iyxnqmf.default-1 The name used to be different but I was able to rename it. Process accessing this directory stalls and can't be killed. System used to panic on accessing it before (NULL pointer dereference, have to stack trace any longer), but that was fixed just before 9-CURRENT. It seems that kernel buffer that contained structures related to this directory was changed with random data, and then written to disk (So that checksums are fine but data is incorrect) 2. /test Directory looks empty, but can't be deleted. / # ls -Al /test total 0 / # rm -rf /test rm: /test: Directory not empty Most likely because I've managed to create file/directory with name longer then MAXNAMLEN (255 bytes) in this directory. I was running tools/regression/fstest suite to test stacked filesystem that performed manipulations on file names. Files can be added/deleted in this directory. It seems both these issues can be fixed by marking appropriate device blocks bad, running scrub, and marking blocks good, or simply trashing appropriate checksums, so that scrub can fix them. But I can't find block offsets nor the way to change checksums. / # uname -a FreeBSD tops 9.0-CURRENT FreeBSD 9.0-CURRENT #16 r198029+762c399-dirty: Wed Oct 28 16:11:49 EET 2009 root@tops:/usr/obj/usr/freebsd-src/local/sys/TOPS amd64 / # zpool status pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 ada0s3d ONLINE 0 0 0 errors: No known data errors / # zfs list NAME USED AVAIL REFER MOUNTPOINT tank 116G 8,39G 18,2G legacy tank/home 87,5G 8,39G 62,7G /home tank/local 4,45G 8,39G 4,45G /usr/local tank/ports 5,48G 8,39G 5,48G /usr/ports / # zdb -C tank version=13 name='tank' state=0 txg=298580 pool_guid=12986731317200074631 hostid=1869410071 hostname='tops' vdev_tree type='root' id=0 guid=12986731317200074631 children[0] type='disk' id=0 guid=11828906155092156003 path='/dev/ad0s3d' whole_disk=0 metaslab_array=23 metaslab_shift=30 ashift=9 asize=135652442112 is_log=0 DTL=153 From gtodd at bellanet.org Tue Nov 10 15:02:22 2009 From: gtodd at bellanet.org (Graham Todd) Date: Tue Nov 10 15:03:01 2009 Subject: trace for zfs panic mounting fs after crash with RC2 In-Reply-To: <20091109101255.e81774e4.gerrit@pmp.uni-hannover.de> References: <20091106094734.4b056899.gerrit@pmp.uni-hannover.de> <4AF4123A.4080301@andric.com> <20091106231440.4f0f2cbb.gerrit@pmp.uni-hannover.de> <4AF4AAFF.2080104@jrv.org> <20091109101255.e81774e4.gerrit@pmp.uni-hannover.de> Message-ID: <4AF98032.9050808@bellanet.org> Gerrit K?hn wrote: > On Fri, 06 Nov 2009 17:02:23 -0600 "James R. Van Artsdalen" > wrote about Re: trace for zfs panic mounting > fs after crash with RC2: > > JRVA> How the ZIL got corrupted - if it did - is a harder question. > > I think it is. Otherwise zfs would not crash while trying to replay the > ZIL, wouldn't it? > It seems that this happens rather easily with the system I have at hand > (it happend twice to me so far - and I crashed the system only twice, > that makes 100%, although I doubt that it is that reproducible). Searching > around I found some reports of the same or similar issues (but no > solution). So apart from recovering my fs (I did not try your suggested > patch yet), there are two things I regard as very important: > > 1. Find you why the ZIL gets corrupted under some circumstances. > 2. Find a safe way to recover a fs with a corrupted ZIL. > > I guess I could live with a corrupted ZIL after a crash, if there was some > kind of --ignore-zil switch to get my data back online. In any case, zfs > should not panic on corrupted ZIL data, should it? Is there is a way to "manually" use zdb to mimic the "zpool clear" command introduced in OpenSolaris's ZFS with PSARC-2009479? http://www.c0t0d0s0.org/archives/6067-PSARC-2009479-zpool-recovery-a.html I have no idea if this would help: in fact it might very well be dangerous for the pool that Gerrit is trying to recover. Are you able to copy the pool somehow before trying experiments? I think the current state of "disaster recovery" tools and methods for ZFS makes some folks nervous. With so much error checking "built in" there's fewer tried and true "old school" sysadmin approaches to recovering lost data after the fact. So thanks for debugging your problem in public. I hope you can resolve things and document how you did it for everyone. Good luck. From gerrit at pmp.uni-hannover.de Tue Nov 10 15:46:27 2009 From: gerrit at pmp.uni-hannover.de (Gerrit =?ISO-8859-1?Q?K=FChn?=) Date: Tue Nov 10 15:46:34 2009 Subject: trace for zfs panic mounting fs after crash with RC2 In-Reply-To: <4AF98032.9050808@bellanet.org> References: <20091106094734.4b056899.gerrit@pmp.uni-hannover.de> <4AF4123A.4080301@andric.com> <20091106231440.4f0f2cbb.gerrit@pmp.uni-hannover.de> <4AF4AAFF.2080104@jrv.org> <20091109101255.e81774e4.gerrit@pmp.uni-hannover.de> <4AF98032.9050808@bellanet.org> Message-ID: <20091110164622.6bc7aca1.gerrit@pmp.uni-hannover.de> On Tue, 10 Nov 2009 10:01:06 -0500 Graham Todd wrote about Re: trace for zfs panic mounting fs after crash with RC2: GT> > I guess I could live with a corrupted ZIL after a crash, if there GT> > was some kind of --ignore-zil switch to get my data back online. In GT> > any case, zfs should not panic on corrupted ZIL data, should it? GT> GT> Is there is a way to "manually" use zdb to mimic the "zpool clear" GT> command introduced in OpenSolaris's ZFS with PSARC-2009479? GT> GT> http://www.c0t0d0s0.org/archives/6067-PSARC-2009479-zpool-recovery-a.html FYI: Meanwhile I opened a PR for the issue (kern/140433) and got some request for additional zdb input (that I will hopefully be able to provide later this evening). The page above looks interesting, though. There it is mentioned (in the comments) that you can achieve the same thing zpool clear does... but it is not mentioned how. Does anyone here know? GT> I have no idea if this would help: in fact it might very well be GT> dangerous for the pool that Gerrit is trying to recover. Are you able GT> to copy the pool somehow before trying experiments? I do not care that much about this specific pool, since I only installed the system and some software. But I want to know I can handle this situation before I put data on the disks. :-) GT> I think the current state of "disaster recovery" tools and methods for GT> ZFS makes some folks nervous. With so much error checking "built in" GT> there's fewer tried and true "old school" sysadmin approaches to GT> recovering lost data after the fact. As long as these situations do not happen, it's ok for me to have no way to recover. :-) I am using zfs since Pawel made the first patchset available in autumn 2006 and never had to face a situation like this so far. As it happend two times in a row now on this new machine, I guess it must have something to do with the hardware. OTOH, everything seems to run fine, unless the machine crashes and corrupts the zil in some strange way. GT> So thanks for debugging your GT> problem in public. I hope you can resolve things and document how you GT> did it for everyone. I hope we get this resolved, too. As long as I do not have to fear to loose important data, I can do almost anything with the machine for debugging. cu Gerrit From fbsdlist at src.cx Tue Nov 10 16:17:56 2009 From: fbsdlist at src.cx (Artem Belevich) Date: Tue Nov 10 16:18:03 2009 Subject: trace for zfs panic mounting fs after crash with RC2 In-Reply-To: <20091110164622.6bc7aca1.gerrit@pmp.uni-hannover.de> References: <20091106094734.4b056899.gerrit@pmp.uni-hannover.de> <4AF4123A.4080301@andric.com> <20091106231440.4f0f2cbb.gerrit@pmp.uni-hannover.de> <4AF4AAFF.2080104@jrv.org> <20091109101255.e81774e4.gerrit@pmp.uni-hannover.de> <4AF98032.9050808@bellanet.org> <20091110164622.6bc7aca1.gerrit@pmp.uni-hannover.de> Message-ID: > GT> http://www.c0t0d0s0.org/archives/6067-PSARC-2009479-zpool-recovery-a.html > > The page above looks interesting, though. There it is mentioned (in the > comments) that you can achieve the same thing zpool clear does... but it > is not mentioned how. Does anyone here know? Perhaps some of the links on the following post on zfs-discuss may help: http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg26704.html Another option would be to boot from OpenSolaris LiveCD that contains latest zfs changes, import your pool there, fix, export and then re-import it on FreeBSD. Make sure you don't upgrade your pool while running OpenSolaris. --Artem From gerrit at weinberg2.de Tue Nov 10 19:00:11 2009 From: gerrit at weinberg2.de (Gerrit =?ISO-8859-1?Q?K=FChn?=) Date: Tue Nov 10 19:00:18 2009 Subject: kern/140433: [zfs] [panic] panic while replaying ZIL after crash Message-ID: <200911101900.nAAJ0BT8068841@freefall.freebsd.org> The following reply was made to PR kern/140433; it has been noted by GNATS. From: Gerrit =?ISO-8859-1?Q?K=FChn?= To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/140433: [zfs] [panic] panic while replaying ZIL after crash Date: Tue, 10 Nov 2009 19:40:42 +0100 Output of "zdb -C" as requested: tank version=13 name='tank' state=0 txg=32618 pool_guid=17523106262699816181 hostname='' vdev_tree type='root' id=0 guid=17523106262699816181 children[0] type='raidz' id=0 guid=2668789775933362751 nparity=1 metaslab_array=14 metaslab_shift=33 ashift=9 asize=1600334594048 is_log=0 children[0] type='disk' id=0 guid=4872680480919708890 path='/dev/label/disk0' whole_disk=0 DTL=63 children[1] type='disk' id=1 guid=14727435584907659484 path='/dev/label/disk1' whole_disk=0 DTL=60 children[2] type='disk' id=2 guid=1501397252321623055 path='/dev/label/disk2' whole_disk=0 DTL=62 children[3] type='disk' id=3 guid=15105917771654568537 path='/dev/label/disk3' whole_disk=0 DTL=61 From jh at FreeBSD.org Tue Nov 10 19:19:16 2009 From: jh at FreeBSD.org (jh@FreeBSD.org) Date: Tue Nov 10 19:19:28 2009 Subject: kern/89991: [ufs] softupdates with mount -ur causes fs UNREFS Message-ID: <200911101919.nAAJJGRv086185@freefall.freebsd.org> Synopsis: [ufs] softupdates with mount -ur causes fs UNREFS State-Changed-From-To: open->closed State-Changed-By: jh State-Changed-When: Tue Nov 10 19:16:50 UTC 2009 State-Changed-Why: Probably fixed by r183074. http://www.freebsd.org/cgi/query-pr.cgi?pr=89991 From jh at FreeBSD.org Tue Nov 10 19:19:17 2009 From: jh at FreeBSD.org (jh@FreeBSD.org) Date: Tue Nov 10 19:19:28 2009 Subject: kern/89991: [ufs] softupdates with mount -ur causes fs UNREFS Message-ID: <200911101919.nAAJJGRv086185@freefall.freebsd.org> Synopsis: [ufs] softupdates with mount -ur causes fs UNREFS State-Changed-From-To: open->closed State-Changed-By: jh State-Changed-When: Tue Nov 10 19:16:50 UTC 2009 State-Changed-Why: Probably fixed by r183074. http://www.freebsd.org/cgi/query-pr.cgi?pr=89991 From pjd at FreeBSD.org Tue Nov 10 22:45:32 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Tue Nov 10 22:45:45 2009 Subject: HEADS UP: Important bug fix in ZFS replay code! In-Reply-To: <200911102227.nAAMRXTf073603@svn.freebsd.org> References: <200911102227.nAAMRXTf073603@svn.freebsd.org> Message-ID: <20091110224524.GC3194@garage.freebsd.pl> Hi. There was important bug in ZFS replay code. If there were setattr logs (not related to permission change) in ZIL during unclean shutdown, one can end up with files that have mode set to 07777. This is very dangerous, especially if you have untrusted local users, as this will set setuid bit on such files. Note that FreeBSD will remove setuid bits when someone will try to modify the file, but it is still dangerous. You can locate such files with the following command: # find / -perm -7777 -print0 | xargs -0 ls -ld You can locate and fix such files with the following command: # find / -perm -7777 -print0 | xargs -0 chmod a-s,o-w,-t On Tue, Nov 10, 2009 at 10:27:33PM +0000, Pawel Jakub Dawidek wrote: > Author: pjd > Date: Tue Nov 10 22:27:33 2009 > New Revision: 199157 > URL: http://svn.freebsd.org/changeset/base/199157 > > Log: > Be careful which vattr fields are set during setattr replay. > Without this fix strange things can appear after unclean shutdown like > files with mode set to 07777. > > Reported by: des > MFC after: 3 days > > Modified: > head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_replay.c > > Modified: head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_replay.c > ============================================================================== > --- head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_replay.c Tue Nov 10 22:25:46 2009 (r199156) > +++ head/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_replay.c Tue Nov 10 22:27:33 2009 (r199157) > @@ -60,10 +60,14 @@ zfs_init_vattr(vattr_t *vap, uint64_t ma > { > VATTR_NULL(vap); > vap->va_mask = (uint_t)mask; > - vap->va_type = IFTOVT(mode); > - vap->va_mode = mode & MODEMASK; > - vap->va_uid = (uid_t)(IS_EPHEMERAL(uid)) ? -1 : uid; > - vap->va_gid = (gid_t)(IS_EPHEMERAL(gid)) ? -1 : gid; > + if (mask & AT_TYPE) > + vap->va_type = IFTOVT(mode); > + if (mask & AT_MODE) > + vap->va_mode = mode & MODEMASK; > + if (mask & AT_UID) > + vap->va_uid = (uid_t)(IS_EPHEMERAL(uid)) ? -1 : uid; > + if (mask & AT_GID) > + vap->va_gid = (gid_t)(IS_EPHEMERAL(gid)) ? -1 : gid; > vap->va_rdev = zfs_cmpldev(rdev); > vap->va_nodeid = nodeid; > } -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20091110/9ffc89b7/attachment.pgp From des at des.no Wed Nov 11 20:32:49 2009 From: des at des.no (=?utf-8?Q?Dag-Erling_Sm=C3=B8rgrav?=) Date: Wed Nov 11 20:32:55 2009 Subject: HEADS UP: Important bug fix in ZFS replay code! In-Reply-To: <20091110224524.GC3194@garage.freebsd.pl> (Pawel Jakub Dawidek's message of "Tue, 10 Nov 2009 23:45:24 +0100") References: <200911102227.nAAMRXTf073603@svn.freebsd.org> <20091110224524.GC3194@garage.freebsd.pl> Message-ID: <86k4xwom2v.fsf@ds4.des.no> Pawel Jakub Dawidek writes: > You can locate such files with the following command: > > # find / -perm -7777 -print0 | xargs -0 ls -ld or 'grep rws /var/run/setuid.today' :) DES -- Dag-Erling Sm?rgrav - des@des.no From kickbsd at ya.ru Wed Nov 11 21:30:09 2009 From: kickbsd at ya.ru (kickbsd kickbsd) Date: Wed Nov 11 21:30:17 2009 Subject: kern/139715: [zfs] vfs.numvnodes leak on busy zfs Message-ID: <200911112130.nABLU9VR074548@freefall.freebsd.org> The following reply was made to PR kern/139715; it has been noted by GNATS. From: kickbsd kickbsd To: bug-followup@freebsd.org Cc: Subject: Re: kern/139715: [zfs] vfs.numvnodes leak on busy zfs Date: Thu, 12 Nov 2009 00:16:04 +0300 Same issue observed on RC3 [root@testzfs /tmp]# sysctl vfs.numvnodes ; i=1 ; while [ $i -le 10000 ] ; do echo "sdfsdfsdf" > `mktemp -t ABC` ; i=$(($i+1)) ; done ; sysctl vfs.numvnodes vfs.numvnodes: 860 vfs.numvnodes: 10861 [root@testzfs /tmp]# sysctl vfs.numvnodes ; i=1 ; while [ $i -le 10000 ] ; do echo "sdfsdfsdf" > `mktemp -t ABC` ; i=$(($i+1)) ; done ; sysctl vfs.numvnodes vfs.numvnodes: 10863 vfs.numvnodes: 20863 From tamgya at gmail.com Thu Nov 12 02:44:48 2009 From: tamgya at gmail.com (Denise H. G.) Date: Thu Nov 12 02:44:55 2009 Subject: Is 'zfs prefetch' dangerous?CC: Message-ID: <87vdhglcer.fsf@mecuria.xbsd.name> Hi list. I've been wondering if I should turn on prefetch for ZFS, but after a search in google, I found there seem to be some issues on ZFS prefetch. Since I have got 4GB RAM, maybe it is a good idea to turn it on. My machine is mainly for home use and has no important data on it. If I turn it on, will my machine run smooth? Thanks, regards. -- tamgya |aT| GmAiL |DoT| cOm From gerrit at pmp.uni-hannover.de Thu Nov 12 08:26:36 2009 From: gerrit at pmp.uni-hannover.de (Gerrit =?ISO-8859-1?Q?K=FChn?=) Date: Thu Nov 12 08:26:43 2009 Subject: trace for zfs panic mounting fs after crash with RC2 In-Reply-To: References: <20091106094734.4b056899.gerrit@pmp.uni-hannover.de> <4AF4123A.4080301@andric.com> <20091106231440.4f0f2cbb.gerrit@pmp.uni-hannover.de> <4AF4AAFF.2080104@jrv.org> <20091109101255.e81774e4.gerrit@pmp.uni-hannover.de> <4AF98032.9050808@bellanet.org> <20091110164622.6bc7aca1.gerrit@pmp.uni-hannover.de> Message-ID: <20091112092630.e7cd6836.gerrit@pmp.uni-hannover.de> On Tue, 10 Nov 2009 08:17:55 -0800 Artem Belevich wrote about Re: trace for zfs panic mounting fs after crash with RC2: AB> Perhaps some of the links on the following post on zfs-discuss may AB> help: AB> http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg26704.html Interesting stuff, thanks. At a first glance I do not see an easy way to roll back my pool to a slightly previous (consistent) state, but all the posts state that it is possible. I guess I have to dive into this a bit deeper. "zpool clear -F" definitely would be the easier-to-use solution. AB> Another option would be to boot from OpenSolaris LiveCD that contains AB> latest zfs changes, import your pool there, fix, export and then AB> re-import it on FreeBSD. Make sure you don't upgrade your pool while AB> running OpenSolaris. Uh, yes, not really an option in this case, I guess. Unless I buy an additional external CD drive and stuff. But thanks for the hint, anyway. I will have a look around how difficult it is to get recent OpenSolaris on a USB stick... cu Gerrit From gerrit at pmp.uni-hannover.de Thu Nov 12 12:06:18 2009 From: gerrit at pmp.uni-hannover.de (Gerrit =?ISO-8859-1?Q?K=FChn?=) Date: Thu Nov 12 12:06:25 2009 Subject: trace for zfs panic mounting fs after crash with RC2 In-Reply-To: <4AF4AAFF.2080104@jrv.org> References: <20091106094734.4b056899.gerrit@pmp.uni-hannover.de> <4AF4123A.4080301@andric.com> <20091106231440.4f0f2cbb.gerrit@pmp.uni-hannover.de> <4AF4AAFF.2080104@jrv.org> Message-ID: <20091112130615.64b44914.gerrit@pmp.uni-hannover.de> On Fri, 06 Nov 2009 17:02:23 -0600 "James R. Van Artsdalen" wrote about Re: trace for zfs panic mounting fs after crash with RC2: JRVA> How the ZIL got corrupted - if it did - is a harder question. What JRVA> kind of hard disk is this, and how is it connected to the system? JRVA> Was there any redundancy (mirror, raidz)? I have been thinking about this for some time now. I have almost the same controller (low-profile version, different bios, but otherwise identical) in use without these problems. Can the 2.5" disks cause any problems? The problematic system is the only one I have with the small drives. Maybe they somehow "lie" to the system about the data actually being written? I remember that a long time ago (about 10 years?) FreeBSD people suggested to turn off the write cache of disk drives to prevent data losses. I see that the sysctl hw.ata.wc is still there. Do people here think that this is worth giving a try? Are there any recent experiences concerning the performance-wise impact on zfs when turning off wc? cu Gerrit From matthias.andree at gmx.de Thu Nov 12 14:42:50 2009 From: matthias.andree at gmx.de (Matthias Andree) Date: Thu Nov 12 14:42:56 2009 Subject: HEADS UP: Important bug fix in ZFS replay code! In-Reply-To: <20091110224524.GC3194@garage.freebsd.pl> References: <200911102227.nAAMRXTf073603@svn.freebsd.org> <20091110224524.GC3194@garage.freebsd.pl> Message-ID: Am 10.11.2009, 23:45 Uhr, schrieb Pawel Jakub Dawidek : > You can locate such files with the following command: > > # find / -perm -7777 -print0 | xargs -0 ls -ld Use ls -ldb to be on the safe side (control characters!). So how about these refinements: find / -perm -7777 -exec ls -ldb '{}' + find / -perm -7777 -ls (not sure what that does with escapes) > You can locate and fix such files with the following command: > > # find / -perm -7777 -print0 | xargs -0 chmod a-s,o-w,-t find / -perm -7777 -exec chmod a-s,o-w,-t '{}' + -- Matthias Andree From tevans.uk at googlemail.com Thu Nov 12 14:44:34 2009 From: tevans.uk at googlemail.com (Tom Evans) Date: Thu Nov 12 14:44:41 2009 Subject: HEADS UP: Important bug fix in ZFS replay code! In-Reply-To: References: <200911102227.nAAMRXTf073603@svn.freebsd.org> <20091110224524.GC3194@garage.freebsd.pl> Message-ID: <2e027be00911120623v2019be2euc48a6f0ec9a049a6@mail.gmail.com> On Thu, Nov 12, 2009 at 2:16 PM, Matthias Andree wrote: > Am 10.11.2009, 23:45 Uhr, schrieb Pawel Jakub Dawidek : > > > You can locate such files with the following command: >> >> # find / -perm -7777 -print0 | xargs -0 ls -ld >> > > Use ls -ldb to be on the safe side (control characters!). > > So how about these refinements: > > find / -perm -7777 -exec ls -ldb '{}' + > find / -perm -7777 -ls (not sure what that does with escapes) > > > You can locate and fix such files with the following command: >> >> # find / -perm -7777 -print0 | xargs -0 chmod a-s,o-w,-t >> > > find / -perm -7777 -exec chmod a-s,o-w,-t '{}' + > > -- > Matthias Andree > > > -exec causes a fork()/exec() for each file found doesn't it? xargs would be more efficient (since we're bikeshedding :) Cheers Tom From gerrit at pmp.uni-hannover.de Thu Nov 12 17:24:18 2009 From: gerrit at pmp.uni-hannover.de (Gerrit =?ISO-8859-1?Q?K=FChn?=) Date: Thu Nov 12 17:24:24 2009 Subject: nfsv4 FreeBSD server vs. Linux client I/O error Message-ID: <20091112182414.cebec1df.gerrit@pmp.uni-hannover.de> Hi all, ist this the right place to talk about nfsv4 issues, or does this better go to -net (or even somewhere else)? Anyways, I'll start here and now: I have a FreeBSD8-RC2 server which I set up for nfsv4 serving according to nfsv4(4). I have a Linux client (Kernel 2.6.25) trying to do "mount.nfs4 / /mnt -v". This takes *exactly* 30s. (looks like running into some kind of timeout). After that, the mount is there. It is displayed by "mount" and gives the right sizes with "du -h". However, as soon as I try to access /mnt with "cd /mnt" or "ls /mnt" I get an Input/Output Error on the client side. Googleing around I found only very few information about nfsv4 and FreeBSD (ok, it's a new feature after all :-). Does anyone here already have some experiences to share? Anyone already using FreeBSD servers against Linux clients? Any suggestions how to debug and solve the problem above? Any hints are as always greatly appreciated. cu Gerrit From matthias.andree at gmx.de Thu Nov 12 17:53:28 2009 From: matthias.andree at gmx.de (Matthias Andree) Date: Thu Nov 12 17:53:36 2009 Subject: HEADS UP: Important bug fix in ZFS replay code! In-Reply-To: <2e027be00911120623v2019be2euc48a6f0ec9a049a6@mail.gmail.com> References: <200911102227.nAAMRXTf073603@svn.freebsd.org> <20091110224524.GC3194@garage.freebsd.pl> <2e027be00911120623v2019be2euc48a6f0ec9a049a6@mail.gmail.com> Message-ID: Am 12.11.2009, 15:23 Uhr, schrieb Tom Evans : >> So how about these refinements: >> >> find / -perm -7777 -exec ls -ldb '{}' + >> find / -perm -7777 -ls (not sure what that does with escapes) >> >> >> You can locate and fix such files with the following command: >>> >>> # find / -perm -7777 -print0 | xargs -0 chmod a-s,o-w,-t >>> >> >> find / -perm -7777 -exec chmod a-s,o-w,-t '{}' + >> >> -- >> Matthias Andree >> >> >> -exec causes a fork()/exec() for each file found doesn't it? xargs >> would be > more efficient (since we're bikeshedding :) That's the subtle difference between "+" and "\;" at the end if using '{}' :) -- Matthias Andree From rmacklem at uoguelph.ca Thu Nov 12 19:37:25 2009 From: rmacklem at uoguelph.ca (Rick Macklem) Date: Thu Nov 12 19:37:32 2009 Subject: nfsv4 FreeBSD server vs. Linux client I/O error In-Reply-To: <20091112182414.cebec1df.gerrit@pmp.uni-hannover.de> References: <20091112182414.cebec1df.gerrit@pmp.uni-hannover.de> Message-ID: On Thu, 12 Nov 2009, Gerrit Kühn wrote: > Hi all, > > ist this the right place to talk about nfsv4 issues, or does this better > go to -net (or even somewhere else)? > > Anyways, I'll start here and now: > I have a FreeBSD8-RC2 server which I set up for nfsv4 serving according > to nfsv4(4). I have a Linux client (Kernel 2.6.25) trying to do > "mount.nfs4 / /mnt -v". This takes *exactly* 30s. (looks like running into > some kind of timeout). After that, the mount is there. It is displayed by > "mount" and gives the right sizes with "du -h". > However, as soon as I try to access /mnt with "cd /mnt" or "ls /mnt" I get > an Input/Output Error on the client side. > A few things to check on the server: - Did you add a "V4:" line to your /etc/exports and what did you set as the root path in it? If you used "V4: /" then the root file system would need to be exported by another line in /etc/exports for it to work. - If you are only exporting another filesystem, lets say "/exports", then your mount command would have to look like: mount -t nfs4 /exports /mnt (assuming "V4: /" was used) - If you used "V4: /exports", then "mount -t nfs4 :/ /mnt" would work and you would see /exports at /mnt. Beyond something like the above, if you capture packets using "tcpdump -s 0 -w host " on the server and then email me "", I can take a look at it. (tcpdump doesn't know diddly about NFSv4, but wireshark does and can handle tcpdump captures.) > Googleing around I found only very few information about nfsv4 and FreeBSD > (ok, it's a new feature after all :-). Does anyone here already have some > experiences to share? Anyone already using FreeBSD servers against Linux > clients? Any suggestions how to debug and solve the problem above? > I don't usually test against Linux, but I'll try a quick mount here, in case something is obviously broken. rick From rmacklem at uoguelph.ca Thu Nov 12 22:10:16 2009 From: rmacklem at uoguelph.ca (Rick Macklem) Date: Thu Nov 12 22:10:23 2009 Subject: nfsv4 FreeBSD server vs. Linux client I/O error In-Reply-To: <20091112182414.cebec1df.gerrit@pmp.uni-hannover.de> References: <20091112182414.cebec1df.gerrit@pmp.uni-hannover.de> Message-ID: On Thu, 12 Nov 2009, Gerrit Kühn wrote: > Hi all, > > ist this the right place to talk about nfsv4 issues, or does this better > go to -net (or even somewhere else)? > > Anyways, I'll start here and now: > I have a FreeBSD8-RC2 server which I set up for nfsv4 serving according > to nfsv4(4). I have a Linux client (Kernel 2.6.25) trying to do > "mount.nfs4 / /mnt -v". This takes *exactly* 30s. (looks like running into > some kind of timeout). After that, the mount is there. It is displayed by > "mount" and gives the right sizes with "du -h". > However, as soon as I try to access /mnt with "cd /mnt" or "ls /mnt" I get > an Input/Output Error on the client side. > One more thing that came to mind. If your root fs in NFS mounted, it can't be exported, so you have to use the version (assuming /exports is a local on-disk file system that is exported): V4: /export and not V4: / and then the mount needs to look like: mount -t nfs4 :/export /mnt I tried a relatively recent Ubuntu client here and it seemed to mount ok. (There are many variants of the mount utilities and patch versions of the nfs4 client for Linux, so your mileage definitely may vary.) I'll be happy to look at a tcpdump capture, if you get one, rick From mattjreimer at gmail.com Fri Nov 13 01:12:09 2009 From: mattjreimer at gmail.com (Matt Reimer) Date: Fri Nov 13 01:12:21 2009 Subject: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" In-Reply-To: References: <4AD710D6.70404@buchlovice.org> Message-ID: 2009/11/12 Matt Reimer : > > Radek, > > Try the attached patch (sponsored by VPOP Technologies). I found an > overflow in /sys/cddl/boot/zfs/zfssubr.c:vdev_raidz_read() that was > causing my 6x1TB raidz2 array to fail to boot. > > Apply the patch, build everything in /sys/boot, and then make sure you > update both gptzfsboot and /boot/loader. Oops, here's the patch. Matt -------------- next part -------------- A non-text attachment was scrubbed... Name: zfssubr.c.patch Type: application/octet-stream Size: 415 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20091113/0d022807/zfssubr.c.obj From mattjreimer at gmail.com Fri Nov 13 01:27:04 2009 From: mattjreimer at gmail.com (Matt Reimer) Date: Fri Nov 13 01:27:10 2009 Subject: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" In-Reply-To: <4AD710D6.70404@buchlovice.org> References: <4AD710D6.70404@buchlovice.org> Message-ID: 2009/10/15 Radek Val??ek : > Hi, > > I want to ask if there is something new in adding support to > gptzfsboot/zfsboot for reading gang-blocks? > > From Sun's docs: > > Gang blocks > > When there is not enough contiguous space to write a complete block, the ZIO > pipeline will break the I/O up into smaller 'gang blocks' which can later be > assembled transparently to appear as complete blocks. > > Everything works fine for me, until I rewrite kernel/world after system > upgrade to latest one (releng_8). After this am I no longer able to boot > from zfs raidz1 pool with following messages: > >>/ ZFS: i/o error - all block copies unavailable > />/ ZFS: can't read MOS > />/ ZFS: unexpected object set type lld > />/ ZFS: unexpected object set type lld > />/ > />/ FreeBSD/i386 boot > />/ Default: z:/boot/kernel/kernel > />/ boot: > />/ ZFS: unexpected object set type lld > />/ > />/ FreeBSD/i386 boot > />/ Default: tank:/boot/kernel/kernel > />/ boot: Radek, Try the attached patch (sponsored by VPOP Technologies). I found an overflow in /sys/cddl/boot/zfs/zfssubr.c:vdev_raidz_read() that was causing my 6x1TB raidz2 array to fail to boot. Apply the patch, build everything in /sys/boot, and then make sure you update both gptzfsboot and /boot/loader. Robert, I'm guessing you couldn't replicate this because your array was small enough not to result in block numbers overflowing an int. The kernel source for the corresponding functionality is in /sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c:vdev_raidz_map_alloc(). There all these variables are uint64_t, but I think unnecessarily. I tried changing the boot loader's vdev_raidz_read() variables to all uint64_t but then gptzfsboot would reboot itself, likely due to a stack overflow. The attached patch just changes a few variables that, after a quick analysis, seemed likely to overflow. If this looks good, would someone commit it? Matt From rnoland at FreeBSD.org Fri Nov 13 04:53:17 2009 From: rnoland at FreeBSD.org (Robert Noland) Date: Fri Nov 13 04:54:02 2009 Subject: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" In-Reply-To: References: <4AD710D6.70404@buchlovice.org> Message-ID: <1258087983.2303.23.camel@balrog.2hip.net> On Thu, 2009-11-12 at 16:54 -0800, Matt Reimer wrote: > 2009/10/15 Radek Val??ek : > > Hi, > > > > I want to ask if there is something new in adding support to > > gptzfsboot/zfsboot for reading gang-blocks? > > > > From Sun's docs: > > > > Gang blocks > > > > When there is not enough contiguous space to write a complete block, the ZIO > > pipeline will break the I/O up into smaller 'gang blocks' which can later be > > assembled transparently to appear as complete blocks. > > > > Everything works fine for me, until I rewrite kernel/world after system > > upgrade to latest one (releng_8). After this am I no longer able to boot > > from zfs raidz1 pool with following messages: > > > >>/ ZFS: i/o error - all block copies unavailable > > />/ ZFS: can't read MOS > > />/ ZFS: unexpected object set type lld > > />/ ZFS: unexpected object set type lld > > />/ > > />/ FreeBSD/i386 boot > > />/ Default: z:/boot/kernel/kernel > > />/ boot: > > />/ ZFS: unexpected object set type lld > > />/ > > />/ FreeBSD/i386 boot > > />/ Default: tank:/boot/kernel/kernel > > />/ boot: > > Radek, > > Try the attached patch (sponsored by VPOP Technologies). I found an > overflow in /sys/cddl/boot/zfs/zfssubr.c:vdev_raidz_read() that was > causing my 6x1TB raidz2 array to fail to boot. > > Apply the patch, build everything in /sys/boot, and then make sure you > update both gptzfsboot and /boot/loader. > > Robert, I'm guessing you couldn't replicate this because your array > was small enough not to result in block numbers overflowing an int. This is likely, all of my raidz tests were with vnode backed 1GB memory disks. So my largest configuration was a 6 x 1GB raidz2. > The kernel source for the corresponding functionality is in > /sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c:vdev_raidz_map_alloc(). > There all these variables are uint64_t, but I think unnecessarily. I > tried changing the boot loader's vdev_raidz_read() variables to all > uint64_t but then gptzfsboot would reboot itself, likely due to a > stack overflow. The attached patch just changes a few variables that, > after a quick analysis, seemed likely to overflow. > > If this looks good, would someone commit it? ps@ grabbed it up already, but I may handle the MFC for him. I have some other minor fixups in my tree right now... like teaching printf to handle %llx. Thanks for finding this... It's been really frustrating that I couldn't produce a failing system. robert. > Matt > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" -- Robert Noland FreeBSD From mattjreimer at gmail.com Fri Nov 13 06:15:08 2009 From: mattjreimer at gmail.com (Matt Reimer) Date: Fri Nov 13 06:15:14 2009 Subject: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" In-Reply-To: <1258087983.2303.23.camel@balrog.2hip.net> References: <4AD710D6.70404@buchlovice.org> <1258087983.2303.23.camel@balrog.2hip.net> Message-ID: 2009/11/12 Robert Noland : > On Thu, 2009-11-12 at 16:54 -0800, Matt Reimer wrote: >> 2009/10/15 Radek Val??ek : >> > Everything works fine for me, until I rewrite kernel/world after system >> > upgrade to latest one (releng_8). After this am I no longer able to boot >> > from zfs raidz1 pool with following messages: >> > >> >>/ ZFS: i/o error - all block copies unavailable >> > />/ ZFS: can't read MOS >> > />/ ZFS: unexpected object set type lld >> > />/ ZFS: unexpected object set type lld >> > />/ >> > />/ FreeBSD/i386 boot >> > />/ Default: z:/boot/kernel/kernel >> > />/ boot: >> > />/ ZFS: unexpected object set type lld >> > />/ >> > />/ FreeBSD/i386 boot >> > />/ Default: tank:/boot/kernel/kernel >> > />/ boot: >> >> Radek, >> >> Try the attached patch (sponsored by VPOP Technologies). I found an >> overflow in /sys/cddl/boot/zfs/zfssubr.c:vdev_raidz_read() that was >> causing my 6x1TB raidz2 array to fail to boot. ... >> The kernel source for the corresponding functionality is in >> /sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c:vdev_raidz_map_alloc(). >> There all these variables are uint64_t, but I think unnecessarily. I >> tried changing the boot loader's vdev_raidz_read() variables to all >> uint64_t but then gptzfsboot would reboot itself, likely due to a >> stack overflow. The attached patch just changes a few variables that, >> after a quick analysis, seemed likely to overflow. >> >> If this looks good, would someone commit it? > > ps@ grabbed it up already, but I may handle the MFC for him. ?I have > some other minor fixups in my tree right now... like teaching printf to > handle %llx. ?Thanks for finding this... It's been really frustrating > that I couldn't produce a failing system. Is it possible for this patch to get into 8.0-RELEASE, or is it too late? I suppose it doesn't matter that much since the loader isn't built with LOADER_ZFS_SUPPORT by default anyway, so folks are going to have to compile it themselves. Matt From gerrit at pmp.uni-hannover.de Fri Nov 13 09:28:52 2009 From: gerrit at pmp.uni-hannover.de (Gerrit =?ISO-8859-1?Q?K=FChn?=) Date: Fri Nov 13 09:28:59 2009 Subject: nfsv4 FreeBSD server vs. Linux client I/O error In-Reply-To: References: <20091112182414.cebec1df.gerrit@pmp.uni-hannover.de> Message-ID: <20091113102848.4a8298e8.gerrit@pmp.uni-hannover.de> On Thu, 12 Nov 2009 14:45:04 -0500 (EST) Rick Macklem wrote about Re: nfsv4 FreeBSD server vs. Linux client I/O error: RM> A few things to check on the server: RM> - Did you add a "V4:" line to your /etc/exports and what did you set as RM> the root path in it? If you used "V4: /" then the root file system RM> would need to be exported by another line in /etc/exports for it to RM> work. RM> RM> - If you are only exporting another filesystem, lets say "/exports", RM> then your mount command would have to look like: RM> mount -t nfs4 /exports /mnt RM> (assuming "V4: /" was used) I think I do not yet understand the last part. How do I restrict the export of... oh, I guess I see. If I put / in the exports list, this will merely only allow for the full path still being used on the client side, but I have still to add the file systems actually to be exported. I was wondering about the notes in the manpages about this. RM> - If you used "V4: /exports", then "mount -t nfs4 :/ /mnt" RM> would work and you would see /exports at /mnt. However, this is exactly the way I went. My exports line on my server (cliff) looks like this: V4: /tank -network 192.168.0.0 -mask 255.255.0.0 On the client I try to mount like this: --- pt-ws1 ~ # time mount -t nfs4 cliff:/ /mnt real 0m30.005s user 0m0.000s sys 0m0.002s --- As you see, there is a timeout of about 30s involved. After that the nfs appears to be there: pt-ws1 ~ # mount [...] nfsd (rw) cliff:/ on /mnt type nfs4 (rw,addr=192.168.33.96,clientaddr=192.168.32.3) But it cannot be accessed: pt-ws1 ~ # ls /mnt ls: cannot open directory /mnt: Input/output error Unmounting gives the same timeout, but afterwards the nfs mount is indeed unmounted: pt-ws1 ~ # time umount /mnt real 0m30.002s user 0m0.001s sys 0m0.001s RM> Beyond something like the above, if you capture packets using RM> "tcpdump -s 0 -w host " on the server and then RM> email me "", I can take a look at it. (tcpdump doesn't know RM> diddly about NFSv4, but wireshark does and can handle tcpdump RM> captures.) I will do that, thanks. cu Gerrit From gerrit at pmp.uni-hannover.de Fri Nov 13 09:44:29 2009 From: gerrit at pmp.uni-hannover.de (Gerrit =?ISO-8859-1?Q?K=FChn?=) Date: Fri Nov 13 09:44:36 2009 Subject: nfsv4 FreeBSD server vs. Linux client I/O error In-Reply-To: References: <20091112182414.cebec1df.gerrit@pmp.uni-hannover.de> Message-ID: <20091113104426.e8e85871.gerrit@pmp.uni-hannover.de> On Thu, 12 Nov 2009 17:17:55 -0500 (EST) Rick Macklem wrote about Re: nfsv4 FreeBSD server vs. Linux client I/O error: RM> One more thing that came to mind. If your root fs in NFS mounted, it RM> can't be exported, so you have to use the version (assuming /exports RM> is a local on-disk file system that is exported): RM> V4: /export RM> and not RM> V4: / I guess this does not apply do my situation, both client and server have their root fs on hard disks. RM> and then the mount needs to look like: RM> mount -t nfs4 :/export /mnt I tried that, just to make sure: pt-ws1 ~ # time mount -t nfs4 cliff:/tank /mnt mount.nfs4: mounting cliff:/tank failed, reason given by server: No such file or directory real 1m0.006s user 0m0.002s sys 0m0.002s I guess this is consistent, although I do not know why I see 30s timeout interval twice here. RM> I tried a relatively recent Ubuntu client here and it seemed to mount RM> ok. (There are many variants of the mount utilities and patch versions RM> of the nfs4 client for Linux, so your mileage definitely may vary.) I'm using Sabayon 3.5 with kernel 2.6.25 here. I guess I could try and update to the new Sabayon 5 for testing (if that might help). The installed nfs packages are: net-libs/libnfsidmap-0.20 net-fs/nfs4-acl-tools-0.3.2 net-fs/nfs-utils-1.1.2 RM> I'll be happy to look at a tcpdump capture, if you get one, rick You should have received it by now. I'm looking forward to your answer. Thanks again. cu Gerrit From gerrit at pmp.uni-hannover.de Fri Nov 13 10:49:11 2009 From: gerrit at pmp.uni-hannover.de (Gerrit =?ISO-8859-1?Q?K=FChn?=) Date: Fri Nov 13 10:49:17 2009 Subject: nfsv4 FreeBSD server vs. Linux client I/O error In-Reply-To: <20091113104426.e8e85871.gerrit@pmp.uni-hannover.de> References: <20091112182414.cebec1df.gerrit@pmp.uni-hannover.de> <20091113104426.e8e85871.gerrit@pmp.uni-hannover.de> Message-ID: <20091113114907.cb04885e.gerrit@pmp.uni-hannover.de> On Fri, 13 Nov 2009 10:44:26 +0100 Gerrit K?hn wrote about Re: nfsv4 FreeBSD server vs. Linux client I/O error: GK> RM> I'll be happy to look at a tcpdump capture, if you get one, rick GK> GK> You should have received it by now. I'm looking forward to your answer. GK> Thanks again. One more thing from my side: I guess I fixed the timeout problem by restarting all concerned services on the client side (namely rpc.idmapd and remounting rpc_pipefs). Now I get quick responses, but I still see the i/o error when trying to access the mounted volume. cu Gerrit From rnoland at FreeBSD.org Fri Nov 13 14:26:05 2009 From: rnoland at FreeBSD.org (Robert Noland) Date: Fri Nov 13 14:26:17 2009 Subject: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" In-Reply-To: References: <4AD710D6.70404@buchlovice.org> <1258087983.2303.23.camel@balrog.2hip.net> Message-ID: <1258122354.2303.24.camel@balrog.2hip.net> On Thu, 2009-11-12 at 22:15 -0800, Matt Reimer wrote: > 2009/11/12 Robert Noland : > > On Thu, 2009-11-12 at 16:54 -0800, Matt Reimer wrote: > >> 2009/10/15 Radek Val??ek : > >> > Everything works fine for me, until I rewrite kernel/world after system > >> > upgrade to latest one (releng_8). After this am I no longer able to boot > >> > from zfs raidz1 pool with following messages: > >> > > >> >>/ ZFS: i/o error - all block copies unavailable > >> > />/ ZFS: can't read MOS > >> > />/ ZFS: unexpected object set type lld > >> > />/ ZFS: unexpected object set type lld > >> > />/ > >> > />/ FreeBSD/i386 boot > >> > />/ Default: z:/boot/kernel/kernel > >> > />/ boot: > >> > />/ ZFS: unexpected object set type lld > >> > />/ > >> > />/ FreeBSD/i386 boot > >> > />/ Default: tank:/boot/kernel/kernel > >> > />/ boot: > >> > >> Radek, > >> > >> Try the attached patch (sponsored by VPOP Technologies). I found an > >> overflow in /sys/cddl/boot/zfs/zfssubr.c:vdev_raidz_read() that was > >> causing my 6x1TB raidz2 array to fail to boot. > ... > >> The kernel source for the corresponding functionality is in > >> /sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c:vdev_raidz_map_alloc(). > >> There all these variables are uint64_t, but I think unnecessarily. I > >> tried changing the boot loader's vdev_raidz_read() variables to all > >> uint64_t but then gptzfsboot would reboot itself, likely due to a > >> stack overflow. The attached patch just changes a few variables that, > >> after a quick analysis, seemed likely to overflow. > >> > >> If this looks good, would someone commit it? > > > > ps@ grabbed it up already, but I may handle the MFC for him. I have > > some other minor fixups in my tree right now... like teaching printf to > > handle %llx. Thanks for finding this... It's been really frustrating > > that I couldn't produce a failing system. > > Is it possible for this patch to get into 8.0-RELEASE, or is it too > late? I suppose it doesn't matter that much since the loader isn't > built with LOADER_ZFS_SUPPORT by default anyway, so folks are going to > have to compile it themselves. I think we have missed the boat, but I'll talk to re@ and see if we can get it in. robert. > Matt -- Robert Noland FreeBSD From se at freebsd.org Fri Nov 13 20:49:01 2009 From: se at freebsd.org (Stefan Esser) Date: Fri Nov 13 20:49:34 2009 Subject: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" In-Reply-To: <1258122354.2303.24.camel@balrog.2hip.net> References: <4AD710D6.70404@buchlovice.org> <1258087983.2303.23.camel@balrog.2hip.net> <1258122354.2303.24.camel@balrog.2hip.net> Message-ID: <4AFDBFFB.3070009@freebsd.org> On 13.11.2009 15:25, Robert Noland wrote: > I think we have missed the boat, but I'll talk to re@ and see if we > can get it in. The patch fixed GPT/ZFS booting for me, too. It would be good to have it in 8.0 (since it is definitely required to boot from ZFS pools with non-trivial sizes), and does not affect anybody not trying to boot this way. OTOH, it since you cannot just install FreeBSD on pure ZFS from sysinstall, it might be sufficient to prominently warn about this problem and point at the required patch, to prevent foot-shooting. But having this patch that has been successfully tested by a number of people that suffered from the GPT/ZFS boot problem looks highly preferable to me ... Regards, STefan From valin at buchlovice.org Sat Nov 14 09:05:24 2009 From: valin at buchlovice.org (=?ISO-8859-2?Q?Radek_Val=E1=B9ek?=) Date: Sat Nov 14 09:05:38 2009 Subject: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" In-Reply-To: <4AFDBFFB.3070009@freebsd.org> References: <4AD710D6.70404@buchlovice.org> <1258087983.2303.23.camel@balrog.2hip.net> <1258122354.2303.24.camel@balrog.2hip.net> <4AFDBFFB.3070009@freebsd.org> Message-ID: <4AFE72D1.5020902@buchlovice.org> Stefan Esser napsal(a): > On 13.11.2009 15:25, Robert Noland wrote: > >> I think we have missed the boat, but I'll talk to re@ and see if we >> can get it in. >> > > The patch fixed GPT/ZFS booting for me, too. It would be good to have > it in 8.0 (since it is definitely required to boot from ZFS pools with > non-trivial sizes), and does not affect anybody not trying to boot > this way. OTOH, it since you cannot just install FreeBSD on pure ZFS > from sysinstall, it might be sufficient to prominently warn about this > problem and point at the required patch, to prevent foot-shooting. > > But having this patch that has been successfully tested by a number > of people that suffered from the GPT/ZFS boot problem looks highly > preferable to me ... > > Regards, STefan > I can confirm that the patch is working for me too, and I'm able now to boot from raidz/raidz2 pool after rewriting loader.conf/loader/kernel. I agree with Stefan, having it in 8.0-RELEASE would be good, catch the boat :) Big thnx to Matt, great work from all. From alteriks at gmail.com Sat Nov 14 09:46:42 2009 From: alteriks at gmail.com (Krzysztof Dajka) Date: Sat Nov 14 09:46:48 2009 Subject: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" In-Reply-To: <4AFDBFFB.3070009@freebsd.org> References: <4AD710D6.70404@buchlovice.org> <1258087983.2303.23.camel@balrog.2hip.net> <1258122354.2303.24.camel@balrog.2hip.net> <4AFDBFFB.3070009@freebsd.org> Message-ID: <684e57ec0911140115oa5d3c63xec7b2913847ce2c6@mail.gmail.com> Thanks Matt for the patch. I used it with 8.0RC3 release. I installed FreeBSD under Linux (KVM) my 3x500GB drives were mounted as a scsi drives. Installation went smoothly but when I rebooted FreeBSD guest it hang as usual ;) with "ZFS: i/o error - all block copies unavailable", well it also spit out some LBA errors for the first time. I was a little disappointed, because I've been trying for three weeks to replace my Debian system with broken ext3 fs with FreeBSD on raidz. But I thought to myself I'll give it a try, and run FreeBSD native. To my suprise it welcomed me with login prompt. Once again thanks for the patch. It would be good idea to merge it with final release. On 11/13/09, Stefan Esser wrote: > On 13.11.2009 15:25, Robert Noland wrote: >> I think we have missed the boat, but I'll talk to re@ and see if we >> can get it in. > > The patch fixed GPT/ZFS booting for me, too. It would be good to have > it in 8.0 (since it is definitely required to boot from ZFS pools with > non-trivial sizes), and does not affect anybody not trying to boot > this way. OTOH, it since you cannot just install FreeBSD on pure ZFS > from sysinstall, it might be sufficient to prominently warn about this > problem and point at the required patch, to prevent foot-shooting. > > But having this patch that has been successfully tested by a number > of people that suffered from the GPT/ZFS boot problem looks highly > preferable to me ... > > Regards, STefan > _______________________________________________ > freebsd-current@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-current > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > From lopez.on.the.lists at yellowspace.net Sat Nov 14 16:40:27 2009 From: lopez.on.the.lists at yellowspace.net (Lorenzo Perone) Date: Sat Nov 14 16:40:34 2009 Subject: gmirroring slices Message-ID: <1A8F306A-8749-471B-94EA-FC8435A30C34@yellowspace.net> Hello, I was wondering if anyone could give me an advice on how viable and reliable it is, to use gmirror on a slice of an MBR-style partitioned disk, and use the second slice(s) within a zpool. I remember a discussion here on where metadata is kept (always at the end of the disk as opposed to the end of the given consumer?), so I wasn't sure about how much of a good idea this might be. The reason I'd like to have it like this is, that I had mixed bad experiences in the effort of using ZFS as a boot and root volume, so I'd rather keep a traditional slice for booting/rooting, and a zpool for the production jails on that machine. The example would be provider: mirror/gm0 consumers: ad6s1 and ad8s1 zpool mirror made out of ad6s2 and ad8s2 while experimenting, I got into the problem that gmirror label -v -b round-robin gm0 ad6s1 got a permission denied (even with sysctl kern.geom.debugflags=16/17). Any hints on what can cause this (I might have screwed up something with fdisk/bsdlabel, but after doublechecking I wonder what it could be..) Is a GPT partition table better for this (I got further with another machine by using GPT partitions)? Thanx for any advice. Regards, Lorenzo From hugo at barafranca.com Sat Nov 14 17:21:11 2009 From: hugo at barafranca.com (Hugo Silva) Date: Sat Nov 14 17:21:18 2009 Subject: gmirroring slices In-Reply-To: <1A8F306A-8749-471B-94EA-FC8435A30C34@yellowspace.net> References: <1A8F306A-8749-471B-94EA-FC8435A30C34@yellowspace.net> Message-ID: <4AFEE2F2.6000609@barafranca.com> Lorenzo Perone wrote: > > Hello, > > I was wondering if anyone could give me an advice on how viable and > reliable it is, to use gmirror on a slice of an MBR-style partitioned > disk, and use the second slice(s) within a zpool. > > I remember a discussion here on where metadata is kept (always at the > end of the disk as opposed to the end of the given consumer?), so I > wasn't sure about how much of a good idea this might be. The reason > I'd like to have it like this is, that I had mixed bad experiences in > the effort of using ZFS as a boot and root volume, so I'd rather keep > a traditional slice for booting/rooting, and a zpool for the > production jails on that machine. > > The example would be > > provider: mirror/gm0 > consumers: ad6s1 and ad8s1 > > zpool mirror made out of > ad6s2 and ad8s2 > > while experimenting, I got into the problem that gmirror label -v -b > round-robin gm0 ad6s1 got a permission denied (even with sysctl > kern.geom.debugflags=16/17). Any hints on what can cause this (I might > have screwed up something with fdisk/bsdlabel, but after > doublechecking I wonder what it could be..) > > Is a GPT partition table better for this (I got further with another > machine by using GPT partitions)? > > Thanx for any advice. > > Regards, > Lorenzo > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" I setup an 8.0-RC3 server the other day just like this: Name Status Components mirror/gm0s1 COMPLETE ad4s1 ad6s1 pool: storage state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM storage ONLINE 0 0 0 mirror ONLINE 0 0 0 ad4s2 ONLINE 0 0 0 ad6s2 ONLINE 0 0 0 errors: No known data errors I would like to try setting this up with gpart too, but I had to get this server running asap and I knew I could make it work like this. From rnoland at FreeBSD.org Sat Nov 14 18:44:35 2009 From: rnoland at FreeBSD.org (Robert Noland) Date: Sat Nov 14 18:44:47 2009 Subject: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" In-Reply-To: <684e57ec0911140115oa5d3c63xec7b2913847ce2c6@mail.gmail.com> References: <4AD710D6.70404@buchlovice.org> <1258087983.2303.23.camel@balrog.2hip.net> <1258122354.2303.24.camel@balrog.2hip.net> <4AFDBFFB.3070009@freebsd.org> <684e57ec0911140115oa5d3c63xec7b2913847ce2c6@mail.gmail.com> Message-ID: <1258224261.2303.31.camel@balrog.2hip.net> On Sat, 2009-11-14 at 10:15 +0100, Krzysztof Dajka wrote: > Thanks Matt for the patch. I used it with 8.0RC3 release. I installed > FreeBSD under Linux (KVM) my 3x500GB drives were mounted as a scsi > drives. Installation went smoothly but when I rebooted FreeBSD guest > it hang as usual ;) with "ZFS: i/o error - all block copies > unavailable", well it also spit out some LBA errors for the first > time. I was a little disappointed, because I've been trying for three > weeks to replace my Debian system with broken ext3 fs with FreeBSD on > raidz. But I thought to myself I'll give it a try, and run FreeBSD > native. To my suprise it welcomed me with login prompt. Once again > thanks for the patch. It would be good idea to merge it with final > release. This was approved by re@ and has been merged to the release branch. It should be included in 8.0-RELEASE. robert. > On 11/13/09, Stefan Esser wrote: > > On 13.11.2009 15:25, Robert Noland wrote: > >> I think we have missed the boat, but I'll talk to re@ and see if we > >> can get it in. > > > > The patch fixed GPT/ZFS booting for me, too. It would be good to have > > it in 8.0 (since it is definitely required to boot from ZFS pools with > > non-trivial sizes), and does not affect anybody not trying to boot > > this way. OTOH, it since you cannot just install FreeBSD on pure ZFS > > from sysinstall, it might be sufficient to prominently warn about this > > problem and point at the required patch, to prevent foot-shooting. > > > > But having this patch that has been successfully tested by a number > > of people that suffered from the GPT/ZFS boot problem looks highly > > preferable to me ... > > > > Regards, STefan > > _______________________________________________ > > freebsd-current@freebsd.org mailing list > > http://lists.freebsd.org/mailman/listinfo/freebsd-current > > To unsubscribe, send any mail to "freebsd-current-unsubscribe@freebsd.org" > > -- Robert Noland FreeBSD From stb at lassitu.de Sat Nov 14 20:03:38 2009 From: stb at lassitu.de (Stefan Bethke) Date: Sat Nov 14 20:03:44 2009 Subject: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" In-Reply-To: References: <4AD710D6.70404@buchlovice.org> Message-ID: <97D7A06D-98EC-4702-9E3D-A7B85DB39A20@lassitu.de> Am 13.11.2009 um 01:54 schrieb Matt Reimer: > Try the attached patch (sponsored by VPOP Technologies). I found an > overflow in /sys/cddl/boot/zfs/zfssubr.c:vdev_raidz_read() that was > causing my 6x1TB raidz2 array to fail to boot. I can confirm as well that the patch (as committed to -current as r199241) makes my loader happy. Now I just need to figure out why the kernel won't mount root... Thanks, Stefan -- Stefan Bethke Fon +49 151 14070811 From stb at lassitu.de Sat Nov 14 20:36:22 2009 From: stb at lassitu.de (Stefan Bethke) Date: Sat Nov 14 20:36:28 2009 Subject: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" In-Reply-To: <97D7A06D-98EC-4702-9E3D-A7B85DB39A20@lassitu.de> References: <4AD710D6.70404@buchlovice.org> <97D7A06D-98EC-4702-9E3D-A7B85DB39A20@lassitu.de> Message-ID: <45EBBE61-7D43-4853-AC86-2FD42334808D@lassitu.de> Am 14.11.2009 um 21:03 schrieb Stefan Bethke: > Am 13.11.2009 um 01:54 schrieb Matt Reimer: > >> Try the attached patch (sponsored by VPOP Technologies). I found an >> overflow in /sys/cddl/boot/zfs/zfssubr.c:vdev_raidz_read() that was >> causing my 6x1TB raidz2 array to fail to boot. > > I can confirm as well that the patch (as committed to -current as r199241) makes my loader happy. Now I just need to figure out why the kernel won't mount root... I was trying to boot off a raw ZFS pool. When using GPT partitions, it works just fine. Stefan -- Stefan Bethke Fon +49 151 14070811 From 000.fbsd at quip.cz Sat Nov 14 20:40:18 2009 From: 000.fbsd at quip.cz (Miroslav Lachman) Date: Sat Nov 14 20:40:25 2009 Subject: gmirroring slices In-Reply-To: <1A8F306A-8749-471B-94EA-FC8435A30C34@yellowspace.net> References: <1A8F306A-8749-471B-94EA-FC8435A30C34@yellowspace.net> Message-ID: <4AFF15AE.4070902@quip.cz> Lorenzo Perone wrote: > > Hello, > > I was wondering if anyone could give me an advice on how viable and > reliable it is, to use gmirror on a slice of an MBR-style partitioned > disk, and use the second slice(s) within a zpool. > > I remember a discussion here on where metadata is kept (always at the > end of the disk as opposed to the end of the given consumer?), so I > wasn't sure about how much of a good idea this might be. I think metadata is stored at the end of the provider (slice in this case), but I am not a GEOM expert. > The reason I'd > like to have it like this is, that I had mixed bad experiences in the > effort of using ZFS as a boot and root volume, so I'd rather keep a > traditional slice for booting/rooting, and a zpool for the production > jails on that machine. > > The example would be > > provider: mirror/gm0 > consumers: ad6s1 and ad8s1 > > zpool mirror made out of > ad6s2 and ad8s2 I am running following setup for year without any configuration problems # gmirror status Name Status Components mirror/gms1 COMPLETE ad4s1 ad6s1 # zpool status pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror ONLINE 0 0 0 ad4s2 ONLINE 0 0 0 ad6s2 ONLINE 0 0 0 The first slice is 20GB partitioned as usual: # mount -t ufs /dev/mirror/gms1a on / (ufs, local) /dev/mirror/gms1e on /usr (ufs, local, soft-updates) /dev/mirror/gms1d on /var (ufs, local, nosuid, soft-updates) /dev/mirror/gms1f on /tmp (ufs, local, noexec, nosuid, soft-updates) The rest (450GB) is used in ZFS mirrored zpool for jails (each jail has its own filesystem) > while experimenting, I got into the problem that gmirror label -v -b > round-robin gm0 ad6s1 got a permission denied (even with sysctl > kern.geom.debugflags=16/17). Any hints on what can cause this (I might > have screwed up something with fdisk/bsdlabel, but after doublechecking > I wonder what it could be..) I did it in non-standard way - converting already installed system on one disk to mirrored. So when I was in system running off ad6 I created two slices on ad4, setup gmirror gms1 from first slice of ad4, create partitions, newfs, mount it and transfer files from running system by dump & restore, edit fstab. Then I rebooted system from gms1, destroy content of ad6, create slices on ad6 and insert first slice in to gms1. After this I had ad4s2 and ad6s2 ready for zpool. All was done remotely through ssh. Miroslav Lachman From bugmaster at FreeBSD.org Mon Nov 16 11:06:52 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Nov 16 11:08:01 2009 Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org Message-ID: <200911161106.nAGB6pMW011148@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/140433 fs [zfs] [panic] panic while replaying ZIL after crash o kern/140134 fs [msdosfs] write and fsck destroy filesystem integrity o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs o bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139597 fs [patch] [tmpfs] tmpfs initializes va_gen but doesn't u o kern/139564 fs [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/139363 fs [nfs] diskless root nfs mount from non FreeBSD server o kern/138790 fs [zfs] ZFS ceases caching when mem demand is high o kern/138524 fs [msdosfs] disks and usb flashes/cards with Russian lab o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138367 fs [tmpfs] [panic] 'panic: Assertion pages > 0 failed' wh o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/138109 fs [extfs] [patch] Minor cleanups to the sys/gnu/fs/ext2f f kern/137037 fs [zfs] [hang] zfs rollback on root causes FreeBSD to fr o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic o kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135594 fs [zfs] Single dataset unresponsive with Samba o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133980 fs [panic] [ffs] panic: ffs_valloc: dup alloc o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [panic] panic: ffs_truncate: read-only filesystem o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int f kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132597 fs [tmpfs] [panic] tmpfs-related panic while interrupting o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131995 fs [nfs] Failure to mount NFSv4 server o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130979 fs [smbfs] [panic] boot/kernel/smbfs.ko o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129059 fs [zfs] [patch] ZFS bootloader whitelistable via WITHOUT f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/127659 fs [tmpfs] tmpfs memory leak o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS p kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [fs] [snapshot] System crashes when manipulati o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o kern/116913 fs [ffs] [panic] ffs_blkfree: freeing free block p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/115645 fs [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna f kern/91568 fs [ufs] [panic] writing to UFS/softupdates DVD media in o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/84589 fs [2TB] 5.4-STABLE unresponsive during background fsck 2 o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 135 problems total. From aaron at goflexitllc.com Mon Nov 16 13:35:37 2009 From: aaron at goflexitllc.com (Aaron Hurt) Date: Mon Nov 16 13:35:44 2009 Subject: gmirroring slices In-Reply-To: <4AFF15AE.4070902@quip.cz> References: <1A8F306A-8749-471B-94EA-FC8435A30C34@yellowspace.net> <4AFF15AE.4070902@quip.cz> Message-ID: <4B015520.7080109@goflexitllc.com> Miroslav Lachman wrote: > Lorenzo Perone wrote: >> >> Hello, >> >> I was wondering if anyone could give me an advice on how viable and >> reliable it is, to use gmirror on a slice of an MBR-style partitioned >> disk, and use the second slice(s) within a zpool. >> >> I remember a discussion here on where metadata is kept (always at the >> end of the disk as opposed to the end of the given consumer?), so I >> wasn't sure about how much of a good idea this might be. > > I think metadata is stored at the end of the provider (slice in this > case), but I am not a GEOM expert. > >> The reason I'd >> like to have it like this is, that I had mixed bad experiences in the >> effort of using ZFS as a boot and root volume, so I'd rather keep a >> traditional slice for booting/rooting, and a zpool for the production >> jails on that machine. >> >> The example would be >> >> provider: mirror/gm0 >> consumers: ad6s1 and ad8s1 >> >> zpool mirror made out of >> ad6s2 and ad8s2 > > I am running following setup for year without any configuration problems > > # gmirror status > Name Status Components > mirror/gms1 COMPLETE ad4s1 > ad6s1 > > # zpool status > pool: tank > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > tank ONLINE 0 0 0 > mirror ONLINE 0 0 0 > ad4s2 ONLINE 0 0 0 > ad6s2 ONLINE 0 0 0 > > The first slice is 20GB partitioned as usual: > # mount -t ufs > /dev/mirror/gms1a on / (ufs, local) > /dev/mirror/gms1e on /usr (ufs, local, soft-updates) > /dev/mirror/gms1d on /var (ufs, local, nosuid, soft-updates) > /dev/mirror/gms1f on /tmp (ufs, local, noexec, nosuid, soft-updates) > > The rest (450GB) is used in ZFS mirrored zpool for jails (each jail > has its own filesystem) > >> while experimenting, I got into the problem that gmirror label -v -b >> round-robin gm0 ad6s1 got a permission denied (even with sysctl >> kern.geom.debugflags=16/17). Any hints on what can cause this (I might >> have screwed up something with fdisk/bsdlabel, but after doublechecking >> I wonder what it could be..) > > I did it in non-standard way - converting already installed system on > one disk to mirrored. So when I was in system running off ad6 I > created two slices on ad4, setup gmirror gms1 from first slice of ad4, > create partitions, newfs, mount it and transfer files from running > system by dump & restore, edit fstab. Then I rebooted system from > gms1, destroy content of ad6, create slices on ad6 and insert first > slice in to gms1. > After this I had ad4s2 and ad6s2 ready for zpool. > All was done remotely through ssh. > > Miroslav Lachman > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > > !DSPAM:2,4aff15c1775218542073880! > An example with gpart ... this is how I know have all of my production dedicated servers setup and running 8.0-RCx ... net1# gpart show => 34 312581741 ad6 GPT (149G) 34 128 1 freebsd-boot (64K) 162 8388608 2 freebsd-swap (4.0G) 8388770 10485760 3 freebsd-ufs (5.0G) 18874530 293707245 4 freebsd-zfs (140G) => 34 312581741 ad16 GPT (149G) 34 128 1 freebsd-boot (64K) 162 8388608 2 freebsd-swap (4.0G) 8388770 10485760 3 freebsd-ufs (5.0G) 18874530 293707245 4 freebsd-zfs (140G) net1# gmirror status Name Status Components mirror/boot COMPLETE ad6p1 ad16p1 mirror/swap COMPLETE ad6p2 ad16p2 mirror/root COMPLETE ad6p3 ad16p3 net1# zpool status pool: pool0 state: ONLINE scrub: scrub completed after 0h5m with 0 errors on Wed Oct 14 12:45:40 2009 config: NAME STATE READ WRITE CKSUM pool0 ONLINE 0 0 0 mirror ONLINE 0 0 0 ad6p4 ONLINE 0 0 0 ad16p4 ONLINE 0 0 0 errors: No known data errors net1# mount -t ufs /dev/mirror/root on / (ufs, local, soft-updates) net1# mount -t zfs pool0 on /pool0 (zfs, local) pool0/tmp on /tmp (zfs, local, nosuid) pool0/usr on /usr (zfs, local) pool0/usr/home on /usr/home (zfs, local) pool0/usr/hosting on /usr/hosting (zfs, local, noexec, nosuid) pool0/usr/ports on /usr/ports (zfs, local, nosuid) pool0/usr/ports/distfiles on /usr/ports/distfiles (zfs, local, noexec, nosuid) pool0/usr/ports/packages on /usr/ports/packages (zfs, local, noexec, nosuid) pool0/usr/src on /usr/src (zfs, local, noexec, nosuid) pool0/var on /var (zfs, local) pool0/var/crash on /var/crash (zfs, local, noexec, nosuid) pool0/var/db on /var/db (zfs, local, noexec, nosuid) pool0/var/db/pkg on /var/db/pkg (zfs, local, nosuid) pool0/var/empty on /var/empty (zfs, local, noexec, nosuid, read-only) pool0/var/log on /var/log (zfs, local, noexec, nosuid) pool0/var/mail on /var/mail (zfs, local, noexec, nosuid) pool0/var/qmail on /var/qmail (zfs, local) pool0/var/run on /var/run (zfs, local, noexec, nosuid) pool0/var/tmp on /var/tmp (zfs, local, nosuid) It runs great and I haven't experienced any issues related to sharing disks between ufs and zfs using gpt partitioning. -- Aaron Hurt Managing Partner Flex I.T., LLC 611 Commerce Street Suite 3117 Nashville, TN 37203 Phone: 615.438.7101 E-mail: aaron@goflexitllc.com From ambsd at raisa.eu.org Mon Nov 16 16:43:59 2009 From: ambsd at raisa.eu.org (Emil Smolenski) Date: Mon Nov 16 16:44:11 2009 Subject: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" Message-ID: After installkernel/installworld my machine stops booting with the following error message: ZFS: i/o error - all block copies unavailable ZFS: can't read MOS ZFS: unexpected object set type lld FreeBSD/i386 boot Default: pgpool:/boot/kernel/kernel boot: ZFS: unexpected object set type lld This is 7.2-STABLE, amd64, zpool on single logical device (ciss(4), hardware RAID5), root on ZFS (using zfsboot). After the failure I booted the server from an external device with UFS and then I did rollback of /usr and / datasets. The machine was still not bootable. Scrub went without errors. Then I read this thread and applied Robert Noland's and Matt Reimer's patches -- and they didn't help. Then I grabbed following files from -CURRENT (svn rev. 198420): /sys/boot/i386/zfsboot/zfsboot.c /sys/boot/zfs/zfs.c /sys/boot/zfs/zfsimpl.c /sys/cddl/boot/zfs/zfsimpl.h and I did: # cd /usr/src/sys/boot/ # make obj ; make depend ; make # cd i386/loader # make install # cd /usr/src/sys/boot/i386/zfsboot # make install # sysctl kern.geom.debugflags=16 # dd if=/boot/zfsboot of=/dev/da0 count=1 # dd if=/boot/zfsboot of=/dev/da0 skip=1 seek=1024 # reboot (is this procedure of updating zfsboot correct?) After that, an error was slightly different (printf was fixed): ZFS: i/o error - all block copies unavailable ZFS: can't read MOS ZFS: unexpected object set type 0 FreeBSD/i386 boot Default: pgpool:/boot/kernel/kernel boot: ZFS: unexpected object set type 0 Additional information: # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT pgpool 4.06T 2.17T 1.89T 53% ONLINE - # zpool status pool: pgpool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM pgpool ONLINE 0 0 0 da0 ONLINE 0 0 0 errors: No known data errors # zfs list pgpool/ROOTFS NAME USED AVAIL REFER MOUNTPOINT pgpool/ROOTFS 568M 1.80T 55.3M legacy # zpool get all pgpool NAME PROPERTY VALUE SOURCE pgpool size 4.06T - pgpool used 2.17T - pgpool available 1.89T - pgpool capacity 53% - pgpool altroot - default pgpool health ONLINE - pgpool guid 3920915583055727184 - pgpool version 13 default pgpool bootfs pgpool/ROOTFS local pgpool delegation on default pgpool autoreplace off default pgpool cachefile - default pgpool failmode wait default pgpool listsnapshots off default loader.conf: usb_load="YES" uplcom_load="YES" umass_load="YES" ugen_load="YES" ukbd_load="YES" random_load="YES" loader_color="YES" vfs.root.mountfrom="zfs:pgpool/ROOTFS" zfs_load="YES" autoboot_delay="2" FreeBSD 7.2-STABLE #0: Fri Jun 19 13:27:29 CEST 2009 (as I mentioned above, there was the rollback) ciss0: port 0xe800-0xe8ff mem 0xdef00000-0xdeffffff,0xdeeff000-0xdeefffff irq 35 at device 0.0 on pci4 ciss0: [ITHREAD] da0 at ciss0 bus 0 target 0 lun 0 I would rather not to upgrade the whole system to -CURRENT. What should I do in this situation? Is there any other patch that I could apply or any workaround for this issue? Is there possibility to switch from zfsboot to gptzfsboot without loosing data? Or maybe I did something wrong? -- am From rnoland at FreeBSD.org Mon Nov 16 16:59:54 2009 From: rnoland at FreeBSD.org (Robert Noland) Date: Mon Nov 16 17:00:26 2009 Subject: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" In-Reply-To: References: Message-ID: <1258390784.2303.42.camel@balrog.2hip.net> On Mon, 2009-11-16 at 17:26 +0100, Emil Smolenski wrote: > After installkernel/installworld my machine stops booting with the > following error message: > > ZFS: i/o error - all block copies unavailable > ZFS: can't read MOS > ZFS: unexpected object set type lld > > FreeBSD/i386 boot > Default: pgpool:/boot/kernel/kernel > boot: > ZFS: unexpected object set type lld > > This is 7.2-STABLE, amd64, zpool on single logical device (ciss(4), > hardware RAID5), root on ZFS (using zfsboot). After the failure I booted > the server from an external device with UFS and then I did rollback of > /usr and / datasets. The machine was still not bootable. Scrub went > without errors. > Then I read this thread and applied Robert Noland's and Matt Reimer's > patches -- and they didn't help. Then I grabbed following files from > -CURRENT (svn rev. 198420): Matt's patch only effects raidz volumes. > /sys/boot/i386/zfsboot/zfsboot.c > /sys/boot/zfs/zfs.c > /sys/boot/zfs/zfsimpl.c > /sys/cddl/boot/zfs/zfsimpl.h > > and I did: > > # cd /usr/src/sys/boot/ > # make obj ; make depend ; make > # cd i386/loader > # make install > # cd /usr/src/sys/boot/i386/zfsboot > # make install > # sysctl kern.geom.debugflags=16 > # dd if=/boot/zfsboot of=/dev/da0 count=1 > # dd if=/boot/zfsboot of=/dev/da0 skip=1 seek=1024 > # reboot > > (is this procedure of updating zfsboot correct?) This should be correct for updating the first stage bootstrap code. The loader (boot/loader) is actually updated during installworld. > After that, an error was slightly different (printf was fixed): > > ZFS: i/o error - all block copies unavailable > ZFS: can't read MOS > ZFS: unexpected object set type 0 This has my patch applied, which fixes the printf's so that they work correctly among other things. > FreeBSD/i386 boot > Default: pgpool:/boot/kernel/kernel > boot: > ZFS: unexpected object set type 0 > > Additional information: > > # zpool list > NAME SIZE USED AVAIL CAP HEALTH ALTROOT > pgpool 4.06T 2.17T 1.89T 53% ONLINE - > > # zpool status > pool: pgpool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > pgpool ONLINE 0 0 0 > da0 ONLINE 0 0 0 > > errors: No known data errors > > # zfs list pgpool/ROOTFS > NAME USED AVAIL REFER MOUNTPOINT > pgpool/ROOTFS 568M 1.80T 55.3M legacy > > # zpool get all pgpool > NAME PROPERTY VALUE SOURCE > pgpool size 4.06T - > pgpool used 2.17T - > pgpool available 1.89T - > pgpool capacity 53% - > pgpool altroot - default > pgpool health ONLINE - > pgpool guid 3920915583055727184 - > pgpool version 13 default > pgpool bootfs pgpool/ROOTFS local > pgpool delegation on default > pgpool autoreplace off default > pgpool cachefile - default > pgpool failmode wait default > pgpool listsnapshots off default > > loader.conf: > usb_load="YES" > uplcom_load="YES" > umass_load="YES" > ugen_load="YES" > ukbd_load="YES" > random_load="YES" > loader_color="YES" > vfs.root.mountfrom="zfs:pgpool/ROOTFS" > zfs_load="YES" > autoboot_delay="2" > > FreeBSD 7.2-STABLE #0: Fri Jun 19 13:27:29 CEST 2009 > (as I mentioned above, there was the rollback) > > ciss0: port 0xe800-0xe8ff mem > 0xdef00000-0xdeffffff,0xdeeff000-0xdeefffff irq 35 at device 0.0 on pci4 > ciss0: [ITHREAD] > da0 at ciss0 bus 0 target 0 lun 0 > > I would rather not to upgrade the whole system to -CURRENT. What should I > do in this situation? Is there any other patch that I could apply or any > workaround for this issue? Is there possibility to switch from zfsboot to > gptzfsboot without loosing data? Or maybe I did something wrong? I don't think that you can switch to gptzfsboot as that would require repartitioning the device. A little more context though, was this working before? Or is this a new install? robert. -- Robert Noland FreeBSD From ambsd at raisa.eu.org Mon Nov 16 18:33:33 2009 From: ambsd at raisa.eu.org (Emil Smolenski) Date: Mon Nov 16 18:33:45 2009 Subject: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" In-Reply-To: <1258390784.2303.42.camel@balrog.2hip.net> References: <1258390784.2303.42.camel@balrog.2hip.net> Message-ID: On Mon, 16 Nov 2009 17:59:44 +0100, Robert Noland wrote: >> [...] >> Then I read this thread and applied Robert Noland's and Matt Reimer's >> patches -- and they didn't help. Then I grabbed following files from >> -CURRENT (svn rev. 198420): > Matt's patch only effects raidz volumes. Oh, I see. Maybe there is similar bug in ZFS on single disk volumes also? >> [cut: update procedure] >> (is this procedure of updating zfsboot correct?) > This should be correct for updating the first stage bootstrap code. The > loader (boot/loader) is actually updated during installworld. I'll try full build/installworld tomorrow. >> [...] >> I would rather not to upgrade the whole system to -CURRENT. What >> should I >> do in this situation? Is there any other patch that I could apply or any >> workaround for this issue? Is there possibility to switch from zfsboot >> to >> gptzfsboot without loosing data? Or maybe I did something wrong? > I don't think that you can switch to gptzfsboot as that would require > repartitioning the device. I thought so. > A little more context though, was this working before? Or is this a new > install? This system has worked stable since Jun, but I've never done full installworld after the initial installation. Now, after the installworld, machine no longer boots. (Rollback did not help). -- am From ndenev at gmail.com Tue Nov 17 04:10:54 2009 From: ndenev at gmail.com (Nikolay Denev) Date: Tue Nov 17 04:11:00 2009 Subject: ZFS resilver/replace changed vdev names from da(4) to gptid Message-ID: <982FC0F3-0071-41E7-94A5-A49720B1771B@gmail.com> Hello, Something strange happened while resilvering a 6 disk raidz1 array with one failed drive. I've initially put the new disk and issued : zfs replace tank da1p2 But the resilver process found unrecoverable errors in one snapshot and after resilvering for 7 hours it still showed da1p2/old and the new da1p2 and shortly after this after issuing another zfs scrub command the machine livelocked. The strange thing happened after I rebooted the machine and restarted the zfs scrub. This time ZFS picked up the new device not by da(4) name, but by gptid, this pass also failed and I was forced to destroy a snapshot containing the unrecoverable errors and restart the scrub again. This time it completed normally and the pool is now ONLINE but even more strangely this time it replaced another vdev with it's gptid, and this is not the vdev that was being resilvered... and now the pool looks like this : pool: tank state: ONLINE scrub: resilver completed after 7h18m with 0 errors on Tue Nov 17 00:16:20 2009 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 raidz1 ONLINE 0 0 0 da0p2 ONLINE 0 0 0 4.55G resilvered gptid/b8baba94-d068-11de-a6d5-003048c1b5fa ONLINE 0 0 0 63.2G resilvered gptid/c00174b1-d068-11de-a6d5-003048c1b5fa ONLINE 0 0 0 4.55G resilvered da3p2 ONLINE 0 0 0 4.21G resilvered da4p2 ONLINE 0 0 0 4.55G resilvered da5p2 ONLINE 0 0 0 4.21G resilvered errors: No known data errors P.S.: This also makes me wonder how I can safely make all of the other vdevs use gptid, as I plan to replace the SATA controller with a new one that probably is going to export the devices as ad(4) or ada(4). -- Regards, Nikolay Denev From ndenev at gmail.com Tue Nov 17 09:00:54 2009 From: ndenev at gmail.com (Niki Denev) Date: Tue Nov 17 09:01:01 2009 Subject: ZFS resilver/replace changed vdev names from da(4) to gptid In-Reply-To: <982FC0F3-0071-41E7-94A5-A49720B1771B@gmail.com> References: <982FC0F3-0071-41E7-94A5-A49720B1771B@gmail.com> Message-ID: <2e77fc10911170100i7a867706mf6887759a963b248@mail.gmail.com> On Tue, Nov 17, 2009 at 6:10 AM, Nikolay Denev wrote: [snip] > P.S.: This also makes me wonder how I can safely make all of the other vdevs use gptid, as I plan to replace > the SATA controller with a new one that probably is going to export the devices as ad(4) or ada(4). Reading the manual always helps :) In this case a "zpool export" and then "zpool import -d /dev/gptid" should do the trick. > -- > Regards, > Nikolay Denev From josh at multipart-mixed.com Tue Nov 17 15:38:09 2009 From: josh at multipart-mixed.com (Josh Carter) Date: Tue Nov 17 15:38:16 2009 Subject: ZFS resilver/replace changed vdev names from da(4) to gptid In-Reply-To: <982FC0F3-0071-41E7-94A5-A49720B1771B@gmail.com> References: <982FC0F3-0071-41E7-94A5-A49720B1771B@gmail.com> Message-ID: Nikolay, > P.S.: This also makes me wonder how I can safely make all of the other vdevs use gptid, as I plan to replace > the SATA controller with a new one that probably is going to export the devices as ad(4) or ada(4). As you mentioned in your follow-up email, I've always found a "zpool export" and "zpool import" to cure all ills when drive logical assignments change. It forces ZFS to forget everything it thinks it knows about the pool and re-create that knowledge from what's on the disk(s). I've had to do this after reconfiguring systems both on OpenSolaris and FreeBSD. Best regards, Josh From linimon at FreeBSD.org Tue Nov 17 15:47:56 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Tue Nov 17 15:48:09 2009 Subject: kern/140640: [zfs] snapshot crash Message-ID: <200911171547.nAHFluiY038374@freefall.freebsd.org> Old Synopsis: ZFS - Snapshot Crash New Synopsis: [zfs] snapshot crash Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Tue Nov 17 15:47:29 UTC 2009 Responsible-Changed-Why: make this a 'kern' PR and assign. http://www.freebsd.org/cgi/query-pr.cgi?pr=140640 From ambsd at raisa.eu.org Tue Nov 17 21:43:51 2009 From: ambsd at raisa.eu.org (Emil Smolenski) Date: Tue Nov 17 21:44:10 2009 Subject: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" In-Reply-To: References: <1258390784.2303.42.camel@balrog.2hip.net> Message-ID: On Mon, 16 Nov 2009 19:33:34 +0100, Emil Smolenski wrote: >> Matt's patch only effects raidz volumes. > Oh, I see. Maybe there is similar bug in ZFS on single disk volumes > also? >>> (is this procedure of updating zfsboot correct?) >> This should be correct for updating the first stage bootstrap code. The >> loader (boot/loader) is actually updated during installworld. > > I'll try full build/installworld tomorrow. It, unfortunately, didn't solve this issue. Should I file a PR? I would like to help in debugging it (however my skills in low-level C aren't strong enough to do it on my own). -- am From rnoland at FreeBSD.org Tue Nov 17 22:33:51 2009 From: rnoland at FreeBSD.org (Robert Noland) Date: Tue Nov 17 22:34:04 2009 Subject: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable" In-Reply-To: References: <1258390784.2303.42.camel@balrog.2hip.net> Message-ID: <1258497221.2303.66.camel@balrog.2hip.net> On Tue, 2009-11-17 at 22:43 +0100, Emil Smolenski wrote: > On Mon, 16 Nov 2009 19:33:34 +0100, Emil Smolenski > wrote: > > >> Matt's patch only effects raidz volumes. > > Oh, I see. Maybe there is similar bug in ZFS on single disk volumes > > also? > > >>> (is this procedure of updating zfsboot correct?) > >> This should be correct for updating the first stage bootstrap code. The > >> loader (boot/loader) is actually updated during installworld. > > > > I'll try full build/installworld tomorrow. > > It, unfortunately, didn't solve this issue. Should I file a PR? I would > like to help in debugging it (however my skills in low-level C aren't > strong enough to do it on my own). Ok, the first thing I would like to see is "zdb -uuu". I don't see an obvious issue with single disk reads. My own setup uses 2 x 1TB currently. Failing to read the MOS is basically the first read attempt from the pool, in fact it is the read that attempts to mount the pool. robert. -- Robert Noland FreeBSD From ambsd at raisa.eu.org Wed Nov 18 00:17:22 2009 From: ambsd at raisa.eu.org (Emil Smolenski) Date: Wed Nov 18 00:17:29 2009 Subject: Boot with ZFS on single disk: "ZFS: i/o error - all block copies unavailable" [was: Re: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable"] In-Reply-To: <1258497221.2303.66.camel@balrog.2hip.net> References: <1258390784.2303.42.camel@balrog.2hip.net> <1258497221.2303.66.camel@balrog.2hip.net> Message-ID: On Tue, 17 Nov 2009 23:33:41 +0100, Robert Noland wrote: >> Should I file a PR? I would >> like to help in debugging it (however my skills in low-level C aren't >> strong enough to do it on my own). > Ok, the first thing I would like to see is "zdb -uuu". # zdb -uuu pgpool Segmentation fault: 11 (core dumped) # zdb pgpool version=13 name='pgpool' state=0 txg=439808 pool_guid=3920915583055727184 hostid=1642959122 hostname='unset' vdev_tree type='root' id=0 guid=3920915583055727184 children[0] type='disk' id=0 guid=5859773264564918193 path='/dev/da0' whole_disk=0 metaslab_array=23 metaslab_shift=35 ashift=9 asize=4500799356928 is_log=0 DTL=260 > I don't see an > obvious issue with single disk reads. My own setup uses 2 x 1TB > currently. Failing to read the MOS is basically the first read attempt > from the pool, in fact it is the read that attempts to mount the pool. -- am From rnoland at FreeBSD.org Wed Nov 18 13:51:00 2009 From: rnoland at FreeBSD.org (Robert Noland) Date: Wed Nov 18 13:51:06 2009 Subject: Boot with ZFS on single disk: "ZFS: i/o error - all block copies unavailable" [was: Re: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable"] In-Reply-To: References: <1258390784.2303.42.camel@balrog.2hip.net> <1258497221.2303.66.camel@balrog.2hip.net> Message-ID: <1258552247.2303.75.camel@balrog.2hip.net> On Wed, 2009-11-18 at 01:17 +0100, Emil Smolenski wrote: > On Tue, 17 Nov 2009 23:33:41 +0100, Robert Noland > wrote: > > >> Should I file a PR? I would > >> like to help in debugging it (however my skills in low-level C aren't > >> strong enough to do it on my own). > > > Ok, the first thing I would like to see is "zdb -uuu". > > # zdb -uuu pgpool > Segmentation fault: 11 (core dumped) Ok, this is disturbing... It works fine for me on -CURRENT / amd64 and reports the root block pointer, which is what we need to locate the MOS. robert. > # zdb > pgpool > version=13 > name='pgpool' > state=0 > txg=439808 > pool_guid=3920915583055727184 > hostid=1642959122 > hostname='unset' > vdev_tree > type='root' > id=0 > guid=3920915583055727184 > children[0] > type='disk' > id=0 > guid=5859773264564918193 > path='/dev/da0' > whole_disk=0 > metaslab_array=23 > metaslab_shift=35 > ashift=9 > asize=4500799356928 > is_log=0 > DTL=260 > > > I don't see an > > obvious issue with single disk reads. My own setup uses 2 x 1TB > > currently. Failing to read the MOS is basically the first read attempt > > from the pool, in fact it is the read that attempts to mount the pool. > -- Robert Noland FreeBSD From ambsd at raisa.eu.org Wed Nov 18 16:11:21 2009 From: ambsd at raisa.eu.org (Emil Smolenski) Date: Wed Nov 18 16:11:33 2009 Subject: Boot with ZFS on single disk: "ZFS: i/o error - all block copies unavailable" [was: Re: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable"] In-Reply-To: <1258552247.2303.75.camel@balrog.2hip.net> References: <1258390784.2303.42.camel@balrog.2hip.net> <1258497221.2303.66.camel@balrog.2hip.net> <1258552247.2303.75.camel@balrog.2hip.net> Message-ID: On Wed, 18 Nov 2009 14:50:47 +0100, Robert Noland wrote: >> >> Should I file a PR? I would >> >> like to help in debugging it (however my skills in low-level C aren't >> >> strong enough to do it on my own). >> > Ok, the first thing I would like to see is "zdb -uuu". >> # zdb -uuu pgpool >> Segmentation fault: 11 (core dumped) > Ok, this is disturbing... It works fine for me on -CURRENT / amd64 and > reports the root block pointer, which is what we need to locate the MOS. Booting from 8.0-*-amd64-memstick.img (Fixit# console) makes "zdb -uuu" happy: Fixit# zdb -uuu pgpool Uberblock magic = 0000000000bab10c version = 13 txg = 443448 guid_sum = 9780688847620645377 timestamp = 1258560175 UTC = Wed Nov 18 16:02:55 2009 rootbp = [L0 DMU objset] 400L/200P DVA[0]=<0:220000de400:200> DVA[1]=<0:2a80008ee00:200> DVA[2]=<0:330000b9000:200> fletcher4 lzjb LE contiguous birth=443448 fill=298 cksum=8a9775385:3935d6d58c7:c028430c00a8:1b58ac4ebf42ac -- am From rnoland at FreeBSD.org Wed Nov 18 16:43:57 2009 From: rnoland at FreeBSD.org (Robert Noland) Date: Wed Nov 18 16:44:04 2009 Subject: Boot with ZFS on single disk: "ZFS: i/o error - all block copies unavailable" [was: Re: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable"] In-Reply-To: References: <1258390784.2303.42.camel@balrog.2hip.net> <1258497221.2303.66.camel@balrog.2hip.net> <1258552247.2303.75.camel@balrog.2hip.net> Message-ID: <1258562628.2303.83.camel@balrog.2hip.net> On Wed, 2009-11-18 at 17:11 +0100, Emil Smolenski wrote: > On Wed, 18 Nov 2009 14:50:47 +0100, Robert Noland > wrote: > > >> >> Should I file a PR? I would > >> >> like to help in debugging it (however my skills in low-level C aren't > >> >> strong enough to do it on my own). > >> > Ok, the first thing I would like to see is "zdb -uuu". > >> # zdb -uuu pgpool > >> Segmentation fault: 11 (core dumped) > > > Ok, this is disturbing... It works fine for me on -CURRENT / amd64 and > > reports the root block pointer, which is what we need to locate the MOS. > > Booting from 8.0-*-amd64-memstick.img (Fixit# console) makes "zdb -uuu" > happy: > > Fixit# zdb -uuu pgpool > Uberblock > > magic = 0000000000bab10c > version = 13 > txg = 443448 > guid_sum = 9780688847620645377 > timestamp = 1258560175 UTC = Wed Nov 18 16:02:55 2009 > rootbp = [L0 DMU objset] 400L/200P DVA[0]=<0:220000de400:200> > DVA[1]=<0:2a80008ee00:200> DVA[2]=<0:330000b9000:200> fletcher4 lzjb LE > contiguous birth=443448 fill=298 > cksum=8a9775385:3935d6d58c7:c028430c00a8:1b58ac4ebf42ac Ok, the offsets are definately up there... What is your normal installation? 8.0 i386? robert. -- Robert Noland FreeBSD From ambsd at raisa.eu.org Wed Nov 18 16:46:12 2009 From: ambsd at raisa.eu.org (Emil Smolenski) Date: Wed Nov 18 16:46:25 2009 Subject: Boot with ZFS on single disk: "ZFS: i/o error - all block copies unavailable" [was: Re: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable"] In-Reply-To: <1258562628.2303.83.camel@balrog.2hip.net> References: <1258390784.2303.42.camel@balrog.2hip.net> <1258497221.2303.66.camel@balrog.2hip.net> <1258552247.2303.75.camel@balrog.2hip.net> <1258562628.2303.83.camel@balrog.2hip.net> Message-ID: On Wed, 18 Nov 2009 17:43:48 +0100, Robert Noland wrote: > On Wed, 2009-11-18 at 17:11 +0100, Emil Smolenski wrote: >> On Wed, 18 Nov 2009 14:50:47 +0100, Robert Noland >> wrote: >> >> >> >> Should I file a PR? I would >> >> >> like to help in debugging it (however my skills in low-level C >> aren't >> >> >> strong enough to do it on my own). >> >> > Ok, the first thing I would like to see is "zdb -uuu". >> >> # zdb -uuu pgpool >> >> Segmentation fault: 11 (core dumped) >> >> > Ok, this is disturbing... It works fine for me on -CURRENT / amd64 >> and >> > reports the root block pointer, which is what we need to locate the >> MOS. >> >> Booting from 8.0-*-amd64-memstick.img (Fixit# console) makes "zdb >> -uuu" >> happy: >> >> Fixit# zdb -uuu pgpool >> Uberblock >> >> magic = 0000000000bab10c >> version = 13 >> txg = 443448 >> guid_sum = 9780688847620645377 >> timestamp = 1258560175 UTC = Wed Nov 18 16:02:55 2009 >> rootbp = [L0 DMU objset] 400L/200P DVA[0]=<0:220000de400:200> >> DVA[1]=<0:2a80008ee00:200> DVA[2]=<0:330000b9000:200> fletcher4 lzjb LE >> contiguous birth=443448 fill=298 >> cksum=8a9775385:3935d6d58c7:c028430c00a8:1b58ac4ebf42ac > > Ok, the offsets are definately up there... What is your normal > installation? 8.0 i386? 7.2-STABLE, amd64. -- am From linimon at FreeBSD.org Wed Nov 18 18:54:09 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Wed Nov 18 18:54:15 2009 Subject: kern/140661: [zfs] /boot/loader fails to work on a GPT/ZFS-only system on both 8.0-RC2 and RC3 Message-ID: <200911181854.nAIIs9Rq079911@freefall.freebsd.org> Old Synopsis: /boot/loader fails to work on a GPT/ZFS-only system on both 8.0-RC2 and RC3 New Synopsis: [zfs] /boot/loader fails to work on a GPT/ZFS-only system on both 8.0-RC2 and RC3 Responsible-Changed-From-To: freebsd-amd64->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Wed Nov 18 18:53:38 UTC 2009 Responsible-Changed-Why: This may not be amd64-specific. http://www.freebsd.org/cgi/query-pr.cgi?pr=140661 From swhetzel at gmail.com Wed Nov 18 21:00:08 2009 From: swhetzel at gmail.com (Scot Hetzel) Date: Wed Nov 18 21:00:15 2009 Subject: amd64/140661: /boot/loader fails to work on a GPT/ZFS-only system on both 8.0-RC2 and RC3 Message-ID: <200911182100.nAIL089D082773@freefall.freebsd.org> The following reply was made to PR kern/140661; it has been noted by GNATS. From: Scot Hetzel To: Kenneth Vestergaard Schmidt Cc: freebsd-gnats-submit@freebsd.org Subject: Re: amd64/140661: /boot/loader fails to work on a GPT/ZFS-only system on both 8.0-RC2 and RC3 Date: Wed, 18 Nov 2009 14:57:19 -0600 On 11/18/09, Kenneth Vestergaard Schmidt wrote: > Two machines tested, and both fail. Both installed according to > http://wiki.freebsd.org/RootOnZFS/GPTZFSBoot but one of them with an added > disk in a mirror. > > Both installed and working as 8.0-RC1. Both fail after upgrading to 8.0-RC2, > and ditto when trying 8.0-RC3. > > Upon booting, the following messages are visible just prior to an automatic > reboot: > > Can't work out which disk we are booting from. > Guessed BIOS device 0xffffffff not found by probes, defaulting to disk0: > ficl-s not found > Assertion failed: (FALSE), function ficlCompileSoftCore, file softcore.c, > line 428. > > /boot/loader.conf contains: > zfs_load="YES" > vfs.root.mountfrom="zfs:pil" > > mckusick# zpool get bootfs pil > NAME PROPERTY VALUE SOURCE > pil bootfs pil local > I recently installed FreeBSD 8.0-RC3 on a new system using the same steps as mentioned in the above guide, and I didn't have any problem booting FreeBSD 8.0-RC3 with the /boot/loader that was created in step 2.6 Install ZFS aware /boot/loader. dv8t01# uname -a FreeBSD dv8t01 8.0-RC3 FreeBSD 8.0-RC3 #0: Tue Nov 10 06:35:19 UTC 2009 root@mason.cse.buffalo.edu:/usr/obj/usr/src/sys/GENERIC amd64 dv8t01# grep zfs /boot/loader.conf vfs.root.mountfrom="zfs:zroot" zfs_load="YES" dv8t01# zpool get bootfs zroot NAME PROPERTY VALUE SOURCE zroot bootfs zroot local Make sure you have LOADER_ZFS_SUPPORT in your /etc/src.conf: dv8t01# cat /etc/src.conf LOADER_ZFS_SUPPORT=YES Scot From mattjreimer at gmail.com Wed Nov 18 23:48:05 2009 From: mattjreimer at gmail.com (Matt Reimer) Date: Wed Nov 18 23:48:52 2009 Subject: Boot with ZFS on single disk: "ZFS: i/o error - all block copies unavailable" [was: Re: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable"] In-Reply-To: <1258562628.2303.83.camel@balrog.2hip.net> References: <1258390784.2303.42.camel@balrog.2hip.net> <1258497221.2303.66.camel@balrog.2hip.net> <1258552247.2303.75.camel@balrog.2hip.net> <1258562628.2303.83.camel@balrog.2hip.net> Message-ID: On Wed, Nov 18, 2009 at 8:43 AM, Robert Noland wrote: > On Wed, 2009-11-18 at 17:11 +0100, Emil Smolenski wrote: >> On Wed, 18 Nov 2009 14:50:47 +0100, Robert Noland >> wrote: >> >> >> >> Should I file a PR? I would >> >> >> like to help in debugging it (however my skills in low-level C aren't >> >> >> strong enough to do it on my own). >> >> > Ok, the first thing I would like to see is "zdb -uuu". >> >> # zdb -uuu pgpool >> >> Segmentation fault: 11 (core dumped) >> >> > Ok, this is disturbing... ?It works fine for me on -CURRENT / amd64 and >> > reports the root block pointer, which is what we need to locate the MOS. >> >> ? Booting from 8.0-*-amd64-memstick.img (Fixit# console) makes "zdb -uuu" >> happy: >> >> Fixit# zdb -uuu pgpool >> Uberblock >> >> ? ? ? ? ?magic = 0000000000bab10c >> ? ? ? ? ?version = 13 >> ? ? ? ? ?txg = 443448 >> ? ? ? ? ?guid_sum = 9780688847620645377 >> ? ? ? ? ?timestamp = 1258560175 UTC = Wed Nov 18 16:02:55 2009 >> ? ? ? ? ?rootbp = [L0 DMU objset] 400L/200P DVA[0]=<0:220000de400:200> >> DVA[1]=<0:2a80008ee00:200> DVA[2]=<0:330000b9000:200> fletcher4 lzjb LE >> contiguous birth=443448 fill=298 >> cksum=8a9775385:3935d6d58c7:c028430c00a8:1b58ac4ebf42ac > > Ok, the offsets are definately up there... What is your normal > installation? ?8.0 i386? Robert's on to something. It looks like your LBAs are probably overflowing 32 bits. This would affect all vdev regardless of type. Try the attached patch. Matt -------------- next part -------------- A non-text attachment was scrubbed... Name: zfsboot.c.patch3 Type: application/octet-stream Size: 947 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20091118/a59ccbb8/zfsboot.c.obj From linimon at FreeBSD.org Thu Nov 19 02:26:59 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Thu Nov 19 02:27:10 2009 Subject: kern/140682: [netgraph] [panic] random panic in netgraph Message-ID: <200911190226.nAJ2QvBN066417@freefall.freebsd.org> Old Synopsis: randomly panic New Synopsis: [netgraph] [panic] random panic in netgraph Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Thu Nov 19 02:25:02 UTC 2009 Responsible-Changed-Why: looks like this might be netgraph-related. http://www.freebsd.org/cgi/query-pr.cgi?pr=140682 From nwfilardo at gmail.com Thu Nov 19 02:40:03 2009 From: nwfilardo at gmail.com (Nathaniel Filardo) Date: Thu Nov 19 02:40:13 2009 Subject: kern/139725: [zfs] zdb(1) dumps core on i386 when examining zpool contents: Assertion failed: (rwlp->rw_count == 0) Message-ID: <200911190240.nAJ2e3sk076223@freefall.freebsd.org> The following reply was made to PR kern/139725; it has been noted by GNATS. From: Nathaniel Filardo To: bug-followup@FreeBSD.org, henno@schooljan.nl Cc: Subject: Re: kern/139725: [zfs] zdb(1) dumps core on i386 when examining zpool contents: Assertion failed: (rwlp->rw_count == 0) Date: Wed, 18 Nov 2009 21:14:48 -0500 This is not an i386 specific thing; I am able to reproduce it readily on my SPARC64 machine, and can make core files available if they'd help. Thanks! --nwf; From rnoland at FreeBSD.org Thu Nov 19 03:57:55 2009 From: rnoland at FreeBSD.org (Robert Noland) Date: Thu Nov 19 03:58:01 2009 Subject: Boot with ZFS on single disk: "ZFS: i/o error - all block copies unavailable" [was: Re: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable"] In-Reply-To: References: <1258390784.2303.42.camel@balrog.2hip.net> <1258497221.2303.66.camel@balrog.2hip.net> <1258552247.2303.75.camel@balrog.2hip.net> <1258562628.2303.83.camel@balrog.2hip.net> Message-ID: <1258603057.2303.92.camel@balrog.2hip.net> On Wed, 2009-11-18 at 15:48 -0800, Matt Reimer wrote: > 220000de400 This divided by 512 byte block size is 33 bits... At a glance, the patch looks ok to me. I'll do a more thorough review of this tomorrow. robert. -- Robert Noland FreeBSD From ambsd at raisa.eu.org Thu Nov 19 10:21:31 2009 From: ambsd at raisa.eu.org (Emil Smolenski) Date: Thu Nov 19 10:21:43 2009 Subject: Boot with ZFS on single disk: "ZFS: i/o error - all block copies unavailable" [was: Re: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable"] In-Reply-To: <1258603057.2303.92.camel@balrog.2hip.net> References: <1258390784.2303.42.camel@balrog.2hip.net> <1258497221.2303.66.camel@balrog.2hip.net> <1258552247.2303.75.camel@balrog.2hip.net> <1258562628.2303.83.camel@balrog.2hip.net> <1258603057.2303.92.camel@balrog.2hip.net> Message-ID: Matt Reimer wrote: > Robert's on to something. It looks like your LBAs are probably > overflowing 32 bits. This would affect all vdev regardless of type. > Try the attached patch. Robert Noland wrote: >> 220000de400 > This divided by 512 byte block size is 33 bits... At a glance, the patch > looks ok to me. I'll do a more thorough review of this tomorrow. Unfortunately it don't work. Error is the same as before: ZFS: i/o error - all block copies unavailable ZFS: can't read MOS ZFS: unexpected object set type 0 ZFS: unexpected object set type 0 FreeBSD/i386 boot Default: pgpool:/boot/kernel/kernel boot: ZFS: unexpected object set type 0 This is 7.2-STABLE, amd64. My test procedure: 1. I fully synchronized these zfsboot-related directories with -CURRENT: src/sys/boot/i386/zfsboot src/sys/boot/zfs src/sys/cddl/boot/zfs 2. I applied Matt Reimer's zfsboot.c.patch3 patch: # cd /usr/src/sys/boot/ # patch < /path/to/zfsboot.c.patch3 3. Then I did: # make clean; make cleandir # make obj ; make depend ; make # cd i386/loader # make install # cd /usr/src/sys/boot/i386/zfsboot # make install # sysctl kern.geom.debugflags=16 # dd if=/boot/zfsboot of=/dev/da0 count=1 # dd if=/boot/zfsboot of=/dev/da0 skip=1 seek=1024 # reboot 4. Result: error shown above. Thanks! -- am From rnoland at FreeBSD.org Thu Nov 19 16:24:04 2009 From: rnoland at FreeBSD.org (Robert Noland) Date: Thu Nov 19 16:24:16 2009 Subject: Boot with ZFS on single disk: "ZFS: i/o error - all block copies unavailable" [was: Re: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable"] In-Reply-To: References: <1258390784.2303.42.camel@balrog.2hip.net> <1258497221.2303.66.camel@balrog.2hip.net> <1258552247.2303.75.camel@balrog.2hip.net> <1258562628.2303.83.camel@balrog.2hip.net> <1258603057.2303.92.camel@balrog.2hip.net> Message-ID: <1258647835.2303.105.camel@balrog.2hip.net> On Thu, 2009-11-19 at 11:21 +0100, Emil Smolenski wrote: > Matt Reimer wrote: > > Robert's on to something. It looks like your LBAs are probably > > overflowing 32 bits. This would affect all vdev regardless of type. > > Try the attached patch. > > Robert Noland wrote: > >> 220000de400 > > This divided by 512 byte block size is 33 bits... At a glance, the patch > > looks ok to me. I'll do a more thorough review of this tomorrow. > > Unfortunately it don't work. Error is the same as before: Ok, I was concerned about the assembly code... So, I've been chatting with jhb@ this morning. Please try this patch that jhb@ came up with instead of Matt's latest patch. robert. > ZFS: i/o error - all block copies unavailable > ZFS: can't read MOS > ZFS: unexpected object set type 0 > ZFS: unexpected object set type 0 > > FreeBSD/i386 boot > Default: pgpool:/boot/kernel/kernel > boot: > ZFS: unexpected object set type 0 > > > This is 7.2-STABLE, amd64. My test procedure: > > 1. I fully synchronized these zfsboot-related directories with -CURRENT: > > src/sys/boot/i386/zfsboot > src/sys/boot/zfs > src/sys/cddl/boot/zfs > > 2. I applied Matt Reimer's zfsboot.c.patch3 patch: > > # cd /usr/src/sys/boot/ > # patch < /path/to/zfsboot.c.patch3 > > 3. Then I did: > > # make clean; make cleandir > # make obj ; make depend ; make > # cd i386/loader > # make install > # cd /usr/src/sys/boot/i386/zfsboot > # make install > # sysctl kern.geom.debugflags=16 > # dd if=/boot/zfsboot of=/dev/da0 count=1 > # dd if=/boot/zfsboot of=/dev/da0 skip=1 seek=1024 > # reboot > > 4. Result: error shown above. > > Thanks! > -- Robert Noland FreeBSD -------------- next part -------------- A non-text attachment was scrubbed... Name: zfsboot_64.patch Type: text/x-patch Size: 2698 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20091119/b0169276/zfsboot_64.bin From jhb at freebsd.org Thu Nov 19 17:02:39 2009 From: jhb at freebsd.org (John Baldwin) Date: Thu Nov 19 17:02:46 2009 Subject: Boot with ZFS on single disk: "ZFS: i/o error - all block copies unavailable" [was: Re: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable"] In-Reply-To: <1258647835.2303.105.camel@balrog.2hip.net> References: <1258647835.2303.105.camel@balrog.2hip.net> Message-ID: <200911191155.10490.jhb@freebsd.org> On Thursday 19 November 2009 11:23:55 am Robert Noland wrote: > On Thu, 2009-11-19 at 11:21 +0100, Emil Smolenski wrote: > > Matt Reimer wrote: > > > Robert's on to something. It looks like your LBAs are probably > > > overflowing 32 bits. This would affect all vdev regardless of type. > > > Try the attached patch. > > > > Robert Noland wrote: > > >> 220000de400 > > > This divided by 512 byte block size is 33 bits... At a glance, the patch > > > looks ok to me. I'll do a more thorough review of this tomorrow. > > > > Unfortunately it don't work. Error is the same as before: > > Ok, I was concerned about the assembly code... So, I've been chatting > with jhb@ this morning. Please try this patch that jhb@ came up with > instead of Matt's latest patch. Actually, I had missed updating one place, please use this instead. Also, I think that this will fix using > 2TB volumes even in the GPT case as zfsboot.c was always using 32-bit LBAs even for the GPT case. -- John Baldwin -------------- next part -------------- A non-text attachment was scrubbed... Name: zfsboot_64.patch Type: text/x-diff Size: 2945 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20091119/5854f276/zfsboot_64.bin From ambsd at raisa.eu.org Thu Nov 19 23:04:14 2009 From: ambsd at raisa.eu.org (Emil Smolenski) Date: Thu Nov 19 23:04:26 2009 Subject: Boot with ZFS on single disk: "ZFS: i/o error - all block copies unavailable" [was: Re: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable"] In-Reply-To: <1258647835.2303.105.camel@balrog.2hip.net> References: <1258390784.2303.42.camel@balrog.2hip.net> <1258497221.2303.66.camel@balrog.2hip.net> <1258552247.2303.75.camel@balrog.2hip.net> <1258562628.2303.83.camel@balrog.2hip.net> <1258603057.2303.92.camel@balrog.2hip.net> <1258647835.2303.105.camel@balrog.2hip.net> Message-ID: On Thu, 19 Nov 2009 17:23:55 +0100, Robert Noland wrote: > Ok, I was concerned about the assembly code... So, I've been chatting > with jhb@ this morning. Please try this patch that jhb@ came up with > instead of Matt's latest patch. On Thu, 19 Nov 2009 17:55:10 +0100, John Baldwin wrote: > Actually, I had missed updating one place, please use this instead. > Also, I > think that this will fix using > 2TB volumes even in the GPT case as > zfsboot.c was always using 32-bit LBAs even for the GPT case. Thanks a million! Both patches works for me. Great work! I know that we have missed the boat but maybe there is opportunity to catch it up by swimming and commit these patches to 8-STABLE before 8.0-RELEASE? Thanks! -- am From randy at psg.com Fri Nov 20 02:48:07 2009 From: randy at psg.com (Randy Bush) Date: Fri Nov 20 02:48:14 2009 Subject: 7.2 dies in zfs Message-ID: i think the issue is how to tune for zfs i386 with 4G of RAM FreeBSD psg.com 7.2-STABLE FreeBSD 7.2-STABLE #2: Wed Nov 18 03:04:55 GMT 2009 root@psg.com:/usr/obj/usr/src/sys/PSG i386 RELENG_7 cvsupped Nov 18 02:42 GMT panic: kmem_malloc(65536): kmem_map too small: 535019520 total allocated cpuid = 0 Uptime: 13h15m1s Physical memory: 3958 MB Dumping 637 MB: 622 606 590 574 558 542 526 510 494 478 462 446 430 414 398 382 366 350 334 318 302 286 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46 30 14 Dump complete Automatic reboot in 15 seconds - press a key on the console to abort and it did not auto reboot # cat /boot/loader.conf.local ipfw_load=YES umass_load=YES zfs_load=YES vm.kmem_size=536870912 vm.kmem_size_max=1073741824 vfs.zfs.prefetch_disable=1 it has zfs # zpool status pool: tank state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 mirror ONLINE 0 0 0 twed1 ONLINE 0 0 0 twed2 ONLINE 0 0 0 errors: No known data errors but boots and has root on ufs # df -H Filesystem Size Used Avail Capacity Mounted on /dev/twed0s1a 260M 199M 40M 83% / devfs 1.0k 1.0k 0B 100% /dev /dev/twed0s1h 65M 2.3M 57M 4% /root procfs 4.1k 4.1k 0B 100% /proc tank 147G 17M 147G 0% /tank tank/usr 167G 20G 147G 12% /usr tank/usr/home 216G 68G 147G 32% /usr/home tank/var 149G 2.3G 147G 2% /var tank/var/spool 148G 531M 147G 0% /var/spool /dev/md0 130M 12k 119M 0% /tmp # kgdb kernel.debug /usr/home/crash/vmcore.20 GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"... Unread portion of the kernel message buffer: panic: kmem_malloc(65536): kmem_map too small: 535019520 total allocated cpuid = 0 Uptime: 13h15m1s Physical memory: 3958 MB Dumping 637 MB: 622 606 590 574 558 542 526 510 494 478 462 446 430 414 398 382 366 350 334 318 302 286 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46 30 14 Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /boot/kernel/zfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. done. Loaded symbols for /boot/kernel/opensolaris.ko Reading symbols from /boot/kernel/ipfw.ko...Reading symbols from /boot/kernel/ipfw.ko.symbols...done. done. Loaded symbols for /boot/kernel/ipfw.ko Reading symbols from /boot/kernel/umass.ko...Reading symbols from /boot/kernel/umass.ko.symbols...done. done. Loaded symbols for /boot/kernel/umass.ko Reading symbols from /boot/kernel/cam.ko...Reading symbols from /boot/kernel/cam.ko.symbols...done. done. Loaded symbols for /boot/kernel/cam.ko Reading symbols from /boot/kernel/usb.ko...Reading symbols from /boot/kernel/usb.ko.symbols...done. done. Loaded symbols for /boot/kernel/usb.ko #0 doadump () at pcpu.h:196 196 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) back #0 doadump () at pcpu.h:196 #1 0xc052b0b6 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc052b39e in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:574 #3 0xc06dfb54 in kmem_malloc (map=0xc107108c, size=65536, flags=2) at /usr/src/sys/vm/vm_kern.c:305 #4 0xc06d6317 in page_alloc (zone=0x0, bytes=65536, pflag=0xf67ee4a7 "\002", wait=2) at /usr/src/sys/vm/uma_core.c:952 #5 0xc06d8e20 in uma_large_malloc (size=65536, wait=2) at /usr/src/sys/vm/uma_core.c:2706 #6 0xc05189e8 in malloc (size=65536, mtp=0xc0989060, flags=2) at /usr/src/sys/kern/kern_malloc.c:393 #7 0xc0897a61 in zfs_kmem_alloc (size=65536, kmflags=2) at /usr/src/sys/modules/zfs/../../cddl/compat/opensolaris/kern/opensolaris_kmem.c:74 #8 0xc090bf4a in zio_buf_alloc (size=65536) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:207 #9 0xc08f3472 in vdev_cache_read (zio=0xd39d0708) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_cache.c:188 #10 0xc090c145 in zio_vdev_io_start (zio=0xd39d0708) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:1816 #11 0xc090c7f0 in zio_execute (zio=0xd39d0708) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:998 #12 0xc08f6bda in vdev_mirror_io_start (zio=0xd77e7708) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:303 #13 0xc090c7f0 in zio_execute (zio=0xd77e7708) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:998 #14 0xc08f6bda in vdev_mirror_io_start (zio=0xdbd22960) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_mirror.c:303 #15 0xc090c7f0 in zio_execute (zio=0xdbd22960) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:998 #16 0xc08ad39d in arc_read_nolock (pio=0xd1844960, spa=0xc7746000, bp=0xcc548640, done=0xc08b0600 , private=0xd9c89e38, priority=0, zio_flags=1, arc_flags=0xf67ee854, zb=0xf67ee834) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2762 #17 0xc08ad878 in arc_read (pio=0xd1844960, spa=0xc7746000, bp=0xcc548640, pbuf=0xc9b29134, done=0xc08b0600 , private=0xd9c89e38, priority=0, zio_flags=1, arc_flags=0xf67ee854, zb=0xf67ee834) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2507 #18 0xc08b0ada in dbuf_read (db=0xd9c89e38, zio=0xd1844960, flags=14) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:521 #19 0xc08b1142 in dbuf_findbp (dn=0xcba92000, level=Variable "level" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1381 #20 0xc08b1269 in dbuf_hold_impl (dn=0xcba92000, level=0 '\0', blkid=0, fail_sparse=0, tag=0x0, dbp=0xf67ee8f0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1617 #21 0xc08b2529 in dbuf_hold (dn=0xcba92000, blkid=0, tag=0x0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1689 #22 0xc08b48cc in dmu_buf_hold (os=0xc774b3d0, object=167123, offset=0, tag=0x0, dbp=0xf67ee95c) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:101 #23 0xc0900044 in zap_lockdir (os=0xc774b3d0, obj=167123, tx=0x0, lti=RW_READER, fatreader=1, adding=0, zapp=0xf67eeba0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:388 #24 0xc09009fd in zap_cursor_retrieve (zc=0xf67eeb9c, za=0xf67eea84) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zap_micro.c:1004 #25 0xc0925a2b in zfs_freebsd_readdir (ap=0xf67eec00) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:2156 #26 0xc07382f2 in VOP_READDIR_APV (vop=0xc098b560, a=0xf67eec00) at vnode_if.c:1407 #27 0xc05becae in kern_getdirentries (td=0xd5a01000, fd=19, buf=0x29061000
, count=4096, basep=0xf67eec74) at vnode_if.h:747 #28 0xc05beec1 in getdirentries (td=0xd5a01000, uap=0xf67eecfc) at /usr/src/sys/kern/vfs_syscalls.c:3776 #29 0xc072cfc5 in syscall (frame=0xf67eed38) at /usr/src/sys/i386/i386/trap.c:1101 #30 0xc0711380 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:262 #31 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) randy From randy at psg.com Fri Nov 20 10:01:58 2009 From: randy at psg.com (Randy Bush) Date: Fri Nov 20 10:02:05 2009 Subject: 7.2 dies in zfs In-Reply-To: References: Message-ID: it was pointed out that i did not include my kernel config. so here you go. # egrep -v '^(#|$)' /sys/i386/conf/PSG cpu I686_CPU ident PSG makeoptions DEBUG=-g # Build kernel with gdb(1) debug symbols options KVA_PAGES=512 options SCHED_ULE # ULE scheduler options PREEMPTION # Enable kernel thread preemption options INET # InterNETworking options INET6 # IPv6 communications protocols options SCTP # Stream Control Transmission Protocol options FFS # Berkeley Fast Filesystem options SOFTUPDATES # Enable FFS soft updates support options UFS_ACL # Support for access control lists options UFS_DIRHASH # Improve performance on big directories options CD9660 # ISO 9660 Filesystem options PROCFS # Process filesystem (requires PSEUDOFS) options PSEUDOFS # Pseudo-filesystem framework options GEOM_PART_GPT # GUID Partition Tables. options GEOM_LABEL # Provides labelization options COMPAT_43TTY # BSD 4.3 TTY compat [KEEP THIS!] options COMPAT_FREEBSD4 # Compatible with FreeBSD4 options COMPAT_FREEBSD5 # Compatible with FreeBSD5 options COMPAT_FREEBSD6 # Compatible with FreeBSD6 options SCSI_DELAY=5000 # Delay (in ms) before probing SCSI options KTRACE # ktrace(1) support options STACK # stack(9) support options SYSVSHM # SYSV-style shared memory options SYSVMSG # SYSV-style message queues options SYSVSEM # SYSV-style semaphores options P1003_1B_SEMAPHORES # POSIX-style semaphores options _KPOSIX_PRIORITY_SCHEDULING # POSIX P1003_1B real-time extensions options KBD_INSTALL_CDEV # install a CDEV entry in /dev options ADAPTIVE_GIANT # Giant mutex is adaptive. options STOP_NMI # Stop CPUS using NMI instead of IPI options AUDIT # Security event auditing options SMP # Symmetric MultiProcessor Kernel device apic # I/O APIC device cpufreq device pci device fdc device ata device atadisk # ATA disk drives device ataraid # ATA RAID drives device atapicd # ATAPI CDROM drives device atapifd # ATAPI floppy drives device atapist # ATAPI tape drives options ATA_STATIC_ID # Static device numbering device twe # 3ware ATA RAID device atkbdc # AT keyboard controller device atkbd # AT keyboard device vga # VGA video card driver device splash # Splash screen and screen saver support device sc device sio # 8250, 16[45]50 based serial ports device uart # Generic UART driver device ppc device ppbus # Parallel port bus (required) device em # Intel PRO/1000 Gigabit Ethernet Family device loop # Network loopback device random # Entropy device device ether # Ethernet support device tun # Packet tunnel. device pty # Pseudo-ttys (telnet etc) device md # Memory "disks" device firmware # firmware assist module device bpf # Berkeley packet filter randy From jhb at freebsd.org Fri Nov 20 14:41:38 2009 From: jhb at freebsd.org (John Baldwin) Date: Fri Nov 20 14:41:54 2009 Subject: Boot with ZFS on single disk: "ZFS: i/o error - all block copies unavailable" [was: Re: GPT boot with ZFS RAIDZ "ZFS: i/o error - all block copies unavailable"] In-Reply-To: References: <1258647835.2303.105.camel@balrog.2hip.net> Message-ID: <200911200743.52233.jhb@freebsd.org> On Thursday 19 November 2009 6:04:16 pm Emil Smolenski wrote: > On Thu, 19 Nov 2009 17:23:55 +0100, Robert Noland > wrote: > > Ok, I was concerned about the assembly code... So, I've been chatting > > with jhb@ this morning. Please try this patch that jhb@ came up with > > instead of Matt's latest patch. > > On Thu, 19 Nov 2009 17:55:10 +0100, John Baldwin wrote: > > Actually, I had missed updating one place, please use this instead. > > Also, I > > think that this will fix using > 2TB volumes even in the GPT case as > > zfsboot.c was always using 32-bit LBAs even for the GPT case. > > Thanks a million! Both patches works for me. Great work! > I know that we have missed the boat but maybe there is opportunity to > catch it up by swimming and commit these patches to 8-STABLE before > 8.0-RELEASE? Thanks! It is too late for 8.0 I'm afraid. -- John Baldwin From mattjreimer at gmail.com Sat Nov 21 00:46:55 2009 From: mattjreimer at gmail.com (Matt Reimer) Date: Sat Nov 21 00:47:01 2009 Subject: Current gptzfsboot limitations Message-ID: I've been analyzing gptzfsboot to see what its limitations are. I think it should now work fine for a healthy pool with any number of disks, with any type of vdev, whether single disk, stripe, mirror, raidz or raidz2. But there are currently several limitations (likely in loader.zfs too), mostly due to the limited amount of memory available (< 640KB) and the simple memory allocators used (a simple malloc() and zfs_alloc_temp()). 1. gptzfsboot might fail to read compressed files on raidz/raidz2 pools. The reason is that the temporary buffer used for I/O (zfs_temp_buf in zfsimpl.c) is 128KB by default, but a 128KB compressed block will require a 128KB buffer to be allocated before the I/O is done, leaving nothing for the raidz code further on. The fix would be to make more the temporary buffer larger, but for some reason it's not as simple as just changing the TEMP_SIZE define (possibly a stack overflow results; more debugging needed). Workaround: don't enable compression on your root filesystem (aka bootfs). 2. gptzfsboot might fail to reconstruct a file that is read from a degraded raidz/raidz2 pool, or if the file is corrupt somehow (i.e. the pool is healthy but the checksums don't match). The reason again is that the temporary buffer gets exhausted. I think this will only happen in the case where more than one physical block is corrupt or unreadable. The fix has several aspects: 1) make the temporary buffer much larger, perhaps larger than 640KB; 2) change zfssubr.c:vdev_raidz_read() to reuse the temp buffers it allocates when possible; and 3) either restructure zfssubr.c:vdev_raidz_reconstruct_pq() to only allocate its temporary buffers once per I/O, or use a malloc that has free() implemented. Workaround: repair your pool somehow (e.g. pxeboot) so one or no disks are bad. 3. gptzfsboot might fail to boot from a degraded pool that has one or more drives marked offline, removed, or faulted. The reason is that vdev_probe() assumes that all vdevs are healthy, regardless of their true state. gptzfsboot then will read from an offline/removed/faulted vdev as if it were healthy, likely resulting in failed checksums, resulting in the recovery code path being run in vdev_raidz_read(), possibly leading to zfs_temp_buf exhaustion as in #2 above. A partial patch for #3 is attached, but it is inadequate because it only reads a vdev's status from the first device's (in BIOS order) vdev_label, with the result that if the first device is marked offline, gptzfsboot won't see this because only the other devices' vdev_labels will indicate that the first device is offline. (Since after a device is offlined no further writes will be made to the device, its vdev_label is not updated to reflect that it's offline.) To complete the patch it would be necessary to set each leaf vdev's status from the newest vdev_label rather than from the first vdev_label seen. I think I've also hit a stack overflow a couple of times while debugging. I don't know enough about the gptzfsboot/loader.zfs environment to know whether the heap size could be easily enlarged, or whether there is room for a real malloc() with free(). loader(8) seems to use the malloc() in libstand. Can anyone shed some light on the memory limitations and possible solutions? I won't be able to spend much more time on this, but I wanted to pass on what I've learned in case someone else has the time and boot fu to take it the next step. Matt -------------- next part -------------- A non-text attachment was scrubbed... Name: zfsboot-status.patch Type: application/octet-stream Size: 3687 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20091121/59e95c29/zfsboot-status.obj From fb-fs at psconsult.nl Sun Nov 22 14:40:25 2009 From: fb-fs at psconsult.nl (Paul Schenkeveld) Date: Sun Nov 22 14:40:33 2009 Subject: ZFS whole_disk=0 Message-ID: <20091122142552.GA39554@psconsult.nl> Hi, I noticed (on 8.0-RC2 amd64) that zdb always reports whole_disk=0 even when using /dev/ad6 as vdev. How do I tell zpool that my vdev is really a whole disk? Is there a big performace penalty (or other disadvantage) when whole_disk == 0? Regards, Paul Schenkeveld From roberto at keltia.freenix.fr Sun Nov 22 15:37:23 2009 From: roberto at keltia.freenix.fr (Ollivier Robert) Date: Sun Nov 22 15:37:30 2009 Subject: ZFS whole_disk=0 In-Reply-To: <20091122142552.GA39554@psconsult.nl> References: <20091122142552.GA39554@psconsult.nl> Message-ID: <20091122153650.GA55532@rron.freenix.org> According to Paul Schenkeveld: >Is there a big performace penalty (or other disadvantage) when >whole_disk == 0? None that I know of, Solaris disable the write cache if it is not using the whole disk but FreeBSD does not have this issue. -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- roberto@keltia.freenix.fr In memoriam to Ondine : http://ondine.keltia.net/ From roberto at keltia.freenix.fr Sun Nov 22 15:42:57 2009 From: roberto at keltia.freenix.fr (Ollivier Robert) Date: Sun Nov 22 15:43:04 2009 Subject: 7.2 dies in zfs In-Reply-To: References: Message-ID: <20091122154228.GB55532@rron.freenix.org> According to Randy Bush: >i think the issue is how to tune for zfs > >i386 with 4G of RAM I've given up on ZFS on i386. Whatever tuning you could do is only delaying the inevitable. Even with lots of RAM, it will panic. I'd love being proven wrong as I also hav a 4 GB i386 with ZFS and it panics regularely. -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- roberto@keltia.freenix.fr In memoriam to Ondine : http://ondine.keltia.net/ From roberto at keltia.freenix.fr Sun Nov 22 16:15:45 2009 From: roberto at keltia.freenix.fr (Ollivier Robert) Date: Sun Nov 22 16:15:51 2009 Subject: Performance issues with 8.0 ZFS and sendfile/lighttpd In-Reply-To: <9bbcef730911071242m5ad91720xcccb7586c6848ffd@mail.gmail.com> References: <772532900-1257123963-cardhu_decombobulator_blackberry.rim.net-1402739480-@bda715.bisx.prod.on.blackberry> <4AEEBD4B.1050407@quip.cz> <4AEEDB3B.5020600@quip.cz> <4AF46CA9.1040904@quip.cz> <9bbcef730911061101h5356d2acob2ac8791afe112@mail.gmail.com> <4AF5D611.7060408@quip.cz> <9bbcef730911071242m5ad91720xcccb7586c6848ffd@mail.gmail.com> Message-ID: <20091122161516.GC55532@rron.freenix.org> According to Ivan Voras: >I'm not very familiar with this layer but since it uses struct buf and >the ZFS doesn't use bufcache, this is probably one of the things that >is bypassed, though it would be nice if it weren't since this code From what I understand from IRC, Kip is working on fixing ZFS with respect to struct buf (and also help with the ARC vs VM issues). -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- roberto@keltia.freenix.fr In memoriam to Ondine : http://ondine.keltia.net/ From randy at psg.com Sun Nov 22 19:47:07 2009 From: randy at psg.com (Randy Bush) Date: Sun Nov 22 19:47:16 2009 Subject: 7.2 dies in zfs In-Reply-To: <20091122154228.GB55532@rron.freenix.org> References: <20091122154228.GB55532@rron.freenix.org> Message-ID: > I've given up on ZFS on i386. Whatever tuning you could do is only > delaying the inevitable. Even with lots of RAM, it will panic. I'd > love being proven wrong as I also hav a 4 GB i386 with ZFS and it > panics regularely. i am not sure one can not tune it to stay up. it's just not clear how. and that is the critical point. on a 4g i386 with moderate+ load, my current parms are vm.kmem_size=1500M vm.kmem_size_max=2G vfs.zfs.arc_min=120M vfs.zfs.arc_max=900M vfs.zfs.prefetch_disable=1 and it's up over two days, wooo wooo! i suspect that i will next lower arc max. randy From linimon at FreeBSD.org Sun Nov 22 23:33:13 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Sun Nov 22 23:33:25 2009 Subject: sparc64/140797: [nfs] [panic] panic on 8.0-RC3/sparc64 as an NFS server Message-ID: <200911222333.nAMNXD4A037447@freefall.freebsd.org> Old Synopsis: Panic on 8.0-RC3/sparc64 as an NFS server New Synopsis: [nfs] [panic] panic on 8.0-RC3/sparc64 as an NFS server Responsible-Changed-From-To: freebsd-sparc64->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Sun Nov 22 23:32:50 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=140797 From lists at mschuette.name Mon Nov 23 02:05:08 2009 From: lists at mschuette.name (=?UTF-8?B?TWFydGluIFNjaMO8dHRl?=) Date: Mon Nov 23 02:05:15 2009 Subject: [nullfs] [panic] null with unref'ed lowervp Message-ID: <4B09EDB2.7020002@mschuette.name> Hello, my server recently had the kernel panic "null with unref'ed lowervp" in null_subr.c:null_checkvp(). I am not sure whether I should open a PR for it. The system is a SMP machine with SCSI RAID (asr), it runs several jails and uses multiple null mounts between partitions, because there is not enough disk space (~90% usage). uname -a: FreeBSD trinity.asta.uni-potsdam.de 7.2-RELEASE FreeBSD 7.2-RELEASE #3: Tue May 12 18:53:06 CEST 2009 root@trinity.asta.uni-potsdam.de:/usr/obj/usr/src/sys/ASTA i386 Kernel is compiled with: options INVARIANT_SUPPORT options INVARIANTS options DIAGNOSTIC Among the few occurances I was not able to observe a pattern or reproduce the error (two crashes happened at 8am which is a cronjob time, but not one with particulary high load). I am also going to add new disks any time now in order to reduce the number of null mounts, so I do not expect to see this error again. I append three gdb backtraces, in case anyone finds them useful. -- Martin Sch?tte -------------- next part -------------- [root@trinity] /usr/obj/usr/src/sys/ASTA# cat /archiv/crash/info.2 && kgdb kernel.debug /archiv/crash/vmcore.2 Dump header from device /dev/da0s1b Architecture: i386 Architecture Version: 2 Dump Length: 354189312B (337 MB) Blocksize: 512 Dumptime: Fri Oct 16 08:00:05 2009 Hostname: trinity.asta.uni-potsdam.de Magic: FreeBSD Kernel Dump Version String: FreeBSD 7.2-RELEASE #3: Tue May 12 18:53:06 CEST 2009 root@trinity.asta.uni-potsdam.de:/usr/obj/usr/src/sys/ASTA Panic String: null with unref'ed lowervp Dump Parity: 3870003723 Bounds: 2 Dump Status: good GNU gdb 6.1.1 [FreeBSD] Copyright 2004 Free Software Foundation, Inc. GDB is free software, covered by the GNU General Public License, and you are welcome to change it and/or distribute copies of it under certain conditions. Type "show copying" to see the conditions. There is absolutely no warranty for GDB. Type "show warranty" for details. This GDB was configured as "i386-marcel-freebsd"... Unread portion of the kernel message buffer: vp = 0xcabd0000, unref'ed lowervp deadc0de deadc0de deadc0de c70fb2a0 deadc0de deadc0de deadc0de c70fb2a0 panic: null with unref'ed lowervp cpuid = 2 Uptime: 31d4h28m45s Physical memory: 3447 MB Dumping 337 MB: 322 306 290 274 258 242 226 210 194 178 162 146 130 114 98 82 66 50 34 18 2 Reading symbols from /boot/kernel/accf_http.ko...Reading symbols from /boot/kernel/accf_http.ko.symbols...done. done. Loaded symbols for /boot/kernel/accf_http.ko Reading symbols from /boot/kernel/acpi.ko...Reading symbols from /boot/kernel/acpi.ko.symbols...done. done. Loaded symbols for /boot/kernel/acpi.ko Reading symbols from /boot/kernel/nullfs.ko...Reading symbols from /boot/kernel/nullfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/nullfs.ko #0 doadump () at pcpu.h:196 196 __asm __volatile("movl %%fs:0,%0" : "=r" (td)); (kgdb) bt #0 doadump () at pcpu.h:196 #1 0xc057ba0c in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc057bcac in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:574 #3 0xc70f86f3 in null_checkvp (vp=Variable "vp" is not available. ) at /usr/src/sys/modules/nullfs/../../fs/nullfs/null_subr.c:337 #4 0xc70f9557 in null_lock (ap=0xeb12c994) at /usr/src/sys/modules/nullfs/../../fs/nullfs/null_vnops.c:531 #5 0xc07d0a35 in VOP_LOCK1_APV (vop=0xc70fb4c0, a=0xeb12c994) at vnode_if.c:1618 #6 0xc0605bde in _vn_lock (vp=0xcabd0000, flags=8194, td=0xc6f52d20, file=0xc080b893 "/usr/src/sys/kern/vfs_subr.c", line=2159) at vnode_if.h:851 #7 0xc05fa0ee in vrele (vp=0xcabd0000) at /usr/src/sys/kern/vfs_subr.c:2159 #8 0xc05f05fe in namei (ndp=0xeb12cb7c) at /usr/src/sys/kern/vfs_lookup.c:202 #9 0xc0605572 in vn_open_cred (ndp=0xeb12cb7c, flagp=0xeb12cc78, cmode=0, cred=0xc717f600, fp=0xc70744c0) at /usr/src/sys/kern/vfs_vnops.c:188 #10 0xc06057f3 in vn_open (ndp=0xeb12cb7c, flagp=0xeb12cc78, cmode=0, fp=0xc70744c0) at /usr/src/sys/kern/vfs_vnops.c:94 #11 0xc06046b3 in kern_open (td=0xc6f52d20, path=0xbfbfa750
, pathseg=UIO_USERSPACE, flags=3, mode=0) at /usr/src/sys/kern/vfs_syscalls.c:1042 #12 0xc0604ba0 in open (td=0xc6f52d20, uap=0xeb12ccfc) at /usr/src/sys/kern/vfs_syscalls.c:1009 #13 0xc07ba703 in syscall (frame=0xeb12cd38) at /usr/src/sys/i386/i386/trap.c:1090 #14 0xc079fd80 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255 #15 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) -------------- next part -------------- = info = Dump header from device /dev/da0s1b Architecture: i386 Architecture Version: 2 Dump Length: 362143744B (345 MB) Blocksize: 512 Dumptime: Sun Oct 25 08:00:16 2009 Hostname: trinity.asta.uni-potsdam.de Magic: FreeBSD Kernel Dump Version String: FreeBSD 7.2-RELEASE #3: Tue May 12 18:53:06 CEST 2009 root@trinity.asta.uni-potsdam.de:/usr/obj/usr/src/sys/ASTA Panic String: null with unref'ed lowervp Dump Parity: 2201427979 Bounds: 3 Dump Status: good = kgdb = Unread portion of the kernel message buffer: vp = 0xc7f17000, unref'ed lowervp deadc0de deadc0de deadc0de c70e62a0 1fffff 0 0 0 panic: null with unref'ed lowervp cpuid = 1 Uptime: 9d0h58m13s Physical memory: 3447 MB Dumping 345 MB: 330 314 298 282 266 250 234 218 202 186 170 154 138 122 106 90 74 58 42 26 10 (kgdb) bt #0 doadump () at pcpu.h:196 #1 0xc057ba0c in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc057bcac in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:574 #3 0xc70e36f3 in null_checkvp (vp=Variable "vp" is not available. ) at /usr/src/sys/modules/nullfs/../../fs/nullfs/null_subr.c:337 #4 0xc70e4557 in null_lock (ap=0xebc0fb20) at /usr/src/sys/modules/nullfs/../../fs/nullfs/null_vnops.c:531 #5 0xc07d0a35 in VOP_LOCK1_APV (vop=0xc70e64c0, a=0xebc0fb20) at vnode_if.c:1618 #6 0xc0605bde in _vn_lock (vp=0xc7f17000, flags=8194, td=0xca5ffaf0, file=0xc080b893 "/usr/src/sys/kern/vfs_subr.c", line=2159) at vnode_if.h:851 #7 0xc05fa0ee in vrele (vp=0xc7f17000) at /usr/src/sys/kern/vfs_subr.c:2159 #8 0xc06004ca in kern_rename (td=0xca5ffaf0, from=0x287019b0
, to=0x28701a20
, pathseg=UIO_USERSPACE) at /usr/src/sys/kern/vfs_syscalls.c:3428 #9 0xc0600589 in rename (td=0xca5ffaf0, uap=0xebc0fcfc) at /usr/src/sys/kern/vfs_syscalls.c:3319 #10 0xc07ba703 in syscall (frame=0xebc0fd38) at /usr/src/sys/i386/i386/trap.c:1090 #11 0xc079fd80 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255 #12 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) -------------- next part -------------- = info = Dump header from device /dev/da0s1b Architecture: i386 Architecture Version: 2 Dump Length: 365514752B (348 MB) Blocksize: 512 Dumptime: Tue Nov 17 22:48:12 2009 Hostname: trinity.asta.uni-potsdam.de Magic: FreeBSD Kernel Dump Version String: FreeBSD 7.2-RELEASE #3: Tue May 12 18:53:06 CEST 2009 root@trinity.asta.uni-potsdam.de:/usr/obj/usr/src/sys/ASTA Panic String: null with unref'ed lowervp Dump Parity: 527080458 Bounds: 4 Dump Status: good = kgdb = Unread portion of the kernel message buffer: vp = 0xc8e60228, unref'ed lowervp deadc0de deadc0de deadc0de c713b2a0 deadc0de deadc0de deadc0de c084e1a0 panic: null with unref'ed lowervp cpuid = 0 Uptime: 23d14h45m58s Physical memory: 3447 MB Dumping 348 MB: 333 317 301 285 269 253 237 221 205 189 173 157 141 125 109 93 77 61 45 29 13 (kgdb) bt #0 doadump () at pcpu.h:196 #1 0xc057ba0c in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:418 #2 0xc057bcac in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:574 #3 0xc71386f3 in null_checkvp (vp=Variable "vp" is not available. ) at /usr/src/sys/modules/nullfs/../../fs/nullfs/null_subr.c:337 #4 0xc7139557 in null_lock (ap=0xeb5af994) at /usr/src/sys/modules/nullfs/../../fs/nullfs/null_vnops.c:531 #5 0xc07d0a35 in VOP_LOCK1_APV (vop=0xc713b4c0, a=0xeb5af994) at vnode_if.c:1618 #6 0xc0605bde in _vn_lock (vp=0xc8e60228, flags=8194, td=0xc8e87460, file=0xc080b893 "/usr/src/sys/kern/vfs_subr.c", line=2159) at vnode_if.h:851 #7 0xc05fa0ee in vrele (vp=0xc8e60228) at /usr/src/sys/kern/vfs_subr.c:2159 #8 0xc05f05fe in namei (ndp=0xeb5afb7c) at /usr/src/sys/kern/vfs_lookup.c:202 #9 0xc0605572 in vn_open_cred (ndp=0xeb5afb7c, flagp=0xeb5afc78, cmode=0, cred=0xc76c2100, fp=0xcd28a6d4) at /usr/src/sys/kern/vfs_vnops.c:188 #10 0xc06057f3 in vn_open (ndp=0xeb5afb7c, flagp=0xeb5afc78, cmode=0, fp=0xcd28a6d4) at /usr/src/sys/kern/vfs_vnops.c:94 #11 0xc06046b3 in kern_open (td=0xc8e87460, path=0xbfbfa8f0
, pathseg=UIO_USERSPACE, flags=3, mode=0) at /usr/src/sys/kern/vfs_syscalls.c:1042 #12 0xc0604ba0 in open (td=0xc8e87460, uap=0xeb5afcfc) at /usr/src/sys/kern/vfs_syscalls.c:1009 #13 0xc07ba703 in syscall (frame=0xeb5afd38) at /usr/src/sys/i386/i386/trap.c:1090 #14 0xc079fd80 in Xint0x80_syscall () at /usr/src/sys/i386/i386/exception.s:255 #15 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) From jhell at DataIX.net Mon Nov 23 02:36:22 2009 From: jhell at DataIX.net (jhell) Date: Mon Nov 23 02:36:28 2009 Subject: 7.2 dies in zfs In-Reply-To: References: <20091122154228.GB55532@rron.freenix.org> Message-ID: <917839072.20091122213618@DataIX.net> Sunday, November 22, 2009, 2:47:03 PM, Randy wrote: >> I've given up on ZFS on i386. Whatever tuning you could do is only >> delaying the inevitable. Even with lots of RAM, it will panic. I'd >> love being proven wrong as I also hav a 4 GB i386 with ZFS and it >> panics regularely. > i am not sure one can not tune it to stay up. it's just not clear how. > and that is the critical point. > on a 4g i386 with moderate+ load, my current parms are > vm.kmem_size=1500M > vm.kmem_size_max=2G > vfs.zfs.arc_min=120M > vfs.zfs.arc_max=900M > vfs.zfs.prefetch_disable=1 > and it's up over two days, wooo wooo! > i suspect that i will next lower arc max. 7.2-STABLE --- arc_min & arc_max AFAIR are auto tuned on boot by values that dont come to mind right now. Have you tried just leaving these for the system to tune.... On my i386 running 7.2-STABLE with 1G RAM 1800MHz with two 7200RPM drives I dual boot with Win7 & FreeBSD not that it makes a difference. My large drive carries FreeBSD, ZFS only with two slices of my second drive one for the ZFS cache and one for the ZIL. kmem_size is 512M and max is 768M & I dont disable the prefetch... Kernel config invludes KVA_PAGES=512 along with no other kernel tuning. Seems no matter what I do with this machine I can not get it to crash whatsoever. I have stress tested it with unixbench, iozone & dd(1). This machines uptime during long periods of not having to boot into (Windows for Work) purposes is around 13 days or so. ;-) I could not be more pleased other than I would like some faster write speeds when doing inbound xfers from a personal fileserver but this I just blame on lack of better hardware. Best of Luck. PS: If you go too low on arc_* you are going to inhibit your write speeds dramaticly while trying to save your memory for other purposes. -- Sunday, November 22, 2009 9:18:42 PM jhell From jhell at DataIX.net Mon Nov 23 02:37:01 2009 From: jhell at DataIX.net (jhell) Date: Mon Nov 23 02:38:09 2009 Subject: 7.2 dies in zfs In-Reply-To: References: <20091122154228.GB55532@rron.freenix.org> Message-ID: <1710337060.20091122213655@DataIX.net> Sunday, November 22, 2009, 2:47:03 PM, Randy wrote: >> I've given up on ZFS on i386. Whatever tuning you could do is only >> delaying the inevitable. Even with lots of RAM, it will panic. I'd >> love being proven wrong as I also hav a 4 GB i386 with ZFS and it >> panics regularely. > i am not sure one can not tune it to stay up. it's just not clear how. > and that is the critical point. > on a 4g i386 with moderate+ load, my current parms are > vm.kmem_size=1500M > vm.kmem_size_max=2G > vfs.zfs.arc_min=120M > vfs.zfs.arc_max=900M > vfs.zfs.prefetch_disable=1 > and it's up over two days, wooo wooo! > i suspect that i will next lower arc max. 7.2-STABLE --- arc_min & arc_max AFAIR are auto tuned on boot by values that dont come to mind right now. Have you tried just leaving these for the system to tune.... On my i386 running 7.2-STABLE with 1G RAM 1800MHz with two 7200RPM drives I dual boot with Win7 & FreeBSD not that it makes a difference. My large drive carries FreeBSD, ZFS only with two slices of my second drive one for the ZFS cache and one for the ZIL. kmem_size is 512M and max is 768M & I dont disable the prefetch... Kernel config invludes KVA_PAGES=512 along with no other kernel tuning. Seems no matter what I do with this machine I can not get it to crash whatsoever. I have stress tested it with unixbench, iozone & dd(1). This machines uptime during long periods of not having to boot into (Windows for Work) purposes is around 13 days or so. ;-) I could not be more pleased other than I would like some faster write speeds when doing inbound xfers from a personal fileserver but this I just blame on lack of better hardware. Best of Luck. PS: If you go too low on arc_* you are going to inhibit your write speeds dramaticly while trying to save your memory for other purposes. -- Sunday, November 22, 2009 9:18:42 PM jhell From roberto at keltia.freenix.fr Mon Nov 23 09:57:33 2009 From: roberto at keltia.freenix.fr (Ollivier Robert) Date: Mon Nov 23 09:57:39 2009 Subject: 7.2 dies in zfs In-Reply-To: References: <20091122154228.GB55532@rron.freenix.org> Message-ID: <20091123095645.GA61769@roberto-al.eurocontrol.fr> According to Randy Bush: >i suspect that i will next lower arc max. The machine generally survives a buildworld and will panic later on a simple "svn update". There is not real pattern... -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- roberto@keltia.freenix.fr In memoriam to Ondine : http://ondine.keltia.net/ From bugmaster at FreeBSD.org Mon Nov 23 11:06:53 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Nov 23 11:07:59 2009 Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org Message-ID: <200911231106.nANB6qIn070109@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o sparc/140797 fs [nfs] [panic] panic on 8.0-RC3/sparc64 as an NFS serve o kern/140682 fs [netgraph] [panic] random panic in netgraph o kern/140661 fs [zfs] /boot/loader fails to work on a GPT/ZFS-only sys o kern/140640 fs [zfs] snapshot crash o kern/140433 fs [zfs] [panic] panic while replaying ZIL after crash o kern/140134 fs [msdosfs] write and fsck destroy filesystem integrity o kern/140068 fs [smbfs] [patch] smbfs does not allow semicolon in file o kern/139725 fs [zfs] zdb(1) dumps core on i386 when examining zpool c o kern/139715 fs [zfs] vfs.numvnodes leak on busy zfs o bin/139651 fs [nfs] mount(8): read-only remount of NFS volume does n o kern/139597 fs [patch] [tmpfs] tmpfs initializes va_gen but doesn't u o kern/139564 fs [zfs] [panic] 8.0-RC1 - Fatal trap 12 at end of shutdo o kern/139407 fs [smbfs] [panic] smb mount causes system crash if remot o kern/139363 fs [nfs] diskless root nfs mount from non FreeBSD server o kern/138790 fs [zfs] ZFS ceases caching when mem demand is high o kern/138524 fs [msdosfs] disks and usb flashes/cards with Russian lab o kern/138421 fs [ufs] [patch] remove UFS label limitations o kern/138367 fs [tmpfs] [panic] 'panic: Assertion pages > 0 failed' wh o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/138109 fs [extfs] [patch] Minor cleanups to the sys/gnu/fs/ext2f f kern/137037 fs [zfs] [hang] zfs rollback on root causes FreeBSD to fr o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic o kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/135594 fs [zfs] Single dataset unresponsive with Samba o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133980 fs [panic] [ffs] panic: ffs_valloc: dup alloc o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [panic] panic: ffs_truncate: read-only filesystem o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int f kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132597 fs [tmpfs] [panic] tmpfs-related panic while interrupting o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes o kern/131995 fs [nfs] Failure to mount NFSv4 server o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/130979 fs [smbfs] [panic] boot/kernel/smbfs.ko o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129059 fs [zfs] [patch] ZFS bootloader whitelistable via WITHOUT f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/127659 fs [tmpfs] tmpfs memory leak o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS p kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [fs] [snapshot] System crashes when manipulati o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o kern/118713 fs [minidump] [patch] Display media size required for a k o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o kern/116913 fs [ffs] [panic] ffs_blkfree: freeing free block p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/115645 fs [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna f kern/91568 fs [ufs] [panic] writing to UFS/softupdates DVD media in o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/84589 fs [2TB] 5.4-STABLE unresponsive during background fsck 2 o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 139 problems total. From jhb at freebsd.org Mon Nov 23 15:40:56 2009 From: jhb at freebsd.org (John Baldwin) Date: Mon Nov 23 15:41:46 2009 Subject: Current gptzfsboot limitations In-Reply-To: References: Message-ID: <200911231018.40815.jhb@freebsd.org> On Friday 20 November 2009 7:46:54 pm Matt Reimer wrote: > I've been analyzing gptzfsboot to see what its limitations are. I > think it should now work fine for a healthy pool with any number of > disks, with any type of vdev, whether single disk, stripe, mirror, > raidz or raidz2. > > But there are currently several limitations (likely in loader.zfs > too), mostly due to the limited amount of memory available (< 640KB) > and the simple memory allocators used (a simple malloc() and > zfs_alloc_temp()). > > 1. gptzfsboot might fail to read compressed files on raidz/raidz2 > pools. The reason is that the temporary buffer used for I/O > (zfs_temp_buf in zfsimpl.c) is 128KB by default, but a 128KB > compressed block will require a 128KB buffer to be allocated before > the I/O is done, leaving nothing for the raidz code further on. The > fix would be to make more the temporary buffer larger, but for some > reason it's not as simple as just changing the TEMP_SIZE define > (possibly a stack overflow results; more debugging needed). > Workaround: don't enable compression on your root filesystem (aka > bootfs). > > 2. gptzfsboot might fail to reconstruct a file that is read from a > degraded raidz/raidz2 pool, or if the file is corrupt somehow (i.e. > the pool is healthy but the checksums don't match). The reason again > is that the temporary buffer gets exhausted. I think this will only > happen in the case where more than one physical block is corrupt or > unreadable. The fix has several aspects: 1) make the temporary buffer > much larger, perhaps larger than 640KB; 2) change > zfssubr.c:vdev_raidz_read() to reuse the temp buffers it allocates > when possible; and 3) either restructure > zfssubr.c:vdev_raidz_reconstruct_pq() to only allocate its temporary > buffers once per I/O, or use a malloc that has free() implemented. > Workaround: repair your pool somehow (e.g. pxeboot) so one or no disks > are bad. > > 3. gptzfsboot might fail to boot from a degraded pool that has one or > more drives marked offline, removed, or faulted. The reason is that > vdev_probe() assumes that all vdevs are healthy, regardless of their > true state. gptzfsboot then will read from an offline/removed/faulted > vdev as if it were healthy, likely resulting in failed checksums, > resulting in the recovery code path being run in vdev_raidz_read(), > possibly leading to zfs_temp_buf exhaustion as in #2 above. > > A partial patch for #3 is attached, but it is inadequate because it > only reads a vdev's status from the first device's (in BIOS order) > vdev_label, with the result that if the first device is marked > offline, gptzfsboot won't see this because only the other devices' > vdev_labels will indicate that the first device is offline. (Since > after a device is offlined no further writes will be made to the > device, its vdev_label is not updated to reflect that it's offline.) > To complete the patch it would be necessary to set each leaf vdev's > status from the newest vdev_label rather than from the first > vdev_label seen. > > I think I've also hit a stack overflow a couple of times while debugging. > > I don't know enough about the gptzfsboot/loader.zfs environment to > know whether the heap size could be easily enlarged, or whether there > is room for a real malloc() with free(). loader(8) seems to use the > malloc() in libstand. Can anyone shed some light on the memory > limitations and possible solutions? > > I won't be able to spend much more time on this, but I wanted to pass > on what I've learned in case someone else has the time and boot fu to > take it the next step. One issue is that disk transfers need to happen in the lower 1MB due to BIOS limitations. The loader uses a bounce buffer (in biosdisk.c in libi386) to make this work ok. The loader uses memory > 1MB for malloc(). You could probably change zfsboot to do that as well if not already. Just note that drvread() has to bounce buffer requests in that case. The text + data + bss + stack is all in the lower 640k and there's not much you can do about that. The stack grows down from 640k, and the boot program text + data starts at 64k with the bss following. Hmm, drvread() might already be bounce buffering since boot2 has to do so since it copies the loader up to memory > 1MB as well. You might need to use memory > 2MB for zfsboot's malloc() so that the loader can be copied up to 1MB. It looks like you could patch malloc() in zfsboot.c to use 4*1024*1024 as heap_next and maybe 64*1024*1024 as heap_end (this assumes all machines that boot ZFS have at least 64MB of RAM, which is probably safe). -- John Baldwin From marius at alchemy.franken.de Mon Nov 23 20:10:09 2009 From: marius at alchemy.franken.de (Marius Strobl) Date: Mon Nov 23 20:10:14 2009 Subject: sparc64/140797: Panic on 8.0-RC3/sparc64 as an NFS server Message-ID: <200911232010.nANKA8Ib089404@freefall.freebsd.org> The following reply was made to PR sparc64/140797; it has been noted by GNATS. From: Marius Strobl To: bug-followup@FreeBSD.org, Greg Lewis Cc: Subject: Re: sparc64/140797: Panic on 8.0-RC3/sparc64 as an NFS server Date: Mon, 23 Nov 2009 20:49:13 +0100 Could you please test whether r199274/r199284 fix this problem? http://svn.freebsd.org/viewvc/base/head/sys/nfsserver/nfs_fha.c?r1=195202&r2=199284&diff_format=u --- head/sys/nfsserver/nfs_fha.c 2009/06/30 19:03:27 195202 +++ head/sys/nfsserver/nfs_fha.c 2009/11/15 03:09:50 199284 @@ -206,7 +206,7 @@ if (error) goto out; - i->fh = *(const u_int64_t *)(fh.fh_generic.fh_fid.fid_data); + bcopy(fh.fh_generic.fh_fid.fid_data, &i->fh, sizeof(i->fh)); /* Content ourselves with zero offset for all but reads. */ if (procnum != NFSPROC_READ) From mattjreimer at gmail.com Mon Nov 23 22:04:32 2009 From: mattjreimer at gmail.com (Matt Reimer) Date: Mon Nov 23 22:04:39 2009 Subject: Current gptzfsboot limitations In-Reply-To: <200911231018.40815.jhb@freebsd.org> References: <200911231018.40815.jhb@freebsd.org> Message-ID: On Mon, Nov 23, 2009 at 7:18 AM, John Baldwin wrote: > On Friday 20 November 2009 7:46:54 pm Matt Reimer wrote: >> I've been analyzing gptzfsboot to see what its limitations are. I >> think it should now work fine for a healthy pool with any number of >> disks, with any type of vdev, whether single disk, stripe, mirror, >> raidz or raidz2. >> >> But there are currently several limitations (likely in loader.zfs >> too), mostly due to the limited amount of memory available (< 640KB) >> and the simple memory allocators used (a simple malloc() and >> zfs_alloc_temp()). ... >> >> I think I've also hit a stack overflow a couple of times while debugging. >> >> I don't know enough about the gptzfsboot/loader.zfs environment to >> know whether the heap size could be easily enlarged, or whether there >> is room for a real malloc() with free(). loader(8) seems to use the >> malloc() in libstand. Can anyone shed some light on the memory >> limitations and possible solutions? >> >> I won't be able to spend much more time on this, but I wanted to pass >> on what I've learned in case someone else has the time and boot fu to >> take it the next step. > > One issue is that disk transfers need to happen in the lower 1MB due to BIOS > limitations. ?The loader uses a bounce buffer (in biosdisk.c in libi386) to > make this work ok. ?The loader uses memory > 1MB for malloc(). ?You could > probably change zfsboot to do that as well if not already. ?Just note that > drvread() has to bounce buffer requests in that case. ?The text + data + bss > + stack is all in the lower 640k and there's not much you can do about that. > The stack grows down from 640k, and the boot program text + data starts at > 64k with the bss following. Ah, the stack growing down from 640k explains a problem I was seeing where a memcpy() to a temp buf would restart gptzfsboot--it must have been overwriting the stack. > Hmm, drvread() might already be bounce buffering > since boot2 has to do so since it copies the loader up to memory > 1MB as > well. Looks like it's already bounce buffering. All the I/O drvread does is to statically allocated char arrays, and the data is copied when necessary, e.g. in vdev_read(): if (drvread(dsk, dmadat->rdbuf, lba, nb)) return -1; memcpy(p, dmadat->rdbuf, nb * DEV_BSIZE); >?You might need to use memory > 2MB for zfsboot's malloc() so that the > loader can be copied up to 1MB. ?It looks like you could patch malloc() in > zfsboot.c to use 4*1024*1024 as heap_next and maybe 64*1024*1024 as heap_end > (this assumes all machines that boot ZFS have at least 64MB of RAM, which is > probably safe). So are the page tables etc. already configured such that RAM above 1MB is ready to use in gptzfsboot? (I'm not familiar with the details of how virtual memory is handled on i386.) Thanks for your help John. Matt From glewis at eyesbeyond.com Tue Nov 24 05:10:03 2009 From: glewis at eyesbeyond.com (Greg Lewis) Date: Tue Nov 24 05:10:09 2009 Subject: sparc64/140797: Panic on 8.0-RC3/sparc64 as an NFS server Message-ID: <200911240510.nAO5A2h7046201@freefall.freebsd.org> The following reply was made to PR sparc64/140797; it has been noted by GNATS. From: Greg Lewis To: Marius Strobl Cc: bug-followup@FreeBSD.org, Greg Lewis Subject: Re: sparc64/140797: Panic on 8.0-RC3/sparc64 as an NFS server Date: Mon, 23 Nov 2009 21:05:33 -0800 On Mon, Nov 23, 2009 at 08:49:13PM +0100, Marius Strobl wrote: > Could you please test whether r199274/r199284 fix this problem? > > http://svn.freebsd.org/viewvc/base/head/sys/nfsserver/nfs_fha.c?r1=195202&r2=199284&diff_format=u > > --- head/sys/nfsserver/nfs_fha.c 2009/06/30 19:03:27 195202 > +++ head/sys/nfsserver/nfs_fha.c 2009/11/15 03:09:50 199284 > @@ -206,7 +206,7 @@ > if (error) > goto out; > > - i->fh = *(const u_int64_t *)(fh.fh_generic.fh_fid.fid_data); > + bcopy(fh.fh_generic.fh_fid.fid_data, &i->fh, sizeof(i->fh)); > > /* Content ourselves with zero offset for all but reads. */ > if (procnum != NFSPROC_READ) Thanks Marius! I'll give it a try as soon as I can and let you know. -- Greg Lewis Email : glewis@eyesbeyond.com Eyes Beyond Web : http://www.eyesbeyond.com Information Technology FreeBSD : glewis@FreeBSD.org From kvs at binarysolutions.dk Tue Nov 24 11:30:05 2009 From: kvs at binarysolutions.dk (Kenneth Schmidt) Date: Tue Nov 24 11:30:12 2009 Subject: amd64/140661: /boot/loader fails to work on a GPT/ZFS-only system on both 8.0-RC2 and RC3 Message-ID: <200911241130.nAOBU4LB005443@freefall.freebsd.org> The following reply was made to PR kern/140661; it has been noted by GNATS. From: Kenneth Schmidt To: Scot Hetzel Cc: freebsd-gnats-submit@freebsd.org Subject: Re: amd64/140661: /boot/loader fails to work on a GPT/ZFS-only system on both 8.0-RC2 and RC3 Date: Tue, 24 Nov 2009 11:51:06 +0100 --Apple-Mail-4--567691600 Content-Transfer-Encoding: 7bit Content-Type: text/plain; charset=us-ascii On Nov 18, 2009, at 21:57 , Scot Hetzel wrote: > Make sure you have LOADER_ZFS_SUPPORT in your /etc/src.conf: > > dv8t01# cat /etc/src.conf > LOADER_ZFS_SUPPORT=YES Ah! I also have LOADER_TFTP_SUPPORT=YES. Removing that, and everything works. I don't know why I didn't think of that in the first place, but maybe this is either a bug, or something that should be warned about when building loader(8)? /Kenneth --Apple-Mail-4--567691600 Content-Disposition: attachment; filename=smime.p7s Content-Type: application/pkcs7-signature; name=smime.p7s Content-Transfer-Encoding: base64 MIAGCSqGSIb3DQEHAqCAMIACAQExCzAJBgUrDgMCGgUAMIAGCSqGSIb3DQEHAQAAoIIFhDCCBYAw ggRooAMCAQICBEWGxzAwDQYJKoZIhvcNAQEFBQAwMTELMAkGA1UEBhMCREsxDDAKBgNVBAoTA1RE QzEUMBIGA1UEAxMLVERDIE9DRVMgQ0EwHhcNMDkwMjI4MTQxOTIyWhcNMTEwMjI4MTQ0OTIyWjCB gzELMAkGA1UEBhMCREsxKTAnBgNVBAoTIEluZ2VuIG9yZ2FuaXNhdG9yaXNrIHRpbGtueXRuaW5n MUkwIgYDVQQDExtLZW5uZXRoIFZlc3RlcmdhYXJkIFNjaG1pZHQwIwYDVQQFExxQSUQ6OTIwOC0y MDAyLTItNTgwODg3NjMzMzU1MIGfMA0GCSqGSIb3DQEBAQUAA4GNADCBiQKBgQChydTclnkISEut 5C7KkSZGmnFJiZFbs0+5xibIPGIVQTMsYkngAMEp+BXZu4vCJoIQIETg65tmf5uyhhikAiTdkj5U IX/7prCH5OS7wARyN2ZIOOsapf4h1vrbP6Q1DO9VZ6dcAL7H7Xem8O7Vk6fRwCwPSjjz0fF+Sk1D rLcRFQIDAQABo4ICzzCCAsswDgYDVR0PAQH/BAQDAgP4MCsGA1UdEAQkMCKADzIwMDkwMjI4MTQx OTIyWoEPMjAxMTAyMjgxNDQ5MjJaMIIBNwYDVR0gBIIBLjCCASowggEmBgoqgVCBKQEBAQEDMIIB FjAvBggrBgEFBQcCARYjaHR0cDovL3d3dy5jZXJ0aWZpa2F0LmRrL3JlcG9zaXRvcnkwgeIGCCsG AQUFBwICMIHVMAoWA1REQzADAgEBGoHGRm9yIGFudmVuZGVsc2UgYWYgY2VydGlmaWthdGV0IGfm bGRlciBPQ0VTIHZpbGvlciwgQ1BTIG9nIE9DRVMgQ1AsIGRlciBrYW4gaGVudGVzIGZyYSB3d3cu Y2VydGlmaWthdC5kay9yZXBvc2l0b3J5LiBCZW3mcmssIGF0IFREQyBlZnRlciB2aWxr5XJlbmUg aGFyIGV0IGJlZ3LmbnNldCBhbnN2YXIgaWZ0LiBwcm9mZXNzaW9uZWxsZSBwYXJ0ZXIuMEEGCCsG AQUFBwEBBDUwMzAxBggrBgEFBQcwAYYlaHR0cDovL29jc3AuY2VydGlmaWthdC5kay9vY3NwL3N0 YXR1czAhBgNVHREEGjAYgRZrdnNAYmluYXJ5c29sdXRpb25zLmRrMIGEBgNVHR8EfTB7MEugSaBH pEUwQzELMAkGA1UEBhMCREsxDDAKBgNVBAoTA1REQzEUMBIGA1UEAxMLVERDIE9DRVMgQ0ExEDAO BgNVBAMTB0NSTDM2MzgwLKAqoCiGJmh0dHA6Ly9jcmwub2Nlcy5jZXJ0aWZpa2F0LmRrL29jZXMu Y3JsMB8GA1UdIwQYMBaAFGC1hexWZH4SGSdnHVAVS3OuO/kSMB0GA1UdDgQWBBSG3wHOpKtm3LBy KmG/ORvrZernijAJBgNVHRMEAjAAMBkGCSqGSIb2fQdBAAQMMAobBFY3LjEDAgOoMA0GCSqGSIb3 DQEBBQUAA4IBAQCHp88nKSvx92/pb8exl7vBpU+UtweGEvag2EEuIrQMUsPetXxQTIZ4w1a3Si9z 79TEMbK7xURcGagyuf6BfKfKOGKSK5fLO/iwgf/6I2GmN3RKkg8wEFkb+qGLcQ8cuGQa+XASjlNn NgVfuQ8R7iIFGaZ+C/IHdQAHCbfJFQCw2G+HMdw0jHVXzibdvKp1yemmgqluyDvOPmck1j9ZnEW/ 3xlcSBwWHO2WO16Z8Jg04OHs+ijdCB5NrbmzbuxbBp1U8YD3hItz3WZIF19BoLhDYiOV2lEJi7O/ D1lByQLJf7SL6qMPISwWCrIGdR4d1MpK31Ch9Tso8ty305habYI8MYIB1TCCAdECAQEwOTAxMQsw CQYDVQQGEwJESzEMMAoGA1UEChMDVERDMRQwEgYDVQQDEwtUREMgT0NFUyBDQQIERYbHMDAJBgUr DgMCGgUAoIHzMBgGCSqGSIb3DQEJAzELBgkqhkiG9w0BBwEwHAYJKoZIhvcNAQkFMQ8XDTA5MTEy NDEwNTEwNlowIwYJKoZIhvcNAQkEMRYEFAAQNleLsRFOTb2BMQeL8Nxu6moxMEgGCSsGAQQBgjcQ BDE7MDkwMTELMAkGA1UEBhMCREsxDDAKBgNVBAoTA1REQzEUMBIGA1UEAxMLVERDIE9DRVMgQ0EC BEWGxzAwSgYLKoZIhvcNAQkQAgsxO6A5MDExCzAJBgNVBAYTAkRLMQwwCgYDVQQKEwNUREMxFDAS BgNVBAMTC1REQyBPQ0VTIENBAgRFhscwMA0GCSqGSIb3DQEBAQUABIGAEVe5Zay1Jc1Uxk8Vx0cz N+4TTtTpnFTQvT1FarRhACvDaDMBH93i0+PIc2sVJ6oyJLF5tbMsJI7BW4G8pomddCnwIsvAlzPr K4kQIk1yXGESEwiVUVPfVDXSpq0y7M/8wI0Q1SZtkCMTTs/QQ3qV77DVLwxYrZ5b102u0H+DOikA AAAAAAA= --Apple-Mail-4--567691600-- From james-freebsd-fs2 at jrv.org Tue Nov 24 11:50:49 2009 From: james-freebsd-fs2 at jrv.org (James R. Van Artsdalen) Date: Tue Nov 24 11:51:11 2009 Subject: Current gptzfsboot limitations In-Reply-To: References: Message-ID: <4B0BC896.8030808@jrv.org> I assume that *zfsboot requires that /boot and /boot/kernel be in the boot filesystem and not filesystems of their own. A man page probably ought to say this or someone will be tempted to "zfs create pool/boot/kernel" so they can roll back undesirable kernel installs. From se at freebsd.org Tue Nov 24 12:47:35 2009 From: se at freebsd.org (Stefan Esser) Date: Tue Nov 24 12:47:42 2009 Subject: 7.2 dies in zfs In-Reply-To: <20091122154228.GB55532@rron.freenix.org> References: <20091122154228.GB55532@rron.freenix.org> Message-ID: <4B0BD5E7.3050604@freebsd.org> On 22.11.2009 16:42, Ollivier Robert wrote: > According to Randy Bush: >> i think the issue is how to tune for zfs >> >> i386 with 4G of RAM > > I've given up on ZFS on i386. Whatever tuning you could do is only > delaying the inevitable. Even with lots of RAM, it will panic. I'd > love being proven wrong as I also hav a 4 GB i386 with ZFS and it panics > regularely. If your i386 based system has much RAM (2GB or more), than you should definitely increase KVA_PAGES. Not doing so will lead to panics, not in spite of but exactly because of the large RAM. I have been using ZFS on i386 since it became available, first for testing and soon as only file-system (with UFS boot, initially, now switching over to gptzfsboot). Systems range from Pentium-3 to AMD64x2 and I see no problems even under significant load. The following is the complete contents of /boot/loader.conf on my home server with 512MB RAM (the maximum supported on this P3/733 based SFF box) and a single 320MB IDE drive: zfs_load="YES" vfs.root.mountfrom="zfs:gk" vfs.zfs.arc_max="80000000" vm.kmem_size="350000000" The box is a mail gateway with spam-filter, IMAP server, web-server, SMB and NFS server for media applications (incl. storage backend for a networked digital TV receiver). It has only 280 days of uptime, since it took a reboot to upgrade kernel and world when ZFS version 13 had been committed to 8-current (previous uptime was at least as long). With 4GB of RAM you need to raise KVA_PAGES or you'll run into a panic. Perhaps, the default of 256 should be raised to 512? The cost of KVA_PAGES=512 is 1MB of RAM allocated to the kernel page table and 1GB less maximum user process size ... Sun specifically mentions, that ZFS makes assumptions that are easily valid on 64bit architectures, but not so easy to meet on 32bit systems. But for moderate load, ZFS can run on a 512MB P3 with good reliability and the known advantages from an admin POV. Regards, STefan From gleb.kurtsou at gmail.com Tue Nov 24 13:24:05 2009 From: gleb.kurtsou at gmail.com (Gleb Kurtsou) Date: Tue Nov 24 13:24:11 2009 Subject: [nullfs] [panic] null with unref'ed lowervp In-Reply-To: <4B09EDB2.7020002@mschuette.name> References: <4B09EDB2.7020002@mschuette.name> Message-ID: <20091124132357.GA1941@tops.skynet.lt> On (23/11/2009 03:04), Martin Sch?tte wrote: > Hello, > my server recently had the kernel panic "null with unref'ed lowervp" in > null_subr.c:null_checkvp(). > I am not sure whether I should open a PR for it. > > The system is a SMP machine with SCSI RAID (asr), it runs several jails > and uses multiple null mounts between partitions, because there is not > enough disk space (~90% usage). > > uname -a: > FreeBSD trinity.asta.uni-potsdam.de 7.2-RELEASE > FreeBSD 7.2-RELEASE #3: Tue May 12 18:53:06 CEST 2009 > root@trinity.asta.uni-potsdam.de:/usr/obj/usr/src/sys/ASTA i386 > > Kernel is compiled with: > options INVARIANT_SUPPORT > options INVARIANTS > options DIAGNOSTIC In my understanding null_checkvp assumptions doesn't hold in null_lock and null_unlock. So I'd suggest you running without DIAGNOSTIC or try attached patch instead. > Among the few occurances I was not able to observe a pattern or > reproduce the error (two crashes happened at 8am which is a cronjob > time, but not one with particulary high load). > I am also going to add new disks any time now in order to reduce the > number of null mounts, so I do not expect to see this error again. > > I append three gdb backtraces, in case anyone finds them useful. > > -- > Martin Sch?tte -------------- next part -------------- diff --git a/sys/fs/nullfs/null_vnops.c b/sys/fs/nullfs/null_vnops.c index a028b63..4c0679f 100644 --- a/sys/fs/nullfs/null_vnops.c +++ b/sys/fs/nullfs/null_vnops.c @@ -553,7 +553,7 @@ null_lock(struct vop_lock1_args *ap) * lock as ffs has special lock considerations in it's * vop lock. */ - if (nn != NULL && (lvp = NULLVPTOLOWERVP(vp)) != NULL) { + if (nn != NULL && (lvp = nn->null_lowervp) != NULL) { VI_LOCK_FLAGS(lvp, MTX_DUPOK); VI_UNLOCK(vp); /* @@ -622,7 +622,7 @@ null_unlock(struct vop_unlock_args *ap) mtxlkflag = 2; } nn = VTONULL(vp); - if (nn != NULL && (lvp = NULLVPTOLOWERVP(vp)) != NULL) { + if (nn != NULL && (lvp = nn->null_lowervp) != NULL) { VI_LOCK_FLAGS(lvp, MTX_DUPOK); flags |= LK_INTERLOCK; vholdl(lvp); From roberto at keltia.freenix.fr Tue Nov 24 14:24:37 2009 From: roberto at keltia.freenix.fr (Ollivier Robert) Date: Tue Nov 24 14:24:43 2009 Subject: 7.2 dies in zfs In-Reply-To: <4B0BD5E7.3050604@freebsd.org> References: <20091122154228.GB55532@rron.freenix.org> <4B0BD5E7.3050604@freebsd.org> Message-ID: <20091124142407.GD81894@roberto-al.eurocontrol.fr> According to Stefan Esser: >If your i386 based system has much RAM (2GB or more), than you >should definitely increase KVA_PAGES. Not doing so will lead to >panics, not in spite of but exactly because of the large RAM. I have uppped KVA_PAGES of course but this is reducing the amount of memory available to processes. If you define KVA_PAGES to 2GB for example, every process will be able to use only the remaining 2 GB for their own memory so there is a trade off there. >I have been using ZFS on i386 since it became available, first for >testing and soon as only file-system (with UFS boot, initially, now >switching over to gptzfsboot). Systems range from Pentium-3 to >AMD64x2 and I see no problems even under significant load. I've found that load is not a factor (if one defines load as many concurrent processes). The machine is mostly idle and I've seen panics coming from a "cvs update" or a "svn up". There are I/O intensive but not that much whereas the same machine can survive a buildworld just fine. The machine I have is a dual Xeon @2.8 GHz with 4 GB of RAM and 200 GB of disk. /boot/loader.conf ----- #-- limits kern.maxdsiz="1024M" kern.maxssiz="256M" kern.dfldsiz="1024M" kern.dflssiz="128M" #-- vm tuning vm.kmem_size="1024M" vm.kmem_size_max="1224M" vfs.zfs.arc_max="128M" vfs.zfs.prefetch_disable=1 ----- options KVA_PAGES=384 # 1.5GB of KVA -- Ollivier ROBERT -=- FreeBSD: The Power to Serve! -=- roberto@keltia.freenix.fr In memoriam to Ondine : http://ondine.keltia.net/ From jhb at freebsd.org Tue Nov 24 18:18:25 2009 From: jhb at freebsd.org (John Baldwin) Date: Tue Nov 24 18:18:46 2009 Subject: Current gptzfsboot limitations In-Reply-To: References: <200911231018.40815.jhb@freebsd.org> Message-ID: <200911241143.24034.jhb@freebsd.org> On Monday 23 November 2009 5:04:30 pm Matt Reimer wrote: > On Mon, Nov 23, 2009 at 7:18 AM, John Baldwin wrote: > > On Friday 20 November 2009 7:46:54 pm Matt Reimer wrote: > >> I've been analyzing gptzfsboot to see what its limitations are. I > >> think it should now work fine for a healthy pool with any number of > >> disks, with any type of vdev, whether single disk, stripe, mirror, > >> raidz or raidz2. > >> > >> But there are currently several limitations (likely in loader.zfs > >> too), mostly due to the limited amount of memory available (< 640KB) > >> and the simple memory allocators used (a simple malloc() and > >> zfs_alloc_temp()). > ... > >> > >> I think I've also hit a stack overflow a couple of times while debugging. > >> > >> I don't know enough about the gptzfsboot/loader.zfs environment to > >> know whether the heap size could be easily enlarged, or whether there > >> is room for a real malloc() with free(). loader(8) seems to use the > >> malloc() in libstand. Can anyone shed some light on the memory > >> limitations and possible solutions? > >> > >> I won't be able to spend much more time on this, but I wanted to pass > >> on what I've learned in case someone else has the time and boot fu to > >> take it the next step. > > > > One issue is that disk transfers need to happen in the lower 1MB due to BIOS > > limitations. The loader uses a bounce buffer (in biosdisk.c in libi386) to > > make this work ok. The loader uses memory > 1MB for malloc(). You could > > probably change zfsboot to do that as well if not already. Just note that > > drvread() has to bounce buffer requests in that case. The text + data + bss > > + stack is all in the lower 640k and there's not much you can do about that. > > The stack grows down from 640k, and the boot program text + data starts at > > 64k with the bss following. > > Ah, the stack growing down from 640k explains a problem I was seeing > where a memcpy() to a temp buf would restart gptzfsboot--it must have > been overwriting the stack. > > > Hmm, drvread() might already be bounce buffering > > since boot2 has to do so since it copies the loader up to memory > 1MB as > > well. > > Looks like it's already bounce buffering. All the I/O drvread does is > to statically allocated char arrays, and the data is copied when > necessary, e.g. in vdev_read(): > > if (drvread(dsk, dmadat->rdbuf, lba, nb)) > return -1; > memcpy(p, dmadat->rdbuf, nb * DEV_BSIZE); > > > > You might need to use memory > 2MB for zfsboot's malloc() so that the > > loader can be copied up to 1MB. It looks like you could patch malloc() in > > zfsboot.c to use 4*1024*1024 as heap_next and maybe 64*1024*1024 as heap_end > > (this assumes all machines that boot ZFS have at least 64MB of RAM, which is > > probably safe). > > So are the page tables etc. already configured such that RAM above 1MB > is ready to use in gptzfsboot? (I'm not familiar with the details of > how virtual memory is handled on i386.) > > Thanks for your help John. Paging is not enabled in the boot loader. Instead, the loader runs in a 32-bit flat mode (but with an offset of 0xa000). Simply changing the constants for heap_start and heap_end should be sufficient. -- John Baldwin From kickbsd at ya.ru Tue Nov 24 18:50:03 2009 From: kickbsd at ya.ru (Baginski Darren) Date: Tue Nov 24 18:50:10 2009 Subject: kern/139715: [zfs] vfs.numvnodes leak on busy zfs Message-ID: <200911241850.nAOIo3El088504@freefall.freebsd.org> The following reply was made to PR kern/139715; it has been noted by GNATS. From: Baginski Darren To: bug-followup@freebsd.org Cc: Subject: Re: kern/139715: [zfs] vfs.numvnodes leak on busy zfs Date: Tue, 24 Nov 2009 21:48:06 +0300 Same issue on FreeBSD zfs-tsts073 8.0-PRERELEASE FreeBSD 8.0-PRERELEASE #8: Mon Nov 23 16:04:14 UTC 2009 root@zfs-tsts073:/usr/obj/usr/src/sys/GENERIC amd64 From mattjreimer at gmail.com Wed Nov 25 00:28:06 2009 From: mattjreimer at gmail.com (Matt Reimer) Date: Wed Nov 25 00:28:12 2009 Subject: Current gptzfsboot limitations In-Reply-To: <4B0BC896.8030808@jrv.org> References: <4B0BC896.8030808@jrv.org> Message-ID: On Tue, Nov 24, 2009 at 3:50 AM, James R. Van Artsdalen wrote: > I assume that *zfsboot requires that /boot and /boot/kernel be in the > boot filesystem and not filesystems of their own. > > A man page probably ought to say this or someone will be tempted to "zfs > create pool/boot/kernel" so they can roll back undesirable kernel installs. gptzfsboot (and I'm pretty sure zfsboot too) uses the first pool it finds. It opens the pool, gets the 'bootfs' property (i.e. the one set with "zpool set bootfs=tank/ROOT tank") and retrieves loader(8) from that filesystem. You can boot from any filesystem in the pool. Matt From linimon at FreeBSD.org Wed Nov 25 03:40:05 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Wed Nov 25 03:40:11 2009 Subject: kern/140853: [nfs] [patch] NFSv2 remove calls fail to send error replies (memory leak!) Message-ID: <200911250340.nAP3e5ud052278@freefall.freebsd.org> Old Synopsis: NFSv2 remove calls fail to send error replies (memory leak!) New Synopsis: [nfs] [patch] NFSv2 remove calls fail to send error replies (memory leak!) Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Wed Nov 25 03:39:42 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=140853 From glewis at eyesbeyond.com Wed Nov 25 06:00:15 2009 From: glewis at eyesbeyond.com (Greg Lewis) Date: Wed Nov 25 06:00:31 2009 Subject: sparc64/140797: Panic on 8.0-RC3/sparc64 as an NFS server Message-ID: <200911250600.nAP60EgI077102@freefall.freebsd.org> The following reply was made to PR sparc64/140797; it has been noted by GNATS. From: Greg Lewis To: Marius Strobl Cc: bug-followup@FreeBSD.org, Greg Lewis Subject: Re: sparc64/140797: Panic on 8.0-RC3/sparc64 as an NFS server Date: Tue, 24 Nov 2009 21:53:56 -0800 G'day Marius, On Mon, Nov 23, 2009 at 08:49:13PM +0100, Marius Strobl wrote: > Could you please test whether r199274/r199284 fix this problem? > > http://svn.freebsd.org/viewvc/base/head/sys/nfsserver/nfs_fha.c?r1=195202&r2=199284&diff_format=u > > --- head/sys/nfsserver/nfs_fha.c 2009/06/30 19:03:27 195202 > +++ head/sys/nfsserver/nfs_fha.c 2009/11/15 03:09:50 199284 > @@ -206,7 +206,7 @@ > if (error) > goto out; > > - i->fh = *(const u_int64_t *)(fh.fh_generic.fh_fid.fid_data); > + bcopy(fh.fh_generic.fh_fid.fid_data, &i->fh, sizeof(i->fh)); > > /* Content ourselves with zero offset for all but reads. */ > if (procnum != NFSPROC_READ) Yes, this fixes it! Thanks. Sorry for not finding that myself. I realise it may be too late to get this into the release. It might be worth an ERRATA notice if so. -- Greg Lewis Email : glewis@eyesbeyond.com Eyes Beyond Web : http://www.eyesbeyond.com Information Technology FreeBSD : glewis@FreeBSD.org From se at freebsd.org Wed Nov 25 09:52:27 2009 From: se at freebsd.org (Stefan Esser) Date: Wed Nov 25 09:52:33 2009 Subject: 7.2 dies in zfs In-Reply-To: <20091124142407.GD81894@roberto-al.eurocontrol.fr> References: <20091122154228.GB55532@rron.freenix.org> <4B0BD5E7.3050604@freebsd.org> <20091124142407.GD81894@roberto-al.eurocontrol.fr> Message-ID: <4B0CFE5B.8050504@freebsd.org> Am 24.11.2009 15:24, schrieb Ollivier Robert: > According to Stefan Esser: >> If your i386 based system has much RAM (2GB or more), than you >> should definitely increase KVA_PAGES. Not doing so will lead to >> panics, not in spite of but exactly because of the large RAM. > > I have uppped KVA_PAGES of course but this is reducing the amount of > memory available to processes. If you define KVA_PAGES to 2GB for > example, every process will be able to use only the remaining 2 GB for > their own memory so there is a trade off there. Yes, I had mentioned that (256 -> 512 means 2MB RAM instead of 1MB spent on the kernel page table and 2GB rather than 3GB user process size; and AFAIK, the limit on the user process size has been the reason for not raising KVA_PAGES to 512 by default). Maybe you can estimate the amount of kernel memory required by measurement of kmem statistics without ZFS and adding about twice the ARC cache limit you want to impose. (The ARC can grow beyond arc_max, IIRC because this is just the high water mark where the cache is aggressively flushed and also because meta data is not taken into account by this limit.) E.g., if your system reports a vm.kvm_free of 100MB, you may be able to fit in an ARC of 50MB. >> I have been using ZFS on i386 since it became available, first for >> testing and soon as only file-system (with UFS boot, initially, now >> switching over to gptzfsboot). Systems range from Pentium-3 to >> AMD64x2 and I see no problems even under significant load. > > I've found that load is not a factor (if one defines load as many > concurrent processes). The machine is mostly idle and I've seen panics > coming from a "cvs update" or a "svn up". There are I/O intensive but > not that much whereas the same machine can survive a buildworld just fine. > > The machine I have is a dual Xeon @2.8 GHz with 4 GB of RAM and 200 GB > of disk. > > /boot/loader.conf > ----- > #-- limits > kern.maxdsiz="1024M" > kern.maxssiz="256M" > kern.dfldsiz="1024M" > kern.dflssiz="128M" > > #-- vm tuning > vm.kmem_size="1024M" > vm.kmem_size_max="1224M" > vfs.zfs.arc_max="128M" > vfs.zfs.prefetch_disable=1 > ----- > > options KVA_PAGES=384 # 1.5GB of KVA Well, my example was for the 512MB P3, which I use because of its power efficiency (less than 30W idle power drawn). My home workstation is an AMD x2 with 2GB RAM, 3*1TB disk (RAIDZ1), and with KVA_PAGES=512 and the following tunables set: kern.maxssiz="128M" vfs.root.mountfrom="zfs:raid1" # Specify root partition vm.kmem_size="1500M" # Sets the size of kernel memory (bytes) vm.kmem_size_max="2G" # Sets the size of kernel memory (bytes) zfs_load="YES" The ARC size is not limited, currently, and auto-sizes to some 950MB. But I have tried arc_max limits down to 200MB to study the impact. The system is absolutely reliable (with regard to ZFS, but haunted by LORs). I'm using a kernel with INVARIANTS and full WITNESS, since I want to understand lock-ups apparently caused by the combination of Atheros WLAN and SMP (sometimes accompanied by LORs). It survives not only CVS and SVN updates, but also other operations that made ZFS panic before I raised KVA_PAGES to its current value. Maybe, defaults for kmem_size and kmem_size_max would suffice, I have not tried them for a while. But KVA_PAGES=512 is essential for my system with 2GB RAM, guess this is even more true for your box with 4GB. Regards, STefan From sarawgi.aditya at gmail.com Wed Nov 25 13:42:05 2009 From: sarawgi.aditya at gmail.com (Aditya Sarawgi) Date: Wed Nov 25 13:42:12 2009 Subject: ext2fs locks help Message-ID: <4b0d342b.161bf30a.56fa.fffff5ed@mx.google.com> Hi, I am experiencing a strange problem with some locks I have applied to ext2fs. Here's what is happening 636 static daddr_t 637 ext2_alloccg(struct inode *ip, int cg, daddr_t bpref, int size) 638 { 639 struct m_ext2fs *fs; 640 struct buf *bp; 641 struct ext2mount *ump; 642 int error, bno, start, end, loc; 643 char *bbp; 644 /* XXX ondisk32 */ 645 mtx_assert(EXT2_MTX(ip->i_ump), MA_OWNED); 646 fs = ip->i_e2fs; 647 ump = ip->i_ump; 648 if (fs->e2fs_gd[cg].ext2bgd_nbfree == 0) 649 return (0); 650 EXT2_UNLOCK(ump); /* snip */ 712 EXT2_LOCK(ump); I have added a mutex to ext2mount for protecting fs similar to what ffs does. Now the problem is that system always panics at line 650 saying that panic: lock (sleep mutex) EXT2FS not locked @ /usr/src/sys/modules/ext2fs/../../fs/ext2fs/ext2_alloc.c:650 the assertion at 645 never fails and the system always panic at 650 only. I also tried commenting line 650, the system panics saying that trying to recurse a non-recursive lock @ line 712. So the lock is getting lost in between. Is this due to some other process unlocking the system ? -- Aditya Sarawgi From marius at alchemy.franken.de Wed Nov 25 19:30:04 2009 From: marius at alchemy.franken.de (Marius Strobl) Date: Wed Nov 25 19:30:13 2009 Subject: sparc64/140797: Panic on 8.0-RC3/sparc64 as an NFS server Message-ID: <200911251930.nAPJU3sc009905@freefall.freebsd.org> The following reply was made to PR sparc64/140797; it has been noted by GNATS. From: Marius Strobl To: marcel@FreeBSD.org Cc: bug-followup@FreeBSD.org, Greg Lewis , Greg Lewis Subject: Re: sparc64/140797: Panic on 8.0-RC3/sparc64 as an NFS server Date: Wed, 25 Nov 2009 20:28:14 +0100 On Tue, Nov 24, 2009 at 09:53:56PM -0800, Greg Lewis wrote: > G'day Marius, > > On Mon, Nov 23, 2009 at 08:49:13PM +0100, Marius Strobl wrote: > > Could you please test whether r199274/r199284 fix this problem? > > > > http://svn.freebsd.org/viewvc/base/head/sys/nfsserver/nfs_fha.c?r1=195202&r2=199284&diff_format=u > > > > --- head/sys/nfsserver/nfs_fha.c 2009/06/30 19:03:27 195202 > > +++ head/sys/nfsserver/nfs_fha.c 2009/11/15 03:09:50 199284 > > @@ -206,7 +206,7 @@ > > if (error) > > goto out; > > > > - i->fh = *(const u_int64_t *)(fh.fh_generic.fh_fid.fid_data); > > + bcopy(fh.fh_generic.fh_fid.fid_data, &i->fh, sizeof(i->fh)); > > > > /* Content ourselves with zero offset for all but reads. */ > > if (procnum != NFSPROC_READ) > > Yes, this fixes it! Thanks. Sorry for not finding that myself. > > I realise it may be too late to get this into the release. It might be > worth an ERRATA notice if so. > Marcel, do you have any such plans? Marius From xcllnt at mac.com Wed Nov 25 22:20:03 2009 From: xcllnt at mac.com (Marcel Moolenaar) Date: Wed Nov 25 22:20:10 2009 Subject: sparc64/140797: Panic on 8.0-RC3/sparc64 as an NFS server Message-ID: <200911252220.nAPMK3ER057766@freefall.freebsd.org> The following reply was made to PR sparc64/140797; it has been noted by GNATS. From: Marcel Moolenaar To: Marius Strobl Cc: "marcel@FreeBSD.org" , "bug-followup@FreeBSD.org" , Greg Lewis , Greg Lewis Subject: Re: sparc64/140797: Panic on 8.0-RC3/sparc64 as an NFS server Date: Wed, 25 Nov 2009 13:15:35 -0800 No, no plans. I only wanted to go as far as putting in in -stable -- which I did. FYI -- Marcel (Mobile) On Nov 25, 2009, at 11:28 AM, Marius Strobl wrote: > On Tue, Nov 24, 2009 at 09:53:56PM -0800, Greg Lewis wrote: >> G'day Marius, >> >> On Mon, Nov 23, 2009 at 08:49:13PM +0100, Marius Strobl wrote: >>> Could you please test whether r199274/r199284 fix this problem? >>> >>> http://svn.freebsd.org/viewvc/base/head/sys/nfsserver/nfs_fha.c?r1=195202&r2=199284&diff_format=u >>> >>> --- head/sys/nfsserver/nfs_fha.c 2009/06/30 19:03:27 195202 >>> +++ head/sys/nfsserver/nfs_fha.c 2009/11/15 03:09:50 199284 >>> @@ -206,7 +206,7 @@ >>> if (error) >>> goto out; >>> >>> - i->fh = *(const u_int64_t *)(fh.fh_generic.fh_fid.fid_data); >>> + bcopy(fh.fh_generic.fh_fid.fid_data, &i->fh, sizeof(i->fh)); >>> >>> /* Content ourselves with zero offset for all but reads. */ >>> if (procnum != NFSPROC_READ) >> >> Yes, this fixes it! Thanks. Sorry for not finding that myself. >> >> I realise it may be too late to get this into the release. It >> might be >> worth an ERRATA notice if so. >> > > Marcel, do you have any such plans? > > Marius >