From ludwigp at chip-web.com Sat Aug 1 03:18:30 2009 From: ludwigp at chip-web.com (Ludwig Pummer) Date: Sat Aug 1 03:18:37 2009 Subject: ZFS raidz1 pool unavailable from losing 1 device In-Reply-To: <4A714B03.6050704@chip-web.com> References: <4A712290.9030308@chip-web.com> <46899.11156.qm@web37301.mail.mud.yahoo.com> <4A714B03.6050704@chip-web.com> Message-ID: <4A73A096.5050106@chip-web.com> Ludwig Pummer wrote: > Simun Mikecin wrote: >> Ludwin Pummer wrote: >> >> >>> My system is 7.2-STABLE Jul 27, amd64, 4GB memory, just upgraded >>> from 6.4-STABLE from last year. I just set up a ZFS raidz volume to >>> replace a graid5 volume I had been using. I had it successfully set >>> up using partitions across 4 disks, ad{6,8,10,12}s1e. Then I wanted >>> to expand the raidz volume by merging the space from the adjacent >>> disk partition. I thought I could just fail out the partition device >>> in ZFS, edit the bsdlabel, and re-add the larger partition, ZFS >>> would resilver, repeat until done. That's when I found out that ZFS >>> doesn't let you fail out a device in a raidz volume. No big deal, I >>> thought, I'll just go to single user mode and mess with the >>> partition when ZFS isn't looking. When it comes back up it should >>> notice that one of the device is gone, I can do a 'zfs replace' and >>> continue my plan. >>> >>> Well, after rebooting to single user mode, combining partitions >>> ad12s1d and ad12s1e (removed the d partiton), "zfs volinit", then >>> "zpool status" just hung (Ctrl-C didn't kill it, so I rebooted). I >>> thought this was a bit odd so I thought perhaps ZFS is confused by >>> the ZFS metadata left on ad12s1e, so I blanked it out with "dd". >>> That didn't help. I changed the name of the partition to ad12s1d >>> thinking perhaps that would help. After that, "zfs volinit; zfs >>> mount -a; zpool status" showed my raidz pool UNAVAIL with the >>> message "insufficient replicas", ad{6,8,10}s1e ONLINE, and ad12s1e >>> UNAVAIL "cannot open", and a more detailed message pointing me to >>> http://www.sun.com/msg/ZFS-8000-3C. I tried doing a "zpool replace >>> storage ad12s1e ad12s1d" but it refused, saying my zpool ("storage") >>> was unavailable. Ditto for pretty much every zpool command I tried. >>> "zpool clear" gave me a "permission denied" error. >>> >> >> Was your pool imported while you were repartitioning in single user >> mode? >> > Yes, I guess you could say it was. ZFS wasn't loaded while I was doing > the repartitioning, though. > > --Ludwig > Well, I figured out my problem. I didn't actually have a raidz1 volume. I missed the magic word "raidz" when I performed the "zpool create" so I created a JBOD. Removing one disk legitmately destroyed my zpool :( --Ludwig From davidn04 at gmail.com Sat Aug 1 08:39:47 2009 From: davidn04 at gmail.com (David N) Date: Sat Aug 1 08:39:54 2009 Subject: iSTGT error messages Message-ID: <4d7dd86f0908010117o77757798p6585148ab829e088@mail.gmail.com> Jul 31 01:40:30 netserv1 istgt[13674]: Login from iqn.example.net (10.1.20.15) on iqn.example.net:mail2disk1 LU1 (10.1.10.1:3260,1), ISID=23d010000, TSIH=40, CID=0, HeaderDigest=off, DataDigest=off Jul 31 03:10:23 netserv1 istgt[13674]: istgt_iscsi.c:3338:istgt_iscsi_op_nopout: ***ERROR*** StatSN(460107/460117) error Jul 31 03:10:23 netserv1 istgt[13674]: istgt_iscsi.c:3762:istgt_iscsi_execute: ***ERROR*** iscsi_op_nopout() failed Jul 31 03:10:23 netserv1 istgt[13674]: istgt_iscsi.c:4088:worker: ***ERROR*** iscsi_execute() failed iSTGT istgt-20090428 FreeBSD 7.2-R iSCSI 10GB disk on FreeBSD Open-iscsi 2.0.865-1ubuntu3.3 client Does anyone have any idea what the errors mean? There are alot of repeated messages in the log file. It will connect, then the error will occur and it'll reconnect and so forth. Regards David N From marius at nuenneri.ch Sat Aug 1 09:11:40 2009 From: marius at nuenneri.ch (=?ISO-8859-1?Q?Marius_N=FCnnerich?=) Date: Sat Aug 1 09:11:47 2009 Subject: ZFS raidz1 pool unavailable from losing 1 device In-Reply-To: <4A73A096.5050106@chip-web.com> References: <4A712290.9030308@chip-web.com> <46899.11156.qm@web37301.mail.mud.yahoo.com> <4A714B03.6050704@chip-web.com> <4A73A096.5050106@chip-web.com> Message-ID: On Sat, Aug 1, 2009 at 03:55, Ludwig Pummer wrote: > Ludwig Pummer wrote: >> >> Simun Mikecin wrote: >>> >>> Ludwin Pummer wrote: >>> >>> >>>> >>>> My system is 7.2-STABLE Jul 27, amd64, 4GB memory, just upgraded from >>>> 6.4-STABLE from last year. I just set up a ZFS raidz volume to replace a >>>> graid5 volume I had been using. I had it successfully set up using >>>> partitions across 4 disks, ad{6,8,10,12}s1e. Then I wanted to expand the >>>> raidz volume by merging the space from the adjacent disk partition. I >>>> thought I could just fail out the partition device in ZFS, edit the >>>> bsdlabel, and re-add the larger partition, ZFS would resilver, repeat until >>>> done. That's when I found out that ZFS doesn't let you fail out a device in >>>> a raidz volume. No big deal, I thought, I'll just go to single user mode and >>>> mess with the partition when ZFS isn't looking. When it comes back up it >>>> should notice that one of the device is gone, I can do a 'zfs replace' and >>>> continue my plan. >>>> >>>> Well, after rebooting to single user mode, combining partitions ad12s1d >>>> and ad12s1e (removed the d partiton), "zfs volinit", then "zpool status" >>>> just hung (Ctrl-C didn't kill it, so I rebooted). I thought this was a bit >>>> odd so I thought perhaps ZFS is confused by the ZFS metadata left on >>>> ad12s1e, so I blanked it out with "dd". That didn't help. I changed the name >>>> of the partition to ad12s1d thinking perhaps that would help. After that, >>>> "zfs volinit; zfs mount -a; zpool status" showed my raidz pool UNAVAIL with >>>> the message "insufficient replicas", ad{6,8,10}s1e ONLINE, and ad12s1e >>>> UNAVAIL "cannot open", and a more detailed message pointing me to >>>> http://www.sun.com/msg/ZFS-8000-3C. I tried doing a "zpool replace storage >>>> ad12s1e ad12s1d" but it refused, saying my zpool ("storage") was >>>> unavailable. Ditto for pretty much every zpool command I tried. "zpool >>>> clear" gave me a "permission denied" error. >>>> >>> >>> Was your pool imported while you were repartitioning in single user mode? >>> >> >> Yes, I guess you could say it was. ZFS wasn't loaded while I was doing the >> repartitioning, though. >> >> --Ludwig >> > > Well, I figured out my problem. I didn't actually have a raidz1 volume. I > missed the magic word "raidz" when I performed the "zpool create" so I > created a JBOD. Removing one disk legitmately destroyed my zpool :( > > --Ludwig That's bad. But it won't explain why the disk names changed. I guess there is a race in tasting either the original ad* providers or the one sector smaller label/foo providers. May I suggest that you or other people reading this should try to use gpt labels in the future as they are there definetly _after_ gpt has tasted. Sadly they are only available in 8-current right now. From serenity at exscape.org Sat Aug 1 10:57:39 2009 From: serenity at exscape.org (Thomas Backman) Date: Sat Aug 1 10:57:45 2009 Subject: Samba + ZFS panic w/ DEBUG_VFS_LOCKS Message-ID: <4B49A2A0-2437-48A4-9047-80267BD4148F@exscape.org> I just installed samba (ports/net/samba3) on my test machine to see if some simple media streaming from ZFS would work. It did not; smbd didn't even start before it panicked... At "Starting smdb" I got the following panic: (Note: I haven't tried without DEBUG_VFS_LOCKS yet. I do suppose that it's not supposed to panic even with rigorous debugging enabled, though!) Unread portion of the kernel message buffer: KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a vfs_badlock() at vfs_badlock+0x95 assert_vop_elocked() at assert_vop_elocked+0x64 VOP_PUTPAGES_APV() at VOP_PUTPAGES_APV+0x5b vnode_pager_putpages() at vnode_pager_putpages+0xa9 vm_pageout_flush() at vm_pageout_flush+0xd1 vm_object_page_collect_flush() at vm_object_page_collect_flush+0x2f0 vm_object_page_clean() at vm_object_page_clean+0x143 fsync() at fsync+0x121 syscall() at syscall+0x28f Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (95, FreeBSD ELF64, fsync), rip = 0x801064dac, rsp = 0x7fffffffe5d8, rbp = 0x801336480 --- VOP_PUTPAGES: 0xffffff0007649588 is not exclusive locked but should be KDB: enter: lock violation 0xffffff0007649588: tag zfs, type VREG usecount 2, writecount 1, refcount 3 mountedhere 0 flags (VI_OBJDIRTY) v_object 0xffffff000ee6c000 ref 1 pages 2 lock type zfs: SHARED (count 1) panic: from debugger cpuid = 0 KDB: stack backtrace: Uptime: 17h10m52s Physical memory: 2034 MB Dumping 1723 MB: ... at /usr/src/sys/amd64/amd64/trap.c:613 #9 0xffffffff8057eda7 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #10 0xffffffff8036c8ad in kdb_enter (why=0xffffffff80613fd5 "vfslock", msg=0xa
) at cpufunc.h:63 #11 0xffffffff803cb3a4 in assert_vop_elocked (vp=0xffffff0007649588, str=0xffffffff80642728 "VOP_PUTPAGES") at /usr/src/sys/kern/vfs_subr.c:3722 #12 0xffffffff805c80eb in VOP_PUTPAGES_APV (vop=0xffffffff807a07c0, a=0xffffff803eb72730) at vnode_if.c:2664 #13 0xffffffff80572cd9 in vnode_pager_putpages (object=0xffffff000ee6c000, m=0xffffff803eb72830, count=8192, sync=12, rtvals=0xffffff803eb727a0) at vnode_if.h:1169 #14 0xffffffff8056d601 in vm_pageout_flush (mc=0xffffff803eb72830, count=2, flags=12) at vm_pager.h:148 #15 0xffffffff80568e30 in vm_object_page_collect_flush ( object=0xffffff000ee6c000, p=Variable "p" is not available. ) at /usr/src/sys/vm/vm_object.c:1032 #16 0xffffffff80569023 in vm_object_page_clean (object=0xffffff000ee6c000, start=0, end=Variable "end" is not available. ) at /usr/src/sys/vm/vm_object.c:844 #17 0xffffffff803d3bd1 in fsync (td=0xffffff0027f45000, uap=Variable "uap" is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:3519#18 0xffffffff80598e7f in syscall (frame=0xffffff803eb72c80) at /usr/src/sys/amd64/amd64/ trap.c:984#19 0xffffffff8057f081 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:373 #20 0x0000000801064dac in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) fr 11 #11 0xffffffff803cb3a4 in assert_vop_elocked (vp=0xffffff0007649588, str=0xffffffff80642728 "VOP_PUTPAGES") at /usr/src/sys/kern/vfs_subr.c:3722 3722 vfs_badlock("is not exclusive locked but should be", str, vp); (kgdb) p *vp $1 = {v_type = VREG, v_tag = 0xffffffff80b59327 "zfs", v_op = 0xffffffff80b5dee0, v_data = 0xffffff00052cb758, v_mount = 0xffffff00018392f0, v_nmntvnodes = {tqe_next = 0x0, tqe_prev = 0xffffff006895b028}, v_un = {vu_mount = 0x0, vu_socket = 0x0, vu_cdev = 0x0, vu_fifoinfo = 0x0, vu_yield = 0}, v_hashlist = {le_next = 0x0, le_prev = 0x0}, v_hash = 0, v_cache_src = { lh_first = 0x0}, v_cache_dst = {tqh_first = 0x0, tqh_last = 0xffffff00076495e8}, v_cache_dd = 0x0, v_cstart = 0, v_lasta = 0, v_lastw = 0, v_clen = 0, v_lock = {lock_object = {lo_name = 0xffffffff80b59327 "zfs", lo_flags = 91947008, lo_data = 0, lo_witness = 0x0}, lk_lock = 17, lk_timo = 51, lk_pri = 80}, v_interlock = {lock_object = { lo_name = 0xffffffff80614670 "vnode interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, v_vnlock = 0xffffff0007649620, v_holdcnt = 3, v_usecount = 2, v_iflag = 1024, v_vflag = 0, v_writecount = 1, v_freelist = { tqe_next = 0x0, tqe_prev = 0x0}, v_bufobj = {bo_mtx = {lock_object = {lo_name = 0xffffffff80614680 "bufobj interlock", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, bo_clean = {bv_hd = {tqh_first = 0x0, tqh_last = 0xffffff00076496c0}, bv_root = 0x0, bv_cnt = 0}, bo_dirty = {bv_hd = {tqh_first = 0x0, tqh_last = 0xffffff00076496e0}, bv_root = 0x0, bv_cnt = 0}, bo_numoutput = 0, bo_flag = 0, bo_ops = 0xffffffff8079d620, bo_bsize = 131072, bo_object = 0xffffff000ee6c000, bo_synclist = {le_next = 0x0, le_prev = 0x0}, bo_private = 0xffffff0007649588, __bo_vnode = 0xffffff0007649588}, v_pollinfo = 0x0, v_label = 0x0, v_lockf = 0xffffff000d402600} (kgdb) fr 17 #17 0xffffffff803d3bd1 in fsync (td=0xffffff0027f45000, uap=Variable "uap" is not available. ) at /usr/src/sys/kern/vfs_syscalls.c:3519 3519 vn_finished_write(mp); (kgdb) p *mp $2 = {mnt_mtx = {lock_object = {lo_name = 0xffffffff80613905 "struct mount mtx", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, mnt_gen = 1, mnt_list = {tqe_next = 0xffffff0001bf75e0, tqe_prev = 0xffffff0001839608}, mnt_op = 0xffffffff80b5de40, mnt_vfc = 0xffffffff80b5dde0, mnt_vnodecovered = 0xffffff0001ae6000, mnt_syncer = 0xffffff0001be2760, mnt_ref = 14897, mnt_nvnodelist = { tqh_first = 0xffffff0001be2b10, tqh_last = 0xffffff00076495b0}, mnt_nvnodelistsize = 7449, mnt_writeopcount = 1, mnt_kern_flag = 1610612864, mnt_flag = 268439552, mnt_xflag = 0, mnt_noasync = 0, mnt_opt = 0xffffff00017f1830, mnt_optnew = 0x0, mnt_maxsymlinklen = 0, mnt_stat = {f_version = 537068824, f_type = 4, f_flags = 268439552, f_bsize = 131072, f_iosize = 131072, f_blocks = 485196, f_bfree = 475793, f_bavail = 475793, f_files = 529171, f_ffree = 475793, f_syncwrites = 0, f_asyncwrites = 0, f_syncreads = 0, f_asyncreads = 0, f_spare = {0, 0, 0, 0, 0, 0, 0, 0, 0, 0}, f_namemax = 255, f_owner = 0, f_fsid = {val = {591198578, -1876274428}}, f_charspare = '\0' , f_fstypename = "zfs", '\0' , f_mntfromname = "tank/usr", '\0' , f_mntonname = "/usr", '\0' }, mnt_cred = 0xffffff0001be0e00, mnt_data = 0xffffff0001a89000, mnt_time = 0, mnt_iosize_max = 65536, mnt_export = 0x0, mnt_label = 0x0, mnt_hashseed = 2610436692, mnt_lockref = 0, mnt_secondary_writes = 0, mnt_secondary_accwrites = 0, mnt_susp_owner = 0x0, mnt_gjprovider = 0x0, mnt_explock = { lock_object = {lo_name = 0xffffffff80613916 "explock", lo_flags = 91422720, lo_data = 0, lo_witness = 0x0}, lk_lock = 1, lk_timo = 0, lk_pri = 80}} # uname -a FreeBSD chaos.exscape.org 8.0-BETA2 FreeBSD 8.0-BETA2 #7 r195910M: Thu Jul 30 19:03:33 CEST 2009 root@chaos.exscape.org:/usr/obj/usr/src/ sys/DTRACE amd64 As I said, DEBUG_VFS_LOCKS in enabled. Should I disabled DEBUG_VFS_LOCKS and consider this "normal" (if it doesn't still panic, that is), or is this a real issue? (Note that while *mp points to /usr, FWIW, /usr is not shared by samba, nor is any FS below it. Also note that my debugging skills are at an early stage... so the info provided may be useless.) Regards, Thomas From kostikbel at gmail.com Sat Aug 1 14:53:11 2009 From: kostikbel at gmail.com (Kostik Belousov) Date: Sat Aug 1 14:53:18 2009 Subject: Samba + ZFS panic w/ DEBUG_VFS_LOCKS In-Reply-To: <4B49A2A0-2437-48A4-9047-80267BD4148F@exscape.org> References: <4B49A2A0-2437-48A4-9047-80267BD4148F@exscape.org> Message-ID: <20090801145301.GE1884@deviant.kiev.zoral.com.ua> On Sat, Aug 01, 2009 at 12:57:29PM +0200, Thomas Backman wrote: > I just installed samba (ports/net/samba3) on my test machine to see if > some simple media streaming from ZFS would work. It did not; smbd > didn't even start before it panicked... At "Starting smdb" I got the > following panic: > > (Note: I haven't tried without DEBUG_VFS_LOCKS yet. I do suppose that > it's not supposed to panic even with rigorous debugging enabled, > though!) > > Unread portion of the kernel message buffer: > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > vfs_badlock() at vfs_badlock+0x95 > assert_vop_elocked() at assert_vop_elocked+0x64 > VOP_PUTPAGES_APV() at VOP_PUTPAGES_APV+0x5b > vnode_pager_putpages() at vnode_pager_putpages+0xa9 > vm_pageout_flush() at vm_pageout_flush+0xd1 > vm_object_page_collect_flush() at vm_object_page_collect_flush+0x2f0 > vm_object_page_clean() at vm_object_page_clean+0x143 > fsync() at fsync+0x121 > syscall() at syscall+0x28f > Xfast_syscall() at Xfast_syscall+0xe1 > --- syscall (95, FreeBSD ELF64, fsync), rip = 0x801064dac, rsp = > 0x7fffffffe5d8, rbp = 0x801336480 --- > VOP_PUTPAGES: 0xffffff0007649588 is not exclusive locked but should be > KDB: enter: lock violation > > 0xffffff0007649588: tag zfs, type VREG > usecount 2, writecount 1, refcount 3 mountedhere 0 > flags (VI_OBJDIRTY) > v_object 0xffffff000ee6c000 ref 1 pages 2 > lock type zfs: SHARED (count 1) > panic: from debugger > cpuid = 0 > KDB: stack backtrace: > Uptime: 17h10m52s > Physical memory: 2034 MB > Dumping 1723 MB: ... > > at /usr/src/sys/amd64/amd64/trap.c:613 > #9 0xffffffff8057eda7 in calltrap () > at /usr/src/sys/amd64/amd64/exception.S:224 > #10 0xffffffff8036c8ad in kdb_enter (why=0xffffffff80613fd5 "vfslock", > msg=0xa
) at cpufunc.h:63 > #11 0xffffffff803cb3a4 in assert_vop_elocked (vp=0xffffff0007649588, > str=0xffffffff80642728 "VOP_PUTPAGES") > at /usr/src/sys/kern/vfs_subr.c:3722 > #12 0xffffffff805c80eb in VOP_PUTPAGES_APV (vop=0xffffffff807a07c0, > a=0xffffff803eb72730) at vnode_if.c:2664 > #13 0xffffffff80572cd9 in vnode_pager_putpages > (object=0xffffff000ee6c000, > m=0xffffff803eb72830, count=8192, sync=12, > rtvals=0xffffff803eb727a0) > at vnode_if.h:1169 > #14 0xffffffff8056d601 in vm_pageout_flush (mc=0xffffff803eb72830, > count=2, > flags=12) at vm_pager.h:148 > #15 0xffffffff80568e30 in vm_object_page_collect_flush ( > object=0xffffff000ee6c000, p=Variable "p" is not available. > ) at /usr/src/sys/vm/vm_object.c:1032 > #16 0xffffffff80569023 in vm_object_page_clean > (object=0xffffff000ee6c000, > start=0, end=Variable "end" is not available. > ) at /usr/src/sys/vm/vm_object.c:844 > #17 0xffffffff803d3bd1 in fsync (td=0xffffff0027f45000, uap=Variable > "uap" is not available. > ) > at /usr/src/sys/kern/vfs_syscalls.c:3519#18 0xffffffff80598e7f in > syscall (frame=0xffffff803eb72c80) at /usr/src/sys/amd64/amd64/ > trap.c:984#19 0xffffffff8057f081 in Xfast_syscall () > at /usr/src/sys/amd64/amd64/exception.S:373 > #20 0x0000000801064dac in ?? () > Previous frame inner to this frame (corrupt stack?) > > (kgdb) fr 11 > #11 0xffffffff803cb3a4 in assert_vop_elocked (vp=0xffffff0007649588, > str=0xffffffff80642728 "VOP_PUTPAGES") > at /usr/src/sys/kern/vfs_subr.c:3722 > 3722 vfs_badlock("is not exclusive locked but > should be", str, vp); > (kgdb) p *vp > $1 = {v_type = VREG, v_tag = 0xffffffff80b59327 "zfs", v_op = > 0xffffffff80b5dee0, v_data = 0xffffff00052cb758, > v_mount = 0xffffff00018392f0, v_nmntvnodes = {tqe_next = 0x0, > tqe_prev = 0xffffff006895b028}, v_un = {vu_mount = 0x0, vu_socket = 0x0, > vu_cdev = 0x0, vu_fifoinfo = 0x0, vu_yield = 0}, v_hashlist = > {le_next = 0x0, le_prev = 0x0}, v_hash = 0, v_cache_src = { > lh_first = 0x0}, v_cache_dst = {tqh_first = 0x0, tqh_last = > 0xffffff00076495e8}, v_cache_dd = 0x0, v_cstart = 0, v_lasta = 0, > v_lastw = 0, v_clen = 0, v_lock = {lock_object = {lo_name = > 0xffffffff80b59327 "zfs", lo_flags = 91947008, lo_data = 0, > lo_witness = 0x0}, lk_lock = 17, lk_timo = 51, lk_pri = 80}, > v_interlock = {lock_object = { > lo_name = 0xffffffff80614670 "vnode interlock", lo_flags = > 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock = 4}, > v_vnlock = 0xffffff0007649620, v_holdcnt = 3, v_usecount = 2, > v_iflag = 1024, v_vflag = 0, v_writecount = 1, v_freelist = { > tqe_next = 0x0, tqe_prev = 0x0}, v_bufobj = {bo_mtx = > {lock_object = {lo_name = 0xffffffff80614680 "bufobj interlock", > lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, mtx_lock > = 4}, bo_clean = {bv_hd = {tqh_first = 0x0, > tqh_last = 0xffffff00076496c0}, bv_root = 0x0, bv_cnt = 0}, > bo_dirty = {bv_hd = {tqh_first = 0x0, tqh_last = 0xffffff00076496e0}, > bv_root = 0x0, bv_cnt = 0}, bo_numoutput = 0, bo_flag = 0, > bo_ops = 0xffffffff8079d620, bo_bsize = 131072, > bo_object = 0xffffff000ee6c000, bo_synclist = {le_next = 0x0, > le_prev = 0x0}, bo_private = 0xffffff0007649588, > __bo_vnode = 0xffffff0007649588}, v_pollinfo = 0x0, v_label = > 0x0, v_lockf = 0xffffff000d402600} > > (kgdb) fr 17 > #17 0xffffffff803d3bd1 in fsync (td=0xffffff0027f45000, uap=Variable > "uap" is not available. > ) at /usr/src/sys/kern/vfs_syscalls.c:3519 > 3519 vn_finished_write(mp); > (kgdb) p *mp > $2 = {mnt_mtx = {lock_object = {lo_name = 0xffffffff80613905 "struct > mount mtx", lo_flags = 16973824, lo_data = 0, lo_witness = 0x0}, > mtx_lock = 4}, mnt_gen = 1, mnt_list = {tqe_next = > 0xffffff0001bf75e0, tqe_prev = 0xffffff0001839608}, mnt_op = > 0xffffffff80b5de40, > mnt_vfc = 0xffffffff80b5dde0, mnt_vnodecovered = > 0xffffff0001ae6000, mnt_syncer = 0xffffff0001be2760, mnt_ref = 14897, > mnt_nvnodelist = { > tqh_first = 0xffffff0001be2b10, tqh_last = 0xffffff00076495b0}, > mnt_nvnodelistsize = 7449, mnt_writeopcount = 1, > mnt_kern_flag = 1610612864, mnt_flag = 268439552, mnt_xflag = 0, > mnt_noasync = 0, mnt_opt = 0xffffff00017f1830, mnt_optnew = 0x0, > mnt_maxsymlinklen = 0, mnt_stat = {f_version = 537068824, f_type = > 4, f_flags = 268439552, f_bsize = 131072, f_iosize = 131072, > f_blocks = 485196, f_bfree = 475793, f_bavail = 475793, f_files = > 529171, f_ffree = 475793, f_syncwrites = 0, f_asyncwrites = 0, > f_syncreads = 0, f_asyncreads = 0, f_spare = {0, 0, 0, 0, 0, 0, > 0, 0, 0, 0}, f_namemax = 255, f_owner = 0, f_fsid = {val = {591198578, > -1876274428}}, f_charspare = '\0' , > f_fstypename = "zfs", '\0' , > f_mntfromname = "tank/usr", '\0' , f_mntonname > = "/usr", '\0' }, mnt_cred = 0xffffff0001be0e00, > mnt_data = 0xffffff0001a89000, mnt_time = 0, mnt_iosize_max = > 65536, mnt_export = 0x0, mnt_label = 0x0, mnt_hashseed = 2610436692, > mnt_lockref = 0, mnt_secondary_writes = 0, mnt_secondary_accwrites > = 0, mnt_susp_owner = 0x0, mnt_gjprovider = 0x0, mnt_explock = { > lock_object = {lo_name = 0xffffffff80613916 "explock", lo_flags = > 91422720, lo_data = 0, lo_witness = 0x0}, lk_lock = 1, lk_timo = 0, > lk_pri = 80}} > > # uname -a > FreeBSD chaos.exscape.org 8.0-BETA2 FreeBSD 8.0-BETA2 #7 r195910M: Thu > Jul 30 19:03:33 CEST 2009 root@chaos.exscape.org:/usr/obj/usr/src/ > sys/DTRACE amd64 > > As I said, DEBUG_VFS_LOCKS in enabled. > Should I disabled DEBUG_VFS_LOCKS and consider this "normal" (if it > doesn't still panic, that is), or is this a real issue? > (Note that while *mp points to /usr, FWIW, /usr is not shared by > samba, nor is any FS below it. Also note that my debugging skills are > at an early stage... so the info provided may be useless.) It does not matter whether the zfs is accessed by samba. Panic happens when you do fsync(2) on a vnode that has its vm object marked as dirty, and VFS_DEBUG_LOCKS is configured. The workaround is to disable VFS_DEBUG_LOCKS. Since vnode_pager_generic_putpages seems to work with shared vnode lock as far as VOP_WRITE works right with shared lock, change sys/kern/vnode_if.src, line 475 from %% putpages vp E E E to %% putpages vp L L L -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090801/9d38d972/attachment.pgp From kientzle at freebsd.org Sat Aug 1 17:41:11 2009 From: kientzle at freebsd.org (Tim Kientzle) Date: Sat Aug 1 17:41:16 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: <4A72D946.4090401@jrv.org> References: <20090727072503.GA52309@jpru.ffm.jpru.de> <20090729084723.GD1586@garage.freebsd.pl> <4A7030B6.8010205@icyb.net.ua> <97D5950F-4E4D-4446-AC22-92679135868D@exscape.org> <4A7048A9.4020507@icyb.net.ua> <52AA86CB-6C06-4370-BA73-CE19175467D0@exscape.org> <4A705299.8060504@icyb.net.ua> <4A7054E1.5060402@icyb.net.ua> <5918824D-A67C-43E6-8685-7B72A52B9CAE@exscape.org> <4A705E50.8070307@icyb.net.ua> <4A70728C.7020004@freebsd.org> <6D47A34B-0753-4CED-BF3D-C505B37748FC@exscape.org> <4A708455.5070304@freebsd.org> <86983A55-E5C4-4C04-A4C7-0AE9A9EE37A3@exscape.org> <4A718E03.6030909@freebsd.org> <71A038EC-02B1-4606-96C2-5E84BE80F005@exscape.org> <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4A71B2DA.9060902@freebsd.org> <4A72D946.4090401@jrv.org> Message-ID: <4A747766.10901@freebsd.org> James R. Van Artsdalen wrote: > Andriy Gapon wrote: >> >> One comment on the patch - I personally don't like bit-wise xor in a logical >> expression. But if otherwise the expression would be huge and ugly, then OK. > > If you're going to code an XOR, use an XOR. Or != which produces the same result for logical values and is sometimes easier to understand. Tim From killasmurf86 at gmail.com Sun Aug 2 08:50:05 2009 From: killasmurf86 at gmail.com (Aldis Berjoza) Date: Sun Aug 2 08:50:11 2009 Subject: kern/137037: [zfs] [hang] zfs rollback on root causes FreeBSD to freeze in few seconds Message-ID: <200908020850.n728o4dC095430@freefall.freebsd.org> The following reply was made to PR kern/137037; it has been noted by GNATS. From: Aldis Berjoza To: bug-followup@FreeBSD.org, killasmurf86@gmail.com Cc: Subject: Re: kern/137037: [zfs] [hang] zfs rollback on root causes FreeBSD to freeze in few seconds Date: Sun, 02 Aug 2009 11:44:53 +0300 -----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 I don't know if this helps, but: Today I did zfs rollback on root again... But this time I logged out from X, and that logged in.... For the first time, after root rollback my system didn't hang.... uname -a: FreeBSD 192.168.128.100 8.0-BETA2 FreeBSD 8.0-BETA2 #0: Mon Jul 20 21:43:13 EEST 2009 root@192.168.128.100:/usr/obj/usr/src/sys/ANTIGENERIC i386 - -- Aldis Berjoza My public PGP key: http://keyserver1.pgp.com/vkd/DownloadKey.event?keyid=0xA81349A77ED573D3 -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.12 (FreeBSD) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkp1UgUACgkQqBNJp37Vc9NjbwCffjydqiAgAiUSICQtLttHe/F/ D5gAoJ3XEZCmwZH4BAQZcCjf8YTqoutd =9OET -----END PGP SIGNATURE----- From lists at jpru.de Sun Aug 2 09:27:18 2009 From: lists at jpru.de (Juergen Unger) Date: Sun Aug 2 09:27:25 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: <20090729084723.GD1586@garage.freebsd.pl> References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> Message-ID: <20090802092714.GA5813@jpru.ffm.jpru.de> Hi Pawel, On Wed, Jul 29, 2009 at 10:47:23AM +0200, Pawel Jakub Dawidek wrote: > On Tue, Jul 28, 2009 at 12:50:26PM +0300, Andriy Gapon wrote: > > on 27/07/2009 22:58 O. Hartmann said the following: > > > Juergen Unger wrote: > > [snip] > > >>> _sx_xlock(3c,0,874aa28d,70f,8ae9a9f8,...) at _sx_xlock+0x43 > > >>> dmu_buf_update_user(0,8ae9a9f8,0,0,0,...) at dmu_buf_update_user+0x35 > > >>> zfs_znode_dmu_fini(8ae9a9f8,874b312d,1114,110b,879ab000,...) at zfs_znode_dmu_f3 > > >>> zfs_freebsd_reclaim(fcd29c3c,1,0,8ec63754,fcd29c60,...) at zfs_freebsd_reclaim+0 > > >>> VOP_RECLAIM_APV(874b65a0,fcd29c3c,0,0,8ec637c8,...) at VOP_RECLAIM_APV+0xa5 > > >>> vgonel(8ec637c8,0,80c77037,386,0,...) at vgonel+0x1a4 > > >>> vnlru_free(80f2a0f0,0,80c77037,300,3e8,...) at vnlru_free+0x2d5 > > >>> vnlru_proc(0,fcd29d38,80c652bc,33e,871932a8,...) at vnlru_proc+0x80 > > >>> fork_exit(8090d960,0,fcd29d38) at fork_exit+0xb8 > > >>> fork_trampoline() at fork_trampoline+0x8 >[snip] > > P.S. I see that zfs_inactive checks for z_dbuf being NULL and there is the > > following comment: > > /* > > * The fs has been unmounted, or we did a > > * suspend/resume and this file no longer exists. > > */ > > Maybe zfs_freebsd_reclaim should do the same? > > Yes, you might be right. > > Could you guys, who can reproduce it, try this patch: > > http://people.freebsd.org/~pjd/patches/zfs_vnops.c.2.patch I tried the patch, restarted the whole thing yesterday morning and after less then 24 hours and approximately 3215 zfs-receive jobs it do not crashes anymore, but the last started zfs-receive jobs is hanging, cannot be killed, even not with -9. Even other zfs commands are hanging and cannot be killed, while zpool commands seems to be not affected. root 86397 0.0 0.0 3920 1308 ?? D 3:18AM 0:00.29 zfs receive -Fv zzzz/203 root 5001 0.0 0.0 3920 1208 0 D+ 10:45AM 0:00.00 zfs list -t snapshot root 5477 0.0 0.0 3920 1240 3 D+ 11:08AM 0:00.00 zfs list also the sync command I tried to execute hangs forever: root 5457 0.0 0.0 1528 492 2- D+ 11:05AM 0:00.04 sync Other parts of the system which do not have something todo with zfs are still working well. I will leave the machine running in this state, is there something I can do to retrieve other usefull information for you? thnx in advance, Juergen -- ENOSIG From pjd at FreeBSD.org Sun Aug 2 09:30:01 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Sun Aug 2 09:30:43 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: <20090802092714.GA5813@jpru.ffm.jpru.de> References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <20090802092714.GA5813@jpru.ffm.jpru.de> Message-ID: <20090802093016.GB3071@garage.freebsd.pl> On Sun, Aug 02, 2009 at 11:27:14AM +0200, Juergen Unger wrote: > Hi Pawel, > > On Wed, Jul 29, 2009 at 10:47:23AM +0200, Pawel Jakub Dawidek wrote: > > On Tue, Jul 28, 2009 at 12:50:26PM +0300, Andriy Gapon wrote: > > > on 27/07/2009 22:58 O. Hartmann said the following: > > > > Juergen Unger wrote: > > > [snip] > > > >>> _sx_xlock(3c,0,874aa28d,70f,8ae9a9f8,...) at _sx_xlock+0x43 > > > >>> dmu_buf_update_user(0,8ae9a9f8,0,0,0,...) at dmu_buf_update_user+0x35 > > > >>> zfs_znode_dmu_fini(8ae9a9f8,874b312d,1114,110b,879ab000,...) at zfs_znode_dmu_f3 > > > >>> zfs_freebsd_reclaim(fcd29c3c,1,0,8ec63754,fcd29c60,...) at zfs_freebsd_reclaim+0 > > > >>> VOP_RECLAIM_APV(874b65a0,fcd29c3c,0,0,8ec637c8,...) at VOP_RECLAIM_APV+0xa5 > > > >>> vgonel(8ec637c8,0,80c77037,386,0,...) at vgonel+0x1a4 > > > >>> vnlru_free(80f2a0f0,0,80c77037,300,3e8,...) at vnlru_free+0x2d5 > > > >>> vnlru_proc(0,fcd29d38,80c652bc,33e,871932a8,...) at vnlru_proc+0x80 > > > >>> fork_exit(8090d960,0,fcd29d38) at fork_exit+0xb8 > > > >>> fork_trampoline() at fork_trampoline+0x8 > >[snip] > > > P.S. I see that zfs_inactive checks for z_dbuf being NULL and there is the > > > following comment: > > > /* > > > * The fs has been unmounted, or we did a > > > * suspend/resume and this file no longer exists. > > > */ > > > Maybe zfs_freebsd_reclaim should do the same? > > > > Yes, you might be right. > > > > Could you guys, who can reproduce it, try this patch: > > > > http://people.freebsd.org/~pjd/patches/zfs_vnops.c.2.patch > > I tried the patch, restarted the whole thing yesterday morning > and after less then 24 hours and approximately 3215 zfs-receive > jobs it do not crashes anymore, but the last started zfs-receive > jobs is hanging, cannot be killed, even not with -9. Even other > zfs commands are hanging and cannot be killed, while zpool commands > seems to be not affected. > > root 86397 0.0 0.0 3920 1308 ?? D 3:18AM 0:00.29 zfs receive -Fv zzzz/203 > root 5001 0.0 0.0 3920 1208 0 D+ 10:45AM 0:00.00 zfs list -t snapshot > root 5477 0.0 0.0 3920 1240 3 D+ 11:08AM 0:00.00 zfs list > > also the sync command I tried to execute hangs forever: > > root 5457 0.0 0.0 1528 492 2- D+ 11:05AM 0:00.04 sync > > Other parts of the system which do not have something todo with zfs > are still working well. I will leave the machine running in this > state, is there something I can do to retrieve other usefull information > for you? If you can break into debugger and send me 'show alltrace' for starters. -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090802/6f523d0e/attachment.pgp From stark at mapper.nl Mon Aug 3 10:08:56 2009 From: stark at mapper.nl (Mark Stapper) Date: Mon Aug 3 10:09:04 2009 Subject: zfs built-in kernel Message-ID: <4A76B450.8010206@mapper.nl> Hello, Would it be possible to built zfs support into the kernel? If so, what would I have to add to my kernel config file? Greetz, Mark -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 259 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090803/7f496f7b/signature.pgp From bugmaster at FreeBSD.org Mon Aug 3 11:06:57 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Aug 3 11:08:21 2009 Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org Message-ID: <200908031106.n73B6tXT088580@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/137037 fs [zfs] [hang] zfs rollback on root causes FreeBSD to fr o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136942 fs [zfs] zvol resize not reflected until reboot o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic o kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/136218 fs [zfs] Exported ZFS pools can't be imported into (Open) o kern/135594 fs [zfs] Single dataset unresponsive with Samba o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135480 fs [zfs] panic: lock &arg.lock already initialized o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o bin/135314 fs [zfs] assertion failed for zdb(8) usage o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot f kern/134496 fs [zfs] [panic] ZFS pool export occasionally causes a ke o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133980 fs [panic] [ffs] panic: ffs_valloc: dup alloc o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [smbfs] [panic] panic: ffs_truncate: read-only filesys o kern/133373 fs [zfs] umass attachment causes ZFS checksum errors, dat o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int f kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/133134 fs [zfs] Missing ZFS zpool labels o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132597 fs [tmpfs] [panic] tmpfs-related panic while interrupting o kern/132551 fs [zfs] ZFS locks up on extattr_list_link syscall o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes f kern/132068 fs [zfs] page fault when using ZFS over NFS on 7.1-RELEAS o kern/131995 fs [nfs] Failure to mount NFSv4 server o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/131086 fs [ext2fs] [patch] mkfs.ext2 creates rotten partition o kern/130979 fs [smbfs] [panic] boot/kernel/smbfs.ko o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129148 fs [zfs] [panic] panic on concurrent writing & rollback o kern/129059 fs [zfs] [patch] ZFS bootloader whitelistable via WITHOUT f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/128633 fs [zfs] [lor] lock order reversal in zfs o kern/128514 fs [zfs] [mpt] problems with ZFS and LSILogic SAS/SATA Ad f kern/128173 fs [ext2fs] ls gives "Input/output error" on mounted ext3 o kern/127659 fs [tmpfs] tmpfs memory leak o kern/127492 fs [zfs] System hang on ZFS input-output o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127213 fs [tmpfs] sendfile on tmpfs data corruption o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/125644 fs [zfs] [panic] zfs unfixable fs errors caused panic whe f kern/125536 fs [ext2fs] ext 2 mounts cleanly but fails on commands li o kern/125149 fs [nfs] [panic] changing into .zfs dir from nfs client c f kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122888 fs [zfs] zfs hang w/ prefetch on, zil off while running t o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o kern/122173 fs [zfs] [panic] Kernel Panic if attempting to replace a o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o kern/122047 fs [ext2fs] [patch] incorrect handling of UF_IMMUTABLE / o kern/122038 fs [tmpfs] [panic] tmpfs: panic: tmpfs_alloc_vp: type 0xc o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o kern/121770 fs [zfs] ZFS on i386, large file or heavy I/O leads to ke o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [fs] [snapshot] System crashes when manipulati o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F o bin/120288 fs zfs(8): "zfs share -a" does not send SIGHUP to mountd f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o misc/118855 fs [zfs] ZFS-related commands are nonfunctional in fixit o kern/118713 fs [minidump] [patch] Display media size required for a k o kern/118320 fs [zfs] [patch] NFS SETATTR sometimes fails to set file o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o kern/116913 fs [ffs] [panic] ffs_blkfree: freeing free block p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/115645 fs [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o kern/113180 fs [zfs] Setting ZFS nfsshare property does not cause inh o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/105093 fs [ext2fs] [patch] ext2fs on read-only media cannot be m o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna f kern/91568 fs [ufs] [panic] writing to UFS/softupdates DVD media in o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/89991 fs [ufs] softupdates with mount -ur causes fs UNREFS o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/84589 fs [2TB] 5.4-STABLE unresponsive during background fsck 2 o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o kern/77826 fs [ext2fs] ext2fs usb filesystem will not mount RW o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 149 problems total. From gary.jennejohn at freenet.de Mon Aug 3 14:26:36 2009 From: gary.jennejohn at freenet.de (Gary Jennejohn) Date: Mon Aug 3 14:26:43 2009 Subject: zfs built-in kernel In-Reply-To: <4A76B450.8010206@mapper.nl> References: <4A76B450.8010206@mapper.nl> Message-ID: <20090803162632.3da3eca4@ernst.jennejohn.org> On Mon, 03 Aug 2009 11:56:32 +0200 Mark Stapper wrote: > Hello, > > Would it be possible to built zfs support into the kernel? > If so, what would I have to add to my kernel config file? > Greetz, > Mark > Nope, looks like it's only available as a module. --- Gary Jennejohn From lists at jpru.de Mon Aug 3 20:32:32 2009 From: lists at jpru.de (Juergen Unger) Date: Mon Aug 3 20:32:40 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: <20090802093016.GB3071@garage.freebsd.pl> References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <20090802092714.GA5813@jpru.ffm.jpru.de> <20090802093016.GB3071@garage.freebsd.pl> Message-ID: <20090803203226.GE5813@jpru.ffm.jpru.de> Hi, On Sun, Aug 02, 2009 at 11:30:16AM +0200, Pawel Jakub Dawidek wrote: [...] > > > Could you guys, who can reproduce it, try this patch: > > > > > > http://people.freebsd.org/~pjd/patches/zfs_vnops.c.2.patch > > > > I tried the patch, restarted the whole thing yesterday morning > > and after less then 24 hours and approximately 3215 zfs-receive > > jobs it do not crashes anymore, but the last started zfs-receive > > jobs is hanging, cannot be killed, even not with -9. Even other > > zfs commands are hanging and cannot be killed, while zpool commands > > seems to be not affected. > > > > root 86397 0.0 0.0 3920 1308 ?? D 3:18AM 0:00.29 zfs receive -Fv zzzz/203 > > root 5001 0.0 0.0 3920 1208 0 D+ 10:45AM 0:00.00 zfs list -t snapshot > > root 5477 0.0 0.0 3920 1240 3 D+ 11:08AM 0:00.00 zfs list > > > > also the sync command I tried to execute hangs forever: > > > > root 5457 0.0 0.0 1528 492 2- D+ 11:05AM 0:00.04 sync > > > > Other parts of the system which do not have something todo with zfs > > are still working well. I will leave the machine running in this > > state, is there something I can do to retrieve other usefull information > > for you? > > If you can break into debugger and send me 'show alltrace' for starters. hmm, maybe you did not get my last mail. I put the log of this on -Juergen- -- ENOSIG From pjd at FreeBSD.org Tue Aug 4 07:34:00 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Tue Aug 4 07:34:12 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: <20090803203226.GE5813@jpru.ffm.jpru.de> References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <20090802092714.GA5813@jpru.ffm.jpru.de> <20090802093016.GB3071@garage.freebsd.pl> <20090803203226.GE5813@jpru.ffm.jpru.de> Message-ID: <20090804073416.GA4479@garage.freebsd.pl> On Mon, Aug 03, 2009 at 10:32:26PM +0200, Juergen Unger wrote: > Hi, > > On Sun, Aug 02, 2009 at 11:30:16AM +0200, Pawel Jakub Dawidek wrote: > [...] > > > > Could you guys, who can reproduce it, try this patch: > > > > > > > > http://people.freebsd.org/~pjd/patches/zfs_vnops.c.2.patch > > > > > > I tried the patch, restarted the whole thing yesterday morning > > > and after less then 24 hours and approximately 3215 zfs-receive > > > jobs it do not crashes anymore, but the last started zfs-receive > > > jobs is hanging, cannot be killed, even not with -9. Even other > > > zfs commands are hanging and cannot be killed, while zpool commands > > > seems to be not affected. > > > > > > root 86397 0.0 0.0 3920 1308 ?? D 3:18AM 0:00.29 zfs receive -Fv zzzz/203 > > > root 5001 0.0 0.0 3920 1208 0 D+ 10:45AM 0:00.00 zfs list -t snapshot > > > root 5477 0.0 0.0 3920 1240 3 D+ 11:08AM 0:00.00 zfs list > > > > > > also the sync command I tried to execute hangs forever: > > > > > > root 5457 0.0 0.0 1528 492 2- D+ 11:05AM 0:00.04 sync > > > > > > Other parts of the system which do not have something todo with zfs > > > are still working well. I will leave the machine running in this > > > state, is there something I can do to retrieve other usefull information > > > for you? > > > > If you can break into debugger and send me 'show alltrace' for starters. > > hmm, maybe you did not get my last mail. > I put the log of this on I did get it, sorry for the delay, I'm quite busy with other stuff. I need to setup machine for HEAD testing, as my current test box is running perforce version. I'd also need 'show lock 0x87aac290' from this machine if its not too late. -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090804/c2e17ae9/attachment.pgp From lists at jpru.de Tue Aug 4 07:53:31 2009 From: lists at jpru.de (Juergen Unger) Date: Tue Aug 4 07:53:38 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: <20090804073416.GA4479@garage.freebsd.pl> References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <20090802092714.GA5813@jpru.ffm.jpru.de> <20090802093016.GB3071@garage.freebsd.pl> <20090803203226.GE5813@jpru.ffm.jpru.de> <20090804073416.GA4479@garage.freebsd.pl> Message-ID: <20090804075329.GI5813@jpru.ffm.jpru.de> Hi Pawel, On Tue, Aug 04, 2009 at 09:34:16AM +0200, Pawel Jakub Dawidek wrote: [...] > > > If you can break into debugger and send me 'show alltrace' for starters. > > > > hmm, maybe you did not get my last mail. > > I put the log of this on > > I did get it, sorry for the delay, I'm quite busy with other stuff. I > need to setup machine for HEAD testing, as my current test box is > running perforce version. > > I'd also need 'show lock 0x87aac290' from this machine if its not too > late. testbox# sysctl debug.kdb.enter=1 KDB: enter: sysctl debug.kdb.enter [thread pid 11635 tid 100472 ] Stopped at kdb_enter+0x3a: movl $0,kdb_why db> show lock 0x87aac290 class: sx name: dp->dp_config_rwlock state: XLOCK: 0x879e8480 (tid 100130, pid 172, "txg_thread_enter") waiters: shared db> bye, -Juergen- -- ENOSIG From pjd at FreeBSD.org Tue Aug 4 09:49:32 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Tue Aug 4 09:49:45 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: <20090804075329.GI5813@jpru.ffm.jpru.de> References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <20090802092714.GA5813@jpru.ffm.jpru.de> <20090802093016.GB3071@garage.freebsd.pl> <20090803203226.GE5813@jpru.ffm.jpru.de> <20090804073416.GA4479@garage.freebsd.pl> <20090804075329.GI5813@jpru.ffm.jpru.de> Message-ID: <20090804094950.GD4479@garage.freebsd.pl> On Tue, Aug 04, 2009 at 09:53:29AM +0200, Juergen Unger wrote: > Hi Pawel, > > On Tue, Aug 04, 2009 at 09:34:16AM +0200, Pawel Jakub Dawidek wrote: > [...] > > > > If you can break into debugger and send me 'show alltrace' for starters. > > > > > > hmm, maybe you did not get my last mail. > > > I put the log of this on > > > > I did get it, sorry for the delay, I'm quite busy with other stuff. I > > need to setup machine for HEAD testing, as my current test box is > > running perforce version. > > > > I'd also need 'show lock 0x87aac290' from this machine if its not too > > late. > > testbox# sysctl debug.kdb.enter=1 > KDB: enter: sysctl debug.kdb.enter > [thread pid 11635 tid 100472 ] > Stopped at kdb_enter+0x3a: movl $0,kdb_why > db> show lock 0x87aac290 > class: sx > name: dp->dp_config_rwlock > state: XLOCK: 0x879e8480 (tid 100130, pid 172, "txg_thread_enter") > waiters: shared > db> Could you also try something like the following from DDB: x/bx 0x879ad8a0,52 -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090804/8451cd5b/attachment.pgp From lists at jpru.de Tue Aug 4 09:56:50 2009 From: lists at jpru.de (Juergen Unger) Date: Tue Aug 4 09:57:02 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: <20090804094950.GD4479@garage.freebsd.pl> References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <20090802092714.GA5813@jpru.ffm.jpru.de> <20090802093016.GB3071@garage.freebsd.pl> <20090803203226.GE5813@jpru.ffm.jpru.de> <20090804073416.GA4479@garage.freebsd.pl> <20090804075329.GI5813@jpru.ffm.jpru.de> <20090804094950.GD4479@garage.freebsd.pl> Message-ID: <20090804095648.GL5813@jpru.ffm.jpru.de> On Tue, Aug 04, 2009 at 11:49:50AM +0200, Pawel Jakub Dawidek wrote: > > testbox# sysctl debug.kdb.enter=1 > > KDB: enter: sysctl debug.kdb.enter > > [thread pid 11635 tid 100472 ] > > Stopped at kdb_enter+0x3a: movl $0,kdb_why > > db> show lock 0x87aac290 > > class: sx > > name: dp->dp_config_rwlock > > state: XLOCK: 0x879e8480 (tid 100130, pid 172, "txg_thread_enter") > > waiters: shared > > db> > > Could you also try something like the following from DDB: > > x/bx 0x879ad8a0,52 db> x/bx 0x879ad8a0,52 0x879ad8a0: 82 1a 4b 87 0 0 71 2 0 0 0 0 0 0 0 0 0x879ad8b0: 1 0 0 0 91 1a 4b 87 4 0 0 0 40 92 cb 87 0x879ad8c0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0x879ad8d0: 0 0 0 0 9 1f 4b 87 0 0 71 2 0 0 0 0 0x879ad8e0: 0 0 0 0 40 92 cb 87 e8 0 0 0 c8 0 0 0 0x879ad8f0: a8 bb db> -Juergen- -- ENOSIG From pjd at FreeBSD.org Tue Aug 4 19:50:55 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Tue Aug 4 19:51:02 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: <20090804095648.GL5813@jpru.ffm.jpru.de> References: <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <20090802092714.GA5813@jpru.ffm.jpru.de> <20090802093016.GB3071@garage.freebsd.pl> <20090803203226.GE5813@jpru.ffm.jpru.de> <20090804073416.GA4479@garage.freebsd.pl> <20090804075329.GI5813@jpru.ffm.jpru.de> <20090804094950.GD4479@garage.freebsd.pl> <20090804095648.GL5813@jpru.ffm.jpru.de> Message-ID: <20090804195112.GB2181@garage.freebsd.pl> On Tue, Aug 04, 2009 at 11:56:48AM +0200, Juergen Unger wrote: > On Tue, Aug 04, 2009 at 11:49:50AM +0200, Pawel Jakub Dawidek wrote: > > > testbox# sysctl debug.kdb.enter=1 > > > KDB: enter: sysctl debug.kdb.enter > > > [thread pid 11635 tid 100472 ] > > > Stopped at kdb_enter+0x3a: movl $0,kdb_why > > > db> show lock 0x87aac290 > > > class: sx > > > name: dp->dp_config_rwlock > > > state: XLOCK: 0x879e8480 (tid 100130, pid 172, "txg_thread_enter") > > > waiters: shared > > > db> > > > > Could you also try something like the following from DDB: > > > > x/bx 0x879ad8a0,52 > > db> x/bx 0x879ad8a0,52 > 0x879ad8a0: 82 1a 4b 87 0 0 71 2 0 0 0 0 0 0 0 0 > 0x879ad8b0: 1 0 0 0 91 1a 4b 87 4 0 0 0 40 92 cb 87 > 0x879ad8c0: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 > 0x879ad8d0: 0 0 0 0 9 1f 4b 87 0 0 71 2 0 0 0 0 > 0x879ad8e0: 0 0 0 0 40 92 cb 87 e8 0 0 0 c8 0 0 0 > 0x879ad8f0: a8 bb > db> This is dump of ZFS-specific rrwlock that some threads are waiting for. We can see here that thread owning the lock is 0x87cb9240, which is pid 86397 (zfs recv process). I don't think we will be able to gather more info from here. I'm builing HEAD at the moment and hopefully will be able to reproduce it. Thanks for all the info. -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090804/441fced3/attachment.pgp From serenity at exscape.org Tue Aug 4 20:11:44 2009 From: serenity at exscape.org (Thomas Backman) Date: Tue Aug 4 20:12:06 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: <20090804195112.GB2181@garage.freebsd.pl> References: <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <20090802092714.GA5813@jpru.ffm.jpru.de> <20090802093016.GB3071@garage.freebsd.pl> <20090803203226.GE5813@jpru.ffm.jpru.de> <20090804073416.GA4479@garage.freebsd.pl> <20090804075329.GI5813@jpru.ffm.jpru.de> <20090804094950.GD4479@garage.freebsd.pl> <20090804095648.GL5813@jpru.ffm.jpru.de> <20090804195112.GB2181@garage.freebsd.pl> Message-ID: On Aug 4, 2009, at 21:51, Pawel Jakub Dawidek wrote: > I'm builing HEAD at the moment and hopefully will be > able to reproduce it. Thanks for all the info. Hey Pawel, Sorry to bother (again!), but... If you're building HEAD, could you please look in to the send -R / zfs recv segfault? http://lists.freebsd.org/pipermail/freebsd-current/2009-July/ 010156.html for the patch by its creator, and I hosted it at http://exscape.org/temp/libzfs_sendrecv.new.patch since spacing (I guess) made the patch not work for me by copying/ pasting. It's a simple patch, fixing a big bug (not being able to replicate whole pools properly!), and it works great. For each day that passes, it feels as if this, (and your equally important zfs_vnops work, which is also pretty vital), won't make it into 8.0... Which is why I keep either reminding or bugging people who both know this stuff and can actually make it happen. ;) Regards (and apologies), Thomas From pjd at FreeBSD.org Tue Aug 4 20:25:12 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Tue Aug 4 20:25:19 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: References: <20090729084723.GD1586@garage.freebsd.pl> <20090802092714.GA5813@jpru.ffm.jpru.de> <20090802093016.GB3071@garage.freebsd.pl> <20090803203226.GE5813@jpru.ffm.jpru.de> <20090804073416.GA4479@garage.freebsd.pl> <20090804075329.GI5813@jpru.ffm.jpru.de> <20090804094950.GD4479@garage.freebsd.pl> <20090804095648.GL5813@jpru.ffm.jpru.de> <20090804195112.GB2181@garage.freebsd.pl> Message-ID: <20090804202528.GE2181@garage.freebsd.pl> On Tue, Aug 04, 2009 at 10:10:51PM +0200, Thomas Backman wrote: > > On Aug 4, 2009, at 21:51, Pawel Jakub Dawidek wrote: > >I'm builing HEAD at the moment and hopefully will be > >able to reproduce it. Thanks for all the info. > Hey Pawel, > Sorry to bother (again!), but... If you're building HEAD, could you > please look in to the send -R / zfs recv segfault? > http://lists.freebsd.org/pipermail/freebsd-current/2009-July/ > 010156.html for the patch by its creator, and I hosted it at > http://exscape.org/temp/libzfs_sendrecv.new.patch since spacing (I guess) > made the patch not work for me by copying/ pasting. > > It's a simple patch, fixing a big bug (not being able to replicate > whole pools properly!), and it works great. For each day that passes, > it feels as if this, (and your equally important zfs_vnops work, which > is also pretty vital), won't make it into 8.0... Which is why I keep > either reminding or bugging people who both know this stuff and can > actually make it happen. ;) This patch is one of the reasons I'm building HEAD, stay tuned:) -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090804/829cc82a/attachment.pgp From spawk at acm.poly.edu Tue Aug 4 21:39:28 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Tue Aug 4 21:39:35 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs Message-ID: <4A78AA71.9050107@acm.poly.edu> Ahoy. I have a seven-disk RAID-Z pool in a 8-BETA2/amd64 machine. One of the disks (ad13) failed to write something today, and the system proceeded to panic. I couldn't get a dump or any otherwise useful information, but the panic made reference to "vdev_is_dead". Upon reboot, it panics again, probably when "zfs mount" is called by its rc.d script: Fatal trap 9: general protection fault while in kernel mode instruction pointer = 0x20:0xffffffff807cbdbb stack pointer = 0x28:0xffffff8077bf54c0 frame pointer = 0x28:0xffffff8077bf54d0 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 82 (zfs) panic: from debugger Uptime: 13s Physical memory: 4081 MB Dumping 1245 MB: 1230 1214 1198 1182 1166 1150 1134 1118 1102 1086 1070 1054 1038 1022 1006 990 974 958 942 926 910 894 878 862 846 830 814 798 782 766 750 734 718 702 686 670 654 638 622 606 590 574 558 542 526 510 494 478 462 446 430 414 398 382 366 350 334 318 302 286 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46 30 14 Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /boot/kernel/zfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. done. Loaded symbols for /boot/kernel/opensolaris.ko #0 doadump () at pcpu.h:223 223 pcpu.h: No such file or directory. in pcpu.h (kgdb) where #0 doadump () at pcpu.h:223 #1 0xffffffff8058ff11 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:419 #2 0xffffffff805902eb in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:575 #3 0xffffffff801d9997 in db_panic (addr=Variable "addr" is not available. ) at /usr/src/sys/ddb/db_command.c:478 #4 0xffffffff801d9da1 in db_command (last_cmdp=0xffffffff80bd5120, cmd_table=Variable "cmd_table" is not available. ) at /usr/src/sys/ddb/db_command.c:445 #5 0xffffffff801d9ff0 in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 #6 0xffffffff801dbf79 in db_trap (type=Variable "type" is not available. ) at /usr/src/sys/ddb/db_main.c:229 #7 0xffffffff805bbd94 in kdb_trap (type=9, code=0, tf=Variable "tf" is not available. ) at /usr/src/sys/kern/subr_kdb.c:534 #8 0xffffffff8086dc5d in trap_fatal (frame=0xffffff8077bf5410, eva=0) at /usr/src/sys/amd64/amd64/trap.c:847 #9 0xffffffff8086e74d in trap (frame=0xffffff8077bf5410) at /usr/src/sys/amd64/amd64/trap.c:639 #10 0xffffffff80857403 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #11 0xffffffff807cbdbb in slab_alloc_item (zone=Variable "zone" is not available. ) at /usr/src/sys/vm/uma_core.c:2300 #12 0xffffffff807ce80e in zone_alloc_item (zone=0xffffff00dffae000, udata=0x0, flags=259) at /usr/src/sys/vm/uma_core.c:2475 #13 0xffffffff807cee03 in keg_alloc_slab (keg=0xffffff00dffad460, zone=0xffffff00dffac380, wait=259) at /usr/src/sys/vm/uma_core.c:826 #14 0xffffffff807cf177 in keg_fetch_slab (keg=0xffffff00dffad460, zone=0xffffff00dffac380, flags=259) at /usr/src/sys/vm/uma_core.c:2152 #15 0xffffffff807cf21e in zone_fetch_slab (zone=0xffffff00dffac380, keg=0xffffff00dffad460, flags=259) at /usr/src/sys/vm/uma_core.c:2212 #16 0xffffffff807d05eb in uma_zalloc_arg (zone=0xffffff00dffac380, udata=0x0, flags=259) at /usr/src/sys/vm/uma_core.c:2381 #17 0xffffffff8057e727 in malloc (size=Variable "size" is not available. ) at uma.h:305 #18 0xffffffff81060365 in metaslab_init (mg=0xffffff0004472980, smo=0xffffff8077bf5730, start=530428461056, size=2147483648, txg=0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c:294 #19 0xffffffff81071b3e in vdev_metaslab_init (vd=0xffffff0001ecf800, txg=0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:796 #20 0xffffffff81071da5 in vdev_load (vd=0xffffff0001ecf800) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:1531 #21 0xffffffff81071c75 in vdev_load (vd=0xffffff0001ed1800) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:1526 #22 0xffffffff8106539c in spa_load (spa=0xffffff0001ff0000, config=Variable "config" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:1361 #23 0xffffffff81064ee1 in spa_load (spa=0xffffff0001ff0000, config=Variable "config" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:1189 #24 0xffffffff810658fd in spa_open_common (pool=Variable "pool" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:1474 #25 0xffffffff81065a52 in spa_get_stats (name=0xffffff0001ff5000 "home", config=0xffffff8077bf59e0, altroot=0xffffff0001ff5400 "", buflen=1024) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:1671 #26 0xffffffff81093e7c in zfs_ioc_pool_stats (zc=0xffffff0001ff5000) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:914 #27 0xffffffff810941c4 in zfsdev_ioctl (dev=Variable "dev" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:3022 #28 0xffffffff80511c76 in devfs_ioctl_f (fp=0xffffff0001f4bc80, com=3425196549, data=0xffffff0001ff5000, cred=Variable "cred" is not available. ) at /usr/src/sys/fs/devfs/devfs_vnops.c:659 #29 0xffffffff805cb166 in kern_ioctl (td=0xffffff0001f0c390, fd=3, com=3425196549, data=0xffffff0001ff5000 "home") at file.h:262 #30 0xffffffff805cb38e in ioctl (td=0xffffff0001f0c390, uap=0xffffff8077bf5bf0) at /usr/src/sys/kern/sys_generic.c:678 #31 0xffffffff8086e28f in syscall (frame=0xffffff8077bf5c80) at /usr/src/sys/amd64/amd64/trap.c:984 #32 0xffffffff808576e1 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:373 #33 0x0000000800fe1d0c in ?? () Booting the system without the disk causes any "zfs" or "zpool" commands to hang the system after a while. Breaking to DDB doesn't work using a keyboard and VGA (I don't have any other kind of gear here). In case it is relevant, the pool started life as version 6 and was upgraded using 7.2-STABLE shortly after the version 13 MFC. The output of "zdb" with all disks connected: home version=13 name='home' state=0 txg=16061492 pool_guid=14089219607492705674 hostid=413956888 hostname='unset' vdev_tree type='root' id=0 guid=14089219607492705674 children[0] type='raidz' id=0 guid=17899218839424019335 nparity=1 metaslab_array=14 metaslab_shift=31 ashift=9 asize=2800585539584 is_log=0 children[0] type='disk' id=0 guid=15839907043443901501 path='/dev/ad4' devid='ad:3QK08728' whole_disk=0 DTL=389 children[1] type='disk' id=1 guid=13623369126078337737 path='/dev/ad16' devid='ad:9QH04HJN' whole_disk=0 DTL=391 children[2] type='disk' id=2 guid=15619490422714555908 path='/dev/ad14' devid='ad:5NF1DDXR' whole_disk=0 DTL=390 children[3] type='disk' id=3 guid=6995275135550350664 path='/dev/ad15' devid='ad:9QG93JHX' whole_disk=0 DTL=386 children[4] type='disk' id=4 guid=10651992494569677081 path='/dev/ad13' devid='ad:9QH04GTY' whole_disk=0 DTL=388 children[5] type='disk' id=5 guid=10503557489947490214 path='/dev/ad18' devid='ad:5NF1DDVB' whole_disk=0 DTL=387 children[6] type='disk' id=6 guid=17574056058658811312 path='/dev/ad12' devid='ad:9QG90QA2' whole_disk=0 DTL=392 Can anyone help? I would be content to at least have access to the filesystem in degraded mode. -Boris From spawk at acm.poly.edu Tue Aug 4 22:01:51 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Tue Aug 4 22:01:58 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs In-Reply-To: <4A78AA71.9050107@acm.poly.edu> References: <4A78AA71.9050107@acm.poly.edu> Message-ID: <4A78AFB2.10103@acm.poly.edu> Boris Kochergin wrote: > Ahoy. I have a seven-disk RAID-Z pool in a 8-BETA2/amd64 machine. One > of the disks (ad13) failed to write something today, and the system > proceeded to panic. I couldn't get a dump or any otherwise useful > information, but the panic made reference to "vdev_is_dead". Upon > reboot, it panics again, probably when "zfs mount" is called by its > rc.d script: > > Fatal trap 9: general protection fault while in kernel mode > instruction pointer = 0x20:0xffffffff807cbdbb > stack pointer = 0x28:0xffffff8077bf54c0 > frame pointer = 0x28:0xffffff8077bf54d0 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 82 (zfs) > panic: from debugger > Uptime: 13s > Physical memory: 4081 MB > Dumping 1245 MB: 1230 1214 1198 1182 1166 1150 1134 1118 1102 1086 > 1070 1054 1038 1022 1006 990 974 958 942 926 910 894 878 862 846 830 > 814 798 782 766 750 734 718 702 686 670 654 638 622 606 590 574 558 > 542 526 510 494 478 462 446 430 414 398 382 366 350 334 318 302 286 > 270 254 238 222 206 190 174 158 142 126 110 94 78 62 46 30 14 > > Reading symbols from /boot/kernel/zfs.ko...Reading symbols from > /boot/kernel/zfs.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/zfs.ko > Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols > from /boot/kernel/opensolaris.ko.symbols...done. > done. > Loaded symbols for /boot/kernel/opensolaris.ko > #0 doadump () at pcpu.h:223 > 223 pcpu.h: No such file or directory. > in pcpu.h > (kgdb) where > #0 doadump () at pcpu.h:223 > #1 0xffffffff8058ff11 in boot (howto=260) at > /usr/src/sys/kern/kern_shutdown.c:419 > #2 0xffffffff805902eb in panic (fmt=Variable "fmt" is not available. > ) at /usr/src/sys/kern/kern_shutdown.c:575 > #3 0xffffffff801d9997 in db_panic (addr=Variable "addr" is not > available. > ) at /usr/src/sys/ddb/db_command.c:478 > #4 0xffffffff801d9da1 in db_command (last_cmdp=0xffffffff80bd5120, > cmd_table=Variable "cmd_table" is not available. > ) at /usr/src/sys/ddb/db_command.c:445 > #5 0xffffffff801d9ff0 in db_command_loop () at > /usr/src/sys/ddb/db_command.c:498 > #6 0xffffffff801dbf79 in db_trap (type=Variable "type" is not available. > ) at /usr/src/sys/ddb/db_main.c:229 > #7 0xffffffff805bbd94 in kdb_trap (type=9, code=0, tf=Variable "tf" > is not available. > ) at /usr/src/sys/kern/subr_kdb.c:534 > #8 0xffffffff8086dc5d in trap_fatal (frame=0xffffff8077bf5410, eva=0) > at /usr/src/sys/amd64/amd64/trap.c:847 > #9 0xffffffff8086e74d in trap (frame=0xffffff8077bf5410) at > /usr/src/sys/amd64/amd64/trap.c:639 > #10 0xffffffff80857403 in calltrap () at > /usr/src/sys/amd64/amd64/exception.S:224 > #11 0xffffffff807cbdbb in slab_alloc_item (zone=Variable "zone" is not > available. > ) at /usr/src/sys/vm/uma_core.c:2300 > #12 0xffffffff807ce80e in zone_alloc_item (zone=0xffffff00dffae000, > udata=0x0, flags=259) at /usr/src/sys/vm/uma_core.c:2475 > #13 0xffffffff807cee03 in keg_alloc_slab (keg=0xffffff00dffad460, > zone=0xffffff00dffac380, wait=259) at /usr/src/sys/vm/uma_core.c:826 > #14 0xffffffff807cf177 in keg_fetch_slab (keg=0xffffff00dffad460, > zone=0xffffff00dffac380, flags=259) at /usr/src/sys/vm/uma_core.c:2152 > #15 0xffffffff807cf21e in zone_fetch_slab (zone=0xffffff00dffac380, > keg=0xffffff00dffad460, flags=259) at /usr/src/sys/vm/uma_core.c:2212 > #16 0xffffffff807d05eb in uma_zalloc_arg (zone=0xffffff00dffac380, > udata=0x0, flags=259) at /usr/src/sys/vm/uma_core.c:2381 > #17 0xffffffff8057e727 in malloc (size=Variable "size" is not available. > ) at uma.h:305 > #18 0xffffffff81060365 in metaslab_init (mg=0xffffff0004472980, > smo=0xffffff8077bf5730, start=530428461056, size=2147483648, txg=0) at > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/metaslab.c:294 > > #19 0xffffffff81071b3e in vdev_metaslab_init (vd=0xffffff0001ecf800, > txg=0) at > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:796 > > #20 0xffffffff81071da5 in vdev_load (vd=0xffffff0001ecf800) at > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:1531 > > #21 0xffffffff81071c75 in vdev_load (vd=0xffffff0001ed1800) at > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:1526 > > #22 0xffffffff8106539c in spa_load (spa=0xffffff0001ff0000, > config=Variable "config" is not available. > ) at > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:1361 > > #23 0xffffffff81064ee1 in spa_load (spa=0xffffff0001ff0000, > config=Variable "config" is not available. > ) at > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:1189 > > #24 0xffffffff810658fd in spa_open_common (pool=Variable "pool" is not > available. > ) at > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:1474 > > #25 0xffffffff81065a52 in spa_get_stats (name=0xffffff0001ff5000 > "home", config=0xffffff8077bf59e0, altroot=0xffffff0001ff5400 "", > buflen=1024) > at > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:1671 > > #26 0xffffffff81093e7c in zfs_ioc_pool_stats (zc=0xffffff0001ff5000) > at > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:914 > > #27 0xffffffff810941c4 in zfsdev_ioctl (dev=Variable "dev" is not > available. > ) at > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:3022 > > #28 0xffffffff80511c76 in devfs_ioctl_f (fp=0xffffff0001f4bc80, > com=3425196549, data=0xffffff0001ff5000, cred=Variable "cred" is not > available. > ) at /usr/src/sys/fs/devfs/devfs_vnops.c:659 > #29 0xffffffff805cb166 in kern_ioctl (td=0xffffff0001f0c390, fd=3, > com=3425196549, data=0xffffff0001ff5000 "home") at file.h:262 > #30 0xffffffff805cb38e in ioctl (td=0xffffff0001f0c390, > uap=0xffffff8077bf5bf0) at /usr/src/sys/kern/sys_generic.c:678 > #31 0xffffffff8086e28f in syscall (frame=0xffffff8077bf5c80) at > /usr/src/sys/amd64/amd64/trap.c:984 > #32 0xffffffff808576e1 in Xfast_syscall () at > /usr/src/sys/amd64/amd64/exception.S:373 > #33 0x0000000800fe1d0c in ?? () > > Booting the system without the disk causes any "zfs" or "zpool" > commands to hang the system after a while. Breaking to DDB doesn't > work using a keyboard and VGA (I don't have any other kind of gear > here). In case it is relevant, the pool started life as version 6 and > was upgraded using 7.2-STABLE shortly after the version 13 MFC. The > output of "zdb" with all disks connected: > > home > version=13 > name='home' > state=0 > txg=16061492 > pool_guid=14089219607492705674 > hostid=413956888 > hostname='unset' > vdev_tree > type='root' > id=0 > guid=14089219607492705674 > children[0] > type='raidz' > id=0 > guid=17899218839424019335 > nparity=1 > metaslab_array=14 > metaslab_shift=31 > ashift=9 > asize=2800585539584 > is_log=0 > children[0] > type='disk' > id=0 > guid=15839907043443901501 > path='/dev/ad4' > devid='ad:3QK08728' > whole_disk=0 > DTL=389 > children[1] > type='disk' > id=1 > guid=13623369126078337737 > path='/dev/ad16' > devid='ad:9QH04HJN' > whole_disk=0 > DTL=391 > children[2] > type='disk' > id=2 > guid=15619490422714555908 > path='/dev/ad14' > devid='ad:5NF1DDXR' > whole_disk=0 > DTL=390 > children[3] > type='disk' > id=3 > guid=6995275135550350664 > path='/dev/ad15' > devid='ad:9QG93JHX' > whole_disk=0 > DTL=386 > children[4] > type='disk' > id=4 > guid=10651992494569677081 > path='/dev/ad13' > devid='ad:9QH04GTY' > whole_disk=0 > DTL=388 > children[5] > type='disk' > id=5 > guid=10503557489947490214 > path='/dev/ad18' > devid='ad:5NF1DDVB' > whole_disk=0 > DTL=387 > children[6] > type='disk' > id=6 > guid=17574056058658811312 > path='/dev/ad12' > devid='ad:9QG90QA2' > whole_disk=0 > DTL=392 > > Can anyone help? I would be content to at least have access to the > filesystem in degraded mode. > > -Boris > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" In a subsequent attempt at "zfs mount -a", the following panic happened: Fatal trap 12: page fault while in kernel mode fault virtual address = 0xffffffff813dadb5 fault code = supervisor write data, page not present instruction pointer = 0x20:0xffffffff805951a5 stack pointer = 0x28:0xffffff8077eb3360 frame pointer = 0x28:0xffffff8077eb3370 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 832 (zfs) panic: from debugger Uptime: 2m32s Physical memory: 4082 MB Dumping 1282 MB: 1267 1251 1235 1219 1203 1187 1171 1155 1139 1123 1107 1091 1075 1059 1043 1027 1011 995 979 963 947 931 915 899 883 867 851 835 819 803 787 771 755 739 723 707 691 675 659 643 627 611 595 579 563 547 531 515 499 483 467 451 435 419 403 387 371 355 339 323 307 291 275 259 243 227 211 195 179 163 147 131 115 99 83 67 51 35 19 3 Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /boot/kernel/zfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. done. Loaded symbols for /boot/kernel/opensolaris.ko #0 doadump () at pcpu.h:223 223 pcpu.h: No such file or directory. in pcpu.h (kgdb) where #0 doadump () at pcpu.h:223 #1 0xffffffff8058d881 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:419 #2 0xffffffff8058dc5b in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:575 #3 0xffffffff801d9767 in db_panic (addr=Variable "addr" is not available. ) at /usr/src/sys/ddb/db_command.c:478 #4 0xffffffff801d9b71 in db_command (last_cmdp=0xffffffff80bd2120, cmd_table=Variable "cmd_table" is not available. ) at /usr/src/sys/ddb/db_command.c:445 #5 0xffffffff801d9dc0 in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 #6 0xffffffff801dbd49 in db_trap (type=Variable "type" is not available. ) at /usr/src/sys/ddb/db_main.c:229 #7 0xffffffff805b9704 in kdb_trap (type=12, code=0, tf=Variable "tf" is not available. ) at /usr/src/sys/kern/subr_kdb.c:534 #8 0xffffffff8086b5cd in trap_fatal (frame=0xffffff8077eb32b0, eva=18446744071582887349) at /usr/src/sys/amd64/amd64/trap.c:847 #9 0xffffffff8086b994 in trap_pfault (frame=0xffffff8077eb32b0, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:768 #10 0xffffffff8086c16b in trap (frame=0xffffff8077eb32b0) at /usr/src/sys/amd64/amd64/trap.c:494 #11 0xffffffff80854d73 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #12 0xffffffff805951a5 in _sx_xlock (sx=0xffffffff813dad9d, opts=0, file=0xffffffff810f57f0 "/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c", line=967) at atomic.h:147 #13 0xffffffff810392e5 in add_reference (ab=0xffffff002c03b340, hash_lock=Variable "hash_lock" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:967 #14 0xffffffff8103d377 in arc_buf_add_ref (buf=0xffffff0003ee87e0, tag=0xffffff002c046c40) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1208 #15 0xffffffff8103fe0d in dbuf_hold_impl (dn=0xffffff0003eec300, level=Variable "level" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1633 #16 0xffffffff81040ddb in dbuf_hold (dn=Variable "dn" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1689 #17 0xffffffff8104d5bc in dnode_hold_impl (os=0xffffff0003a01400, object=754, flag=1, tag=Variable "tag" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dnode.c:584 #18 0xffffffff81042c5a in dmu_bonus_hold (os=Variable "os" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:147 #19 0xffffffff81071bb7 in vdev_metaslab_init (vd=0xffffff00036dc800, txg=0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:787 #20 0xffffffff81071da5 in vdev_load (vd=0xffffff00036dc800) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:1531 #21 0xffffffff81071c75 in vdev_load (vd=0xffffff00036db800) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/vdev.c:1526 #22 0xffffffff8106539c in spa_load (spa=0xffffff00034f7000, config=Variable "config" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:1361 #23 0xffffffff81064ee1 in spa_load (spa=0xffffff00034f7000, config=Variable "config" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:1189 #24 0xffffffff810658fd in spa_open_common (pool=Variable "pool" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/spa.c:1474 #25 0xffffffff810512af in dsl_dir_open_spa (spa=0x0, name=Variable "name" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dir.c:314 #26 0xffffffff8105627b in dsl_dataset_hold (name=Variable "name" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dsl_dataset.c:571 #27 0xffffffff8104867f in dmu_objset_open (name=0xffffff0003013000 "home", type=DMU_OST_ANY, mode=9, osp=0xffffff8077eb39e0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_objset.c:349 #28 0xffffffff810936e2 in zfs_ioc_objset_stats (zc=0xffffff0003013000) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:1231 #29 0xffffffff810941c4 in zfsdev_ioctl (dev=Variable "dev" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_ioctl.c:3022 #30 0xffffffff80511a46 in devfs_ioctl_f (fp=0xffffff00034eb230, com=3425196561, data=0xffffff0003013000, cred=Variable "cred" is not available. ) at /usr/src/sys/fs/devfs/devfs_vnops.c:659 #31 0xffffffff805c8ad6 in kern_ioctl (td=0xffffff00037ac720, fd=3, com=3425196561, data=0xffffff0003013000 "home") at file.h:262 #32 0xffffffff805c8cfe in ioctl (td=0xffffff00037ac720, uap=0xffffff8077eb3bf0) at /usr/src/sys/kern/sys_generic.c:678 #33 0xffffffff8086bbff in syscall (frame=0xffffff8077eb3c80) at /usr/src/sys/amd64/amd64/trap.c:984 #34 0xffffffff80855051 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:373 #35 0x0000000800fe1d0c in ?? () This isn't related to the problems described in the "zfs: Fatal trap 12: page fault while in kernel mode" thread, is it? I've poked around on it and the panics look different. -Boris From andrew at modulus.org Wed Aug 5 01:00:00 2009 From: andrew at modulus.org (Andrew Snow) Date: Wed Aug 5 01:00:06 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs In-Reply-To: <4A78AFB2.10103@acm.poly.edu> References: <4A78AA71.9050107@acm.poly.edu> <4A78AFB2.10103@acm.poly.edu> Message-ID: <4A78D597.8030907@modulus.org> Have you tried setting the "failmode = continue" property on the zpool? The default failmode is "panic". - Andrew From spawk at acm.poly.edu Wed Aug 5 01:38:59 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Wed Aug 5 01:39:05 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs In-Reply-To: <4A78D597.8030907@modulus.org> References: <4A78AA71.9050107@acm.poly.edu> <4A78AFB2.10103@acm.poly.edu> <4A78D597.8030907@modulus.org> Message-ID: <4A78E294.2000206@acm.poly.edu> Andrew Snow wrote: > > Have you tried setting the "failmode = continue" property on the > zpool? The default failmode is "panic". > > > - Andrew > > The default failmode appears to be "wait," as that is what it is on all my other machines with which I have not fiddled. I wouldn't be able to get far enough to set it, either way. Thanks, though. -Boris From pjd at FreeBSD.org Wed Aug 5 06:50:07 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Wed Aug 5 06:50:20 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: References: <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4A71B239.8060007@freebsd.org> <3AA3C1CB-CEF7-46CC-A9C7-1648093D679E@exsca!pe.org> <4A71BED8.7050300@freebsd.org> Message-ID: <20090805065022.GI2181@garage.freebsd.pl> On Fri, Jul 31, 2009 at 11:05:01AM +0200, Thomas Backman wrote: > I'm able to reliably reproduce this panic, by having zfs recv destroy > a filesystem on the receiving end. > > 1) Use DDEBUG=1, I guess > 2) Create a FS on the source pool you don't care about: zfs create -o > mountpoint=/testfs source/testfs > 3) Clone a pool to another: zfs snapshot -r source@snap && zfs send -R > source@snap | zfs recv -Fvd target > 4) zfs destroy -r source/testfs > 4) zfs snapshot -r source@snap2 && zfs send -R -I snap source@snap2 | > zfs recv -Fvd target > 5) ^ Panic while receiving the FS the destroyed one is mounted under. > In my case, this was tank/root three times out of three; I then tried > creating testfs under /tmp (tank/tmp/testfs), *mounting* it under /usr/ > testfs, and it panics on receiving tank/usr: [...] I repeated precisevly those steps and it doesn't panic for me. Could you confirm that you use this patch? http://people.freebsd.org/~pjd/patches/zfs_vnops.c.2.patch If so, could you give me exact steps and all of them how to reproduce it? Starting with pool creation. -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090805/293dfb77/attachment.pgp From serenity at exscape.org Wed Aug 5 07:09:35 2009 From: serenity at exscape.org (Thomas Backman) Date: Wed Aug 5 07:09:48 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: <20090805065022.GI2181@garage.freebsd.pl> References: <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4A71B239.8060007@freebsd.org> <3AA3C1CB-CEF7-46CC-A9C7-1648093D679E@exsca!pe.org> <4A71BED8.7050300@freebsd.org> <20090805065022.GI2181@garage.freebsd.pl> Message-ID: <7C3499A8-A389-4F28-A800-B6C31B9E09C4@exscape.org> On Aug 5, 2009, at 08:50, Pawel Jakub Dawidek wrote: > On Fri, Jul 31, 2009 at 11:05:01AM +0200, Thomas Backman wrote: >> I'm able to reliably reproduce this panic, by having zfs recv destroy >> a filesystem on the receiving end. >> >> 1) Use DDEBUG=1, I guess >> 2) Create a FS on the source pool you don't care about: zfs create -o >> mountpoint=/testfs source/testfs >> 3) Clone a pool to another: zfs snapshot -r source@snap && zfs send >> -R >> source@snap | zfs recv -Fvd target >> 4) zfs destroy -r source/testfs >> 4) zfs snapshot -r source@snap2 && zfs send -R -I snap source@snap2 | >> zfs recv -Fvd target >> 5) ^ Panic while receiving the FS the destroyed one is mounted under. >> In my case, this was tank/root three times out of three; I then tried >> creating testfs under /tmp (tank/tmp/testfs), *mounting* it under / >> usr/ >> testfs, and it panics on receiving tank/usr: > [...] > > I repeated precisevly those steps and it doesn't panic for me. > Could you confirm that you use this patch? > > http://people.freebsd.org/~pjd/patches/zfs_vnops.c.2.patch > > If so, could you give me exact steps and all of them how to > reproduce it? > Starting with pool creation. Yup, I'm using that patch (I diffed the diffs, heh). I'll try to write a script to recreate the panic; I hope it's as easy as in real-world conditions though. Regards, Thomas From serenity at exscape.org Wed Aug 5 07:21:27 2009 From: serenity at exscape.org (Thomas Backman) Date: Wed Aug 5 07:21:34 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: <7C3499A8-A389-4F28-A800-B6C31B9E09C4@exscape.org> References: <4A719CA4.4060400@freebsd.org> <19347561-3CE6-40B3-930A-EB9925D3AFD1@exscape.org> <4A71AD29.10705@freebsd.org> <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4A71B239.8060007@freebsd.org> <3AA3C1CB-CEF7-46CC-A9C7-1648093D679E@exsca!pe.org> <4A71BED8.7050300@freebsd.org> <20090805065022.GI2181@garage.freebsd.pl> <7C3499A8-A389-4F28-A800-B6C31B9E09C4@exscape.org> Message-ID: <3ECC4BA0-F1EF-4039-9F39-68532851B572@exscape.org> On Aug 5, 2009, at 09:09, Thomas Backman wrote: > > On Aug 5, 2009, at 08:50, Pawel Jakub Dawidek wrote: > >> On Fri, Jul 31, 2009 at 11:05:01AM +0200, Thomas Backman wrote: >>> I'm able to reliably reproduce this panic, by having zfs recv >>> destroy >>> a filesystem on the receiving end. >>> >>> 1) Use DDEBUG=1, I guess >>> 2) Create a FS on the source pool you don't care about: zfs create >>> -o >>> mountpoint=/testfs source/testfs >>> 3) Clone a pool to another: zfs snapshot -r source@snap && zfs >>> send -R >>> source@snap | zfs recv -Fvd target >>> 4) zfs destroy -r source/testfs >>> 4) zfs snapshot -r source@snap2 && zfs send -R -I snap >>> source@snap2 | >>> zfs recv -Fvd target >>> 5) ^ Panic while receiving the FS the destroyed one is mounted >>> under. >>> In my case, this was tank/root three times out of three; I then >>> tried >>> creating testfs under /tmp (tank/tmp/testfs), *mounting* it under / >>> usr/ >>> testfs, and it panics on receiving tank/usr: >> [...] >> >> I repeated precisevly those steps and it doesn't panic for me. >> Could you confirm that you use this patch? >> >> http://people.freebsd.org/~pjd/patches/zfs_vnops.c.2.patch >> >> If so, could you give me exact steps and all of them how to >> reproduce it? >> Starting with pool creation. > Yup, I'm using that patch (I diffed the diffs, heh). I'll try to > write a script to recreate the panic; I hope it's as easy as in real- > world conditions though. Oh! I noticed that I actually finised my test case for this panic; I thought I stopped midway, but that was something else. Here are all the details: http://lists.freebsd.org/pipermail/freebsd-fs/2009-July/006585.html (If you have the libzfs_sendrecv patch, your own vnops patch and DDEBUG=1, there's no need to patch anything at all.) Regards, Thomas From pjd at FreeBSD.org Wed Aug 5 09:38:07 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Wed Aug 5 09:38:20 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: <3ECC4BA0-F1EF-4039-9F39-68532851B572@exscape.org> References: <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4A71B239.8060007@freebsd.org> <3AA3C1CB-CEF7-46CC-A9C7-1648093D679E@exsca!pe.org> <4A71BED8.7050300@freebsd.org> <20090805065022.GI2181@garage.freebsd.pl> <7C3499A8-A389-4F28-A800-B6C31B9E09C4@exscape.org> <3ECC4BA0-F1EF-4039-9F39-68532851B572@exscape.org> Message-ID: <20090805093825.GC1784@garage.freebsd.pl> On Wed, Aug 05, 2009 at 09:20:58AM +0200, Thomas Backman wrote: > Oh! I noticed that I actually finised my test case for this panic; I > thought I stopped midway, but that was something else. > Here are all the details: > http://lists.freebsd.org/pipermail/freebsd-fs/2009-July/006585.html > (If you have the libzfs_sendrecv patch, your own vnops patch and > DDEBUG=1, there's no need to patch anything at all.) I belive it is safe to do the following: http://people.freebsd.org/~pjd/patches/zfs_vfsops.c.2.patch Could you give it a try? -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090805/907a0bdb/attachment.pgp From serenity at exscape.org Wed Aug 5 10:37:09 2009 From: serenity at exscape.org (Thomas Backman) Date: Wed Aug 5 10:37:15 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: <20090805093825.GC1784@garage.freebsd.pl> References: <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4A71B239.8060007@freebsd.org> <3AA3C1CB-CEF7-46CC-A9C7-1648093D679E@exsca!pe.org> <4A71BED8.7050300@freebsd.org> <20090805065022.GI2181@garage.freebsd.pl> <7C3499A8-A389-4F28-A800-B6C31B9E09C4@exscape.org> <3ECC4BA0-F1EF-4039-9F39-68532851B572@exscape.org> <20090805093825.GC1784@garage.freebsd.pl> Message-ID: <1DA8C406-D35E-4AF3-990F-58753F305444@exscape.org> On Aug 5, 2009, at 11:38, Pawel Jakub Dawidek wrote: > On Wed, Aug 05, 2009 at 09:20:58AM +0200, Thomas Backman wrote: >> Oh! I noticed that I actually finised my test case for this panic; I >> thought I stopped midway, but that was something else. >> Here are all the details: >> http://lists.freebsd.org/pipermail/freebsd-fs/2009-July/006585.html >> (If you have the libzfs_sendrecv patch, your own vnops patch and >> DDEBUG=1, there's no need to patch anything at all.) > > I belive it is safe to do the following: > > http://people.freebsd.org/~pjd/patches/zfs_vfsops.c.2.patch > > Could you give it a try? Not right now (today), I'm afraid. I'll look in to it as soon as I can (likely tomorrow), though. Regards, Thomas From pjd at FreeBSD.org Wed Aug 5 11:56:03 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Wed Aug 5 11:56:10 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs In-Reply-To: <4A78AFB2.10103@acm.poly.edu> References: <4A78AA71.9050107@acm.poly.edu> <4A78AFB2.10103@acm.poly.edu> Message-ID: <20090805115621.GG1784@garage.freebsd.pl> On Tue, Aug 04, 2009 at 06:01:22PM -0400, Boris Kochergin wrote: > In a subsequent attempt at "zfs mount -a", the following panic happened: > > Fatal trap 12: page fault while in kernel mode [...] Could you try to mount file systems one by one? For example you have: tank tank/foo tank/foo/bar tank/baz And you do: # mount -t zfs tank /tank # mount -t zfs tank/foo /tank/foo # mount -t zfs tank/foo/bar /tank/foo/bar # mount -t zfs tank/baz /tank/baz -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090805/700c0014/attachment.pgp From serenity at exscape.org Wed Aug 5 12:07:17 2009 From: serenity at exscape.org (Thomas Backman) Date: Wed Aug 5 12:07:34 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: <20090805093825.GC1784@garage.freebsd.pl> References: <7544AED1-1216-4A24-B287-F54117641F76@exscape.org> <4A71B239.8060007@freebsd.org> <3AA3C1CB-CEF7-46CC-A9C7-1648093D679E@exsca!pe.org> <4A71BED8.7050300@freebsd.org> <20090805065022.GI2181@garage.freebsd.pl> <7C3499A8-A389-4F28-A800-B6C31B9E09C4@exscape.org> <3ECC4BA0-F1EF-4039-9F39-68532851B572@exscape.org> <20090805093825.GC1784@garage.freebsd.pl> Message-ID: <4F8F7C4A-A760-427A-A97B-92C548DA7BEE@exscape.org> On Aug 5, 2009, at 11:38, Pawel Jakub Dawidek wrote: > On Wed, Aug 05, 2009 at 09:20:58AM +0200, Thomas Backman wrote: >> Oh! I noticed that I actually finised my test case for this panic; I >> thought I stopped midway, but that was something else. >> Here are all the details: >> http://lists.freebsd.org/pipermail/freebsd-fs/2009-July/006585.html >> (If you have the libzfs_sendrecv patch, your own vnops patch and >> DDEBUG=1, there's no need to patch anything at all.) > > I belive it is safe to do the following: > > http://people.freebsd.org/~pjd/patches/zfs_vfsops.c.2.patch > > Could you give it a try? OK, I was wrong, I could give it a try now. :) It seems to work! -DDEBUG=1 and no panic. Just to be sure I tried to revert the patch, and sure enough, solaris assert panic at that line. The backup script (aka. real world test) also did not panic, which it did before. Regards, Thomas From spawk at acm.poly.edu Wed Aug 5 13:33:38 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Wed Aug 5 13:33:44 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs In-Reply-To: <20090805115621.GG1784@garage.freebsd.pl> References: <4A78AA71.9050107@acm.poly.edu> <4A78AFB2.10103@acm.poly.edu> <20090805115621.GG1784@garage.freebsd.pl> Message-ID: <4A798A12.4070408@acm.poly.edu> Pawel Jakub Dawidek wrote: > On Tue, Aug 04, 2009 at 06:01:22PM -0400, Boris Kochergin wrote: > >> In a subsequent attempt at "zfs mount -a", the following panic happened: >> >> Fatal trap 12: page fault while in kernel mode >> > [...] > > Could you try to mount file systems one by one? For example you have: > > tank > tank/foo > tank/foo/bar > tank/baz > > And you do: > > # mount -t zfs tank /tank > # mount -t zfs tank/foo /tank/foo > # mount -t zfs tank/foo/bar /tank/foo/bar > # mount -t zfs tank/baz /tank/baz > > There is only one filesystem (home), but "mount -t zfs home /usr/home" did work while the problem disk (ad13) was disconnected from the system. I started moving its data off to a new geom_raid3 array, and there was a panic shortly after: Unread portion of the kernel message buffer: Fatal trap 12: page fault while in kernel mode fault virtual address = 0xffffffffffffffe9 fault code = supervisor read data, page not present instruction pointer = 0x20:0xffffffff8103a9e7 stack pointer = 0x28:0xffffff8077f26430 frame pointer = 0x28:0xffffff8077f26500 code segment = base 0x0, limit 0xfffff, type 0x1b = DPL 0, pres 1, long 1, def32 0, gran 1 processor eflags = interrupt enabled, resume, IOPL = 0 current process = 972 (cp) panic: from debugger Uptime: 4m28s Physical memory: 4082 MB Dumping 2532 MB: 2517 2501 2485 2469 2453 2437 2421 2405 2389 2373 2357 2341 2325 2309 2293 2277 2261 2245 2229 2213 2197 2181 2165 2149 2133 2117 2101 2085 2069 2053 2037 2021 2005 1989 1973 1957 1941 1925 1909 1893 1877 1861 1845 1829 1813 1797 1781 1765 1749 1733 1717 1701 1685 1669 1653 1637 1621 1605 1589 1573 1557 1541 1525 1509 1493 1477 1461 1445 1429 1413 1397 1381 1365 1349 1333 1317 1301 1285 1269 1253 1237 1221 1205 1189 1173 1157 1141 1125 1109 1093 1077 1061 1045 1029 1013 997 981 965 949 933 917 901 885 869 853 837 821 805 789 773 757 741 725 709 693 677 661 645 629 613 597 581 565 549 533 517 501 485 469 453 437 421 405 389 373 357 341 325 309 293 277 261 245 229 213 197 181 165 149 133 117 101 85 69 53 37 21 5 Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /boot/kernel/zfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. done. Loaded symbols for /boot/kernel/opensolaris.ko Reading symbols from /usr/src/sys/modules/geom/geom_raid3/geom_raid3.ko...done. Loaded symbols for /usr/src/sys/modules/geom/geom_raid3/geom_raid3.ko #0 doadump () at pcpu.h:223 223 pcpu.h: No such file or directory. in pcpu.h (kgdb) where #0 doadump () at pcpu.h:223 #1 0xffffffff8058d881 in boot (howto=260) at /usr/src/sys/kern/kern_shutdown.c:419 #2 0xffffffff8058dc5b in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:575 #3 0xffffffff801d9767 in db_panic (addr=Variable "addr" is not available. ) at /usr/src/sys/ddb/db_command.c:478 #4 0xffffffff801d9b71 in db_command (last_cmdp=0xffffffff80bd2120, cmd_table=Variable "cmd_table" is not available. ) at /usr/src/sys/ddb/db_command.c:445 #5 0xffffffff801d9dc0 in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 #6 0xffffffff801dbd49 in db_trap (type=Variable "type" is not available. ) at /usr/src/sys/ddb/db_main.c:229 #7 0xffffffff805b9704 in kdb_trap (type=12, code=0, tf=Variable "tf" is not available. ) at /usr/src/sys/kern/subr_kdb.c:534 #8 0xffffffff8086b5cd in trap_fatal (frame=0xffffff8077f26380, eva=18446744073709551593) at /usr/src/sys/amd64/amd64/trap.c:847 #9 0xffffffff8086b994 in trap_pfault (frame=0xffffff8077f26380, usermode=0) at /usr/src/sys/amd64/amd64/trap.c:768 #10 0xffffffff8086c16b in trap (frame=0xffffff8077f26380) at /usr/src/sys/amd64/amd64/trap.c:494 #11 0xffffffff80854d73 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #12 0xffffffff8103a9e7 in arc_evict (state=Variable "state" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1489 #13 0xffffffff8103b049 in arc_get_data_buf (buf=0xffffff00873d23f0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2170 #14 0xffffffff8103b46e in arc_buf_alloc (spa=0xffffff0003536000, size=16384, tag=Variable "tag" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1156 #15 0xffffffff8103c6a0 in arc_read_nolock (pio=0xffffff00039a92d0, spa=0xffffff0003536000, bp=0xffffff800947a380, done=0xffffffff8103f360 , private=0xffffff008740dc40, priority=0, zio_flags=1, arc_flags=0xffffff8077f266ec, zb=0xffffff8077f266c0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2607 #16 0xffffffff8103cd6c in arc_read (pio=0xffffff00039a92d0, spa=0xffffff0003536000, bp=0xffffff800947a380, pbuf=0xffffff002d89f5a0, done=0xffffffff8103f360 , private=0xffffff008740dc40, priority=0, zio_flags=1, arc_flags=0xffffff8077f266ec, zb=0xffffff8077f266c0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:2508 #17 0xffffffff8103f7e9 in dbuf_read (db=0xffffff008740dc40, zio=0xffffff00039a92d0, flags=14) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:521 #18 0xffffffff8103fd56 in dbuf_findbp (dn=Variable "dn" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1381 #19 0xffffffff8103fe62 in dbuf_hold_impl (dn=0xffffff002d526300, level=Variable "level" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1617 #20 0xffffffff81040ddb in dbuf_hold (dn=Variable "dn" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1689 #21 0xffffffff81042f4d in dmu_buf_hold_array_by_dnode (dn=0xffffff002d526300, offset=Variable "offset" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:223 #22 0xffffffff810433e2 in dmu_buf_hold_array (os=Variable "os" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:284 #23 0xffffffff8104357f in dmu_read_uio (os=Variable "os" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dmu.c:649 #24 0xffffffff810a21b1 in zfs_freebsd_read (ap=Variable "ap" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:591 #25 0xffffffff806244c0 in vn_read (fp=0xffffff0003435460, uio=0xffffff8077f26b00, active_cred=0xffffff002d376900, flags=Variable "flags" is not available. ) at vnode_if.h:384 #26 0xffffffff805c93a1 in dofileread (td=0xffffff0003780ab0, fd=3, fp=0xffffff0003435460, auio=0xffffff8077f26b00, offset=Variable "offset" is not available. ) at file.h:227 #27 0xffffffff805c9720 in kern_readv (td=0xffffff0003780ab0, fd=3, auio=0xffffff8077f26b00) at /usr/src/sys/kern/sys_generic.c:237 #28 0xffffffff805c9815 in read (td=Variable "td" is not available. ) at /usr/src/sys/kern/sys_generic.c:153 #29 0xffffffff8086bbff in syscall (frame=0xffffff8077f26c80) at /usr/src/sys/amd64/amd64/trap.c:984 #30 0xffffffff80855051 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:373 #31 0x0000000800737d6c in ?? () Previous frame inner to this frame (corrupt stack?) I reconnected the bad disk and tried "mount -t zfs home /usr/home" but the command does not return (it's been running for a few minutes at the time of this writing). However, the machine does not panic or lock up. Thank you for your help. -Boris From pjd at FreeBSD.org Wed Aug 5 14:23:17 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Wed Aug 5 14:23:25 2009 Subject: adding drive to raidz1 In-Reply-To: <4A3B1020.2010305@jrv.org> References: <4A3B1020.2010305@jrv.org> Message-ID: <20090805142336.GJ1784@garage.freebsd.pl> On Thu, Jun 18, 2009 at 11:12:16PM -0500, James R. Van Artsdalen wrote: > As a feature suggestion why not reject an "zpool add" of a non-redundant > vdev to a pool of redundant vdev's unless -f is given? A command of > that sort is almost always a mistake so requiring -f would seem no > hardship for anyone... This is rejected: # zpool create foobar raidz md0 md1 md2 # zpool add foobar md3 invalid vdev specification use '-f' to override the following errors: mismatched replication level: pool uses raidz and new vdev is disk -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090805/ea6c154a/attachment.pgp From trevor.hearn at Vanderbilt.Edu Thu Aug 6 14:21:12 2009 From: trevor.hearn at Vanderbilt.Edu (Hearn, Trevor) Date: Thu Aug 6 14:21:21 2009 Subject: UFS Filesystem issues, and the loss of my hair... Message-ID: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34C3@ITS-HCWNEM03.ds.Vanderbilt.edu> First off, let me state that I love FreeBSD. I've used it for years, and have not had any major problems with it... Until now. As you can tell, I work for a major university. I setup a large storage array to hold data for a project they have here. No great shakes, just some standard files and such. The fun started when I started loading users onto the system, and they started using it... Isn't that always the case? Now, I get ufs_dirbad errors, and the system hard locks. This isn't the worst thing that could happen, but when you're talking about file partitions the size that I am using, the fsck takes FOREVER. Somewhere on the order of 1.5 hours. During that time, I am bringing the individual shares/partitions online, but the users suffer. I've asked about this before, in a different forum, but got no usable information that I could see. So, here goes... The system is as such. A dell 2950 1U server, with a Qlogic Fibre Channel card. It is connected to two Promise Array chassis, 610 series, each with 16 drives. Each chassis is running RAID 6, which gives me about 12.73tb of storage per chassis. From there, the logical drives are sliced up into smaller partitions. At most, I have a 3.6tb partition. The smallest is a 100gig partition. Filesystem Size Used Avail Capacity Mounted on /dev/mfid0s1a 197G 10G 170G 6% / devfs 1.0K 1.0K 0B 100% /dev /dev/da0p1 1.8T 1.5T 130G 92% /slice1 /dev/da0p5 2.7T 1.8T 661G 74% /slice2 /dev/da0p9 250G 21G 209G 9% /slice3 /dev/da1p3 103G 12G 83G 12% /slice4 /dev/da1p4 205G 54G 135G 29% /slice5 /dev/da1p5 103G 7.3G 87G 8% /slice6 /dev/da1p6 103G 22G 72G 23% /slice7 etc... I had to use GPT to setup the partitions, and they are using UFS2 for the filesystem. Now... If that's not fun enough... I have TWO of these creatures, which RSYNC every 4 hours. The secondary system is across campus, and sits idle 99% of the time. Every 4 hours, in a stepped schedule, the primary array syncs to the secondary array. If the primary goes down, I FSCK, and any files that are fried, I bring back across from the secondary and replace them. This has worked OK for a while, but now I am getting Kernel Panics on a regular basis. I've been told to migrate to a different filesystem, but my options are ZFS and using GJOURNAL with UFS, from what I can tell. I need something repeatable, simple, and I need something robust. I have NO idea why I keep getting errors like this, but I imagine it's a cascading effect of other hangs that have caused more corruption. I'd buy a fella, or gal, a cup of coffee and a pop-tart if they could help a brother out. I have checked out this link: http://phaq.phunsites.net/2007/07/01/ufs_dirbad-panic-with-mangled-entries-in-ufs/ and decided that I need to give this a shot after hours, but being the kinda guy I am, I need to make sure I am covering all of my bases. Anyone got any ideas? Thanks! -T From jamie.ostrowski at gmail.com Thu Aug 6 16:05:58 2009 From: jamie.ostrowski at gmail.com (Jamie Ostrowski) Date: Thu Aug 6 16:06:07 2009 Subject: Extracting block pointer list -- ffsinfo? Message-ID: <29ae62fc0908060839u430fb073hf5b9f7837f9bc8b6@mail.gmail.com> I'm a student studying filesystems, and I'd like to find a way to list the block pointers in an inode. Are there any tools in FreeBSD that can do that? For example, I've tried the following command, but I'm not seeing a list of the block pointers: ffsinfo -i 2 -l 256 /dev/da0s1f ===== START UFS2 INODE DUMP ===== # 0@28202200: Inode 0x00000002 mode u_int16_t 040755 nlink int16_t 0x0012 uid u_int32_t 0x00000000 gid u_int32_t 0x00000000 blksize u_int32_t 0x00000000 size u_int64_t 0x0000000000000200 blocks u_int64_t 0x0000000000000004 atime ufs_time_t 1249545661 mtime ufs_time_t 1243012475 ctime ufs_time_t 1243012475 birthtime ufs_time_t 1230822454 mtimensec int32_t 0x00000000 atimensec int32_t 0x00000000 ctimensec int32_t 0x00000000 birthnsec int32_t 0x00000000 gen int32_t 0x50291104 kernflags u_int32_t 0x00000000 flags u_int32_t 0x00000000 extsize int32_t 0x00000000 db ufs2_daddr_t[0] 0x bc8 ===== END UFS2 INODE DUMP ===== From pjd at FreeBSD.org Fri Aug 7 05:44:14 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Fri Aug 7 05:44:21 2009 Subject: zfs: Fatal trap 12: page fault while in kernel mode In-Reply-To: <20090802092714.GA5813@jpru.ffm.jpru.de> References: <20090727072503.GA52309@jpru.ffm.jpru.de> <4A6E06E6.9030300@mail.zedat.fu-berlin.de> <4A6EC9E2.5070200@icyb.net.ua> <20090729084723.GD1586@garage.freebsd.pl> <20090802092714.GA5813@jpru.ffm.jpru.de> Message-ID: <20090807054431.GA2500@garage.freebsd.pl> On Sun, Aug 02, 2009 at 11:27:14AM +0200, Juergen Unger wrote: > I tried the patch, restarted the whole thing yesterday morning > and after less then 24 hours and approximately 3215 zfs-receive > jobs it do not crashes anymore, but the last started zfs-receive > jobs is hanging, cannot be killed, even not with -9. Even other > zfs commands are hanging and cannot be killed, while zpool commands > seems to be not affected. Unfortunatel I wasn't able to reproduce it. The good news is that something was just committed to OpenSolaris which might fix it (see bug 6868108 on http://bugs.opensolaris.org). The bad news is that the fix is too complex to backport to our ZFS version... -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090807/b71e4070/attachment.pgp From pjd at FreeBSD.org Fri Aug 7 07:37:19 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Fri Aug 7 07:37:26 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs In-Reply-To: <4A798A12.4070408@acm.poly.edu> References: <4A78AA71.9050107@acm.poly.edu> <4A78AFB2.10103@acm.poly.edu> <20090805115621.GG1784@garage.freebsd.pl> <4A798A12.4070408@acm.poly.edu> Message-ID: <20090807073738.GA1607@garage.freebsd.pl> On Wed, Aug 05, 2009 at 09:33:06AM -0400, Boris Kochergin wrote: > Fatal trap 12: page fault while in kernel mode > fault virtual address = 0xffffffffffffffe9 > fault code = supervisor read data, page not present > instruction pointer = 0x20:0xffffffff8103a9e7 > stack pointer = 0x28:0xffffff8077f26430 > frame pointer = 0x28:0xffffff8077f26500 > code segment = base 0x0, limit 0xfffff, type 0x1b > = DPL 0, pres 1, long 1, def32 0, gran 1 > processor eflags = interrupt enabled, resume, IOPL = 0 > current process = 972 (cp) [...] > /usr/src/sys/amd64/amd64/trap.c:494 > #11 0xffffffff80854d73 in calltrap () at > /usr/src/sys/amd64/amd64/exception.S:224 > #12 0xffffffff8103a9e7 in arc_evict (state=Variable "state" is not > available. > ) at > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1489 Could you tell me what do you have at this line in your source? I don't think you use HEAD... What exact FreeBSD version are you using? -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090807/4fb11cf9/attachment.pgp From pjd at FreeBSD.org Fri Aug 7 07:43:41 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Fri Aug 7 07:43:48 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs In-Reply-To: <20090807073738.GA1607@garage.freebsd.pl> References: <4A78AA71.9050107@acm.poly.edu> <4A78AFB2.10103@acm.poly.edu> <20090805115621.GG1784@garage.freebsd.pl> <4A798A12.4070408@acm.poly.edu> <20090807073738.GA1607@garage.freebsd.pl> Message-ID: <20090807074400.GB1607@garage.freebsd.pl> On Fri, Aug 07, 2009 at 09:37:38AM +0200, Pawel Jakub Dawidek wrote: > On Wed, Aug 05, 2009 at 09:33:06AM -0400, Boris Kochergin wrote: > > Fatal trap 12: page fault while in kernel mode > > fault virtual address = 0xffffffffffffffe9 > > fault code = supervisor read data, page not present > > instruction pointer = 0x20:0xffffffff8103a9e7 > > stack pointer = 0x28:0xffffff8077f26430 > > frame pointer = 0x28:0xffffff8077f26500 > > code segment = base 0x0, limit 0xfffff, type 0x1b > > = DPL 0, pres 1, long 1, def32 0, gran 1 > > processor eflags = interrupt enabled, resume, IOPL = 0 > > current process = 972 (cp) > [...] > > /usr/src/sys/amd64/amd64/trap.c:494 > > #11 0xffffffff80854d73 in calltrap () at > > /usr/src/sys/amd64/amd64/exception.S:224 > > #12 0xffffffff8103a9e7 in arc_evict (state=Variable "state" is not > > available. > > ) at > > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1489 > > Could you tell me what do you have at this line in your source? I don't > think you use HEAD... What exact FreeBSD version are you using? You already gave version number in your first mail, sorry about that. 8.0-BETA2 should be very close to HEAD (or it actually was HEAD), so I guess this is the code we are looking at: 1488: /* "lookahead" for better eviction candidate */ 1489: if (recycle && ab->b_size != bytes && 1490: ab_prev && ab_prev->b_size == bytes) 1491: continue; If 'ab' is corrupted it should panic earlier, so it seems ab_prev is corrupted, can you see what it points to in gdb? -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090807/d59961a9/attachment.pgp From jhb at freebsd.org Fri Aug 7 12:44:44 2009 From: jhb at freebsd.org (John Baldwin) Date: Fri Aug 7 12:44:52 2009 Subject: UFS Filesystem issues, and the loss of my hair... In-Reply-To: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34C3@ITS-HCWNEM03.ds.Vanderbilt.edu> References: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34C3@ITS-HCWNEM03.ds.Vanderbilt.edu> Message-ID: <200908070829.54571.jhb@freebsd.org> On Thursday 06 August 2009 9:51:04 am Hearn, Trevor wrote: > First off, let me state that I love FreeBSD. I've used it for years, and have not had any major problems with it... Until now. > > As you can tell, I work for a major university. I setup a large storage array to hold data for a project they have here. No great shakes, just some standard files and such. The fun started when I started loading users onto the system, and they started using it... Isn't that always the case? Now, I get ufs_dirbad errors, and the system hard locks. This isn't the worst thing that could happen, but when you're talking about file partitions the size that I am using, the fsck takes FOREVER. Somewhere on the order of 1.5 hours. During that time, I am bringing the individual shares/partitions online, but the users suffer. I've asked about this before, in a different forum, but got no usable information that I could see. So, here goes... > > The system is as such. A dell 2950 1U server, with a Qlogic Fibre Channel card. It is connected to two Promise Array chassis, 610 series, each with 16 drives. Each chassis is running RAID 6, which gives me about 12.73tb of storage per chassis. From there, the logical drives are sliced up into smaller partitions. At most, I have a 3.6tb partition. The smallest is a 100gig partition. > > Filesystem Size Used Avail Capacity Mounted on > /dev/mfid0s1a 197G 10G 170G 6% / > devfs 1.0K 1.0K 0B 100% /dev > /dev/da0p1 1.8T 1.5T 130G 92% /slice1 > /dev/da0p5 2.7T 1.8T 661G 74% /slice2 > /dev/da0p9 250G 21G 209G 9% /slice3 > /dev/da1p3 103G 12G 83G 12% /slice4 > /dev/da1p4 205G 54G 135G 29% /slice5 > /dev/da1p5 103G 7.3G 87G 8% /slice6 > /dev/da1p6 103G 22G 72G 23% /slice7 > etc... > > I had to use GPT to setup the partitions, and they are using UFS2 for the filesystem. Now... If that's not fun enough... I have TWO of these creatures, which RSYNC every 4 hours. The secondary system is across campus, and sits idle 99% of the time. Every 4 hours, in a stepped schedule, the primary array syncs to the secondary array. If the primary goes down, I FSCK, and any files that are fried, I bring back across from the secondary and replace them. This has worked OK for a while, but now I am getting Kernel Panics on a regular basis. I've been told to migrate to a different filesystem, but my options are ZFS and using GJOURNAL with UFS, from what I can tell. I need something repeatable, simple, and I need something robust. I have NO idea why I keep getting errors like this, but I imagine it's a cascading effect of other hangs that have caused more corruption. > > I'd buy a fella, or gal, a cup of coffee and a pop-tart if they could help a brother out. I have checked out this link: > http://phaq.phunsites.net/2007/07/01/ufs_dirbad-panic-with-mangled-entries-in-ufs/ > and decided that I need to give this a shot after hours, but being the kinda guy I am, I need to make sure I am covering all of my bases. Are you seeing ufs_dirbad panics? Specifically, can you capture the messages on the console when the machine panics? -- John Baldwin From spawk at acm.poly.edu Fri Aug 7 13:46:15 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Fri Aug 7 13:46:21 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs In-Reply-To: <20090807074400.GB1607@garage.freebsd.pl> References: <4A78AA71.9050107@acm.poly.edu> <4A78AFB2.10103@acm.poly.edu> <20090805115621.GG1784@garage.freebsd.pl> <4A798A12.4070408@acm.poly.edu> <20090807073738.GA1607@garage.freebsd.pl> <20090807074400.GB1607@garage.freebsd.pl> Message-ID: <4A7C3002.8000003@acm.poly.edu> Pawel Jakub Dawidek wrote: > On Fri, Aug 07, 2009 at 09:37:38AM +0200, Pawel Jakub Dawidek wrote: > >> On Wed, Aug 05, 2009 at 09:33:06AM -0400, Boris Kochergin wrote: >> >>> Fatal trap 12: page fault while in kernel mode >>> fault virtual address = 0xffffffffffffffe9 >>> fault code = supervisor read data, page not present >>> instruction pointer = 0x20:0xffffffff8103a9e7 >>> stack pointer = 0x28:0xffffff8077f26430 >>> frame pointer = 0x28:0xffffff8077f26500 >>> code segment = base 0x0, limit 0xfffff, type 0x1b >>> = DPL 0, pres 1, long 1, def32 0, gran 1 >>> processor eflags = interrupt enabled, resume, IOPL = 0 >>> current process = 972 (cp) >>> >> [...] >> >>> /usr/src/sys/amd64/amd64/trap.c:494 >>> #11 0xffffffff80854d73 in calltrap () at >>> /usr/src/sys/amd64/amd64/exception.S:224 >>> #12 0xffffffff8103a9e7 in arc_evict (state=Variable "state" is not >>> available. >>> ) at >>> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1489 >>> >> Could you tell me what do you have at this line in your source? I don't >> think you use HEAD... What exact FreeBSD version are you using? >> > > You already gave version number in your first mail, sorry about that. > 8.0-BETA2 should be very close to HEAD (or it actually was HEAD), so I > guess this is the code we are looking at: > > 1488: /* "lookahead" for better eviction candidate */ > 1489: if (recycle && ab->b_size != bytes && > 1490: ab_prev && ab_prev->b_size == bytes) > 1491: continue; > > If 'ab' is corrupted it should panic earlier, so it seems ab_prev is > corrupted, can you see what it points to in gdb? > > Yeah, that's what the code looks like. For convenience, I've put the source tree the system was built using up at: http://acm.poly.edu/~spawk/src/ Maybe my kgdb chops aren't up to par, but I can't seem to see what ab_prev points to: (kgdb) up #12 0xffffffff8103a9e7 in arc_evict (state=Variable "state" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1489 1489 if (recycle && ab->b_size != bytes && Current language: auto; currently c (kgdb) list 1484 LBOLT - ab->b_arc_access < arc_min_prefetch_lifespan)) { 1485 skipped++; 1486 continue; 1487 } 1488 /* "lookahead" for better eviction candidate */ 1489 if (recycle && ab->b_size != bytes && 1490 ab_prev && ab_prev->b_size == bytes) 1491 continue; 1492 hash_lock = HDR_LOCK(ab); 1493 have_lock = MUTEX_HELD(hash_lock); (kgdb) print ab $13 = (arc_buf_hdr_t *) 0xffffff0003ebc410 (kgdb) print ab->b_size $14 = 1 (kgdb) print bytes $15 = 16384 (kgdb) print ab_prev No symbol "ab_prev" in current context. -Boris From pjd at FreeBSD.org Fri Aug 7 19:13:17 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Fri Aug 7 19:13:24 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs In-Reply-To: <4A7C3002.8000003@acm.poly.edu> References: <4A78AA71.9050107@acm.poly.edu> <4A78AFB2.10103@acm.poly.edu> <20090805115621.GG1784@garage.freebsd.pl> <4A798A12.4070408@acm.poly.edu> <20090807073738.GA1607@garage.freebsd.pl> <20090807074400.GB1607@garage.freebsd.pl> <4A7C3002.8000003@acm.poly.edu> Message-ID: <20090807191334.GA1814@garage.freebsd.pl> On Fri, Aug 07, 2009 at 09:45:38AM -0400, Boris Kochergin wrote: > Maybe my kgdb chops aren't up to par, but I can't seem to see what > ab_prev points to: > > (kgdb) up > #12 0xffffffff8103a9e7 in arc_evict (state=Variable "state" is not > available. > ) at > /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1489 > 1489 if (recycle && ab->b_size != bytes && > Current language: auto; currently c > (kgdb) list > 1484 LBOLT - ab->b_arc_access < > arc_min_prefetch_lifespan)) { > 1485 skipped++; > 1486 continue; > 1487 } > 1488 /* "lookahead" for better eviction candidate */ > 1489 if (recycle && ab->b_size != bytes && > 1490 ab_prev && ab_prev->b_size == bytes) > 1491 continue; > 1492 hash_lock = HDR_LOCK(ab); > 1493 have_lock = MUTEX_HELD(hash_lock); > (kgdb) print ab > $13 = (arc_buf_hdr_t *) 0xffffff0003ebc410 > (kgdb) print ab->b_size > $14 = 1 > (kgdb) print bytes > $15 = 16384 > (kgdb) print ab_prev > No symbol "ab_prev" in current context. Yeah, that's strange indeed. Could you try: print ab->b_arc_node.list_prev print ab->b_arc_node.list_next -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090807/8d744932/attachment.pgp From spawk at acm.poly.edu Fri Aug 7 19:35:11 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Fri Aug 7 19:35:17 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs In-Reply-To: <20090807191334.GA1814@garage.freebsd.pl> References: <4A78AA71.9050107@acm.poly.edu> <4A78AFB2.10103@acm.poly.edu> <20090805115621.GG1784@garage.freebsd.pl> <4A798A12.4070408@acm.poly.edu> <20090807073738.GA1607@garage.freebsd.pl> <20090807074400.GB1607@garage.freebsd.pl> <4A7C3002.8000003@acm.poly.edu> <20090807191334.GA1814@garage.freebsd.pl> Message-ID: <4A7C81CA.2040303@acm.poly.edu> Pawel Jakub Dawidek wrote: > On Fri, Aug 07, 2009 at 09:45:38AM -0400, Boris Kochergin wrote: > >> Maybe my kgdb chops aren't up to par, but I can't seem to see what >> ab_prev points to: >> >> (kgdb) up >> #12 0xffffffff8103a9e7 in arc_evict (state=Variable "state" is not >> available. >> ) at >> /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c:1489 >> 1489 if (recycle && ab->b_size != bytes && >> Current language: auto; currently c >> (kgdb) list >> 1484 LBOLT - ab->b_arc_access < >> arc_min_prefetch_lifespan)) { >> 1485 skipped++; >> 1486 continue; >> 1487 } >> 1488 /* "lookahead" for better eviction candidate */ >> 1489 if (recycle && ab->b_size != bytes && >> 1490 ab_prev && ab_prev->b_size == bytes) >> 1491 continue; >> 1492 hash_lock = HDR_LOCK(ab); >> 1493 have_lock = MUTEX_HELD(hash_lock); >> (kgdb) print ab >> $13 = (arc_buf_hdr_t *) 0xffffff0003ebc410 >> (kgdb) print ab->b_size >> $14 = 1 >> (kgdb) print bytes >> $15 = 16384 >> (kgdb) print ab_prev >> No symbol "ab_prev" in current context. >> > > Yeah, that's strange indeed. Could you try: > > print ab->b_arc_node.list_prev > print ab->b_arc_node.list_next > > (kgdb) print ab->b_arc_node.list_prev $1 = (struct list_node *) 0x1 (kgdb) print ab->b_arc_node.list_next $2 = (struct list_node *) 0xffffffff811064f0 -Boris From pjd at FreeBSD.org Fri Aug 7 19:38:28 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Fri Aug 7 19:38:37 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs In-Reply-To: <4A7C81CA.2040303@acm.poly.edu> References: <4A78AA71.9050107@acm.poly.edu> <4A78AFB2.10103@acm.poly.edu> <20090805115621.GG1784@garage.freebsd.pl> <4A798A12.4070408@acm.poly.edu> <20090807073738.GA1607@garage.freebsd.pl> <20090807074400.GB1607@garage.freebsd.pl> <4A7C3002.8000003@acm.poly.edu> <20090807191334.GA1814@garage.freebsd.pl> <4A7C81CA.2040303@acm.poly.edu> Message-ID: <20090807193842.GA2487@garage.freebsd.pl> On Fri, Aug 07, 2009 at 03:34:34PM -0400, Boris Kochergin wrote: > Pawel Jakub Dawidek wrote: > >Yeah, that's strange indeed. Could you try: > > > > print ab->b_arc_node.list_prev > > print ab->b_arc_node.list_next > > > > > (kgdb) print ab->b_arc_node.list_prev > $1 = (struct list_node *) 0x1 Yeah, list_prev is corrupted. If it panics on you everytime, I could send you a patch which will try to catch where the corruption occurs. -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090807/9535145a/attachment.pgp From spawk at acm.poly.edu Fri Aug 7 20:00:43 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Fri Aug 7 20:00:50 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs In-Reply-To: <20090807193842.GA2487@garage.freebsd.pl> References: <4A78AA71.9050107@acm.poly.edu> <4A78AFB2.10103@acm.poly.edu> <20090805115621.GG1784@garage.freebsd.pl> <4A798A12.4070408@acm.poly.edu> <20090807073738.GA1607@garage.freebsd.pl> <20090807074400.GB1607@garage.freebsd.pl> <4A7C3002.8000003@acm.poly.edu> <20090807191334.GA1814@garage.freebsd.pl> <4A7C81CA.2040303@acm.poly.edu> <20090807193842.GA2487@garage.freebsd.pl> Message-ID: <4A7C87C5.1070608@acm.poly.edu> Pawel Jakub Dawidek wrote: > On Fri, Aug 07, 2009 at 03:34:34PM -0400, Boris Kochergin wrote: > >> Pawel Jakub Dawidek wrote: >> >>> Yeah, that's strange indeed. Could you try: >>> >>> print ab->b_arc_node.list_prev >>> print ab->b_arc_node.list_next >>> >>> >>> >> (kgdb) print ab->b_arc_node.list_prev >> $1 = (struct list_node *) 0x1 >> > > Yeah, list_prev is corrupted. If it panics on you everytime, I could > send you a patch which will try to catch where the corruption occurs. > > I eventually get the arc_evict panic every time I successfully manage to mount the filesystem, but it usually panics (with the other backtrace) as soon as I try to mount it, or mount just hangs. I'll gladly try the patch, though--the data on the array is important to me. Thanks. -Boris From pjd at FreeBSD.org Fri Aug 7 20:27:39 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Fri Aug 7 20:27:52 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs In-Reply-To: <4A7C87C5.1070608@acm.poly.edu> References: <4A78AFB2.10103@acm.poly.edu> <20090805115621.GG1784@garage.freebsd.pl> <4A798A12.4070408@acm.poly.edu> <20090807073738.GA1607@garage.freebsd.pl> <20090807074400.GB1607@garage.freebsd.pl> <4A7C3002.8000003@acm.poly.edu> <20090807191334.GA1814@garage.freebsd.pl> <4A7C81CA.2040303@acm.poly.edu> <20090807193842.GA2487@garage.freebsd.pl> <4A7C87C5.1070608@acm.poly.edu> Message-ID: <20090807202756.GB2487@garage.freebsd.pl> On Fri, Aug 07, 2009 at 04:00:05PM -0400, Boris Kochergin wrote: > Pawel Jakub Dawidek wrote: > >On Fri, Aug 07, 2009 at 03:34:34PM -0400, Boris Kochergin wrote: > > > >>Pawel Jakub Dawidek wrote: > >> > >>>Yeah, that's strange indeed. Could you try: > >>> > >>> print ab->b_arc_node.list_prev > >>> print ab->b_arc_node.list_next > >>> > >>> > >>> > >>(kgdb) print ab->b_arc_node.list_prev > >>$1 = (struct list_node *) 0x1 > >> > > > >Yeah, list_prev is corrupted. If it panics on you everytime, I could > >send you a patch which will try to catch where the corruption occurs. > > > > > I eventually get the arc_evict panic every time I successfully manage to > mount the filesystem, but it usually panics (with the other backtrace) > as soon as I try to mount it, or mount just hangs. I'll gladly try the > patch, though--the data on the array is important to me. Thanks. To get the data from there you could also try to 'zfs send' it without mounting the dataset at all (just in case). -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090807/584a7d28/attachment.pgp From matt at corp.spry.com Fri Aug 7 20:46:07 2009 From: matt at corp.spry.com (Matt Simerson) Date: Fri Aug 7 20:46:15 2009 Subject: UFS Filesystem issues, and the loss of my hair... In-Reply-To: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34C3@ITS-HCWNEM03.ds.Vanderbilt.edu> References: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34C3@ITS-HCWNEM03.ds.Vanderbilt.edu> Message-ID: <04A4A2CB-828B-46BF-A2B6-50B64F06E96E@spry.com> On Aug 6, 2009, at 6:51 AM, Hearn, Trevor wrote: > First off, let me state that I love FreeBSD. I've used it for years, > and have not had any major problems with it... Until now. > > As you can tell, I work for a major university. I setup a large > storage array to hold data for a project they have here. No great > shakes, just some standard files and such. > > I'd buy a fella, or gal, a cup of coffee and a pop-tart if they > could help a brother out. I have checked out this link: > http://phaq.phunsites.net/2007/07/01/ufs_dirbad-panic-with-mangled-entries-in-ufs/ > and decided that I need to give this a shot after hours, but being > the kinda guy I am, I need to make sure I am covering all of my bases. > > Anyone got any ideas? > > Thanks! Have you given any consideration to ZFS? With ZFS there's no reason to have all those slices. Just stripe the two RAID 6 arrays together and have a single 26TB zpool. No GPT or UFS to mess with. Just point ZFS at the raw disks and off you go. I'm doing that with Areca 1231ML controllers in boxes with 24 disks each. The two 12 channel RAID cards each present a RAID volume to the OS and zpool stripes them together. One of the more useful features of ZFS is file system compression. You may find that with file system compression, you can get by with 13TB of storage. Then you have one RAID 6 array as the data store and the 2nd array for backups on each machine. With ZFS, you can send snapshots of the data partition to the backup every hour, or even every minute without any appreciable impact. back01# zfs get compression back01/var NAME PROPERTY VALUE SOURCE back01/var compression gzip local back01# zfs get compressratio back01/var NAME PROPERTY VALUE SOURCE back01/var compressratio 2.16x - I'm using gzip compression and I fit over twice as much data on the filesystem as I'd otherwise be getting. You can get more aggressive with gzip-9 if you need. You could use your backup server as a proof-of-concept. Install FreeBSD 8-BETA2 amd64 on it. Unmount the existing GPT partitions, wipe the MBR clean using dd, and create a zpool on just one of the RAID 6 volumes. Set ZFS compression=gzip on your filesystem and use rsync to copy all the files from your 'primary' server. I suspect you'll find that you have ample storage. Then you can create another zpool on that same box using the other RAID 6 volume for backups. You can experiment there with zfs send/receive, or rsnapshot, or whatever you use. Then get a subset of your users to start testing on it and see how it fares. I suspect you'll be quite pleased. If it works out wonderfully, you can rebuild the other GPT/UFS system on ZFS as well. Set it up with both RAID 6 volumes in one ZFS pool and start pushing your backups from the primary server to it. Once successfully backed up, you can add the 2nd RAID 6 volume on the primary server into the storage pool to double it's disk space. Matt From peterjeremy at optushome.com.au Sat Aug 8 02:37:36 2009 From: peterjeremy at optushome.com.au (Peter Jeremy) Date: Sat Aug 8 02:37:42 2009 Subject: Extracting block pointer list -- ffsinfo? In-Reply-To: <29ae62fc0908060839u430fb073hf5b9f7837f9bc8b6@mail.gmail.com> References: <29ae62fc0908060839u430fb073hf5b9f7837f9bc8b6@mail.gmail.com> Message-ID: <20090808003218.GA56430@server.vk2pj.dyndns.org> On 2009-Aug-06 10:39:57 -0500, Jamie Ostrowski wrote: > I'm a student studying filesystems, and I'd like to find a way to list >the block pointers in an inode. Are there any tools in FreeBSD that can do >that? ffsinfo(8) or fsdb(8) >db ufs2_daddr_t[0] 0x bc8 It might not be obvious without looking in the source code (this particular output comes from /usr/src/sbin/growfs/debug.c) but that actually _is_ the list of blocks. It is more obvious if you run ffsinfo on a larger file. -- Peter Jeremy -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 196 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090808/f12fd3c3/attachment.pgp From sub.mesa at gmail.com Sun Aug 9 11:47:11 2009 From: sub.mesa at gmail.com (Jason Edwards) Date: Sun Aug 9 11:47:18 2009 Subject: ZFS corruption on 8-CURRENT Message-ID: <883b2dc50908090414o71bc5fc2q5aef64c2b5da653e@mail.gmail.com> Hi guys, I'm investigating some weird corruption issue. After filling up my 8-disk RAID-Z pool with data and using it for a few weeks, it started to show me this: # zpool status sub pool: sub state: UNAVAIL status: One or more devices could not be used because the label is missing or invalid. There are insufficient replicas for the pool to continue functioning. action: Destroy and re-create the pool from a backup source. see: http://www.sun.com/msg/ZFS-8000-5E scrub: none requested config: NAME STATE READ WRITE CKSUM sub UNAVAIL 0 0 0 insufficient replicas raidz1 UNAVAIL 0 0 0 insufficient replicas ad14a FAULTED 0 0 0 corrupted data ad8a ONLINE 0 0 0 ad10a ONLINE 0 0 0 ad10a FAULTED 0 0 0 corrupted data ad18a FAULTED 0 0 0 corrupted data ad12a FAULTED 0 0 0 corrupted data ad16a FAULTED 0 0 0 corrupted data ad8a FAULTED 0 0 0 corrupted data oops? What happened here? Besides the "corrupted data" it can also be seen ad10a is displayed twice, one online and one failed. After rebooting, it shows a little cleaner, but it found a problem with the ZIL: # zpool status sub pool: sub state: FAULTED status: An intent log record could not be read. Waiting for adminstrator intervention to fix the faulted pool. action: Either restore the affected device(s) and run 'zpool online', or ignore the intent log records by running 'zpool clear'. scrub: none requested config: NAME STATE READ WRITE CKSUM sub FAULTED 0 0 0 bad intent log raidz1 ONLINE 0 0 0 ad14a ONLINE 0 0 0 ad4a ONLINE 0 0 0 ad6a ONLINE 0 0 0 ad10a ONLINE 0 0 0 ad18a ONLINE 6 0 0 ad12a ONLINE 0 0 0 ad16a ONLINE 0 0 0 ad8a ONLINE 0 0 0 Additionally, i got some read errors on ad18. But since this is a raid-z i guess one disk alone cannot corrupt/fail the entire array. Before i do any actions that might be destructive, anybody has a clue what happened here and how i can prevent this in the future? Box is a quadcore X4 9350e with 6GB RAM and its running 8-CURRENT as of July 21th 2009 (after 8.0-BETA2). It did work correctly before upgrading CURRENT to a newer date. Maybe some bug slipped in? Kind regards, sub From michael at fuckner.net Sun Aug 9 18:20:11 2009 From: michael at fuckner.net (Michael Fuckner) Date: Sun Aug 9 18:20:18 2009 Subject: Using Intel Iscsi remote boot with istgt Message-ID: <4A7F0F0E.207@fuckner.net> Hi, I try to boot from Intel iSCSI Nic, but it does not work as planned. I am using the following setup: 192.168.2.1 :iscsi-initiator, gateway 192.168.2.65: iscsi-target with intel iscsi rom installed istgt is 20090428, iscsi-initiator is 2.2.3 I can connect from the OS c64# iscontrol -d -t 192.168.2.1 -c /etc/iscsi.conf -n c64iscsi TargetName=iqn.2007-09.jp.ne.peach.istgt:target0 TargetAddress=192.168.2.1:3260,1 c64# iscontrol -t 192.168.2.1 -c /etc/iscsi.conf -n c64iscsi c64# iscontrol[1576]: running iscontrol[1576]: (pass1:iscsi0:0:0:0): tagged openings now 0 iscontrol: supervise starting main loop c64# dmesg |grep da0 da0 at iscsi0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-5 device da0 at iscsi0 bus 0 target 0 lun 0 da0: Fixed Direct Access SCSI-5 device c64# cat /etc/iscsi.conf c64iscsi { AuthMethod = None chapIName = c64iscsi chapSecret = 1234567890123456 Initiatorname = iqn.1991-05.com.microsoft:c64iscsi TargetName = iqn.2007-09.jp.ne.peach.istgt:target0 TargetAddress = 192.168.2.1 # your iscsi server IP } When trying to connect using the Intel Card I get: ------------------------------------------- Intel(R) iSCSI Boot version 2.3.52 Copyright (c) 2003-2009 Intel Corporation. All rights reserved. Press ESC key to skip iscsi boot initialization Initializing adapter configuration - MAC address(0015177CC363). Using STATIC configuration for primary port. Please wait. iSCSI Target Name : iqn.2007-09.jp.ne.peach.istgt:target0 iSCSI Target IP Address : 192.168.2.1 LUN ID: 1 Port 3260 VLAN ID : 3 iSCSI Initiator IP: 192.168.2.65 iSCSI Gateway IP: 192.168.2.1 iSCSI Initiator Name: iqn.1991-05.com.microsoft:c64iscsi Attempting to connect to target disk using MAC address(0015177CC363) ERROR: Could not establish TCP/IP connection with iSCSI target. No disk found! ------------------------------------------- In tcpdump I don't even see the card trying to do anything useful. tcpdump -nvi em0 port 3260 18:05:18.602286 IP (tos 0x0, ttl 64, id 24099, offset 0, flags [DF], proto TCP (6), length 100) 192.168.2.1.3260 > 192.168.2.65.51302: P 144:192(48) ack 1 win 8326 Any idea how to tell the Intel card to boot via network? Regards, Michael! From stb at lassitu.de Mon Aug 10 09:35:47 2009 From: stb at lassitu.de (Stefan Bethke) Date: Mon Aug 10 09:35:54 2009 Subject: Using Intel Iscsi remote boot with istgt In-Reply-To: <4A7F0F0E.207@fuckner.net> References: <4A7F0F0E.207@fuckner.net> Message-ID: <459140BF-273E-4587-93CC-085BD446E8F6@lassitu.de> Am 09.08.2009 um 20:01 schrieb Michael Fuckner: > Initializing adapter configuration - MAC address(0015177CC363). > Using STATIC configuration for primary port. Please wait. > iSCSI Target Name : iqn.2007-09.jp.ne.peach.istgt:target0 > iSCSI Target IP Address : 192.168.2.1 > LUN ID: 1 Port 3260 > VLAN ID : 3 ^^ > iSCSI Initiator IP: 192.168.2.65 > iSCSI Gateway IP: 192.168.2.1 > iSCSI Initiator Name: iqn.1991-05.com.microsoft:c64iscsi Does that match your network configuration? Can you try capturing all traffic the card sends towards the server, not just tcp port 3260? Stefan -- Stefan Bethke Fon +49 151 14070811 From bugmaster at FreeBSD.org Mon Aug 10 11:06:55 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Aug 10 11:07:59 2009 Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org Message-ID: <200908101106.n7AB6sxJ025130@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/137037 fs [zfs] [hang] zfs rollback on root causes FreeBSD to fr o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136942 fs [zfs] zvol resize not reflected until reboot o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic o kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/136218 fs [zfs] Exported ZFS pools can't be imported into (Open) o kern/135594 fs [zfs] Single dataset unresponsive with Samba o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135480 fs [zfs] panic: lock &arg.lock already initialized o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o bin/135314 fs [zfs] assertion failed for zdb(8) usage o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot f kern/134496 fs [zfs] [panic] ZFS pool export occasionally causes a ke o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133980 fs [panic] [ffs] panic: ffs_valloc: dup alloc o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [smbfs] [panic] panic: ffs_truncate: read-only filesys o kern/133373 fs [zfs] umass attachment causes ZFS checksum errors, dat o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int f kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/133134 fs [zfs] Missing ZFS zpool labels o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132597 fs [tmpfs] [panic] tmpfs-related panic while interrupting o kern/132551 fs [zfs] ZFS locks up on extattr_list_link syscall o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes f kern/132068 fs [zfs] page fault when using ZFS over NFS on 7.1-RELEAS o kern/131995 fs [nfs] Failure to mount NFSv4 server o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/131086 fs [ext2fs] [patch] mkfs.ext2 creates rotten partition o kern/130979 fs [smbfs] [panic] boot/kernel/smbfs.ko o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129148 fs [zfs] [panic] panic on concurrent writing & rollback o kern/129059 fs [zfs] [patch] ZFS bootloader whitelistable via WITHOUT f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/128633 fs [zfs] [lor] lock order reversal in zfs o kern/128514 fs [zfs] [mpt] problems with ZFS and LSILogic SAS/SATA Ad f kern/128173 fs [ext2fs] ls gives "Input/output error" on mounted ext3 o kern/127659 fs [tmpfs] tmpfs memory leak o kern/127492 fs [zfs] System hang on ZFS input-output o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127213 fs [tmpfs] sendfile on tmpfs data corruption o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/125644 fs [zfs] [panic] zfs unfixable fs errors caused panic whe f kern/125536 fs [ext2fs] ext 2 mounts cleanly but fails on commands li o kern/125149 fs [nfs] [panic] changing into .zfs dir from nfs client c f kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122888 fs [zfs] zfs hang w/ prefetch on, zil off while running t o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o kern/122173 fs [zfs] [panic] Kernel Panic if attempting to replace a o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o kern/122047 fs [ext2fs] [patch] incorrect handling of UF_IMMUTABLE / o kern/122038 fs [tmpfs] [panic] tmpfs: panic: tmpfs_alloc_vp: type 0xc o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o kern/121770 fs [zfs] ZFS on i386, large file or heavy I/O leads to ke o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [fs] [snapshot] System crashes when manipulati o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F o bin/120288 fs zfs(8): "zfs share -a" does not send SIGHUP to mountd f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o misc/118855 fs [zfs] ZFS-related commands are nonfunctional in fixit o kern/118713 fs [minidump] [patch] Display media size required for a k o kern/118320 fs [zfs] [patch] NFS SETATTR sometimes fails to set file o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o kern/116913 fs [ffs] [panic] ffs_blkfree: freeing free block p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/115645 fs [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o kern/113180 fs [zfs] Setting ZFS nfsshare property does not cause inh o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/105093 fs [ext2fs] [patch] ext2fs on read-only media cannot be m o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna f kern/91568 fs [ufs] [panic] writing to UFS/softupdates DVD media in o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/89991 fs [ufs] softupdates with mount -ur causes fs UNREFS o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/84589 fs [2TB] 5.4-STABLE unresponsive during background fsck 2 o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o kern/77826 fs [ext2fs] ext2fs usb filesystem will not mount RW o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 149 problems total. From trevor.hearn at Vanderbilt.Edu Mon Aug 10 19:31:26 2009 From: trevor.hearn at Vanderbilt.Edu (Hearn, Trevor) Date: Mon Aug 10 19:31:34 2009 Subject: UFS Filesystem issues, and the loss of my hair... In-Reply-To: <200908070829.54571.jhb@freebsd.org> References: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34C3@ITS-HCWNEM03.ds.Vanderbilt.edu>, <200908070829.54571.jhb@freebsd.org> Message-ID: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34CA@ITS-HCWNEM03.ds.Vanderbilt.edu> To the FreeBSD-FS group at large... Well, I've spent alot of time looking this one over... I setup a share on a webserver to put up redacted images of the errors I am getting. They are here: http://www.trevorhearn.com/Array/IMG_0056.jpg http://www.trevorhearn.com/Array/IMG_0061.jpg http://www.trevorhearn.com/Array/IMG_0063.jpg http://www.trevorhearn.com/Array/IMG_0065.jpg http://www.trevorhearn.com/Array/IMG_0067.jpg http://www.trevorhearn.com/Array/IMG_0069.jpg So, while I am in a meeting about the array, oddly, I have this come rolling across the screen of the terminal session I am in... Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 Aug 10 10:53:43 XXXX last message repeated 20 times Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=1638d)]error = 5 Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 Aug 10 10:53:43 XXXX last message repeated 18 times When I say it was rolling across the screen, I mean it did it for about 5 minutes... I was waiting for the hard-lock to happen, but the process that was touching the file(s) went to 99.02%, and has stayed there the remainder of the day... PID USERNAME THR PRI NICE SIZE RES STATE C TIME WCPU COMMAND 1351 xxxxxxxx 1 -8 0 10928K 4656K CPU1 0 2:10 99.02% smbd While this happened earlier in the morning, which we were only seeing moderate useage: Aug 10 09:54:18 PRSA kernel: pid 1776 (smbd), uid 1194 inumber 107797529 on /xxxxxxxxxx: bad block Aug 10 09:54:18 PRSA kernel: bad block 165436921330628865, ino 107797529 The bad block number is WAAAY outside of what is used on the machine. So.... Everything that I have found relating to these problems is everyone asking, 'How do I fix this', and NONE of them so far have been a fix. 'Error = 5' relates to EIO, or an error in the input/output to a device. Now, that being said, I either have a problem with the controller in my Promise Array, which I am learning is possible, or, I have an issue with a driver in FreeBSD, and just happen to have a circumstance where it will appear. There does not seem to be a rhyme or reason to what is taking place. How does a set of array controllers throw a bad block error? I mean, with a standard drive, I can see it... but an array controller? Some other things that I have found... The link below tells about using 'find / -type d -exec stat {} ;' to run thru the filesystem and find the corrupted files. I did this earlier this morning, and found none. I went back thru several of the inodes that are showing in the pictures above, and only found one in existence. I battened down the hatches, and hit that directory. I was able to cp all of the info in that directory to another directory without a single problem. With all that I have been reading, this should have caused all manner of hell. I ran fsck on all directories, and got the server back online... Back online? Yes. It hard-locked at 3:09AM Sunday morning. Odd, since it has done that MANY times at 3:09 AM. I have Nagios watching the server, and it always seems to do so at the same time. I looked at cron jobs, and found that it runs PERIODIC DAILY at 3:01AM. My Nagios box checks every 5 minutes, with three intervals of one minute afterwards if a service is not available. SO, somewhere in the list of things that the server does in the PERIODIC DAILY job, there is something that makes the server fault. Tonight, I will be going thru the jobs, running them one by one, seeing exactly which one causes the fault. I have seen others speak of it going down at 3:00AMish, so I think this might be a bit of a clue. At this point, I am purchasing another 2 port fibre channel card, with hopes of installing it in a spare 1U server I have, to migrate to Ubuntu, or similar. I'd like to test it out with Ubuntu, but I do not know at this point if it will see the array partitions correctly, nor if it will allow me to access the UFS partitions that are there. Worst case, I will backup, and re-format the chassis themselves. I would hope that this would not be necessary, but I am almost at my wit's end. Has ANYONE got any ideas, other than the ones presented? I'm keen to see if there is a fix, because I love FreeBSD, but I can't be a evangelist for it when it is giving me so much grief. Thanks for listening, I'll be here all week. :) -Trevor ________________________________________ From: John Baldwin [jhb@freebsd.org] Sent: Friday, August 07, 2009 7:29 AM To: freebsd-fs@freebsd.org Cc: Hearn, Trevor Subject: Re: UFS Filesystem issues, and the loss of my hair... On Thursday 06 August 2009 9:51:04 am Hearn, Trevor wrote: > First off, let me state that I love FreeBSD. I've used it for years, and have not had any major problems with it... Until now. > > As you can tell, I work for a major university. I setup a large storage array to hold data for a project they have here. No great shakes, just some standard files and such. The fun started when I started loading users onto the system, and they started using it... Isn't that always the case? Now, I get ufs_dirbad errors, and the system hard locks. This isn't the worst thing that could happen, but when you're talking about file partitions the size that I am using, the fsck takes FOREVER. Somewhere on the order of 1.5 hours. During that time, I am bringing the individual shares/partitions online, but the users suffer. I've asked about this before, in a different forum, but got no usable information that I could see. So, here goes... > > The system is as such. A dell 2950 1U server, with a Qlogic Fibre Channel card. It is connected to two Promise Array chassis, 610 series, each with 16 drives. Each chassis is running RAID 6, which gives me about 12.73tb of storage per chassis. From there, the logical drives are sliced up into smaller partitions. At most, I have a 3.6tb partition. The smallest is a 100gig partition. > > Filesystem Size Used Avail Capacity Mounted on > /dev/mfid0s1a 197G 10G 170G 6% / > devfs 1.0K 1.0K 0B 100% /dev > /dev/da0p1 1.8T 1.5T 130G 92% /slice1 > /dev/da0p5 2.7T 1.8T 661G 74% /slice2 > /dev/da0p9 250G 21G 209G 9% /slice3 > /dev/da1p3 103G 12G 83G 12% /slice4 > /dev/da1p4 205G 54G 135G 29% /slice5 > /dev/da1p5 103G 7.3G 87G 8% /slice6 > /dev/da1p6 103G 22G 72G 23% /slice7 > etc... > > I had to use GPT to setup the partitions, and they are using UFS2 for the filesystem. Now... If that's not fun enough... I have TWO of these creatures, which RSYNC every 4 hours. The secondary system is across campus, and sits idle 99% of the time. Every 4 hours, in a stepped schedule, the primary array syncs to the secondary array. If the primary goes down, I FSCK, and any files that are fried, I bring back across from the secondary and replace them. This has worked OK for a while, but now I am getting Kernel Panics on a regular basis. I've been told to migrate to a different filesystem, but my options are ZFS and using GJOURNAL with UFS, from what I can tell. I need something repeatable, simple, and I need something robust. I have NO idea why I keep getting errors like this, but I imagine it's a cascading effect of other hangs that have caused more corruption. > > I'd buy a fella, or gal, a cup of coffee and a pop-tart if they could help a brother out. I have checked out this link: > http://phaq.phunsites.net/2007/07/01/ufs_dirbad-panic-with-mangled-entries-in-ufs/ > and decided that I need to give this a shot after hours, but being the kinda guy I am, I need to make sure I am covering all of my bases. Are you seeing ufs_dirbad panics? Specifically, can you capture the messages on the console when the machine panics? -- John Baldwin From jhb at freebsd.org Mon Aug 10 20:06:47 2009 From: jhb at freebsd.org (John Baldwin) Date: Mon Aug 10 20:06:53 2009 Subject: UFS Filesystem issues, and the loss of my hair... In-Reply-To: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34CA@ITS-HCWNEM03.ds.Vanderbilt.edu> References: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34C3@ITS-HCWNEM03.ds.Vanderbilt.edu> <200908070829.54571.jhb@freebsd.org> <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34CA@ITS-HCWNEM03.ds.Vanderbilt.edu> Message-ID: <200908101605.12332.jhb@freebsd.org> On Monday 10 August 2009 3:31:23 pm Hearn, Trevor wrote: > To the FreeBSD-FS group at large... > > Well, I've spent alot of time looking this one over... I setup a share on a webserver to put up redacted images of the errors I am getting. They are here: > > http://www.trevorhearn.com/Array/IMG_0056.jpg > http://www.trevorhearn.com/Array/IMG_0061.jpg > http://www.trevorhearn.com/Array/IMG_0063.jpg > http://www.trevorhearn.com/Array/IMG_0065.jpg > http://www.trevorhearn.com/Array/IMG_0067.jpg > http://www.trevorhearn.com/Array/IMG_0069.jpg > > So, while I am in a meeting about the array, oddly, I have this come rolling across the screen of the terminal session I am in... > > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:43 XXXX last message repeated 20 times > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=1638d)]error = 5 > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:43 XXXX last message repeated 18 times Is the '1638d' above a typo in the cut and paste or does it actually have a 'd' instead of a '4' in the log? '4' is 0x34 and 'd' is 0x64 so that could be indicative of a two-bit memory error perhaps? -- John Baldwin From trevor.hearn at Vanderbilt.Edu Mon Aug 10 20:19:17 2009 From: trevor.hearn at Vanderbilt.Edu (Hearn, Trevor) Date: Mon Aug 10 20:19:23 2009 Subject: UFS Filesystem issues, and the loss of my hair... In-Reply-To: <200908101605.12332.jhb@freebsd.org> References: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34C3@ITS-HCWNEM03.ds.Vanderbilt.edu> <200908070829.54571.jhb@freebsd.org> <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34CA@ITS-HCWNEM03.ds.Vanderbilt.edu>, <200908101605.12332.jhb@freebsd.org> Message-ID: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34CC@ITS-HCWNEM03.ds.Vanderbilt.edu> Here is the chunk that I grabbed from the screen to save as a memento. :) Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 Aug 10 10:53:43 XXXX last message repeated 20 times Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=1638d)]error = 5 Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 Aug 10 10:53:43 XXXX last message repeated 18 times Aug 10 10:53:43 XXXX kernel: g_vfs_done()-da1p7[READ(offset= Aug 10 10:53:43 XXXX kernel: -6419569950008350720, length=16384)]error = 5 Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 Aug 10 10:53:43 XXXX last message repeated 19 times Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, l=ngth=16384)]error 1 5 Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 Aug 10 10:53:43 XXXX last message repeated 6 times Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384o]error = 5 Aug 10 10:53:43 XXXX kernel: Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 Aug 10 10:53:43 XXXX last message repeated 22 times Aug 10 10:53:44 XXXX kernel: READ(offset=-6419569950008350720, length=16384)]error = 5 Aug 10 10:53:44 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 Aug 10 10:53:44 XXXX last message repeated 849 times As you can see, it's a little disjointed. I would assume that the screwed up length information is from the speed at which it was coming out, and possibly another message got throw in at the same time? -Trevor ________________________________________ From: John Baldwin [jhb@freebsd.org] Sent: Monday, August 10, 2009 3:05 PM To: Hearn, Trevor Cc: freebsd-fs@freebsd.org Subject: Re: UFS Filesystem issues, and the loss of my hair... On Monday 10 August 2009 3:31:23 pm Hearn, Trevor wrote: > To the FreeBSD-FS group at large... > > Well, I've spent alot of time looking this one over... I setup a share on a webserver to put up redacted images of the errors I am getting. They are here: > > http://www.trevorhearn.com/Array/IMG_0056.jpg > http://www.trevorhearn.com/Array/IMG_0061.jpg > http://www.trevorhearn.com/Array/IMG_0063.jpg > http://www.trevorhearn.com/Array/IMG_0065.jpg > http://www.trevorhearn.com/Array/IMG_0067.jpg > http://www.trevorhearn.com/Array/IMG_0069.jpg > > So, while I am in a meeting about the array, oddly, I have this come rolling across the screen of the terminal session I am in... > > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:43 XXXX last message repeated 20 times > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=1638d)]error = 5 > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:43 XXXX last message repeated 18 times Is the '1638d' above a typo in the cut and paste or does it actually have a 'd' instead of a '4' in the log? '4' is 0x34 and 'd' is 0x64 so that could be indicative of a two-bit memory error perhaps? -- John Baldwin From jhb at freebsd.org Mon Aug 10 21:08:04 2009 From: jhb at freebsd.org (John Baldwin) Date: Mon Aug 10 21:08:11 2009 Subject: UFS Filesystem issues, and the loss of my hair... In-Reply-To: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34CC@ITS-HCWNEM03.ds.Vanderbilt.edu> References: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34C3@ITS-HCWNEM03.ds.Vanderbilt.edu> <200908101605.12332.jhb@freebsd.org> <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34CC@ITS-HCWNEM03.ds.Vanderbilt.edu> Message-ID: <200908101707.49526.jhb@freebsd.org> On Monday 10 August 2009 4:15:44 pm Hearn, Trevor wrote: > Here is the chunk that I grabbed from the screen to save as a memento. :) > > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:43 XXXX last message repeated 20 times > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=1638d)]error = 5 > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:43 XXXX last message repeated 18 times > Aug 10 10:53:43 XXXX kernel: g_vfs_done()-da1p7[READ(offset= > Aug 10 10:53:43 XXXX kernel: -6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:43 XXXX last message repeated 19 times > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, l=ngth=16384)]error 1 5 > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:43 XXXX last message repeated 6 times > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384o]error = 5 > Aug 10 10:53:43 XXXX kernel: > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:43 XXXX last message repeated 22 times > Aug 10 10:53:44 XXXX kernel: READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:44 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:44 XXXX last message repeated 849 times > > As you can see, it's a little disjointed. I would assume that the screwed up length information is from the speed at which it was coming out, and possibly another message got throw in at the same time? Yes, it does seem like it was part of one of the other messages. The isp(4) driver was just recently updated in HEAD by mjacob@ who has maintained that driver in the past. He may have some insight if there is an isp(4)-specific problem. -- John Baldwin From trevor.hearn at Vanderbilt.Edu Mon Aug 10 22:24:42 2009 From: trevor.hearn at Vanderbilt.Edu (Hearn, Trevor) Date: Mon Aug 10 22:24:48 2009 Subject: UFS Filesystem issues, and the loss of my hair... In-Reply-To: <200908101707.49526.jhb@freebsd.org> References: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34C3@ITS-HCWNEM03.ds.Vanderbilt.edu> <200908101605.12332.jhb@freebsd.org> <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34CC@ITS-HCWNEM03.ds.Vanderbilt.edu>, <200908101707.49526.jhb@freebsd.org> Message-ID: <8E9591D8BCB72D4C8DE0884D9A2932DC6D2EDF21@ITS-HCWNEM03.ds.Vanderbilt.edu> ________________________________________ From: John Baldwin [jhb@freebsd.org] Sent: Monday, August 10, 2009 4:07 PM To: Hearn, Trevor Cc: freebsd-fs@freebsd.org Subject: Re: UFS Filesystem issues, and the loss of my hair... On Monday 10 August 2009 4:15:44 pm Hearn, Trevor wrote: > Here is the chunk that I grabbed from the screen to save as a memento. :) > > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:43 XXXX last message repeated 20 times > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=1638d)]error = 5 > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:43 XXXX last message repeated 18 times > Aug 10 10:53:43 XXXX kernel: g_vfs_done()-da1p7[READ(offset= > Aug 10 10:53:43 XXXX kernel: -6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:43 XXXX last message repeated 19 times > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, l=ngth=16384)]error 1 5 > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:43 XXXX last message repeated 6 times > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384o]error = 5 > Aug 10 10:53:43 XXXX kernel: > Aug 10 10:53:43 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:43 XXXX last message repeated 22 times > Aug 10 10:53:44 XXXX kernel: READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:44 XXXX kernel: g_vfs_done():da1p7[READ(offset=-6419569950008350720, length=16384)]error = 5 > Aug 10 10:53:44 XXXX last message repeated 849 times > > As you can see, it's a little disjointed. I would assume that the screwed up length information is from the speed at which it was coming out, and possibly another message got throw in at the same time? Yes, it does seem like it was part of one of the other messages. The isp(4) driver was just recently updated in HEAD by mjacob@ who has maintained that driver in the past. He may have some insight if there is an isp(4)-specific problem. -- John Baldwin Heh. Ok, I just watched the same error message scroll across the screen for about 5 minutes now, with a different offset, same length. The fun part is that it is not touching the device, /dev/da1p7 at all. From the systat -vmstat display, I see all of the traffic coming from the /dev/mfid0 drives. It ran for a while, then stopped. So, no access to the drive in question, da1p7, but on the root drive, mfid0. Odd. The partition is mapped to the root drive. I wonder if the driver lost itself, and it tried to access the file on the empty folder on the root drive. Sigh. Anyone? -Trevor From nafzal at hotmail.com Tue Aug 11 00:51:10 2009 From: nafzal at hotmail.com (Naeem Afzal) Date: Tue Aug 11 00:51:16 2009 Subject: filesystem size after newfs Message-ID: resending to FS mailing list:I created this small partition of 512K bytes on disk, I am noticing about 24% is used up before system can be mounted and used. My assumption was about 4% is supposed to be used if minfree is set to 0. #newfs -U -l -m 0 -n -o space /dev/ad1d /dev/ad1d: 0.5MB (1024 sectors) block size 16384, fragment size 2048 using 1 cylinder groups of 0.50MB, 32 blks, 64 inodes with soft updates super-block backups (for fsck -b #) at: 160 #mount /dev/ad1d /test #df -H /test Filesystem Size Used Avail Capacity Mounted on /dev/ad1d 391k 2.0k 389k 1% /test Could someone explain where the 512-391=121K of disk space went to? What is the relation between this used of space and total paritition size or is it some fixed ratio? Thanks & Regards Naeem _________________________________________________________________ Get your vacation photos on your phone! http://windowsliveformobile.com/en-us/photos/default.aspx?&OCID=0809TL-HM From michael at fuckner.net Tue Aug 11 05:22:46 2009 From: michael at fuckner.net (Michael Fuckner) Date: Tue Aug 11 05:22:53 2009 Subject: Using Intel Iscsi remote boot with istgt In-Reply-To: <459140BF-273E-4587-93CC-085BD446E8F6@lassitu.de> References: <4A7F0F0E.207@fuckner.net> <459140BF-273E-4587-93CC-085BD446E8F6@lassitu.de> Message-ID: <4A81000A.3020602@fuckner.net> Stefan Bethke wrote: > Am 09.08.2009 um 20:01 schrieb Michael Fuckner: > >> Initializing adapter configuration - MAC address(0015177CC363). >> Using STATIC configuration for primary port. Please wait. >> iSCSI Target Name : iqn.2007-09.jp.ne.peach.istgt:target0 >> iSCSI Target IP Address : 192.168.2.1 >> LUN ID: 1 Port 3260 >> VLAN ID : 3 > ^^ >> iSCSI Initiator IP: 192.168.2.65 >> iSCSI Gateway IP: 192.168.2.1 >> iSCSI Initiator Name: iqn.1991-05.com.microsoft:c64iscsi > > > Does that match your network configuration? > > Can you try capturing all traffic the card sends towards the server, not > just tcp port 3260? > Hi Stefan, hi all, I am not using vlans @home. I don't know where this VLAN ID comes from. Ihis is what my tcpdump looks like. 10.1.2.254 is my router-box, connected to the internet, my iscsi-target-box is doing packet forwarding between workstation-net (192.168.2.x) and wlan/ internet (10.1.2.x). Regards, Michael g33# tcpdump -nvi em0 port 3260 tcpdump: listening on em0, link-type EN10MB (Ethernet), capture size 96 bytes 18:40:39.773645 IP (tos 0x0, ttl 64, id 31710, offset 0, flags [DF], proto TCP (6), length 100) 192.168.2.1.3260 > 192.168.2.65.53739: P 2159158359:2159158407(48) ack 820164098 win 8326 18:40:40.246018 IP (tos 0x0, ttl 64, id 31712, offset 0, flags [DF], proto TCP (6), length 100) 192.168.2.1.3260 > 192.168.2.65.53739: P 0:48(48) ack 1 win 8326 18:40:40.986477 IP (tos 0x0, ttl 64, id 31724, offset 0, flags [DF], proto TCP (6), length 100) 192.168.2.1.3260 > 192.168.2.65.53739: P 0:48(48) ack 1 win 8326 18:40:42.264067 IP (tos 0x0, ttl 64, id 31726, offset 0, flags [DF], proto TCP (6), length 100) 192.168.2.1.3260 > 192.168.2.65.53739: P 0:48(48) ack 1 win 8326 18:40:44.615007 IP (tos 0x0, ttl 64, id 31729, offset 0, flags [DF], proto TCP (6), length 100) 192.168.2.1.3260 > 192.168.2.65.53739: P 0:48(48) ack 1 win 8326 18:40:49.112412 IP (tos 0x0, ttl 64, id 31735, offset 0, flags [DF], proto TCP (6), length 100) 192.168.2.1.3260 > 192.168.2.65.53739: P 0:48(48) ack 1 win 8326 18:40:56.727573 IP (tos 0x0, ttl 64, id 31751, offset 0, flags [DF], proto TCP (6), length 100) 192.168.2.1.3260 > 192.168.2.65.53739: P 0:48(48) ack 1 win 8326 18:41:01.204986 IP (tos 0x0, ttl 64, id 31760, offset 0, flags [DF], proto TCP (6), length 100) 192.168.2.1.3260 > 192.168.2.65.53739: P 48:96(48) ack 1 win 8326 18:41:11.747747 IP (tos 0x0, ttl 64, id 31786, offset 0, flags [DF], proto TCP (6), length 148) 192.168.2.1.3260 > 192.168.2.65.53739: P 0:96(96) ack 1 win 8326 18:41:22.633882 IP (tos 0x0, ttl 64, id 31798, offset 0, flags [DF], proto TCP (6), length 100) 192.168.2.1.3260 > 192.168.2.65.53739: P 96:144(48) ack 1 win 8326 18:41:41.592982 IP (tos 0x0, ttl 64, id 31842, offset 0, flags [DF], proto TCP (6), length 196) 192.168.2.1.3260 > 192.168.2.65.53739: P 0:144(144) ack 1 win 8326 18:41:44.062446 IP (tos 0x0, ttl 64, id 31848, offset 0, flags [DF], proto TCP (6), length 100) 192.168.2.1.3260 > 192.168.2.65.53739: P 144:192(48) ack 1 win 8326 18:42:05.489296 IP (tos 0x0, ttl 64, id 31899, offset 0, flags [DF], proto TCP (6), length 100) 192.168.2.1.3260 > 192.168.2.65.53739: P 192:240(48) ack 1 win 8326 18:42:26.922956 IP (tos 0x0, ttl 64, id 31934, offset 0, flags [DF], proto TCP (6), length 100) 192.168.2.1.3260 > 192.168.2.65.53739: P 240:288(48) ack 1 win 8326 18:42:41.077411 IP (tos 0x0, ttl 64, id 31963, offset 0, flags [DF], proto TCP (6), length 340) 192.168.2.1.3260 > 192.168.2.65.53739: P 0:288(288) ack 1 win 8326 18:42:48.352074 IP (tos 0x0, ttl 64, id 31997, offset 0, flags [DF], proto TCP (6), length 100) 192.168.2.1.3260 > 192.168.2.65.53739: P 288:336(48) ack 1 win 8326 18:42:48.352289 IP (tos 0x0, ttl 64, id 20, offset 0, flags [DF], proto TCP (6), length 40) 192.168.2.65.53739 > 192.168.2.1.3260: R, cksum 0x64c1 (correct), 820164098:820164098(0) win 0 19:22:19.459706 IP (tos 0x0, ttl 63, id 61907, offset 0, flags [DF], proto TCP (6), length 60) 10.1.2.254.3260 > 192.168.2.65.14013: S, cksum 0x91e6 (correct), 2095746219:2095746219(0) win 5840 19:22:19.460056 IP (tos 0x0, ttl 64, id 2519, offset 0, flags [DF], proto TCP (6), length 40) 192.168.2.65.14013 > 10.1.2.254.3260: R, cksum 0x92d8 (correct), 0:0(0) ack 2095746220 win 0 ^C 19 packets captured 1312938 packets received by filter 0 packets dropped by kernel g33# From pmc at citylink.dinoex.sub.org Tue Aug 11 08:20:07 2009 From: pmc at citylink.dinoex.sub.org (Peter Much) Date: Tue Aug 11 08:20:13 2009 Subject: kern/137037: [zfs] [hang] zfs rollback on root causes FreeBSD to freeze in few seconds Message-ID: <200908110820.n7B8K53Y051011@freefall.freebsd.org> The following reply was made to PR kern/137037; it has been noted by GNATS. From: Peter Much To: bug-followup@FreeBSD.org, killasmurf86@gmail.com Cc: Subject: Re: kern/137037: [zfs] [hang] zfs rollback on root causes FreeBSD to freeze in few seconds Date: Tue, 11 Aug 2009 09:34:10 +0200 I considered to do more investigations before reporting my issue, but after seeing this bug report I think an interim report from my side should not harm. I also experience system failures after rollback, and the significant similarity is that in my case also the rollbacks succeed, and the system continues to work for some seconds (or sometimes even longer) before it fails. The failure is either (seldom) a system freeze or (much more often) an instanteous reboot without dumping. I am currently investigating about methods to capture some useful data. Maybe, if it freezes, running "watchdog" can trick it to do a dump... I am running 7.2-STABLE as of mid-July (that is ZFS V13). I admit I am someway low on memory to run ZFS (memory is ordered ;) ), but I use it only for a very limited number of filesystems and specific tasks, and I am watching carefully about my mem usage. Nevertheless, if the system would run out of memory, I would expect an orderly panic and not some hard reset or freeze. I am not using geli or anything like, also I am not working with the root; what I am doing is mainly an extensive use of the rollback feature, from script, in a way like this: while do zfs mount jb/x mount -t zfs jb/p /jb/x/p ... do some work ... umount /jb/x/p umount /jb/x zfs rollback jb/x@base zfs rollback jb/p@base done At first I tried this without the unmounting, but the crashes were so reproducible that I considered that unfunctional. With the unmounting it looked functional first, but now I also experience crashes about every 12 hours. Beware: this is an interim report, I have not yet extensively verified against possibilities of my own mistakes. Take it with the appropriate grains of salt. ;) ------------------------------ Update: I was able to obtain a dump. After running the above loop in a tough way and staying on the console, it suddenly started to do havoc, reported that it were not able to unmount the filesystems or could not detect them (something I also had seen occasionally before) and then dropped me into the debugger at _sx_xlock+0x16 lock cmpxchgl %edx,0x10(%ecx) The backtrace see attached below - but beware, since the havoc had already started before, this will very likely NOT point to the root cause of the problem. But maybe it gives some first impression. I suppose this should be reproducible, but in any case I would be glad to provide further data if requested (or do further tests). And as said before - if this is a result of low memory, then I am just sorry. ;) Ah, btw, its a dual Pentium3 SMP machine. (gdb) add-symbol-file /usr/src/sys/i386/compile/D1R72V1/modules/usr/src/sys/modules/zfs/zfs.ko 0xc0a59860 add symbol table from file "/usr/src/sys/i386/compile/D1R72V1/modules/usr/src/sys/modules/zfs/zfs.ko" at .text_addr = 0xc0a59860 (gdb) bt #0 doadump () at pcpu.h:196 #1 0xc05e8be6 in boot (howto=260) at ../../../kern/kern_shutdown.c:418 #2 0xc05e8f07 in panic (fmt=Variable "fmt" is not available. ) at ../../../kern/kern_shutdown.c:574 #3 0xc046ed77 in db_panic (addr=Could not find the frame base for "db_panic". ) at ../../../ddb/db_command.c:446 #4 0xc046f52a in db_command (last_cmdp=0xc0932a54, cmd_table=0x0, dopager=1) at ../../../ddb/db_command.c:413 #5 0xc046f645 in db_command_loop () at ../../../ddb/db_command.c:466 #6 0xc047117c in db_trap (type=12, code=0) at ../../../ddb/db_main.c:228 #7 0xc0617581 in kdb_trap (type=12, code=0, tf=0xdb76b9fc) at ../../../kern/subr_kdb.c:524 #8 0xc0855adf in trap_fatal (frame=0xdb76b9fc, eva=76) at ../../../i386/i386/trap.c:929 #9 0xc0855d8b in trap_pfault (frame=0xdb76b9fc, usermode=0, eva=76) at ../../../i386/i386/trap.c:851 #10 0xc0856786 in trap (frame=0xdb76b9fc) at ../../../i386/i386/trap.c:529 #11 0xc083b70b in calltrap () at ../../../i386/i386/exception.s:166 #12 0xc05f0a56 in _sx_xlock (sx=0x3c, opts=0, file=0xc0b4953d "/usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c", line=1807) at atomic.h:149 #13 0xc0a79185 in dmu_buf_update_user (db_fake=0x0, old_user_ptr=0xc2de3000, user_ptr=0x0, user_data_ptr_ptr=0x0, evict_func=0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c:1807 #14 0xc0ad0cab in zfs_znode_dmu_fini (zp=0xc2de3000) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_znode.c:557 #15 0xc0aef214 in zfs_freebsd_reclaim (ap=0xdb76baf0) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4385 #16 0xc0871602 in VOP_RECLAIM_APV (vop=0xc0b55560, a=0xdb76baf0) at vnode_if.c:1566 #17 0xc066d28f in vgonel (vp=0xc355ce04) at vnode_if.h:819 #18 0xc0670f26 in vflush (mp=0xc3db25a0, rootrefs=0, flags=Variable "flags" is not available. ) at ../../../kern/vfs_subr.c:2408 #19 0xc0aee0c8 in zfs_umount (vfsp=0xc3db25a0, fflag=134217728, td=0xc312bd80) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:1005 #20 0xc066a201 in dounmount (mp=0xc3db25a0, flags=134217728, td=0xc312bd80) at ../../../kern/vfs_mount.c:1290 #21 0xc066a957 in unmount (td=0xc312bd80, uap=0xdb76bcfc) at ../../../kern/vfs_mount.c:1186 #22 0xc08560f5 in syscall (frame=0xdb76bd38) at ../../../i386/i386/trap.c:1089 #23 0xc083b770 in Xint0x80_syscall () at ../../../i386/i386/exception.s:262 #24 0x00000033 in ?? () Previous frame inner to this frame (corrupt stack?) (gdb) From rick-freebsd2008 at kiwi-computer.com Tue Aug 11 19:18:38 2009 From: rick-freebsd2008 at kiwi-computer.com (Rick C. Petty) Date: Tue Aug 11 19:18:46 2009 Subject: filesystem size after newfs In-Reply-To: References: Message-ID: <20090811191837.GB66530@keira.kiwi-computer.com> On Tue, Aug 11, 2009 at 12:41:06AM +0000, Naeem Afzal wrote: > > resending to FS mailing list:I created this small partition of 512K bytes on disk, I am noticing about 24% is used up before system can be mounted and used. My assumption was about 4% is supposed to be used if minfree is set to 0. > #newfs -U -l -m 0 -n -o space /dev/ad1d > /dev/ad1d: 0.5MB (1024 sectors) block size 16384, fragment size 2048 using 1 cylinder groups of 0.50MB, 32 blks, 64 inodes with soft updates > super-block backups (for fsck -b #) at: > 160 > #mount /dev/ad1d /test > #df -H /test > Filesystem Size Used Avail Capacity Mounted on > /dev/ad1d 391k 2.0k 389k 1% /test > Could someone explain where the 512-391=121K of disk space went to? What is the relation between this used of space and total paritition size or is it some fixed ratio? When you use newfs(8), it leaves 64k at the front for bootstrap code. This is followed by at least one "block" for the superblock, one block for the superblock backup, one block for the cylinder group, and at least one block for inodes. Since your block size is 16k (the default), this means that your filesystem uses 64k for filesystem metadata. This isn't a problem with larger filesystems, but yours is 512k so 128k is "wasted" meaning you cannot even use the space. I'm not sure how you are seeing a filesystem of 391k.. I performed these same steps and I have a 382k filesystem: 512 - 128 - 2 = 382, so I'm not surprised with my numbers. That extra 2k is one fragment allocated to the root directory. If you want to better conserve space on your small partition, you should probably use UFS1 (which only reserves 8k for bootstrap) instead of UFS2 and specify smaller block and fragment sizes. I would also specify inode density. I tried the following: % newfs -O 1 -U -l -m 0 -n -o space -f 512 -b 4096 -i 1048576 /dev/md0 /dev/md0: 0.5MB (1024 sectors) block size 4096, fragment size 512 using 1 cylinder groups of 0.50MB, 128 blks, 32 inodes. with soft updates super-block backups (for fsck -b #) at: 32 After mounting, it shows: Filesystem Size Used Avail Capacity Mounted on /dev/md0 480K 512B 479K 0% /mnt There are a number of things of which you should be careful. Using UFS1, you won't be able to use bootstrap code larger than 8k and you won't be able to use large files (not a problem because your filesystem is only 512k). You also won't get snapshots, which you apparently don't want. Specifying inode density can put you in a bind if you need a lot of inodes. In my example there are exactly 16 inodes, which is somewhat limited. The first three inodes are reserved (2 is the root inode) which leaves you with a maximum of 13 files and/or directories. I'm assuming this isn't a problem since you're using such a small filesystem. The smaller block and fragment sizes help reduce the "wasted space" taken up by filesystem metadata, but will require some tuning if you want more inodes. Be sure that only one cylinder group is created, or you'll be wasting 16k or more for each cylinder group. I also recommend keeping the 8:1 ratio of blocks to fragments. If you do wish to tweak that, here are a few things to note. Minimum blocksize is 4096 and at least 4 blocks are allocated for each cylinder group (in addition to the leading 64k). More blocks are allocated if the inode density is higher (specifying a lower number to "newfs -i"). UFS1 can fit twice as many inodes in the same space as UFS2, which is why I recommend using it with very small filesystems. Since filesystem metadata is always allocated in blocks, it doesn't really help to tweak the fragment size. At one time I was thinking of writing up a patch to newfs to allow you specify the superblock offset, so you could save 16-64k per cylinder group. But there are limitations, since the FFS code searches for superblocks at specific offsets, namely (in order): 64k, 8k, 0, 256k. I also had thoughts about patching it to remove the superblock backup, so that fs_sblkno could be 0 instead of 144 or 32. Because of its structure, at least 16k (8k bootstrap plus 8k initial superblock) is unused for every cylinder group in UFS1 (at least 72k for UFS2). There isn't much to be gained in such a patch except for very small filesystems such as in your case. When you're dealing with 512k, that extra 16k (or more) is starting to look significant (3%). HTH, -- Rick C. Petty From spawk at acm.poly.edu Tue Aug 11 20:06:39 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Tue Aug 11 20:06:46 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs In-Reply-To: <20090807202756.GB2487@garage.freebsd.pl> References: <4A78AFB2.10103@acm.poly.edu> <20090805115621.GG1784@garage.freebsd.pl> <4A798A12.4070408@acm.poly.edu> <20090807073738.GA1607@garage.freebsd.pl> <20090807074400.GB1607@garage.freebsd.pl> <4A7C3002.8000003@acm.poly.edu> <20090807191334.GA1814@garage.freebsd.pl> <4A7C81CA.2040303@acm.poly.edu> <20090807193842.GA2487@garage.freebsd.pl> <4A7C87C5.1070608@acm.poly.edu> <20090807202756.GB2487@garage.freebsd.pl> Message-ID: <4A81CF20.7010108@acm.poly.edu> Pawel Jakub Dawidek wrote: > On Fri, Aug 07, 2009 at 04:00:05PM -0400, Boris Kochergin wrote: > >> Pawel Jakub Dawidek wrote: >> >>> On Fri, Aug 07, 2009 at 03:34:34PM -0400, Boris Kochergin wrote: >>> >>> >>>> Pawel Jakub Dawidek wrote: >>>> >>>> >>>>> Yeah, that's strange indeed. Could you try: >>>>> >>>>> print ab->b_arc_node.list_prev >>>>> print ab->b_arc_node.list_next >>>>> >>>>> >>>>> >>>>> >>>> (kgdb) print ab->b_arc_node.list_prev >>>> $1 = (struct list_node *) 0x1 >>>> >>>> >>> Yeah, list_prev is corrupted. If it panics on you everytime, I could >>> send you a patch which will try to catch where the corruption occurs. >>> >>> >>> >> I eventually get the arc_evict panic every time I successfully manage to >> mount the filesystem, but it usually panics (with the other backtrace) >> as soon as I try to mount it, or mount just hangs. I'll gladly try the >> patch, though--the data on the array is important to me. Thanks. >> > > To get the data from there you could also try to 'zfs send' it without > mounting the dataset at all (just in case). > > Sorry for the delay. I had to find another machine to move the disks into so that I could continue experimenting. Anyway, the filesystem didn't have any snapshots I could send, so I tried creating one with "zfs snapshot home@1" and the machine hung. FYI, In the new machine, all disks (including the one with the / filesystem) retain their device names. -Boris From pmc at citylink.dinoex.sub.org Tue Aug 11 21:43:25 2009 From: pmc at citylink.dinoex.sub.org (Peter Much) Date: Tue Aug 11 21:43:31 2009 Subject: zfs/panic: short after rollback References: <4A325E9F.2080802@icyb.net.ua> <3c1674c90906121354s6d6ae7ben5082708b1586e94f@mail.gmail.com> Message-ID: <200908112056.n7BKulp2029129@gate.oper.dinoex.org> aka Kip Macy schrieb mit Datum Fri, 12 Jun 2009 13:54:40 -0700 in m2n.fbsd.stable: |show sleepchain |show thread 100263 | |On Fri, Jun 12, 2009 at 6:56 AM, Andriy Gapon wrote: |> |> I did zfs rollback xxx@yyy |> And then did ls on a directory in the rolled-back fs. |> panic: sleeping thread This is quite likely the same problem as I experience. And it is maybe also the same problem as in kern/137037 and kern/129148. It seems to show up in some different flavours, while the bottomline is this: do a rollback, and soon after (usually at the next filesystem-related action) the kernel has gone fishing. I experienced it first when doing a rollback of a mounted filesystem. It crashed right after the first try, and it did so reproducible. (Well, more or less reproducible - another day under similar circumstances it did not crash.) Then I started thinking, and came to the conclusion that a rollback of a mounted filesystem (with possibly open files) could easily bring a lot of things into an undefined state, and should not be something one wants to do normally. So maybe it is not supposed to work at all. Anyway, when trying this, I do either get the "sleeping thread" message (as above), or a panic from _sx_xlock() (as shown in my addendum to kern/137037, and in the addendum to kern/129148). So I started to do rollbacks on unmounted filesystems (quite an excessive amount of them), and while this seemed to work at first, later on the system failures reappeared. These system failures took various shapes - I experienced immediate resets without dump, and system hangs. When deliberately trying to reproduce that (after installing a kernel with debugging info and watching the console), I also captured a panic coming from _sx_xlock() - so it seems to be the same problem as without unmounting, only that it takes a couple of rollbacks (a dozen or more) to hit. Over all, there was never any data loss or persistent damage. So, I consider rollback still functional and safe to use, but I consider a system no longer production stable after doing a rollback. rgds, PMc From nafzal at hotmail.com Wed Aug 12 00:16:10 2009 From: nafzal at hotmail.com (Naeem Afzal) Date: Wed Aug 12 00:16:17 2009 Subject: filesystem size after newfs In-Reply-To: <20090811191837.GB66530@keira.kiwi-computer.com> References: <20090811191837.GB66530@keira.kiwi-computer.com> Message-ID: Thanks you so much that was good explanation. For some reason using UFS1 or UFS2 did not make any size difference after I changed the block size and fragment size to minimum (4K/512). One more thing, I tried to make the same partition as geli with HMAC/SHA256 authentication and it eats up even more space. Without this authentication, usage is pretty close to without geli. #geli init -a HMAC/SHA256 -P -K da2-64bytes.key /dev/ad1d # newfs -O 1 -U -l -m 0 -n -o space -f 512 -b 4096 -i 1048576 /dev/ad1d.eli/dev/ad1d.eli: 0.2MB (511 sectors) block size 4096, fragment size 512 using 1 cylinder groups of 0.25MB, 63 blks, 32 inodes. with soft updatessuper-block backups (for fsck -b #) at: 32 #df -H /testFilesystem 1K-blocks Used Avail Capacity Mounted on/dev/ad1d.eli 228k 512B 228k 0% /test ada1.eli should have been 0.5MB, but seems like it is reserving some area for geli? How much is needed for HMAC/SHA256? What is allocation scheme if we go for this SHA256 authentication? regardsnaeem > > There are a number of things of which you should be careful. Using UFS1, > you won't be able to use bootstrap code larger than 8k and you won't be > able to use large files (not a problem because your filesystem is only > 512k). You also won't get snapshots, which you apparently don't want. > > Specifying inode density can put you in a bind if you need a lot of > inodes. In my example there are exactly 16 inodes, which is somewhat > limited. The first three inodes are reserved (2 is the root inode) which > leaves you with a maximum of 13 files and/or directories. I'm assuming > this isn't a problem since you're using such a small filesystem. The > smaller block and fragment sizes help reduce the "wasted space" taken up > by filesystem metadata, but will require some tuning if you want more > inodes. Be sure that only one cylinder group is created, or you'll be > wasting 16k or more for each cylinder group. > > I also recommend keeping the 8:1 ratio of blocks to fragments. If you do > wish to tweak that, here are a few things to note. Minimum blocksize is > 4096 and at least 4 blocks are allocated for each cylinder group (in > addition to the leading 64k). More blocks are allocated if the inode > density is higher (specifying a lower number to "newfs -i"). UFS1 can fit > twice as many inodes in the same space as UFS2, which is why I recommend > using it with very small filesystems. Since filesystem metadata is always > allocated in blocks, it doesn't really help to tweak the fragment size. > > At one time I was thinking of writing up a patch to newfs to allow you > specify the superblock offset, so you could save 16-64k per cylinder > group. But there are limitations, since the FFS code searches for > superblocks at specific offsets, namely (in order): 64k, 8k, 0, 256k. > I also had thoughts about patching it to remove the superblock backup, so > that fs_sblkno could be 0 instead of 144 or 32. Because of its structure, > at least 16k (8k bootstrap plus 8k initial superblock) is unused for every > cylinder group in UFS1 (at least 72k for UFS2). > > There isn't much to be gained in such a patch except for very small > filesystems such as in your case. When you're dealing with 512k, that > extra 16k (or more) is starting to look significant (3%). > > HTH, > > -- Rick C. Petty _________________________________________________________________ Get free photo software from Windows Live http://www.windowslive.com/online/photos?ocid=PID23393::T:WLMTAGL:ON:WL:en-US:SI_PH_software:082009 From pjd at FreeBSD.org Wed Aug 12 05:28:27 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Wed Aug 12 05:28:33 2009 Subject: filesystem size after newfs In-Reply-To: References: <20090811191837.GB66530@keira.kiwi-computer.com> Message-ID: <20090812052814.GD1600@garage.freebsd.pl> On Wed, Aug 12, 2009 at 12:16:09AM +0000, Naeem Afzal wrote: > > Thanks you so much that was good explanation. > For some reason using UFS1 or UFS2 did not make any size difference after I changed the block size and fragment size to minimum (4K/512). > One more thing, I tried to make the same partition as geli with HMAC/SHA256 authentication and it eats up even more space. Without this authentication, usage is pretty close to without geli. > #geli init -a HMAC/SHA256 -P -K da2-64bytes.key /dev/ad1d > # newfs -O 1 -U -l -m 0 -n -o space -f 512 -b 4096 -i 1048576 /dev/ad1d.eli/dev/ad1d.eli: 0.2MB (511 sectors) block size 4096, fragment size 512 using 1 cylinder groups of 0.25MB, 63 blks, 32 inodes. with soft updatessuper-block backups (for fsck -b #) at: 32 > #df -H /testFilesystem 1K-blocks Used Avail Capacity Mounted on/dev/ad1d.eli 228k 512B 228k 0% /test > > ada1.eli should have been 0.5MB, but seems like it is reserving some area for geli? How much is needed for HMAC/SHA256? What is allocation scheme if we go for this SHA256 authentication? > regardsnaeem To ensure atomicity of operations, geli stores hashes in the same sector as the data. Creating geli providers with block size of 512 bytes is very inefficient. It will consume two sectors for each sector, which looks like this: 1 512b of data -----> 480b for data + 32b for hash 2 32b for data + 32b for hash + 448b unused The most optimal block size for geli provider is 4kB, it consumes one extra sector for every 8 sectors: 1 512b of data -----> 480b for data + 32b for hash 2 512b of data 480b for data + 32b for hash 3 512b of data 480b for data + 32b for hash 4 512b of data 480b for data + 32b for hash 5 512b of data 480b for data + 32b for hash 6 512b of data 480b for data + 32b for hash 7 512b of data 480b for data + 32b for hash 8 512b of data 480b for data + 32b for hash 9 256b for data + 32b for hash + 224 unused -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090812/0628d135/attachment.pgp From pjd at FreeBSD.org Wed Aug 12 05:38:10 2009 From: pjd at FreeBSD.org (pjd@FreeBSD.org) Date: Wed Aug 12 05:38:16 2009 Subject: kern/137037: [zfs] [hang] zfs rollback on root causes FreeBSD to freeze in few seconds Message-ID: <200908120538.n7C5cAUF046203@freefall.freebsd.org> Synopsis: [zfs] [hang] zfs rollback on root causes FreeBSD to freeze in few seconds State-Changed-From-To: open->feedback State-Changed-By: pjd State-Changed-When: ¶ro 12 sie 2009 05:33:06 UTC State-Changed-Why: Note that rollback of root file system is impossible to do on-line anyway. Rollback will modify data on your root file system and because of that file system being rolled back has to be unmounted and mounted with new data. You shouldn't be able to unmount root file system anyway. I'd like to have offline rollback, eg. you mark root file system for rollback and reboot, but I'm not sure it's possible right now. What I'd suggest instead is the following. Let's say your root file system is tank/root and you have a snapshot to which you're willing to rollback tank/root@snap. You could clone the snapshot and create tank/root2, edit your /etc/fstab so that tank/root2 will be your root file system and reboot. Once you reboot and verify everything is fine you could promote your clone, remove tank/root and rename tank/root2 to tank/root. http://www.freebsd.org/cgi/query-pr.cgi?pr=137037 From p.christias at noc.ntua.gr Wed Aug 12 12:47:26 2009 From: p.christias at noc.ntua.gr (Panagiotis Christias) Date: Wed Aug 12 12:47:33 2009 Subject: UFS Filesystem issues, and the loss of my hair... In-Reply-To: <8E9591D8BCB72D4C8DE0884D9A2932DC6D2EDF21@ITS-HCWNEM03.ds.Vanderbilt.edu> References: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34C3@ITS-HCWNEM03.ds.Vanderbilt.edu> <200908101605.12332.jhb@freebsd.org> <200908101707.49526.jhb@freebsd.org> <8E9591D8BCB72D4C8DE0884D9A2932DC6D2EDF21@ITS-HCWNEM03.ds.Vanderbilt.edu> Message-ID: <20090812124721.GA71441@noc.ntua.gr> On Mon, Aug 10, 2009 at 05:20:44PM -0500, Hearn, Trevor wrote: > Yes, it does seem like it was part of one of the other messages. The isp(4) > driver was just recently updated in HEAD by mjacob@ who has maintained that > driver in the past. He may have some insight if there is an isp(4)-specific > problem. > > -- > John Baldwin > > Heh. Ok, I just watched the same error message scroll across the screen > for about 5 minutes now, with a different offset, same length. The fun > part is that it is not touching the device, /dev/da1p7 at all. From the > systat -vmstat display, I see all of the traffic coming from the > /dev/mfid0 drives. It ran for a while, then stopped. So, no access to > the drive in question, da1p7, but on the root drive, mfid0. Odd. The > partition is mapped to the root drive. I wonder if the driver lost > itself, and it tried to access the file on the empty folder on the root > drive. Sigh. Anyone? Hello, we faced a similar problem here (major greek university) about a year ago [1]. Our setup consists of Dell 2950 servers, QLogic 2462 HBAs (PCI-E) and an EMC CLARiiON CX3-40. As soon as we tried to do a simple "tar zxf ports.tgz" on a SAN volume the system would freeze or/and panic (same error messages as yours). Oleg Sharoiko suggested that we could decrease the number of tag openings (tag queue depth). Decreasing it would make the system a bit more stable but did not eliminate the problem. Then, I contacted Matthew Jacob and tested his latest isp code [2] along with alternative solutions like zfs and gjournal. Matthew was kind enough to offer his support but eventually I ran out time and patience, so I moved a couple of servers to centos in order to put the storage into production. That was around December last year. About a month ago Kenneth Merry announced that a new version of isp was available [3] which corrected bugs and added new functionality. I thought it was worth trying so I set up FreeBSD 7-stable in two Dell boxes, added the isp patches, recompiled the kernel and started the stress tests. I also looked around for more info and hints regarding qlogic hbas. The Linux driver (ql2xxx) has a 32 max queue depth by default (see ql2xmaxqdepth) which is also the recommended value by EMC. There are also similar references for Solaris (see sd:sd_max_throttle). Some mention even smaller values depending the storage. Currently, I am running stress tests, using fsx, ffsb, postmark, iozone, bonnie++, blogbench and other home-made scripts (any other suggestion?) on two 7-stable-amd64 + isp_diffs.releng7.20090629 boxes. So far, at 32 maximum tag openings, everything looks good, I have not seen any panics and the following fsck run cleanly. I will keep running more tests for a week or two hoping that they will help draw a conclusion. Regards, Panagiotis ps. cc'ed to Kenneth Merry, I think he would be interested. [1] http://lists.freebsd.org/pipermail/freebsd-scsi/2008-October/003686.html [2] http://feral.com/isp.html [3] http://lists.freebsd.org/pipermail/freebsd-scsi/2009-June/003916.html -- Panagiotis J. Christias Network Management Center P.Christias@noc.ntua.gr National Technical Univ. of Athens, GREECE From james-freebsd-fs2 at jrv.org Wed Aug 12 15:13:51 2009 From: james-freebsd-fs2 at jrv.org (James R. Van Artsdalen) Date: Wed Aug 12 15:13:57 2009 Subject: ZFS sx_xlock panic w/zfs_vnops.c.2.patch Message-ID: <4A82DC2D.6070400@jrv.org> FreeBSD bigback.housenet.jrv 8.0-BETA2 FreeBSD 8.0-BETA2 #1 r195757M: Wed Jul 29 13:44:06 CDT 2009 james@bigback.housenet.jrv:/usr/obj/usr/src/sys/BIGTEX amd64 with mav's siis.20090718.patch driver, with my libzfs_sendrecv.c path, with zfs_vnops.c.2.patch panic during reboot(8) ... zfs_umount:971[0]: Force unmount is experimental - report any problems. zfs_umount:971[0]: Force unmount is experimental - report any problems. panic: sx_xlock() of destroyed sx @ /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4361 cpuid = 0 KDB: enter: panic Physical memory: 9422 MB Dumping 1862 MB: 1847 1831 1815 1799 1783 1767 1751 1735 1719 1703 1687 1671 1655 1639 1623 1607 1591 1575 1559 1543 1527 1511 1495 1479 1463 1447 1431 1415 1399 1383 1367 1351 1335 1319 1303 1287 1271 1255 1239 1223 1207 1191 1175 1159 1143 1127 1111 1095 1079 1063 1047 1031 1015 999 983 967 951 935 919 903 887 871 855 839 823 807 791 775 759 743 727 711 695 679 663 647 631 615 599 583 567 551 535 519 503 487 471 455 439 423 407 391 375 359 343 327 311 295 279 263 247 231 215 199 183 167 151 135 119 103 87 71 55 39 23 7 Reading symbols from /boot/kernel/zfs.ko...Reading symbols from /boot/kernel/zfs.ko.symbols...done. done. Loaded symbols for /boot/kernel/zfs.ko Reading symbols from /boot/kernel/opensolaris.ko...Reading symbols from /boot/kernel/opensolaris.ko.symbols...done. done. Loaded symbols for /boot/kernel/opensolaris.ko Reading symbols from /boot/kernel/geom_mirror.ko...Reading symbols from /boot/kernel/geom_mirror.ko.symbols...done. done. Loaded symbols for /boot/kernel/geom_mirror.ko Reading symbols from /boot/kernel/siis.ko...Reading symbols from /boot/kernel/siis.ko.symbols...done. done. Loaded symbols for /boot/kernel/siis.ko Reading symbols from /boot/kernel/ahci.ko...Reading symbols from /boot/kernel/ahci.ko.symbols...done. done. Loaded symbols for /boot/kernel/ahci.ko #0 doadump () at pcpu.h:223 223 pcpu.h: No such file or directory. in pcpu.h (kgdb) bt #0 doadump () at pcpu.h:223 #1 0xffffffff801dfdec in db_fncall (dummy1=Variable "dummy1" is not available. ) at /usr/src/sys/ddb/db_command.c:548 #2 0xffffffff801e0121 in db_command (last_cmdp=0xffffffff80bbd9e0, cmd_table=Variable "cmd_table" is not available. ) at /usr/src/sys/ddb/db_command.c:445 #3 0xffffffff801e0370 in db_command_loop () at /usr/src/sys/ddb/db_command.c:498 #4 0xffffffff801e2349 in db_trap (type=Variable "type" is not available. ) at /usr/src/sys/ddb/db_main.c:229 #5 0xffffffff805bab85 in kdb_trap (type=3, code=0, tf=0xffffff810f0cf5e0) at /usr/src/sys/kern/subr_kdb.c:534 #6 0xffffffff8083daf1 in trap (frame=0xffffff810f0cf5e0) at /usr/src/sys/amd64/amd64/trap.c:613 #7 0xffffffff80823883 in calltrap () at /usr/src/sys/amd64/amd64/exception.S:224 #8 0xffffffff805bad5d in kdb_enter (why=0xffffffff80936f79 "panic", msg=0xa
) at cpufunc.h:63 #9 0xffffffff8058b74b in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:558 #10 0xffffffff80592f0c in _sx_xlock (sx=dwarf2_read_address: Corrupted DWARF expression. ) at /usr/src/sys/kern/kern_sx.c:285 #11 0xffffffff8105fc56 in zfs_freebsd_reclaim (ap=Variable "ap" is not available. ) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vnops.c:4361 #12 0xffffffff8061ae05 in vgonel (vp=0xffffff00503fa1d8) at vnode_if.h:830 #13 0xffffffff8061e975 in vflush (mp=0xffffff00503ebbc0, rootrefs=0, flags=0, td=0xffffff0050248000) at /usr/src/sys/kern/vfs_subr.c:2449 #14 0xffffffff8105a598 in zfs_umount (vfsp=0xffffff00503ebbc0, fflag=524288) at /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/common/fs/zfs/zfs_vfsops.c:996 #15 0xffffffff80616336 in dounmount (mp=0xffffff00503ebbc0, flags=524288, td=Variable "td" is not available. ) at /usr/src/sys/kern/vfs_mount.c:1289 #16 0xffffffff8061be54 in vfs_unmountall () at /usr/src/sys/kern/vfs_subr.c:3141 #17 0xffffffff8058b58f in boot (howto=0) at /usr/src/sys/kern/kern_shutdown.c:401 #18 0xffffffff8058b8b8 in reboot (td=Variable "td" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:173 #19 0xffffffff8083d4af in syscall (frame=0xffffff810f0cfc80) at /usr/src/sys/amd64/amd64/trap.c:984 #20 0xffffffff80823b61 in Xfast_syscall () at /usr/src/sys/amd64/amd64/exception.S:373 #21 0x000000080078f96c in ?? () Previous frame inner to this frame (corrupt stack?) (kgdb) From gprspb at mail.ru Wed Aug 12 18:10:09 2009 From: gprspb at mail.ru (gprspb@mail.ru) Date: Wed Aug 12 18:10:16 2009 Subject: kern/122038: [tmpfs] [panic] tmpfs: panic: tmpfs_alloc_vp: type 0xc7d2fab0 0 Message-ID: <200908121810.n7CIA8Qv058688@freefall.freebsd.org> The following reply was made to PR kern/122038; it has been noted by GNATS. From: gprspb@mail.ru To: bug-followup@FreeBSD.org, delphij@FreeBSD.org Cc: Subject: Re: kern/122038: [tmpfs] [panic] tmpfs: panic: tmpfs_alloc_vp: type 0xc7d2fab0 0 Date: Wed, 12 Aug 2009 21:47:01 +0400 This is 100% reproducible on my system with the following commands: # mkdir -p /tmp/1/2 # cd /tmp/1/2 # rm -rf /tmp/1 ; cd .. Panic String: tmpfs_alloc_vp: type 0xffffff00268b80e0 0 tmpfs on /tmp (tmpfs, local) FreeBSD gpr.nnz-home.ru 8.0-BETA2 FreeBSD 8.0-BETA2 #0 r196086M: Sat Aug 8 23:53:43 MSD 2009 gpr@gpr.nnz-home.ru:/usr/obj/usr/src/freebsd-head/sys/GPR amd64 I can make dump or submit additional info if it is necessary. From nafzal at hotmail.com Wed Aug 12 21:19:57 2009 From: nafzal at hotmail.com (Naeem Afzal) Date: Wed Aug 12 21:20:05 2009 Subject: filesystem size after newfs In-Reply-To: <20090812052814.GD1600@garage.freebsd.pl> References: <20090811191837.GB66530@keira.kiwi-computer.com> <20090812052814.GD1600@garage.freebsd.pl> Message-ID: Block size is set to 4K, did you mean fragment size should be 4K too? Are you saying that for each block of 512b there is 32b for space consumed for hash? Is this only when using authentication algorithm? regardsnaeem > > To ensure atomicity of operations, geli stores hashes in the same > sector as the data. Creating geli providers with block size of 512 bytes > is very inefficient. It will consume two sectors for each sector, which > looks like this: > > 1 512b of data -----> 480b for data + 32b for hash > 2 32b for data + 32b for hash + 448b unused > > The most optimal block size for geli provider is 4kB, it consumes one > extra sector for every 8 sectors: > > 1 512b of data -----> 480b for data + 32b for hash > 2 512b of data 480b for data + 32b for hash > 3 512b of data 480b for data + 32b for hash > 4 512b of data 480b for data + 32b for hash > 5 512b of data 480b for data + 32b for hash > 6 512b of data 480b for data + 32b for hash > 7 512b of data 480b for data + 32b for hash > 8 512b of data 480b for data + 32b for hash > 9 256b for data + 32b for hash + 224 unused > > -- > Pawel Jakub Dawidek http://www.wheel.pl > pjd@FreeBSD.org http://www.FreeBSD.org > FreeBSD committer Am I Evil? Yes, I Am! _________________________________________________________________ Express your personality in color! Preview and select themes for Hotmail?. http://www.windowslive-hotmail.com/LearnMore/personalize.aspx?ocid=PID23391::T:WLMTAGL:ON:WL:en-US:WM_HYGN_express:082009 From pjd at FreeBSD.org Wed Aug 12 22:18:29 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Wed Aug 12 22:18:36 2009 Subject: filesystem size after newfs In-Reply-To: References: <20090811191837.GB66530@keira.kiwi-computer.com> <20090812052814.GD1600@garage.freebsd.pl> Message-ID: <20090812221820.GA1463@garage.freebsd.pl> On Wed, Aug 12, 2009 at 09:15:05PM +0000, Naeem Afzal wrote: > > > Block size is set to 4K, did you mean fragment size should be 4K too? Are you saying that for each block of 512b there is 32b for space consumed for hash? Is this only when using authentication algorithm? The block size I was talking about is the one given by -s option to 'geli init'. If authentication is turned off there is no space overhead except for 512 bytes for metadata. -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090812/874c22bd/attachment.pgp From ken at kdm.org Thu Aug 13 15:23:10 2009 From: ken at kdm.org (Kenneth D. Merry) Date: Thu Aug 13 15:23:17 2009 Subject: UFS Filesystem issues, and the loss of my hair... In-Reply-To: <20090812124721.GA71441@noc.ntua.gr> References: <8E9591D8BCB72D4C8DE0884D9A2932DC35BD34C3@ITS-HCWNEM03.ds.Vanderbilt.edu> <200908101605.12332.jhb@freebsd.org> <200908101707.49526.jhb@freebsd.org> <8E9591D8BCB72D4C8DE0884D9A2932DC6D2EDF21@ITS-HCWNEM03.ds.Vanderbilt.edu> <20090812124721.GA71441@noc.ntua.gr> Message-ID: <20090813145008.GA39384@nargothrond.kdm.org> On Wed, Aug 12, 2009 at 15:47:21 +0300, Panagiotis Christias wrote: > On Mon, Aug 10, 2009 at 05:20:44PM -0500, Hearn, Trevor wrote: > > Yes, it does seem like it was part of one of the other messages. The isp(4) > > driver was just recently updated in HEAD by mjacob@ who has maintained that > > driver in the past. He may have some insight if there is an isp(4)-specific > > problem. > > > > -- > > John Baldwin > > > > Heh. Ok, I just watched the same error message scroll across the screen > > for about 5 minutes now, with a different offset, same length. The fun > > part is that it is not touching the device, /dev/da1p7 at all. From the > > systat -vmstat display, I see all of the traffic coming from the > > /dev/mfid0 drives. It ran for a while, then stopped. So, no access to > > the drive in question, da1p7, but on the root drive, mfid0. Odd. The > > partition is mapped to the root drive. I wonder if the driver lost > > itself, and it tried to access the file on the empty folder on the root > > drive. Sigh. Anyone? > > Hello, > > we faced a similar problem here (major greek university) about a year ago > [1]. Our setup consists of Dell 2950 servers, QLogic 2462 HBAs (PCI-E) > and an EMC CLARiiON CX3-40. As soon as we tried to do a simple "tar zxf > ports.tgz" on a SAN volume the system would freeze or/and panic (same error > messages as yours). Oleg Sharoiko suggested that we could decrease the > number of tag openings (tag queue depth). Decreasing it would make the > system a bit more stable but did not eliminate the problem. > > Then, I contacted Matthew Jacob and tested his latest isp code [2] along > with alternative solutions like zfs and gjournal. Matthew was kind enough > to offer his support but eventually I ran out time and patience, so I moved > a couple of servers to centos in order to put the storage into production. > That was around December last year. > > About a month ago Kenneth Merry announced that a new version of isp was > available [3] which corrected bugs and added new functionality. I thought > it was worth trying so I set up FreeBSD 7-stable in two Dell boxes, added > the isp patches, recompiled the kernel and started the stress tests. I > also looked around for more info and hints regarding qlogic hbas. The > Linux driver (ql2xxx) has a 32 max queue depth by default (see > ql2xmaxqdepth) which is also the recommended value by EMC. There are also > similar references for Solaris (see sd:sd_max_throttle). Some mention > even smaller values depending the storage. > > Currently, I am running stress tests, using fsx, ffsb, postmark, iozone, > bonnie++, blogbench and other home-made scripts (any other suggestion?) on > two 7-stable-amd64 + isp_diffs.releng7.20090629 boxes. So far, at 32 maximum > tag openings, everything looks good, I have not seen any panics and the > following fsck run cleanly. I will keep running more tests for a week or two > hoping that they will help draw a conclusion. Thanks for the report! I'm glad to hear it is working for you. The driver has gone into -current, and will be in 8.0-RELEASE. Hopefully it'll get propagated back into RELENG_7 before too long. Ken -- Kenneth Merry ken@kdm.org From stb at lassitu.de Fri Aug 14 15:13:00 2009 From: stb at lassitu.de (Stefan Bethke) Date: Fri Aug 14 15:13:07 2009 Subject: XtreemFS: new distributed FS Message-ID: <5BE7DB15-41CB-4F68-B1B0-72DA077453B7@lassitu.de> http://www.xtreemfs.org/ They have some bold claims, and no port to FreeBSD yet, but they're using Fuse, so it shouldn't be too hard... Stefan -- Stefan Bethke Fon +49 151 14070811 From gosand1982 at yahoo.com Sat Aug 15 19:31:42 2009 From: gosand1982 at yahoo.com (George Sanders) Date: Sat Aug 15 19:31:48 2009 Subject: cannot use 2TB external USB drive ... Message-ID: <453373.52906.qm@web111616.mail.gq1.yahoo.com> I bought a western digital 2TB USB external drive - shows up in dmesg as: da2 at umass-sim0 bus 0 target 0 lun 0 da2: Fixed Direct Access SCSI-4 device da2: 40.000MB/s transfers da2: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) Originally I tried to simply use it right off as a FAT32 device. However, this crashed my system after generating thousands of: g_vfs_done():da0s4[READ(offset=0, length=2048)]error = 5 g_vfs_done():da0s4[READ(offset=32768, length=2048)]error = 5 So I wiped the disk and recreated it in sysinstall using sysid 6 (msdos) and newfs_msdos ... my thought was that maybe western digital had some weird layouts or boot partitions, etc., and maybe I just needed to start with a clean slate. I got the same result. So finally, I gave up and since the end user of this system CANNOT use ufs2 (which is what I would prefer anyway) I installed the ext2 tools and made an ext2 volume: # mke2fs /dev/da2s1 mke2fs 1.41.8 (11-Jul-2009) Filesystem label= OS type: FreeBSD Block size=4096 (log=2) Fragment size=4096 (log=2) 122101760 inodes, 488378000 blocks 24418900 blocks (5.00%) reserved for the super user First data block=0 Maximum filesystem blocks=0 14905 block groups 32768 blocks per group, 32768 fragments per group 8192 inodes per group Superblock backups stored on blocks: 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, 102400000, 214990848 Writing inode tables: This mke2fs operation completed successfully and I mounted the drive and began using it. I filled it up to just about the 1 TB level, and the system crashes. No errors, no output, nothing - just freezes up requiring a reset button. Note that my 4k block size in the mke2fs above does NOT imply a 1 TB filesystem size limit for ext2. So what am I doing wrong ? Again, I would love to just newfs this to ufs2, but the end user cannot use that - fat32 and ext2 are my only options... Is this drive just too big for freebsd to handle over USB ? From zbeeble at gmail.com Sat Aug 15 22:51:04 2009 From: zbeeble at gmail.com (Zaphod Beeblebrox) Date: Sat Aug 15 22:51:10 2009 Subject: cannot use 2TB external USB drive ... In-Reply-To: <453373.52906.qm@web111616.mail.gq1.yahoo.com> References: <453373.52906.qm@web111616.mail.gq1.yahoo.com> Message-ID: <5f67a8c40908151521g2f152e56ge9928573dd784a88@mail.gmail.com> On Sat, Aug 15, 2009 at 3:19 PM, George Sanders wrote: > > > I bought a western digital 2TB USB external drive - shows up in dmesg as: > > da2: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) > > Originally I tried to simply use it right off as a FAT32 device. However, > this crashed my system after generating thousands of: > [...] > So finally, I gave up and since the end user of this system CANNOT use ufs2 > (which is what I would prefer anyway) I installed the ext2 tools and made an > ext2 volume: Urm... I'm pretty sure that neither ext2 nor FAT32 can make a 2T filesystem. It's a limitation of the format. NTFS might work fine (use the fuse version) or ext3 might work (I'm fuzzy on that point). The other obvious choice would be to make partitions less than 1T. From jensrasmus at gmail.com Sun Aug 16 17:32:19 2009 From: jensrasmus at gmail.com (Jens Rasmus Liland) Date: Sun Aug 16 17:32:25 2009 Subject: Fwd: How do I mount an external ntfs formatted harddisk manually and through /etc/fstab? In-Reply-To: References: <63e02e980907310725t2b38d1d3iff66aca3948ac8dd@mail.gmail.com> <63e02e980908020954r65b6b4b5n8288f0f5e3b14568@mail.gmail.com> Message-ID: <63e02e980908161032y60c4c966v2918b34c83397fee@mail.gmail.com> Hi, Sorry for the late reply - I went on vacation for a while. I think 'mount_ntfs-3g' did the trick in terms of mounting /dev/da0s1 manually. But I tried to add /dev/da0s1 /homewd ntfs-3g ro 0 0 ... but then the computer panicked, and went into single user mode. I think it happened because the ntfs-3g module is loaded later with the fusefs-stuff. How to get around this one? On Mon, Aug 3, 2009 at 3:19 PM, CmdLnKid wrote: > On Sun, 2 Aug 2009 12:54 -0000, jensrasmus wrote: > > I'm forwarding this to -stable list, since i appears to get no response on >> -fs. >> >> ---------- Forwarded message ---------- >> From: Jens Rasmus Liland >> Date: Fri, Jul 31, 2009 at 4:25 PM >> Subject: How do I mount an external ntfs formatted harddisk manually and >> through /etc/fstab? >> To: freebsd-fs@freebsd.org >> >> >> Hi, >> >> How do I mount an NTFS formatted external harddisk plugged into the >> computer >> using a usb cable? And what do i write in the /etc/fstab after being able >> to >> successfully mount it manually? >> >> I have some blurry understanding after reading a bit in handbook that the >> harddisk's NTFS partition is at /dev/da0s1 by default. I have installed >> ntfs-3g from ports. >> >> /Rasmus >> _______________________________________________ >> freebsd-stable@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-stable >> To unsubscribe, send any mail to "freebsd-stable-unsubscribe@freebsd.org" >> >> >> > Try mount_ntfs-3g /dev/da0s1 /path/to/mountpoint > > Manuals and other such documentation serve as a pretty good medium. > > -- > > - (2^(N-1)) > From cliftonr at lava.net Sun Aug 16 19:25:23 2009 From: cliftonr at lava.net (Clifton Royston) Date: Sun Aug 16 19:25:41 2009 Subject: Fwd: How do I mount an external ntfs formatted harddisk manually and through /etc/fstab? In-Reply-To: <63e02e980908161032y60c4c966v2918b34c83397fee@mail.gmail.com> References: <63e02e980907310725t2b38d1d3iff66aca3948ac8dd@mail.gmail.com> <63e02e980908020954r65b6b4b5n8288f0f5e3b14568@mail.gmail.com> <63e02e980908161032y60c4c966v2918b34c83397fee@mail.gmail.com> Message-ID: <20090816192521.GA4926@lava.net> On Sun, Aug 16, 2009 at 07:32:17PM +0200, Jens Rasmus Liland wrote: > Hi, > > Sorry for the late reply - I went on vacation for a while. > > I think 'mount_ntfs-3g' did the trick in terms of mounting /dev/da0s1 > manually. But I tried to add > > /dev/da0s1 /homewd ntfs-3g ro 0 0 > > ... but then the computer panicked, and went into single user mode. I think > it happened because the ntfs-3g module is loaded later with the > fusefs-stuff. How to get around this one? Solution 1: Try changing "ro" to "ro,noauto" and mount it later. Solution 2: Instead, try adding "late" to the options, so that it will not be mounted until later in the boot process, after /usr and other normal filesystems are mounted. As you say the filesystem type was added from ports, your problem might be that the OS is trying to mount it before /usr is mounted, and the fs module is not available, although that shouldn't generate a panic. My 2 cents, -- Clifton -- Clifton Royston -- cliftonr@iandicomputing.com / cliftonr@lava.net President - I and I Computing * http://www.iandicomputing.com/ Custom programming, network design, systems and network consulting services From pluknet at gmail.com Sun Aug 16 20:04:52 2009 From: pluknet at gmail.com (pluknet) Date: Sun Aug 16 20:04:58 2009 Subject: Fwd: How do I mount an external ntfs formatted harddisk manually and through /etc/fstab? In-Reply-To: <63e02e980908161032y60c4c966v2918b34c83397fee@mail.gmail.com> References: <63e02e980907310725t2b38d1d3iff66aca3948ac8dd@mail.gmail.com> <63e02e980908020954r65b6b4b5n8288f0f5e3b14568@mail.gmail.com> <63e02e980908161032y60c4c966v2918b34c83397fee@mail.gmail.com> Message-ID: 2009/8/16 Jens Rasmus Liland : > Hi, > > Sorry for the late reply - I went on vacation for a while. > > I think 'mount_ntfs-3g' did the trick in terms of mounting /dev/da0s1 > manually. But I tried to add > > /dev/da0s1 ? ? ? ? ? ? ?/homewd ? ? ? ? ntfs-3g ? ?ro ? ? ? ? ? ? ?0 ? ? ? 0 > Since 7.2 new parameter -o mountprog was introduced so you should be able to set in fstab mounting with 3th party program like this: /dev/acd0 /mnt ntfs ro,noauto,mountprog=/usr/local/bin/ntfs-3g 0 0 > ... but then the computer panicked, and went into single user mode. I think > it happened because the ntfs-3g module is loaded later with the > fusefs-stuff. Or due to the wrong/unsupported syntax. -- wbr, pluknet From sarawgi.aditya at gmail.com Sun Aug 16 20:28:22 2009 From: sarawgi.aditya at gmail.com (Aditya Sarawgi) Date: Sun Aug 16 20:28:28 2009 Subject: cannot use 2TB external USB drive ... In-Reply-To: <453373.52906.qm@web111616.mail.gq1.yahoo.com> References: <453373.52906.qm@web111616.mail.gq1.yahoo.com> Message-ID: <20090816142546.GA1350@aditya> On Sat, Aug 15, 2009 at 12:19:11PM -0700, George Sanders wrote: > > > I bought a western digital 2TB USB external drive - shows up in dmesg as: > > da2 at umass-sim0 bus 0 target 0 lun 0 > da2: Fixed Direct Access SCSI-4 device > da2: 40.000MB/s transfers > da2: 1907729MB (3907029168 512 byte sectors: 255H 63S/T 243201C) > > Originally I tried to simply use it right off as a FAT32 device. However, this crashed my system after generating thousands of: > > g_vfs_done():da0s4[READ(offset=0, length=2048)]error = 5 > g_vfs_done():da0s4[READ(offset=32768, length=2048)]error = 5 > > So I wiped the disk and recreated it in sysinstall using sysid 6 (msdos) and newfs_msdos ... my thought was that maybe western digital had some weird layouts or boot partitions, etc., and maybe I just needed to start with a clean slate. > > I got the same result. > > So finally, I gave up and since the end user of this system CANNOT use ufs2 (which is what I would prefer anyway) I installed the ext2 tools and made an ext2 volume: > > # mke2fs /dev/da2s1 > mke2fs 1.41.8 (11-Jul-2009) > Filesystem label= > OS type: FreeBSD > Block size=4096 (log=2) > Fragment size=4096 (log=2) > 122101760 inodes, 488378000 blocks > 24418900 blocks (5.00%) reserved for the super user > First data block=0 > Maximum filesystem blocks=0 > 14905 block groups > 32768 blocks per group, 32768 fragments per group > 8192 inodes per group > Superblock backups stored on blocks: > 32768, 98304, 163840, 229376, 294912, 819200, 884736, 1605632, 2654208, > 4096000, 7962624, 11239424, 20480000, 23887872, 71663616, 78675968, > 102400000, 214990848 > > Writing inode tables: > > This mke2fs operation completed successfully and I mounted the drive and began using it. > > I filled it up to just about the 1 TB level, and the system crashes. No errors, no output, nothing - just freezes up requiring a reset button. Note that my 4k block size in the mke2fs above does NOT imply a 1 TB filesystem size limit for ext2. > > > > So what am I doing wrong ? > > > Again, I would love to just newfs this to ufs2, but the end user cannot use that - fat32 and ext2 are my only options... > > > Is this drive just too big for freebsd to handle over USB ? > With that block size you can create a partition of about 16 TiB though I have never tried creating one. Can you provide a crash dump ? This might help http://www.freebsd.org/doc/en/books/developers-handbook/kerneldebug.html Also feel free to file a pr about this. I don't have the necessary resources to confirm this. Can anybody reproduce this and provide a dump, please ? -- Aditya Sarawgi From gavin.atkinson at ury.york.ac.uk Mon Aug 17 09:21:10 2009 From: gavin.atkinson at ury.york.ac.uk (Gavin Atkinson) Date: Mon Aug 17 09:21:17 2009 Subject: Fwd: How do I mount an external ntfs formatted harddisk manually and through /etc/fstab? In-Reply-To: <63e02e980908161032y60c4c966v2918b34c83397fee@mail.gmail.com> References: <63e02e980907310725t2b38d1d3iff66aca3948ac8dd@mail.gmail.com> <63e02e980908020954r65b6b4b5n8288f0f5e3b14568@mail.gmail.com> <63e02e980908161032y60c4c966v2918b34c83397fee@mail.gmail.com> Message-ID: <1250499010.32945.1.camel@buffy.york.ac.uk> On Sun, 2009-08-16 at 19:32 +0200, Jens Rasmus Liland wrote: > Hi, > > Sorry for the late reply - I went on vacation for a while. > > I think 'mount_ntfs-3g' did the trick in terms of mounting /dev/da0s1 > manually. But I tried to add > > /dev/da0s1 /homewd ntfs-3g ro 0 0 > > ... but then the computer panicked, and went into single user mode. I think > it happened because the ntfs-3g module is loaded later with the > fusefs-stuff. How to get around this one? Make sure you have recompiled fusefs, ntfs-3g and any other ports that they rely on since you last upgraded your kernel/world. The fusefs kernel module seems to be quite sensitive to changes in the kernel itself. Gavin From bugmaster at FreeBSD.org Mon Aug 17 11:06:54 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Aug 17 11:07:57 2009 Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org Message-ID: <200908171106.n7HB6r1C075770@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- f kern/137037 fs [zfs] [hang] zfs rollback on root causes FreeBSD to fr o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136942 fs [zfs] zvol resize not reflected until reboot o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic o kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/136218 fs [zfs] Exported ZFS pools can't be imported into (Open) o kern/135594 fs [zfs] Single dataset unresponsive with Samba o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135480 fs [zfs] panic: lock &arg.lock already initialized o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o bin/135314 fs [zfs] assertion failed for zdb(8) usage o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot f kern/134496 fs [zfs] [panic] ZFS pool export occasionally causes a ke o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133980 fs [panic] [ffs] panic: ffs_valloc: dup alloc o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [smbfs] [panic] panic: ffs_truncate: read-only filesys o kern/133373 fs [zfs] umass attachment causes ZFS checksum errors, dat o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int f kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/133134 fs [zfs] Missing ZFS zpool labels o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132597 fs [tmpfs] [panic] tmpfs-related panic while interrupting o kern/132551 fs [zfs] ZFS locks up on extattr_list_link syscall o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes f kern/132068 fs [zfs] page fault when using ZFS over NFS on 7.1-RELEAS o kern/131995 fs [nfs] Failure to mount NFSv4 server o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/131086 fs [ext2fs] [patch] mkfs.ext2 creates rotten partition o kern/130979 fs [smbfs] [panic] boot/kernel/smbfs.ko o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129148 fs [zfs] [panic] panic on concurrent writing & rollback o kern/129059 fs [zfs] [patch] ZFS bootloader whitelistable via WITHOUT f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/128633 fs [zfs] [lor] lock order reversal in zfs o kern/128514 fs [zfs] [mpt] problems with ZFS and LSILogic SAS/SATA Ad f kern/128173 fs [ext2fs] ls gives "Input/output error" on mounted ext3 o kern/127659 fs [tmpfs] tmpfs memory leak o kern/127492 fs [zfs] System hang on ZFS input-output o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127213 fs [tmpfs] sendfile on tmpfs data corruption o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/125644 fs [zfs] [panic] zfs unfixable fs errors caused panic whe f kern/125536 fs [ext2fs] ext 2 mounts cleanly but fails on commands li o kern/125149 fs [nfs] [panic] changing into .zfs dir from nfs client c f kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122888 fs [zfs] zfs hang w/ prefetch on, zil off while running t o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o kern/122173 fs [zfs] [panic] Kernel Panic if attempting to replace a o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o kern/122047 fs [ext2fs] [patch] incorrect handling of UF_IMMUTABLE / o kern/122038 fs [tmpfs] [panic] tmpfs: panic: tmpfs_alloc_vp: type 0xc o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o kern/121770 fs [zfs] ZFS on i386, large file or heavy I/O leads to ke o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [fs] [snapshot] System crashes when manipulati o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F o bin/120288 fs zfs(8): "zfs share -a" does not send SIGHUP to mountd f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o misc/118855 fs [zfs] ZFS-related commands are nonfunctional in fixit o kern/118713 fs [minidump] [patch] Display media size required for a k o kern/118320 fs [zfs] [patch] NFS SETATTR sometimes fails to set file o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o kern/116913 fs [ffs] [panic] ffs_blkfree: freeing free block p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/115645 fs [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o kern/113180 fs [zfs] Setting ZFS nfsshare property does not cause inh o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/105093 fs [ext2fs] [patch] ext2fs on read-only media cannot be m o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna f kern/91568 fs [ufs] [panic] writing to UFS/softupdates DVD media in o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/89991 fs [ufs] softupdates with mount -ur causes fs UNREFS o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/84589 fs [2TB] 5.4-STABLE unresponsive during background fsck 2 o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o kern/77826 fs [ext2fs] ext2fs usb filesystem will not mount RW o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 149 problems total. From spawk at acm.poly.edu Tue Aug 18 12:58:26 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Tue Aug 18 12:58:32 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs In-Reply-To: <4A81CF20.7010108@acm.poly.edu> References: <4A78AFB2.10103@acm.poly.edu> <20090805115621.GG1784@garage.freebsd.pl> <4A798A12.4070408@acm.poly.edu> <20090807073738.GA1607@garage.freebsd.pl> <20090807074400.GB1607@garage.freebsd.pl> <4A7C3002.8000003@acm.poly.edu> <20090807191334.GA1814@garage.freebsd.pl> <4A7C81CA.2040303@acm.poly.edu> <20090807193842.GA2487@garage.freebsd.pl> <4A7C87C5.1070608@acm.poly.edu> <20090807202756.GB2487@garage.freebsd.pl> <4A81CF20.7010108@acm.poly.edu> Message-ID: <4A8AA531.2000004@acm.poly.edu> Boris Kochergin wrote: > Pawel Jakub Dawidek wrote: >> On Fri, Aug 07, 2009 at 04:00:05PM -0400, Boris Kochergin wrote: >> >>> Pawel Jakub Dawidek wrote: >>> >>>> On Fri, Aug 07, 2009 at 03:34:34PM -0400, Boris Kochergin wrote: >>>> >>>> >>>>> Pawel Jakub Dawidek wrote: >>>>> >>>>>> Yeah, that's strange indeed. Could you try: >>>>>> >>>>>> print ab->b_arc_node.list_prev >>>>>> print ab->b_arc_node.list_next >>>>>> >>>>>> >>>>>> >>>>> (kgdb) print ab->b_arc_node.list_prev >>>>> $1 = (struct list_node *) 0x1 >>>>> >>>> Yeah, list_prev is corrupted. If it panics on you everytime, I could >>>> send you a patch which will try to catch where the corruption occurs. >>>> >>>> >>>> >>> I eventually get the arc_evict panic every time I successfully >>> manage to mount the filesystem, but it usually panics (with the >>> other backtrace) as soon as I try to mount it, or mount just hangs. >>> I'll gladly try the patch, though--the data on the array is >>> important to me. Thanks. >>> >> >> To get the data from there you could also try to 'zfs send' it without >> mounting the dataset at all (just in case). >> >> > Sorry for the delay. I had to find another machine to move the disks > into so that I could continue experimenting. Anyway, the filesystem > didn't have any snapshots I could send, so I tried creating one with > "zfs snapshot home@1" and the machine hung. > > FYI, In the new machine, all disks (including the one with the / > filesystem) retain their device names. > > -Boris > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" Some more panics using RELENG_8 sources from yesterday: http://acm.poly.edu/~spawk/zfs/. The one in panic3.txt happens much more often than the other ones. If any brave soul wants to look into it, I can provide NFS/geom_gate/whatever access to the disk images (or actual disks, if there's a difference) so that they can recreate the problem on a local machine. -Boris From simon at comsys.ntu-kpi.kiev.ua Thu Aug 20 14:10:06 2009 From: simon at comsys.ntu-kpi.kiev.ua (Andrey Simonenko) Date: Thu Aug 20 14:10:13 2009 Subject: kern/136865: NFS exports atomic and on-the-fly atomic updates Message-ID: <200908201410.n7KEA5QW094936@freefall.freebsd.org> The following reply was made to PR kern/136865; it has been noted by GNATS. From: Andrey Simonenko To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/136865: NFS exports atomic and on-the-fly atomic updates Date: Thu, 20 Aug 2009 16:26:25 +0300 Updated version nfse-20090820 does not use , available on http://comsys.ntu-kpi.kiev.ua/~simon/nfse/ From randy at psg.com Fri Aug 21 02:47:45 2009 From: randy at psg.com (Randy Bush) Date: Fri Aug 21 02:47:52 2009 Subject: re-adding a replacement into a pool of mirrors Message-ID: a drive in a 12x2tb array died and i have replaced it # zpool status pool: tank state: DEGRADED scrub: scrub completed after 0h14m with 0 errors on Wed Aug 19 12:03:14 2009 config: NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 mirror DEGRADED 0 0 0 da0s3 ONLINE 0 0 0 da1s3 REMOVED 0 0 0 mirror ONLINE 0 0 0 da2s1 ONLINE 0 0 0 da3s1 ONLINE 0 0 0 mirror ONLINE 0 0 0 da4s1 ONLINE 0 0 0 da5s1 ONLINE 0 0 0 mirror ONLINE 0 0 0 da6s1 ONLINE 0 0 0 da7s1 ONLINE 0 0 0 mirror ONLINE 0 0 0 da8s1c ONLINE 0 0 0 da9s1 ONLINE 0 0 0 mirror ONLINE 0 0 0 da10s1 ONLINE 0 0 0 da11s1 ONLINE 0 0 0 errors: No known data errors i want to place da1s3 back in the pool, but # zpool add -fn tank da1s3 would update 'tank' to the following configuration: tank mirror da0s3 da1s3 mirror da2s1 da3s1 mirror da4s1 da5s1 mirror da6s1 da7s1 mirror da8s1c da9s1 mirror da10s1 da11s1 da1s3 so zpool add would put it in the wrong place clue please? thanks. randy From andrew at modulus.org Fri Aug 21 02:56:12 2009 From: andrew at modulus.org (Andrew Snow) Date: Fri Aug 21 02:56:19 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: References: Message-ID: <4A8E0C80.9030908@modulus.org> Randy Bush wrote: > a drive in a 12x2tb array died and i have replaced it > # zpool add -fn tank da1s3 > > > so zpool add would put it in the wrong place > Two things to try: 1. zpool replace -f tank da1s3 2. zpool offline tank da1s3 ; zpool online tank da1s3 From randy at psg.com Fri Aug 21 03:00:43 2009 From: randy at psg.com (Randy Bush) Date: Fri Aug 21 03:00:50 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: <4A8E0C80.9030908@modulus.org> References: <4A8E0C80.9030908@modulus.org> Message-ID: > 1. zpool replace -f tank da1s3 # zpool replace -f tank da1s3 cannot replace da1s3 with da1s3: permission denied > 2. zpool offline tank da1s3 ; zpool online tank da1s3 # zpool offline tank da1s3 cannot offline da1s3: no valid replicas but tried anyway # zpool online tank da1s3 warning: device 'da1s3' onlined, but remains in faulted state use 'zpool replace' to replace devices that are no longer present NAME STATE READ WRITE CKSUM tank DEGRADED 0 0 0 mirror DEGRADED 0 0 0 da0s3 ONLINE 0 0 0 da1s3 REMOVED 0 0 0 randy From andrew at modulus.org Fri Aug 21 03:07:43 2009 From: andrew at modulus.org (Andrew Snow) Date: Fri Aug 21 03:07:50 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: References: <4A8E0C80.9030908@modulus.org> Message-ID: <4A8E0F39.6070407@modulus.org> OK, with mirrored vdevs you can detach and re-attach disks. So: zpool detach tank da1s3 <-- remove it from the mirror zpool attach tank da0s3 da1s3 <-- add a disk into the mirror vdev From randy at psg.com Fri Aug 21 03:10:42 2009 From: randy at psg.com (Randy Bush) Date: Fri Aug 21 03:10:47 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: <4A8E0F39.6070407@modulus.org> References: <4A8E0C80.9030908@modulus.org> <4A8E0F39.6070407@modulus.org> Message-ID: Andrew Snow wrote: > > > OK, with mirrored vdevs you can detach and re-attach disks. > > So: > > > zpool detach tank da1s3 <-- remove it from the mirror > zpool attach tank da0s3 da1s3 <-- add a disk into the mirror vdev # zpool detach tank da1s3 # zpool status pool: tank state: ONLINE scrub: scrub completed after 0h14m with 0 errors on Wed Aug 19 12:03:14 2009 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 da0s3 ONLINE 0 0 0 mirror ONLINE 0 0 0 da2s1 ONLINE 0 0 0 da3s1 ONLINE 0 0 0 mirror ONLINE 0 0 0 da4s1 ONLINE 0 0 0 da5s1 ONLINE 0 0 0 mirror ONLINE 0 0 0 da6s1 ONLINE 0 0 0 da7s1 ONLINE 0 0 0 mirror ONLINE 0 0 0 da8s1c ONLINE 0 0 0 da9s1 ONLINE 0 0 0 mirror ONLINE 0 0 0 da10s1 ONLINE 0 0 0 da11s1 ONLINE 0 0 0 errors: No known data errors # zpool attach tank da0s3 da1s3 cannot attach da1s3 to da0s3: permission denied # zpool status pool: tank state: ONLINE scrub: scrub completed after 0h14m with 0 errors on Wed Aug 19 12:03:14 2009 config: NAME STATE READ WRITE CKSUM tank ONLINE 0 0 0 da0s3 ONLINE 0 0 0 mirror ONLINE 0 0 0 da2s1 ONLINE 0 0 0 da3s1 ONLINE 0 0 0 mirror ONLINE 0 0 0 da4s1 ONLINE 0 0 0 da5s1 ONLINE 0 0 0 mirror ONLINE 0 0 0 da6s1 ONLINE 0 0 0 da7s1 ONLINE 0 0 0 mirror ONLINE 0 0 0 da8s1c ONLINE 0 0 0 da9s1 ONLINE 0 0 0 mirror ONLINE 0 0 0 da10s1 ONLINE 0 0 0 da11s1 ONLINE 0 0 0 errors: No known data errors uh oh! randy From andrew at modulus.org Fri Aug 21 03:25:01 2009 From: andrew at modulus.org (Andrew Snow) Date: Fri Aug 21 03:25:06 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: References: <4A8E0C80.9030908@modulus.org> <4A8E0F39.6070407@modulus.org> Message-ID: <4A8E1347.3020301@modulus.org> Randy Bush wrote: > > # zpool attach tank da0s3 da1s3 > cannot attach da1s3 to da0s3: permission denied What about attach -f ? From randy at psg.com Fri Aug 21 03:30:11 2009 From: randy at psg.com (Randy Bush) Date: Fri Aug 21 03:30:18 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: <4A8E1347.3020301@modulus.org> References: <4A8E0C80.9030908@modulus.org> <4A8E0F39.6070407@modulus.org> <4A8E1347.3020301@modulus.org> Message-ID: >> # zpool attach tank da0s3 da1s3 >> cannot attach da1s3 to da0s3: permission denied > What about attach -f ? # zpool attach -f tank da0s3 da1s3 cannot attach da1s3 to da0s3: permission denied From andrew at modulus.org Fri Aug 21 03:36:35 2009 From: andrew at modulus.org (Andrew Snow) Date: Fri Aug 21 03:36:42 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: References: <4A8E0C80.9030908@modulus.org> <4A8E0F39.6070407@modulus.org> <4A8E1347.3020301@modulus.org> Message-ID: <4A8E15F9.6000303@modulus.org> Randy Bush wrote: >>> # zpool attach tank da0s3 da1s3 >>> cannot attach da1s3 to da0s3: permission denied >> What about attach -f ? > > # zpool attach -f tank da0s3 da1s3 > cannot attach da1s3 to da0s3: permission denied I think this stage should be working, unless something else in the system is already using da1s3 or has it opened. Can you check through your system to see if any other parts of eg. geom have decided to take da1s3 for their purposes? From randy at psg.com Fri Aug 21 03:50:09 2009 From: randy at psg.com (Randy Bush) Date: Fri Aug 21 03:50:15 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: <4A8E15F9.6000303@modulus.org> References: <4A8E0C80.9030908@modulus.org> <4A8E0F39.6070407@modulus.org> <4A8E1347.3020301@modulus.org> <4A8E15F9.6000303@modulus.org> Message-ID: >> # zpool attach -f tank da0s3 da1s3 >> cannot attach da1s3 to da0s3: permission denied > > I think this stage should be working, unless something else in the > system is already using da1s3 or has it opened. Can you check through > your system to see if any other parts of eg. geom have decided to take > da1s3 for their purposes? not that i can see. randy From andrew at modulus.org Fri Aug 21 04:30:33 2009 From: andrew at modulus.org (Andrew Snow) Date: Fri Aug 21 04:30:40 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: References: <4A8E0C80.9030908@modulus.org> <4A8E0F39.6070407@modulus.org> <4A8E1347.3020301@modulus.org> <4A8E15F9.6000303@modulus.org> Message-ID: <4A8E22A1.4080903@modulus.org> Randy Bush wrote: >>> # zpool attach -f tank da0s3 da1s3 >>> cannot attach da1s3 to da0s3: permission denied >> I think this stage should be working, unless something else in the >> system is already using da1s3 or has it opened. Can you check through >> your system to see if any other parts of eg. geom have decided to take >> da1s3 for their purposes? > > not that i can see. It might be worth zeroing the whole disk with dd if=/dev/zero of=/dev/da1s3 bs=64k, and then see if you can re-attach. From serenity at exscape.org Fri Aug 21 06:51:28 2009 From: serenity at exscape.org (Thomas Backman) Date: Fri Aug 21 06:51:35 2009 Subject: Yet another ZFS recv panic; old but rarely seen Message-ID: <7F161876-8DA7-4617-98B6-7CD54C691BC6@exscape.org> Ugh. Bad news again: another zfs send/recv panic during an incremental backup. Unread portion of the kernel message buffer: panic: dirtying dbuf obj=b213 lvl=1 blkid=2 but not tx_held cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a panic() at panic+0x182 dmu_tx_dirty_buf() at dmu_tx_dirty_buf+0x28f dbuf_dirty() at dbuf_dirty+0x69 dnode_free_range() at dnode_free_range+0x80d dnode_reallocate() at dnode_reallocate+0x131 dmu_object_reclaim() at dmu_object_reclaim+0x99 dmu_recv_stream() at dmu_recv_stream+0x1446 zfs_ioc_recv() at zfs_ioc_recv+0x25a zfsdev_ioctl() at zfsdev_ioctl+0x8a devfs_ioctl_f() at devfs_ioctl_f+0x77 kern_ioctl() at kern_ioctl+0xf6ioctl() at ioctl+0xfd syscall() at syscall+0x28f Xfast_syscall() at Xfast_syscall+0xe1 --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800fe5f7c, rsp = 0x7fffffff8fb8, rbp = 0x7fffffff9cf0 --- KDB: enter: panic panic: from debugger cpuid = 0 Uptime: 4h52m26s Looks *eerily* similar to this panic fron OpenSolaris: http://mail.opensolaris.org/pipermail/zfs-code/2008-September/000694.html GDB backtrace isn't of that much more use, I guess: #11 0xffffffff8036d02b in panic (fmt=Variable "fmt" is not available. ) at /usr/src/sys/kern/kern_shutdown.c:562 #12 0xffffffff80b4765f in dmu_tx_dirty_buf () from /boot/kernel/zfs.ko #13 0xffffffff80b3a519 in dbuf_dirty () from /boot/kernel/zfs.ko #14 0xffffffff80b4b68d in dnode_free_range () from /boot/kernel/zfs.ko #15 0xffffffff80b4c461 in dnode_reallocate () from /boot/kernel/zfs.ko #16 0xffffffff80b42569 in dmu_object_reclaim () from /boot/kernel/zfs.ko #17 0xffffffff80b421b6 in dmu_recv_stream () from /boot/kernel/zfs.ko #18 0xffffffff80ba430a in zfs_ioc_recv () from /boot/kernel/zfs.ko #19 0xffffff002ac13d68 in ?? () #20 0xffffff002aa6c320 in ?? () #21 0xffffff002ae15000 in ?? () #22 0xffffff0002891400 in ?? () #23 0xffffff00028f2800 in ?? () #24 0xffffff00744a1ab8 in ?? () ... #34 0xffffff803e7fc860 in ?? () #35 0xffffffff805b699f in uma_zalloc_arg (zone=0xffffff00183c6600, udata=0xffffff00744a1000, flags=-128) at /usr/src/sys/vm/ uma_core.c:1990 Previous frame inner to this frame (corrupt stack?) (kgdb) Apparently, I've gotten this once before, at r195910 (+ patches, not such which ones at that time), on July 30th. Same DDB backtrace, same broken GDB backtrace. Regards, Thomas From serenity at exscape.org Fri Aug 21 09:47:46 2009 From: serenity at exscape.org (Thomas Backman) Date: Fri Aug 21 09:47:59 2009 Subject: Yet another ZFS recv panic; old but rarely seen In-Reply-To: <7F161876-8DA7-4617-98B6-7CD54C691BC6@exscape.org> References: <7F161876-8DA7-4617-98B6-7CD54C691BC6@exscape.org> Message-ID: <306284EA-C89C-433C-9D33-E6CF44305800@exscape.org> On Aug 21, 2009, at 08:51, Thomas Backman wrote: > Ugh. Bad news again: another zfs send/recv panic during an > incremental backup. > > Unread portion of the kernel message buffer: > panic: dirtying dbuf obj=b213 lvl=1 blkid=2 but not tx_held > > cpuid = 0 > KDB: stack backtrace: > db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > panic() at panic+0x182 > dmu_tx_dirty_buf() at dmu_tx_dirty_buf+0x28f > dbuf_dirty() at dbuf_dirty+0x69 > dnode_free_range() at dnode_free_range+0x80d > dnode_reallocate() at dnode_reallocate+0x131 > dmu_object_reclaim() at dmu_object_reclaim+0x99 > dmu_recv_stream() at dmu_recv_stream+0x1446 > zfs_ioc_recv() at zfs_ioc_recv+0x25a > zfsdev_ioctl() at zfsdev_ioctl+0x8a > devfs_ioctl_f() at devfs_ioctl_f+0x77 > kern_ioctl() at kern_ioctl+0xf6ioctl() at ioctl+0xfd > syscall() at syscall+0x28f > Xfast_syscall() at Xfast_syscall+0xe1 > --- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800fe5f7c, rsp = > 0x7fffffff8fb8, rbp = 0x7fffffff9cf0 --- > KDB: enter: panic > panic: from debugger > cpuid = 0 > Uptime: 4h52m26s > > Looks *eerily* similar to this panic fron OpenSolaris: http://mail.opensolaris.org/pipermail/zfs-code/2008-September/000694.html > > GDB backtrace isn't of that much more use, I guess: > #11 0xffffffff8036d02b in panic (fmt=Variable "fmt" is not available. > ) > at /usr/src/sys/kern/kern_shutdown.c:562 > #12 0xffffffff80b4765f in dmu_tx_dirty_buf () from /boot/kernel/zfs.ko > #13 0xffffffff80b3a519 in dbuf_dirty () from /boot/kernel/zfs.ko > #14 0xffffffff80b4b68d in dnode_free_range () from /boot/kernel/zfs.ko > #15 0xffffffff80b4c461 in dnode_reallocate () from /boot/kernel/zfs.ko > #16 0xffffffff80b42569 in dmu_object_reclaim () from /boot/kernel/ > zfs.ko > #17 0xffffffff80b421b6 in dmu_recv_stream () from /boot/kernel/zfs.ko > #18 0xffffffff80ba430a in zfs_ioc_recv () from /boot/kernel/zfs.ko > #19 0xffffff002ac13d68 in ?? () > #20 0xffffff002aa6c320 in ?? () > #21 0xffffff002ae15000 in ?? () > #22 0xffffff0002891400 in ?? () > #23 0xffffff00028f2800 in ?? () > #24 0xffffff00744a1ab8 in ?? () > ... > #34 0xffffff803e7fc860 in ?? () > #35 0xffffffff805b699f in uma_zalloc_arg (zone=0xffffff00183c6600, > udata=0xffffff00744a1000, flags=-128) at /usr/src/sys/vm/ > uma_core.c:1990 > Previous frame inner to this frame (corrupt stack?) > (kgdb) > > Apparently, I've gotten this once before, at r195910 (+ patches, not > such which ones at that time), on July 30th. Same DDB backtrace, > same broken GDB backtrace. > > Regards, > Thomas I found some more info mere minutes after posting this (figures; that's why I prefer media where you can edit your posts!), but had other things to do. So, here's some more: OpenSolaris bug ID: 6754448 ( http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6754448 ) Fixed in build 108: http://dlc.sun.com/osol/on/downloads/b108/on-changelog-b108.html Changelogs are to be found on that page (just search for "6754448", with a history/diff link on each source file's page. Unfortunately (unless FreeBSD suffers from both, that is), they apparently fixed two bugs in the same batch, making it harder - at least for *me* - to see what changes relate to *this* panic. Still, I'm guessing this will help, unless the code is too much out of sync with OpenSolaris. I'm also guessing Pawel already knows waaaaaaay more about their system than I do (... which is about nothing), so I'll probably shut up now... ;) Regards, Thomas From randy at psg.com Fri Aug 21 10:33:59 2009 From: randy at psg.com (Randy Bush) Date: Fri Aug 21 10:34:06 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: <4A8E22A1.4080903@modulus.org> References: <4A8E0C80.9030908@modulus.org> <4A8E0F39.6070407@modulus.org> <4A8E1347.3020301@modulus.org> <4A8E15F9.6000303@modulus.org> <4A8E22A1.4080903@modulus.org> Message-ID: > It might be worth zeroing the whole disk with dd if=/dev/zero > of=/dev/da1s3 bs=64k, and then see if you can re-attach. went single luser, zapped the drive, took a nap, then zpool attach tank da0s3 da1s3 worked. so geom did have it! thanks! randy From pjd at FreeBSD.org Fri Aug 21 11:00:38 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Fri Aug 21 11:00:51 2009 Subject: Yet another ZFS recv panic; old but rarely seen In-Reply-To: <306284EA-C89C-433C-9D33-E6CF44305800@exscape.org> References: <7F161876-8DA7-4617-98B6-7CD54C691BC6@exscape.org> <306284EA-C89C-433C-9D33-E6CF44305800@exscape.org> Message-ID: <20090821110031.GB1962@garage.freebsd.pl> On Fri, Aug 21, 2009 at 11:47:35AM +0200, Thomas Backman wrote: > On Aug 21, 2009, at 08:51, Thomas Backman wrote: > > >Ugh. Bad news again: another zfs send/recv panic during an > >incremental backup. > > > >Unread portion of the kernel message buffer: > >panic: dirtying dbuf obj=b213 lvl=1 blkid=2 but not tx_held > > > >cpuid = 0 > >KDB: stack backtrace: > >db_trace_self_wrapper() at db_trace_self_wrapper+0x2a > >panic() at panic+0x182 > >dmu_tx_dirty_buf() at dmu_tx_dirty_buf+0x28f > >dbuf_dirty() at dbuf_dirty+0x69 > >dnode_free_range() at dnode_free_range+0x80d > >dnode_reallocate() at dnode_reallocate+0x131 > >dmu_object_reclaim() at dmu_object_reclaim+0x99 > >dmu_recv_stream() at dmu_recv_stream+0x1446 > >zfs_ioc_recv() at zfs_ioc_recv+0x25a > >zfsdev_ioctl() at zfsdev_ioctl+0x8a > >devfs_ioctl_f() at devfs_ioctl_f+0x77 > >kern_ioctl() at kern_ioctl+0xf6ioctl() at ioctl+0xfd > >syscall() at syscall+0x28f > >Xfast_syscall() at Xfast_syscall+0xe1 > >--- syscall (54, FreeBSD ELF64, ioctl), rip = 0x800fe5f7c, rsp = > >0x7fffffff8fb8, rbp = 0x7fffffff9cf0 --- > >KDB: enter: panic > >panic: from debugger > >cpuid = 0 > >Uptime: 4h52m26s > > > >Looks *eerily* similar to this panic fron OpenSolaris: > >http://mail.opensolaris.org/pipermail/zfs-code/2008-September/000694.html > > > >GDB backtrace isn't of that much more use, I guess: > >#11 0xffffffff8036d02b in panic (fmt=Variable "fmt" is not available. > >) > > at /usr/src/sys/kern/kern_shutdown.c:562 > >#12 0xffffffff80b4765f in dmu_tx_dirty_buf () from /boot/kernel/zfs.ko > >#13 0xffffffff80b3a519 in dbuf_dirty () from /boot/kernel/zfs.ko > >#14 0xffffffff80b4b68d in dnode_free_range () from /boot/kernel/zfs.ko > >#15 0xffffffff80b4c461 in dnode_reallocate () from /boot/kernel/zfs.ko > >#16 0xffffffff80b42569 in dmu_object_reclaim () from /boot/kernel/ > >zfs.ko > >#17 0xffffffff80b421b6 in dmu_recv_stream () from /boot/kernel/zfs.ko > >#18 0xffffffff80ba430a in zfs_ioc_recv () from /boot/kernel/zfs.ko > >#19 0xffffff002ac13d68 in ?? () > >#20 0xffffff002aa6c320 in ?? () > >#21 0xffffff002ae15000 in ?? () > >#22 0xffffff0002891400 in ?? () > >#23 0xffffff00028f2800 in ?? () > >#24 0xffffff00744a1ab8 in ?? () > >... > >#34 0xffffff803e7fc860 in ?? () > >#35 0xffffffff805b699f in uma_zalloc_arg (zone=0xffffff00183c6600, > > udata=0xffffff00744a1000, flags=-128) at /usr/src/sys/vm/ > >uma_core.c:1990 > >Previous frame inner to this frame (corrupt stack?) > >(kgdb) > > > >Apparently, I've gotten this once before, at r195910 (+ patches, not > >such which ones at that time), on July 30th. Same DDB backtrace, > >same broken GDB backtrace. > > > >Regards, > >Thomas > > I found some more info mere minutes after posting this (figures; > that's why I prefer media where you can edit your posts!), but had > other things to do. So, here's some more: > > OpenSolaris bug ID: 6754448 ( > http://bugs.opensolaris.org/bugdatabase/view_bug.do?bug_id=6754448 ) > Fixed in build 108: > http://dlc.sun.com/osol/on/downloads/b108/on-changelog-b108.html > Changelogs are to be found on that page (just search for "6754448", > with a history/diff link on each source file's page. Unfortunately > (unless FreeBSD suffers from both, that is), they apparently fixed two > bugs in the same batch, making it harder - at least for *me* - to see > what changes relate to *this* panic. > Still, I'm guessing this will help, unless the code is too much out of > sync with OpenSolaris. > I'm also guessing Pawel already knows waaaaaaay more about their > system than I do (... which is about nothing), so I'll probably shut > up now... ;) Right, the bug is already fixed in OpenSolaris. If you can reproduce the problem, you might try this patch: http://people.freebsd.org/~pjd/patches/dirtying_dbuf.patch -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090821/a0c6c24a/attachment.pgp From morganw at chemikals.org Fri Aug 21 17:04:37 2009 From: morganw at chemikals.org (Wes Morgan) Date: Fri Aug 21 17:04:42 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: References: <4A8E0C80.9030908@modulus.org> <4A8E0F39.6070407@modulus.org> <4A8E1347.3020301@modulus.org> <4A8E15F9.6000303@modulus.org> <4A8E22A1.4080903@modulus.org> Message-ID: On Fri, 21 Aug 2009, Randy Bush wrote: >> It might be worth zeroing the whole disk with dd if=/dev/zero >> of=/dev/da1s3 bs=64k, and then see if you can re-attach. > > went single luser, zapped the drive, took a nap, then > > zpool attach tank da0s3 da1s3 > > worked. so geom did have it! I'm not sure geom had it more than the slice had a zfs label on it and as a foot-shooting precaution it did not want to let you add a device from another pool to the existing one. Just a guess, though. Was the replacement drive brand new? From randy at psg.com Fri Aug 21 22:33:00 2009 From: randy at psg.com (Randy Bush) Date: Fri Aug 21 22:33:06 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: References: <4A8E0C80.9030908@modulus.org> <4A8E0F39.6070407@modulus.org> <4A8E1347.3020301@modulus.org> <4A8E15F9.6000303@modulus.org> <4A8E22A1.4080903@modulus.org> Message-ID: >>> It might be worth zeroing the whole disk with dd if=/dev/zero >>> of=/dev/da1s3 bs=64k, and then see if you can re-attach. >> went single luser, zapped the drive, took a nap, then >> zpool attach tank da0s3 da1s3 >> worked. so geom did have it! > I'm not sure geom had it more than the slice had a zfs label on it and as > a foot-shooting precaution it did not want to let you add a device > from another pool to the existing one. Just a guess, though. Was the > replacement drive brand new? yep, brand new. randy From morganw at chemikals.org Fri Aug 21 23:39:30 2009 From: morganw at chemikals.org (Wes Morgan) Date: Fri Aug 21 23:39:36 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: References: <4A8E0C80.9030908@modulus.org> <4A8E0F39.6070407@modulus.org> <4A8E1347.3020301@modulus.org> <4A8E15F9.6000303@modulus.org> <4A8E22A1.4080903@modulus.org> Message-ID: On Sat, 22 Aug 2009, Randy Bush wrote: >>>> It might be worth zeroing the whole disk with dd if=/dev/zero >>>> of=/dev/da1s3 bs=64k, and then see if you can re-attach. >>> went single luser, zapped the drive, took a nap, then >>> zpool attach tank da0s3 da1s3 >>> worked. so geom did have it! >> I'm not sure geom had it more than the slice had a zfs label on it and as >> a foot-shooting precaution it did not want to let you add a device >> from another pool to the existing one. Just a guess, though. Was the >> replacement drive brand new? > > yep, brand new. Perhaps it was pre-formatted as fat32? From randy at psg.com Fri Aug 21 23:52:00 2009 From: randy at psg.com (Randy Bush) Date: Fri Aug 21 23:52:06 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: References: <4A8E0C80.9030908@modulus.org> <4A8E0F39.6070407@modulus.org> <4A8E1347.3020301@modulus.org> <4A8E15F9.6000303@modulus.org> <4A8E22A1.4080903@modulus.org> Message-ID: >>>>> It might be worth zeroing the whole disk with dd if=/dev/zero >>>>> of=/dev/da1s3 bs=64k, and then see if you can re-attach. >>>> went single luser, zapped the drive, took a nap, then >>>> zpool attach tank da0s3 da1s3 >>>> worked. so geom did have it! >>> I'm not sure geom had it more than the slice had a zfs label on it and as >>> a foot-shooting precaution it did not want to let you add a device >>> from another pool to the existing one. Just a guess, though. Was the >>> replacement drive brand new? >> yep, brand new. > Perhaps it was pre-formatted as fat32? well, at the same time i replaced the bad spindle, i put a spare in. the spare came from the same source. so how would i look to see who's claws are gripping it? randy From morganw at chemikals.org Fri Aug 21 23:58:41 2009 From: morganw at chemikals.org (Wes Morgan) Date: Fri Aug 21 23:58:48 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: References: <4A8E0C80.9030908@modulus.org> <4A8E0F39.6070407@modulus.org> <4A8E1347.3020301@modulus.org> <4A8E15F9.6000303@modulus.org> <4A8E22A1.4080903@modulus.org> Message-ID: On Sat, 22 Aug 2009, Randy Bush wrote: >>>>>> It might be worth zeroing the whole disk with dd if=/dev/zero >>>>>> of=/dev/da1s3 bs=64k, and then see if you can re-attach. >>>>> went single luser, zapped the drive, took a nap, then >>>>> zpool attach tank da0s3 da1s3 >>>>> worked. so geom did have it! >>>> I'm not sure geom had it more than the slice had a zfs label on it and as >>>> a foot-shooting precaution it did not want to let you add a device >>>> from another pool to the existing one. Just a guess, though. Was the >>>> replacement drive brand new? >>> yep, brand new. >> Perhaps it was pre-formatted as fat32? > > well, at the same time i replaced the bad spindle, i put a spare in. > the spare came from the same source. so how would i look to see who's > claws are gripping it? "geom list" might show you something... Try the part and label classes. You're using a partition for the vdev rather than the entire disk. What partition type did you use for creating it? From randy at psg.com Sat Aug 22 00:17:59 2009 From: randy at psg.com (Randy Bush) Date: Sat Aug 22 00:18:06 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: References: <4A8E0C80.9030908@modulus.org> <4A8E0F39.6070407@modulus.org> <4A8E1347.3020301@modulus.org> <4A8E15F9.6000303@modulus.org> <4A8E22A1.4080903@modulus.org> Message-ID: > "geom list" might show you something... Try the part and label > classes. new unused drive is da12 # geom part list # geom label list Geom name: da1s1 Providers: 1. Name: ufsid/4a16c4ae7a243ed1 Mediasize: 17174352384 (16G) Sectorsize: 512 Mode: r0w0e0 secoffset: 0 offset: 0 seclength: 33543657 length: 17174352384 index: 0 Consumers: 1. Name: da1s1 Mediasize: 17174352384 (16G) Sectorsize: 512 Mode: r0w0e0 > You're using a partition for the vdev rather than the entire disk. What > partition type did you use for creating it? 165 randy From andrew at modulus.org Sat Aug 22 02:42:59 2009 From: andrew at modulus.org (Andrew Snow) Date: Sat Aug 22 02:43:05 2009 Subject: re-adding a replacement into a pool of mirrors In-Reply-To: References: <4A8E0C80.9030908@modulus.org> <4A8E0F39.6070407@modulus.org> <4A8E1347.3020301@modulus.org> <4A8E15F9.6000303@modulus.org> <4A8E22A1.4080903@modulus.org> Message-ID: <4A8F5AEA.4010205@modulus.org> Wes Morgan wrote: > I'm not sure geom had it more than the slice had a zfs label on it and > as a foot-shooting precaution it did not want to let you add a device > from another pool to the existing one. FWIW, I have tried that and the error message is quite specific about it, and you can override it with -f. Randy's error message was simply "permission denied" and -f didn't help. - Andrew From serenity at exscape.org Sat Aug 22 11:01:54 2009 From: serenity at exscape.org (Thomas Backman) Date: Sat Aug 22 11:02:06 2009 Subject: Yet another ZFS recv panic; old but rarely seen In-Reply-To: <20090821110031.GB1962@garage.freebsd.pl> References: <7F161876-8DA7-4617-98B6-7CD54C691BC6@exscape.org> <306284EA-C89C-433C-9D33-E6CF44305800@exscape.org> <20090821110031.GB1962@garage.freebsd.pl> Message-ID: On Aug 21, 2009, at 13:00, Pawel Jakub Dawidek wrote: > > Right, the bug is already fixed in OpenSolaris. If you can reproduce > the > problem, you might try this patch: > > http://people.freebsd.org/~pjd/patches/dirtying_dbuf.patch I tried to reproduce it, a lot (~750 incremental send/recvs) but no "luck". I've only gotten it twice AFAIK, and that's since May. However, during the stress, I got a solaris assert panic (I've still got -DDEBUG=1), after a couple hours: Unread portion of the kernel message buffer: panic: solaris assert: (int64_t)(arc_stats.arcstat_p.value.ui64) >= 0, file: /usr/src/sys/modules/zfs/../../cddl/contrib/opensolaris/uts/ common/fs/zfs/arc.c, line: 2044 cpuid = 0 KDB: stack backtrace: db_trace_self_wrapper() at db_trace_self_wrapper+0x2a panic() at panic+0x182 arc_get_data_buf() at arc_get_data_buf+0x2a0 arc_buf_alloc() at arc_buf_alloc+0xe6 arc_read_nolock() at arc_read_nolock+0xf7 arc_read() at arc_read+0xaf dbuf_read() at dbuf_read+0x62b dmu_buf_hold() at dmu_buf_hold+0xcc zap_lockdir() at zap_lockdir+0x68 zap_lookup_norm() at zap_lookup_norm+0x45 zap_lookup() at zap_lookup+0x2e dsl_prop_changed_notify() at dsl_prop_changed_notify+0x1c9 dsl_prop_changed_notify() at dsl_prop_changed_notify+0x157 dsl_prop_set_sync() at dsl_prop_set_sync+0x2ab dsl_sync_task_group_sync() at dsl_sync_task_group_sync+0x173 dsl_pool_sync() at dsl_pool_sync+0x122 spa_sync() at spa_sync+0x35e txg_sync_thread() at txg_sync_thread+0x2d7 fork_exit() at fork_exit+0x118 fork_trampoline() at fork_trampoline+0xe --- trap 0, rip = 0, rsp = 0xffffff803e8cdd30, rbp = 0 --- KDB: enter: panic panic: from debugger cpuid = 0 Uptime: 1d3h17m21s Physical memory: 2029 MB GDB backtrace is the same until spa_sync(), at which point (#26) it turns into ??'s until #61 0xffffffff80b75447 in txg_sync_thread () from /boot/kernel/zfs.ko Previous frame inner to this frame (corrupt stack?) core.txt vmstat -s: 29040 pages active 28905 pages inactive 143 pages in VM cache 231106 pages wired down (903MiB out of ~2048) 214771 pages free 2GB RAM, amd64. Regards, Thomas From remko at FreeBSD.org Sun Aug 23 18:38:56 2009 From: remko at FreeBSD.org (remko@FreeBSD.org) Date: Sun Aug 23 18:39:02 2009 Subject: kern/138109: extfs: Minor cleanups to the sys/gnu/fs/ext2fs based on BSD Lite2 Message-ID: <200908231838.n7NIcudt018286@freefall.freebsd.org> Old Synopsis: Minor cleanups to the sys/gnu/fs/ext2fs based on BSD Lite2 New Synopsis: extfs: Minor cleanups to the sys/gnu/fs/ext2fs based on BSD Lite2 Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: remko Responsible-Changed-When: Sun Aug 23 18:38:42 UTC 2009 Responsible-Changed-Why: reassign to -fs team. http://www.freebsd.org/cgi/query-pr.cgi?pr=138109 From sarawgi.aditya at gmail.com Mon Aug 24 04:20:02 2009 From: sarawgi.aditya at gmail.com (Aditya Sarawgi) Date: Mon Aug 24 04:20:09 2009 Subject: kern/138109: extfs: Minor cleanups to the sys/gnu/fs/ext2fs based on BSD Lite2 Message-ID: <200908240420.n7O4K25S093421@freefall.freebsd.org> The following reply was made to PR kern/138109; it has been noted by GNATS. From: Aditya Sarawgi To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/138109: extfs: Minor cleanups to the sys/gnu/fs/ext2fs based on BSD Lite2 Date: Mon, 24 Aug 2009 03:45:46 +0530 I have merged these changes in my perforce branch. Please refer http://p4db.freebsd.org/branchView.cgi?BRANCH=truncs_ext2fs Cheers, Aditya Sarawgi From bugmaster at FreeBSD.org Mon Aug 24 11:06:54 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Aug 24 11:08:02 2009 Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org Message-ID: <200908241106.n7OB6rNd048554@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/138109 fs [extfs] [patch] Minor cleanups to the sys/gnu/fs/ext2f f kern/137037 fs [zfs] [hang] zfs rollback on root causes FreeBSD to fr o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136942 fs [zfs] zvol resize not reflected until reboot o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic o kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/136218 fs [zfs] Exported ZFS pools can't be imported into (Open) o kern/135594 fs [zfs] Single dataset unresponsive with Samba o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135480 fs [zfs] panic: lock &arg.lock already initialized o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o bin/135314 fs [zfs] assertion failed for zdb(8) usage o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot f kern/134496 fs [zfs] [panic] ZFS pool export occasionally causes a ke o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133980 fs [panic] [ffs] panic: ffs_valloc: dup alloc o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [smbfs] [panic] panic: ffs_truncate: read-only filesys o kern/133373 fs [zfs] umass attachment causes ZFS checksum errors, dat o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int f kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/133134 fs [zfs] Missing ZFS zpool labels o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132597 fs [tmpfs] [panic] tmpfs-related panic while interrupting o kern/132551 fs [zfs] ZFS locks up on extattr_list_link syscall o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes f kern/132068 fs [zfs] page fault when using ZFS over NFS on 7.1-RELEAS o kern/131995 fs [nfs] Failure to mount NFSv4 server o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/131086 fs [ext2fs] [patch] mkfs.ext2 creates rotten partition o kern/130979 fs [smbfs] [panic] boot/kernel/smbfs.ko o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129148 fs [zfs] [panic] panic on concurrent writing & rollback o kern/129059 fs [zfs] [patch] ZFS bootloader whitelistable via WITHOUT f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/128633 fs [zfs] [lor] lock order reversal in zfs o kern/128514 fs [zfs] [mpt] problems with ZFS and LSILogic SAS/SATA Ad f kern/128173 fs [ext2fs] ls gives "Input/output error" on mounted ext3 o kern/127659 fs [tmpfs] tmpfs memory leak o kern/127492 fs [zfs] System hang on ZFS input-output o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127213 fs [tmpfs] sendfile on tmpfs data corruption o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/125644 fs [zfs] [panic] zfs unfixable fs errors caused panic whe f kern/125536 fs [ext2fs] ext 2 mounts cleanly but fails on commands li o kern/125149 fs [nfs] [panic] changing into .zfs dir from nfs client c f kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122888 fs [zfs] zfs hang w/ prefetch on, zil off while running t o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o kern/122173 fs [zfs] [panic] Kernel Panic if attempting to replace a o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o kern/122047 fs [ext2fs] [patch] incorrect handling of UF_IMMUTABLE / o kern/122038 fs [tmpfs] [panic] tmpfs: panic: tmpfs_alloc_vp: type 0xc o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o kern/121770 fs [zfs] ZFS on i386, large file or heavy I/O leads to ke o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [fs] [snapshot] System crashes when manipulati o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F o bin/120288 fs zfs(8): "zfs share -a" does not send SIGHUP to mountd f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o misc/118855 fs [zfs] ZFS-related commands are nonfunctional in fixit o kern/118713 fs [minidump] [patch] Display media size required for a k o kern/118320 fs [zfs] [patch] NFS SETATTR sometimes fails to set file o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o kern/116913 fs [ffs] [panic] ffs_blkfree: freeing free block p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/115645 fs [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o kern/113180 fs [zfs] Setting ZFS nfsshare property does not cause inh o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/105093 fs [ext2fs] [patch] ext2fs on read-only media cannot be m o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna f kern/91568 fs [ufs] [panic] writing to UFS/softupdates DVD media in o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/89991 fs [ufs] softupdates with mount -ur causes fs UNREFS o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/84589 fs [2TB] 5.4-STABLE unresponsive during background fsck 2 o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o kern/77826 fs [ext2fs] ext2fs usb filesystem will not mount RW o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 150 problems total. From linimon at FreeBSD.org Mon Aug 24 11:52:51 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Mon Aug 24 11:52:57 2009 Subject: kern/131441: [unionfs] [nullfs] unionfs and/or nullfs not combineable Message-ID: <200908241152.n7OBqp0O004642@freefall.freebsd.org> Old Synopsis: unionfs and/or nullfs not combineable New Synopsis: [unionfs] [nullfs] unionfs and/or nullfs not combineable Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Aug 24 11:52:30 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=131441 From m at plus-plus.su Mon Aug 24 12:28:43 2009 From: m at plus-plus.su (Mikhail (Plus Plus)) Date: Mon Aug 24 12:28:50 2009 Subject: need help with ZFS Message-ID: <4A927CB3.3040402@plus-plus.su> Greetings gentlemen, I need your help with ZFS -- how I can diagnose, debug ZFS crashes, and if possible make it more stable. We're running raidz pool and we're having hard time to get it running smooth -- ZFS simply crashes as soon as we put some load on it. Couple months back we've built custom server to replace our old storage system. New server has 7 x 1GB SATA drives, Intel Q6600 Quad core CPU, 8GB RAM. OS: FreeBSD 7.2-RELEASE-p2 amd64, stock GENERIC kernel. So I decided to try ZFS, and I was amazed by it's features. I've read ZFS wiki page and my loader.conf follows: zen# cat loader.conf vm.kmem_size="1536M" vm.kmem_size_max="3072M" vm.pmap.shpgperproc="1024" vfs.zfs.arc_min="256M" vfs.zfs.arc_max="384M" vfs.zfs.vdev.cache.size="50M" vfs.zfs.prefetch_disable="1" kern.maxproc="20000" zen# and zen# zpool status pool: datapool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM datapool ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ad24 ONLINE 0 0 0 ad8 ONLINE 0 0 0 ad18 ONLINE 0 0 0 ad20 ONLINE 0 0 0 ad22 ONLINE 0 0 0 ad10 ONLINE 0 0 0 spares ad26 AVAIL errors: No known data errors zen# zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT datapool 5.44T 3.54T 1.90T 65% ONLINE - zen# Problem starts as soon as we put some load on FS - e.g. last night I tried to rsync 1TB of different files from above ZFS pool to single hard drive (partitioned UFS2), but server crashed and restarted after copying ~700GB of data. I also tried to stress-test it by running 100 torrent downloads (using rtorrent), and server also crashes and restarts after running for about 30-40 minutes. My guess these crashes happen due to big load on filesystem - ZFS eats all available memory and then server simply crashes. Right now I'm completely lost - I can't even copy 1TB from ZFS to another partition.. How can I diagnose the issue? Is there anything available to make ZFS more stable? Thanks, Mikhail. From p.christias at noc.ntua.gr Mon Aug 24 12:57:40 2009 From: p.christias at noc.ntua.gr (Panagiotis Christias) Date: Mon Aug 24 12:57:48 2009 Subject: need help with ZFS In-Reply-To: <4A927CB3.3040402@plus-plus.su> References: <4A927CB3.3040402@plus-plus.su> Message-ID: <20090824125737.GA92643@noc.ntua.gr> On Mon, Aug 24, 2009 at 03:42:43PM +0400, Mikhail (Plus Plus) wrote: > Greetings gentlemen, > > I need your help with ZFS -- how I can diagnose, debug ZFS crashes, and > if possible make it more stable. We're running raidz pool and we're > having hard time to get it running smooth -- ZFS simply crashes as soon > as we put some load on it. > > Couple months back we've built custom server to replace our old storage > system. New server has 7 x 1GB SATA drives, Intel Q6600 Quad core CPU, > 8GB RAM. OS: FreeBSD 7.2-RELEASE-p2 amd64, stock GENERIC kernel. > > So I decided to try ZFS, and I was amazed by it's features. I've read > ZFS wiki page and my loader.conf follows: > > zen# cat loader.conf > vm.kmem_size="1536M" > vm.kmem_size_max="3072M" > vm.pmap.shpgperproc="1024" > vfs.zfs.arc_min="256M" > vfs.zfs.arc_max="384M" > vfs.zfs.vdev.cache.size="50M" > vfs.zfs.prefetch_disable="1" > kern.maxproc="20000" > zen# > > and > > zen# zpool status > pool: datapool > state: ONLINE > scrub: none requested > config: > > NAME STATE READ WRITE CKSUM > datapool ONLINE 0 0 0 > raidz1 ONLINE 0 0 0 > ad24 ONLINE 0 0 0 > ad8 ONLINE 0 0 0 > ad18 ONLINE 0 0 0 > ad20 ONLINE 0 0 0 > ad22 ONLINE 0 0 0 > ad10 ONLINE 0 0 0 > spares > ad26 AVAIL > > errors: No known data errors > zen# zpool list > NAME SIZE USED AVAIL CAP HEALTH ALTROOT > datapool 5.44T 3.54T 1.90T 65% ONLINE - > zen# > > > Problem starts as soon as we put some load on FS - e.g. last night I > tried to rsync 1TB of different files from above ZFS pool to single hard > drive (partitioned UFS2), but server crashed and restarted after > copying ~700GB of data. I also tried to stress-test it by running 100 > torrent downloads (using rtorrent), and server also crashes and restarts > after running for about 30-40 minutes. > My guess these crashes happen due to big load on filesystem - ZFS eats > all available memory and then server simply crashes. > > Right now I'm completely lost - I can't even copy 1TB from ZFS to > another partition.. > How can I diagnose the issue? Is there anything available to make ZFS > more stable? I would suggest you try FreeBSD 8.0 that includes the latest version of ZFS (version 13), which fixed several problems present in 7.x. Then, check for crash dumps (see dumpon(8)), collect any available info and sent it to the list. Regards, Panagiotis -- Panagiotis J. Christias Network Management Center P.Christias@noc.ntua.gr National Technical Univ. of Athens, GREECE From m at plus-plus.su Mon Aug 24 13:04:44 2009 From: m at plus-plus.su (Mikhail (Plus Plus)) Date: Mon Aug 24 13:04:50 2009 Subject: need help with ZFS In-Reply-To: <20090824125737.GA92643@noc.ntua.gr> References: <4A927CB3.3040402@plus-plus.su> <20090824125737.GA92643@noc.ntua.gr> Message-ID: <4A92902F.70606@plus-plus.su> Panagiotis Christias wrote: > I would suggest you try FreeBSD 8.0 that includes the latest version of > ZFS (version 13), which fixed several problems present in 7.x. Then, > check for crash dumps (see dumpon(8)), collect any available info and > sent it to the list. > Does that mean I will have to format my current zpool in order to "upgrade" to ZFS v13? And if it is true then another problem is that I have to backup current data somewhere before I'll be able to format zpool to new version, but I cannot do this because server simply crashes every time I try to copy that much of data to another hdd. Mikhail. From spawk at acm.poly.edu Mon Aug 24 14:17:58 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Mon Aug 24 14:18:04 2009 Subject: need help with ZFS In-Reply-To: <4A92902F.70606@plus-plus.su> References: <4A927CB3.3040402@plus-plus.su> <20090824125737.GA92643@noc.ntua.gr> <4A92902F.70606@plus-plus.su> Message-ID: <4A92A0E9.1060508@acm.poly.edu> Mikhail (Plus Plus) wrote: > Panagiotis Christias wrote: > > I would suggest you try FreeBSD 8.0 that includes the latest > version of >> ZFS (version 13), which fixed several problems present in 7.x. Then, >> check for crash dumps (see dumpon(8)), collect any available info and >> sent it to the list. >> > > Does that mean I will have to format my current zpool in order to > "upgrade" to ZFS v13? And if it is true then another problem is that I > have to backup current data somewhere before I'll be able to format > zpool to new version, but I cannot do this because server simply > crashes every time I try to copy that much of data to another hdd. > > Mikhail. > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" ZFS version 13 refers to two things: the ZFS code, and the on-disk format. Most, if not all of your problems will probably be resolved by upgrading to the new code, as found in 7.2-STABLE and 8.0. New versions of the code are backwards compatible with old on-disk formats. Should you decide to upgrade to the new on-disk format with "zfs set version=3" and "zpool upgrade," the process is documented as keeping your data intact. While it is always a good idea to back up your data before potentially-dangerous filesystem operations, I, along with many other people, have upgraded ZFS on-disk formats without issue. -Boris From brde at optusnet.com.au Mon Aug 24 14:40:04 2009 From: brde at optusnet.com.au (Bruce Evans) Date: Mon Aug 24 14:40:10 2009 Subject: kern/138109: Minor cleanups to the sys/gnu/fs/ext2fs based on BSD Lite2 Message-ID: <200908241440.n7OEe3EZ072161@freefall.freebsd.org> The following reply was made to PR kern/138109; it has been noted by GNATS. From: Bruce Evans To: "Pedro F. Giffuni" Cc: freebsd-gnats-submit@FreeBSD.org Subject: Re: kern/138109: Minor cleanups to the sys/gnu/fs/ext2fs based on BSD Lite2 Date: Mon, 24 Aug 2009 23:14:47 +1000 (EST) On Sun, 23 Aug 2009, Pedro F. Giffuni wrote: >> Description: > I have been looking at some of the FFS BSD-lite2 fixes to apply them to our ext2fs (based on an older FFS1 from BSD lites). This is helping getting some of the code more in sync with the NetBSD implementation. Please don't format mail for 200+ column terminals. > I am still missing some bigger changes but for now here are pretty simple cleanups, based on these FFS changes: > > ffs_inode.c > ------------ > Use the correct flags (IO_SYNC -> B_SYNC) when deciding to do a sync or > async write in the section that changes the filesize. The bug resulted > in the updates always being async. I tested this a bit after you told me about it a few months ago. > ffs_vfsops.c > ------------- > Speed up for vfs_bio -- addition of a routine bqrelse to greatly diminish > overhead for merged cache. Interesting. I've used this for many years, but didn't notice it reducing overheads. The changes involving vnode_pager_setsize() seem to be bugs: >> How-To-Repeat: > >> Fix: > diff -ruN ext2fs.orig/ext2_inode.c ext2fs/ext2_inode.c > --- ext2fs.orig/ext2_inode.c 2009-08-18 20:32:13.000000000 -0500 > +++ ext2fs/ext2_inode.c 2009-08-23 12:37:18.000000000 -0500 > @@ -126,16 +126,16 @@ > long count, nblocks, blocksreleased = 0; > int aflags, error, i, allerror; > off_t osize; > -/* -printf("ext2_truncate called %d to %d\n", VTOI(ovp)->i_number, length); > -*/ /* > + > + /* > * negative file sizes will totally break the code below and > * are not meaningful anyways. > + * XXX: We should check for max file size here too. > */ > + oip = VTOI(ovp); > if (length < 0) > - return EFBIG; > + return EINVAL; Should also fix the style bugs (indentation, and missing parentheses). These style bugs are missing in the ffs version. > > - oip = VTOI(ovp); > if (ovp->v_type == VLNK && > oip->i_size < ovp->v_mount->mnt_maxsymlinklen) { > #ifdef DIAGNOSTIC This code should be almost identical with that in ffs, and now almost is. Did you get if from FreeBSD or NetBSD? The comments in it don't seem necessary. ffs doesn't have them. Both check for maxfilesize, but only ffs does it at this point, before the VLNK code, but after the extended attributes code). It may be technically correct to not check maxfilesize for either symlinks or extended attributes, since maxfilesize only applies to regular files, but maxfilesize should be larger so checking it first is harmless. ffs is now inconsisent about this for symlinks versus extended attributes. FreeBSD history shows that it is ffs that has the misplaced check for maxfilesize. Lite1 was just missing the check. Lite2 added it where ext2fs has it now. FreeBSD already had it when Lite2 was merged in ffs_inode.c 1.24. FreeBSD had it up-front, and FreeBSD had the wrong error code value for the (length < 0) case (like ext2fs has now). Rev.1.24 merged Lite2 incompletely by fixing the error code but not moving the maxfilesize check. > ... > @@ -167,12 +167,13 @@ > aflags = B_CLRBUF; > if (flags & IO_SYNC) > aflags |= B_SYNC; > - vnode_pager_setsize(ovp, length); Moving this is dangerous. It is inconsistent with ffs and seems wrong. Don't copy NetBSD for this. dyson!@ had to fix many misplaced calls to vnode_pager_setsize(). > - if ((error = ext2_balloc(oip, lbn, offset + 1, cred, &bp, > - aflags)) != 0) > + error = ext2_balloc(oip, lbn, offset + 1, cred, > + &bp, aflags); This line no longer needs splitting. > + if (error) > return (error); Another old bug is not restoring the size on error. This was fixed in ffs (but not here :-() just this year (ffs_inode.c 1.115). Presumably the old code delayed the setting until the space was allocated to avoid having to clean up and/or to avoid having an inconsistent setting while allocating, but in FreeBSD it is necessary to increase the vm size while allocating. > oip->i_size = length; > - if (aflags & IO_SYNC) > + vnode_pager_setsize(ovp, length); Don't move this -- see above. > + if (aflags & B_SYNC) > bwrite(bp); > else > bawrite(bp); > @@ -195,18 +196,20 @@ > aflags = B_CLRBUF; > if (flags & IO_SYNC) > aflags |= B_SYNC; > - if ((error = ext2_balloc(oip, lbn, offset, cred, &bp, > - aflags)) != 0) > + ext2_balloc(oip, lbn, offset, cred, &bp, > + aflags) This line shouldn't be split, much more so than the one above. > + if (error) > return (error); Another place that is missing restoring the vm size on error. > oip->i_size = length; > size = blksize(fs, oip, lbn); > bzero((char *)bp->b_data + offset, (u_int)(size - offset)); > allocbuf(bp, size); > - if (aflags & IO_SYNC) > + if (aflags & B_SYNC) > bwrite(bp); > else > bawrite(bp); > } > + vnode_pager_setsize(ovp, length); Don't move this... > /* > * Calculate index into inode's block list of > * last direct and indirect blocks (if any) > diff -ruN ext2fs.orig/ext2_vfsops.c ext2fs/ext2_vfsops.c > --- ext2fs.orig/ext2_vfsops.c 2009-08-18 20:32:13.000000000 -0500 > +++ ext2fs/ext2_vfsops.c 2009-08-23 12:40:27.000000000 -0500 > @@ -171,10 +171,7 @@ > flags = WRITECLOSE; > if (mp->mnt_flag & MNT_FORCE) > flags |= FORCECLOSE; > - if (vfs_busy(mp, LK_NOWAIT, 0, td)) > - return (EBUSY); > error = ext2_flushfiles(mp, flags, td); > - vfs_unbusy(mp, td); > if (!error && fs->s_wasvalid) { > fs->s_es->s_state |= EXT2_VALID_FS; > ext2_sbupdate(ump, MNT_WAIT); Consistent with ffs, but I don't understand it. > @@ -496,10 +493,10 @@ > * Things to do to update the mount: > * 1) invalidate all cached meta-data. > * 2) re-read superblock from disk. > - * 3) re-read summary information from disk. > - * 4) invalidate all inactive vnodes. > - * 5) invalidate all cached file data. > - * 6) re-read inode data for all active vnodes. > + * 3) (re-read summary information from disk.) > + * - (invalidate all inactive vnodes.) > + * 4) invalidate all cached file data. > + * 5) re-read inode data for all active vnodes. > */ > static int > ext2_reload(struct mount *mp, struct thread *td) Don't renumber this. The meaning of the parentheses is unclear. IIRC, ext2fs doesn't do all the steps here, but it should. Use descriptive comments instead of parentheses to say which ones. > @@ -1007,8 +1004,8 @@ > * still zero, it will be unlinked and returned to the free > * list by vput(). > */ > - vput(vp); > brelse(bp); > + vput(vp); > *vpp = NULL; > return (error); > } Consistent with ffs, and seems to be needed (not just a style fix). > @@ -1032,7 +1029,7 @@ > /* > ext2_print_inode(ip); > */ > - brelse(bp); > + bqrelse(bp); > > /* > * Initialize the vnode from the inode, check for aliases. Though I use it, I'm not sure about the safety of this. Bruce From matt at corp.spry.com Mon Aug 24 17:01:59 2009 From: matt at corp.spry.com (Matt Simerson) Date: Mon Aug 24 17:02:06 2009 Subject: need help with ZFS In-Reply-To: <4A92902F.70606@plus-plus.su> References: <4A927CB3.3040402@plus-plus.su> <20090824125737.GA92643@noc.ntua.gr> <4A92902F.70606@plus-plus.su> Message-ID: <7A60E3BB-04DB-4272-9052-7477FC10DBBA@spry.com> On Aug 24, 2009, at 6:05 AM, Mikhail (Plus Plus) wrote: > Panagiotis Christias wrote: >> I would suggest you try FreeBSD 8.0 that includes the latest >> version of >> ZFS (version 13), which fixed several problems present in 7.x. >> Then, check for crash dumps (see dumpon(8)), collect any available >> info and >> sent it to the list. I second this recommendation. The ZFS code in 8.0 is considerably more stable than what's currently in 7.2-STABLE. > Does that mean I will have to format my current zpool in order to > "upgrade" to ZFS v13? And if it is true then another problem is that > I have to backup current data somewhere before I'll be able to > format zpool to new version, but I cannot do this because server > simply crashes every time I try to copy that much of data to another > hdd. Not at all. Simply install 8.0 and do a zpool upgrade. Matt From olivier at gid0.org Mon Aug 24 17:44:04 2009 From: olivier at gid0.org (Olivier Smedts) Date: Mon Aug 24 17:44:11 2009 Subject: need help with ZFS In-Reply-To: <7A60E3BB-04DB-4272-9052-7477FC10DBBA@spry.com> References: <4A927CB3.3040402@plus-plus.su> <20090824125737.GA92643@noc.ntua.gr> <4A92902F.70606@plus-plus.su> <7A60E3BB-04DB-4272-9052-7477FC10DBBA@spry.com> Message-ID: <367b2c980908241014l425b48e3uea765a51f0447ea3@mail.gmail.com> 2009/8/24 Matt Simerson : > > On Aug 24, 2009, at 6:05 AM, Mikhail (Plus Plus) wrote: > >> Panagiotis Christias wrote: >>> >>> I would suggest you try FreeBSD 8.0 that includes the latest version of >>> ZFS (version 13), which fixed several problems present in 7.x. Then, >>> check for crash dumps (see dumpon(8)), collect any available info and >>> sent it to the list. > > I second this recommendation. The ZFS code in 8.0 is considerably more > stable than what's currently in 7.2-STABLE. > >> Does that mean I will have to format my current zpool in order to >> "upgrade" to ZFS v13? And if it is true then another problem is that I have >> to backup current data somewhere before I'll be able to format zpool to new >> version, but I cannot do this because server simply crashes every time I try >> to copy that much of data to another hdd. > > Not at all. Simply install 8.0 and do a zpool upgrade. And don't forget to zfs upgrade after the zpool upgrade ! > > Matt > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > -- Olivier Smedts _ ASCII ribbon campaign ( ) e-mail: olivier@gid0.org - against HTML email & vCards X www: http://www.gid0.org - against proprietary attachments / \ "Il y a seulement 10 sortes de gens dans le monde : ceux qui comprennent le binaire, et ceux qui ne le comprennent pas." From grarpamp at gmail.com Mon Aug 24 18:55:11 2009 From: grarpamp at gmail.com (grarpamp) Date: Mon Aug 24 18:55:23 2009 Subject: ZFS versions [was: need help] Message-ID: FYI There are both zpool and zfs versions. They are independant. Though to fully enable some features both need upgraded. See zpool(1M), zfs(1M) and the ticket/psarc cases listed in the urls for more. The canonical references and utilities [read-only] are: http://opensolaris.org/os/community/zfs/version/n/ zpool upgrade -v zpool upgrade http://opensolaris.org/os/community/zfs/version/zpl/n/ zfs upgrade -v zfs upgrade The parameter setting method is not preferred. FreeBSD RELENG_7 and above are at: zpool 13 zfs 3 OpenSolaris is at: zpool 18 zfs 4 From p.christias at noc.ntua.gr Mon Aug 24 20:09:42 2009 From: p.christias at noc.ntua.gr (Panagiotis Christias) Date: Mon Aug 24 20:10:12 2009 Subject: ZFS versions [was: need help] In-Reply-To: References: Message-ID: <20090824200938.GB13590@noc.ntua.gr> On Mon, Aug 24, 2009 at 02:33:54PM -0400, grarpamp wrote: > FYI > There are both zpool and zfs versions. > They are independant. Though to fully enable some > features both need upgraded. See zpool(1M), zfs(1M) > and the ticket/psarc cases listed in the urls for more. > > The canonical references and utilities [read-only] are: > > http://opensolaris.org/os/community/zfs/version/n/ > zpool upgrade -v > zpool upgrade > http://opensolaris.org/os/community/zfs/version/zpl/n/ > zfs upgrade -v > zfs upgrade > > The parameter setting method is not preferred. > > FreeBSD RELENG_7 and above are at: > zpool 13 > zfs 3 > > OpenSolaris is at: > zpool 18 > zfs 4 Very useful information. http://wiki.freebsd.org/ZFS would be a good place for it. What I had in mind was the latest MFC regarding ZFS (pool) version 13 on May 20th (two weeks after the 7.2 release): http://svn.freebsd.org/viewvc/base?view=revision&revision=192498 And the corresponding entry in /usr/src/UPDATING: "20090520: Update ZFS to version 13. ZFS users will need to re-build and install both kernel and world simultaneously in order for the ZFS tools to work. Existing pools will continue to work without upgrade. If a pool is upgraded it will no longer be usable by older kernel revs. ZFS send / recv between pool version 6 and pool version 13 is not supported." Regargs, Panagiotis -- Panagiotis J. Christias Network Management Center P.Christias@noc.ntua.gr National Technical Univ. of Athens, GREECE From serenity at exscape.org Tue Aug 25 12:25:09 2009 From: serenity at exscape.org (Thomas Backman) Date: Tue Aug 25 12:25:21 2009 Subject: Yet another ZFS recv panic; old but rarely seen In-Reply-To: <20090821110031.GB1962@garage.freebsd.pl> References: <7F161876-8DA7-4617-98B6-7CD54C691BC6@exscape.org> <306284EA-C89C-433C-9D33-E6CF44305800@exscape.org> <20090821110031.GB1962@garage.freebsd.pl> Message-ID: <917520A8-94F5-41CD-8274-EC0228EE70A2@exscape.org> On Aug 21, 2009, at 13:00, Pawel Jakub Dawidek wrote: > > Right, the bug is already fixed in OpenSolaris. If you can reproduce > the > problem, you might try this patch: > > http://people.freebsd.org/~pjd/patches/dirtying_dbuf.patch Well, since I wasn't able to reliably reproduce the panic, I can't say if the patch is working or not... but after the stress testing, I applied it anyway to check for regressions, and so far it seems to work. Regards, Thomas From yuri at rawbw.com Tue Aug 25 20:30:15 2009 From: yuri at rawbw.com (Yuri) Date: Tue Aug 25 20:30:21 2009 Subject: Is it possible to add this patch to FreeBSD-8.0? (kern/133174: [msdosfs] [patch] msdosfs must support utf-encoded international characters in file names) Message-ID: <4A944498.8030203@rawbw.com> This PR is pending for a while: http://www.freebsd.org/cgi/query-pr.cgi?pr=133174 I tested the patch many times and never found any problems. Mandarin file names all look good. Is there a way to check this patch into 8.0? I mount disks with the option -L=zh_TW.UTF-8. Not sure if the first part (zh_TW) really matters. ru_RU.UTF-8 works with Chinese file names the same way. I guess it should be just -L=UTF-8 but this makes mount_msdosfs to fail with some strange error message. It's very important to have FreeBSD understand international file names in msdosfs since this will make it's use easier for non-English users. Please let me know what you think. Thank you, Yuri From spawk at acm.poly.edu Wed Aug 26 00:00:04 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Wed Aug 26 00:00:12 2009 Subject: geom_mirror/UFS weirdness with 7.2-STABLE In-Reply-To: References: <4A646DA8.2050201@acm.poly.edu> Message-ID: <4A947AE6.7070401@acm.poly.edu> Ivan Voras wrote: > Boris Kochergin wrote: > >> Ahoy. I noticed some very odd things in my file server's kernel buffer >> this morning (there were actually a ton of these--this is a snippet): >> >> Jul 20 05:54:10 exodus smartd[763]: Device: /dev/ad1, FAILED SMART >> self-check. BACK UP DATA NOW! >> Jul 20 05:57:57 exodus kernel: >> g_vfs_done():mirror/boots1[READ(offset=-4569735194538825728, >> length=16384)]error = 5 >> Jul 20 05:57:57 exodus kernel: bad block 8806809555123731765, ino 4430620 >> Jul 20 05:57:57 exodus kernel: pid 35 (softdepflush), uid 0 inumber >> 4430620 on /: bad block >> > > >> # df / >> Filesystem 1K-blocks Used Avail >> Capacity Mounted on >> /dev/mirror/boots1 37846636 -4058799239201906816 4058799239236725722 >> -11656883301279% / >> >> The system is a: >> >> # uname -a >> FreeBSD exodus.poly.edu 7.2-STABLE FreeBSD 7.2-STABLE #3: Sat Jul 11 >> 16:22:02 EDT 2009 root@exodus.poly.edu:/usr/obj/usr/src/sys/EXODUS >> amd64 >> >> Regarding smartd yelling at me about /dev/ad1, it's been doing that for >> long while before this. There is one sector on the drive that cannot be >> read, but the disk has otherwise been fine for months. My experience >> with geom_mirror has been that it disconnects members from an array if >> they experience I/O errors, so this seems to be something different. Any >> clues? >> > > It looks like the drive returned corrupted data without returning an > error - which is strange, but not impossible. You are probably seeing > numbers like -4058799239201906816 because some metadata is corrupted. If > so, you should immediately disconnect the problematic drive so that the > errorneous data isn't picked up and written to the good drive. > > > In retrospect, it appears to have been bad RAM. The symptoms were just subtler back then. -Boris From spawk at acm.poly.edu Wed Aug 26 00:14:41 2009 From: spawk at acm.poly.edu (Boris Kochergin) Date: Wed Aug 26 00:14:48 2009 Subject: ZFS RAID-Z panic on vdev failure + subsequent panics and hangs In-Reply-To: <4A8AA531.2000004@acm.poly.edu> References: <4A78AFB2.10103@acm.poly.edu> <20090805115621.GG1784@garage.freebsd.pl> <4A798A12.4070408@acm.poly.edu> <20090807073738.GA1607@garage.freebsd.pl> <20090807074400.GB1607@garage.freebsd.pl> <4A7C3002.8000003@acm.poly.edu> <20090807191334.GA1814@garage.freebsd.pl> <4A7C81CA.2040303@acm.poly.edu> <20090807193842.GA2487@garage.freebsd.pl> <4A7C87C5.1070608@acm.poly.edu> <20090807202756.GB2487@garage.freebsd.pl> <4A81CF20.7010108@acm.poly.edu> <4A8AA531.2000004@acm.poly.edu> Message-ID: <4A947E57.6050700@acm.poly.edu> Boris Kochergin wrote: > Boris Kochergin wrote: >> Pawel Jakub Dawidek wrote: >>> On Fri, Aug 07, 2009 at 04:00:05PM -0400, Boris Kochergin wrote: >>> >>>> Pawel Jakub Dawidek wrote: >>>> >>>>> On Fri, Aug 07, 2009 at 03:34:34PM -0400, Boris Kochergin wrote: >>>>> >>>>> >>>>>> Pawel Jakub Dawidek wrote: >>>>>> >>>>>>> Yeah, that's strange indeed. Could you try: >>>>>>> >>>>>>> print ab->b_arc_node.list_prev >>>>>>> print ab->b_arc_node.list_next >>>>>>> >>>>>>> >>>>>>> >>>>>> (kgdb) print ab->b_arc_node.list_prev >>>>>> $1 = (struct list_node *) 0x1 >>>>>> >>>>> Yeah, list_prev is corrupted. If it panics on you everytime, I could >>>>> send you a patch which will try to catch where the corruption occurs. >>>>> >>>>> >>>>> >>>> I eventually get the arc_evict panic every time I successfully >>>> manage to mount the filesystem, but it usually panics (with the >>>> other backtrace) as soon as I try to mount it, or mount just hangs. >>>> I'll gladly try the patch, though--the data on the array is >>>> important to me. Thanks. >>>> >>> >>> To get the data from there you could also try to 'zfs send' it without >>> mounting the dataset at all (just in case). >>> >>> >> Sorry for the delay. I had to find another machine to move the disks >> into so that I could continue experimenting. Anyway, the filesystem >> didn't have any snapshots I could send, so I tried creating one with >> "zfs snapshot home@1" and the machine hung. >> >> FYI, In the new machine, all disks (including the one with the / >> filesystem) retain their device names. >> >> -Boris >> _______________________________________________ >> freebsd-fs@freebsd.org mailing list >> http://lists.freebsd.org/mailman/listinfo/freebsd-fs >> To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > Some more panics using RELENG_8 sources from yesterday: > http://acm.poly.edu/~spawk/zfs/. The one in panic3.txt happens much > more often than the other ones. If any brave soul wants to look into > it, I can provide NFS/geom_gate/whatever access to the disk images (or > actual disks, if there's a difference) so that they can recreate the > problem on a local machine. > > -Boris > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" For the archives: pjd@ took some time to examine the disk images I made of the RAID-Z pool, but found heavy corruption in the metadata. As it turns out, the machine had bad RAM during the incident, and that is probably what caused it. Unfortunately, I had only started to suspect it recently as random userland application and kernel panics became frequent. This is good news for ZFS users, as it indicates that ZFS did not corrupt my pool on its own. I do, however, advise you to be mindful of the problems bad memory can cause for ZFS. Personally, I will start shelling out a few more bucks for the ECC stuff from now on. (Eagerly awaiting the read-only offline recovery functionality described at http://www.mail-archive.com/zfs-discuss@opensolaris.org/msg20092.html). -Boris From mi+thun at aldan.algebra.com Wed Aug 26 04:47:30 2009 From: mi+thun at aldan.algebra.com (Mikhail T.) Date: Wed Aug 26 04:47:36 2009 Subject: semantics of fcntl() with l_len being 0 Message-ID: <4A94B829.3090202@aldan.algebra.com> Hello! I'm curious, whether a file, that's locked (via fcntl) with l_start and l_len being 0 is supposed to be appendable... I would think so, but I notice, that when spamprobe (see mail/spamprobe) chews on my spam mailbox, I can not append a new piece of spam to the file -- my imap-server is waiting for spamprobe to finish. The fcntl(2) says: ``len = 0 means until end of file''. Is that ``until the end of file AT THE TIME OF LOCKING'' or simply ``no other lock until we are done''? Thanks. Yours, -mi From kostikbel at gmail.com Wed Aug 26 09:09:57 2009 From: kostikbel at gmail.com (Kostik Belousov) Date: Wed Aug 26 09:10:03 2009 Subject: semantics of fcntl() with l_len being 0 In-Reply-To: <4A94B829.3090202@aldan.algebra.com> References: <4A94B829.3090202@aldan.algebra.com> Message-ID: <20090826090947.GS9623@deviant.kiev.zoral.com.ua> On Wed, Aug 26, 2009 at 12:20:57AM -0400, Mikhail T. wrote: > Hello! > > I'm curious, whether a file, that's locked (via fcntl) with l_start and > l_len being 0 is supposed to be appendable... > > I would think so, but I notice, that when spamprobe (see mail/spamprobe) > chews on my spam mailbox, I can not append a new piece of spam to the > file -- my imap-server is waiting for spamprobe to finish. > > The fcntl(2) says: ``len = 0 means until end of file''. Is that ``until > the end of file AT THE TIME OF LOCKING'' or simply ``no other lock until > we are done''? Thanks. Yours, SUSv3 is definitive on the subject: A lock shall be set to extend to the largest possible value of the file offset for that file by setting l_len to 0. If such a lock also has l_start set to 0 and l_whence is set to SEEK_SET, the whole file shall be locked. From sys/kern/kern_lockf.c, line 464: } else if (fl->l_len == 0) { end = OFF_MAX; -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090826/387a7064/attachment.pgp From linimon at FreeBSD.org Wed Aug 26 09:45:55 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Wed Aug 26 09:46:01 2009 Subject: kern/138202: mount_msdosfs(1) see only 2Gb Message-ID: <200908260945.n7Q9js1Y050637@freefall.freebsd.org> Old Synopsis: mount_msdosfs see only 2Gb New Synopsis: mount_msdosfs(1) see only 2Gb Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Wed Aug 26 09:45:27 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=138202 From mi+thun at aldan.algebra.com Wed Aug 26 13:57:05 2009 From: mi+thun at aldan.algebra.com (Mikhail T.) Date: Wed Aug 26 13:57:12 2009 Subject: semantics of fcntl() with l_len being 0 In-Reply-To: <20090826090947.GS9623@deviant.kiev.zoral.com.ua> References: <4A94B829.3090202@aldan.algebra.com> <20090826090947.GS9623@deviant.kiev.zoral.com.ua> Message-ID: <4A953F30.8030202@aldan.algebra.com> Kostik Belousov ???????(??): > On Wed, Aug 26, 2009 at 12:20:57AM -0400, Mikhail T. wrote: > >> Hello! >> >> I'm curious, whether a file, that's locked (via fcntl) with l_start and >> l_len being 0 is supposed to be appendable... >> >> I would think so, but I notice, that when spamprobe (see mail/spamprobe) >> chews on my spam mailbox, I can not append a new piece of spam to the >> file -- my imap-server is waiting for spamprobe to finish. >> >> The fcntl(2) says: ``len = 0 means until end of file''. Is that ``until >> the end of file AT THE TIME OF LOCKING'' or simply ``no other lock until >> we are done''? Thanks. Yours, >> > > SUSv3 is definitive on the subject: > A lock shall be set to extend to the largest possible value of the file > offset for that file by setting l_len to 0. If such a lock also has > l_start set to 0 and l_whence is set to SEEK_SET, the whole file shall > be locked. > I would not say, this is definitive -- if one attempts to grow a locked file, "the whole file" (as existed at the lock-time) will remain non-violated... The code is more explicit, of course: > From sys/kern/kern_lockf.c, line 464: > } else if (fl->l_len == 0) { > end = OFF_MAX; > So, how can one properly lock only the area, that exists at the time of locking? Perform a stat() first? Thanks! -mi From kostikbel at gmail.com Wed Aug 26 14:06:50 2009 From: kostikbel at gmail.com (Kostik Belousov) Date: Wed Aug 26 14:06:57 2009 Subject: semantics of fcntl() with l_len being 0 In-Reply-To: <4A953F30.8030202@aldan.algebra.com> References: <4A94B829.3090202@aldan.algebra.com> <20090826090947.GS9623@deviant.kiev.zoral.com.ua> <4A953F30.8030202@aldan.algebra.com> Message-ID: <20090826140640.GA9623@deviant.kiev.zoral.com.ua> On Wed, Aug 26, 2009 at 09:57:04AM -0400, Mikhail T. wrote: > Kostik Belousov ???????(??): > > On Wed, Aug 26, 2009 at 12:20:57AM -0400, Mikhail T. wrote: > > > >> Hello! > >> > >> I'm curious, whether a file, that's locked (via fcntl) with l_start and > >> l_len being 0 is supposed to be appendable... > >> > >> I would think so, but I notice, that when spamprobe (see mail/spamprobe) > >> chews on my spam mailbox, I can not append a new piece of spam to the > >> file -- my imap-server is waiting for spamprobe to finish. > >> > >> The fcntl(2) says: ``len = 0 means until end of file''. Is that ``until > >> the end of file AT THE TIME OF LOCKING'' or simply ``no other lock until > >> we are done''? Thanks. Yours, > >> > > > > SUSv3 is definitive on the subject: > > A lock shall be set to extend to the largest possible value of the file > > offset for that file by setting l_len to 0. If such a lock also has > > l_start set to 0 and l_whence is set to SEEK_SET, the whole file shall > > be locked. > > > I would not say, this is definitive -- if one attempts to grow a locked > file, "the whole file" (as existed at the lock-time) will remain > non-violated... The code is more explicit, of course: > > From sys/kern/kern_lockf.c, line 464: > > } else if (fl->l_len == 0) { > > end = OFF_MAX; > > > So, how can one properly lock only the area, that exists at the time of > locking? Perform a stat() first? Thanks! You might lock the maximal range, to prevent the modifications from other accessors that honour the protocol, do stat call, and then unlock the range from EOF to 0 (AKA max offset). -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 195 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090826/63c10ba0/attachment.pgp From m at plus-plus.su Wed Aug 26 14:11:19 2009 From: m at plus-plus.su (Mikhail (Plus Plus)) Date: Wed Aug 26 14:11:25 2009 Subject: need help with ZFS In-Reply-To: <367b2c980908241014l425b48e3uea765a51f0447ea3@mail.gmail.com> References: <4A927CB3.3040402@plus-plus.su> <20090824125737.GA92643@noc.ntua.gr> <4A92902F.70606@plus-plus.su> <7A60E3BB-04DB-4272-9052-7477FC10DBBA@spry.com> <367b2c980908241014l425b48e3uea765a51f0447ea3@mail.gmail.com> Message-ID: <4A9542D4.8090404@plus-plus.su> Olivier Smedts wrote: > 2009/8/24 Matt Simerson : >> On Aug 24, 2009, at 6:05 AM, Mikhail (Plus Plus) wrote: >> >>> Panagiotis Christias wrote: >>>> I would suggest you try FreeBSD 8.0 that includes the latest version of >>>> ZFS (version 13), which fixed several problems present in 7.x. Then, >>>> check for crash dumps (see dumpon(8)), collect any available info and >>>> sent it to the list. >> I second this recommendation. The ZFS code in 8.0 is considerably more >> stable than what's currently in 7.2-STABLE. >> >>> Does that mean I will have to format my current zpool in order to >>> "upgrade" to ZFS v13? And if it is true then another problem is that I have >>> to backup current data somewhere before I'll be able to format zpool to new >>> version, but I cannot do this because server simply crashes every time I try >>> to copy that much of data to another hdd. >> Not at all. Simply install 8.0 and do a zpool upgrade. > > And don't forget to zfs upgrade after the zpool upgrade ! Thanks everyone who responded. I've followed the advice and installed FreeBSD-8.0BETA3-amd64 on that server. OS resides on dedicated SATA HD, so all other hard drives left untouched. Now I cannot see zfs pool online. I've enabled ZFS in /etc/rc.conf, reboot the server, but I cannot see my zpool. zen# zpool list no pools available zen# I remember ZFS relies on hostid, and I've checked my current hostid is identical to the one that I had when I ran 7.2. What do I do now to see my zpool again? Thanks, Mikhail. From m at plus-plus.su Wed Aug 26 14:22:04 2009 From: m at plus-plus.su (Mikhail (Plus Plus)) Date: Wed Aug 26 14:22:11 2009 Subject: need help with ZFS In-Reply-To: <4A9542D4.8090404@plus-plus.su> References: <4A927CB3.3040402@plus-plus.su> <20090824125737.GA92643@noc.ntua.gr> <4A92902F.70606@plus-plus.su> <7A60E3BB-04DB-4272-9052-7477FC10DBBA@spry.com> <367b2c980908241014l425b48e3uea765a51f0447ea3@mail.gmail.com> <4A9542D4.8090404@plus-plus.su> Message-ID: <4A95455B.9050800@plus-plus.su> Mikhail (Plus Plus) wrote: > What do I do now to see my zpool again? > Replying to myself - Someone contacted me off-list with suggestion to do "zpool import {pool}" and it worked. Mikhail. From m at plus-plus.su Thu Aug 27 09:16:43 2009 From: m at plus-plus.su (Mikhail (Plus Plus)) Date: Thu Aug 27 09:16:50 2009 Subject: need help with ZFS In-Reply-To: <4A927CB3.3040402@plus-plus.su> References: <4A927CB3.3040402@plus-plus.su> Message-ID: <4A964F4E.4080009@plus-plus.su> Last night I installed FreeBSD-8.0BETA3 AMD64, upgraded zpool and zfs, and this morning I did some load on FS to see if it runs more stable now. To stress-load the system I ran rsync from ZFS to mounted UFS volume and at the same time I started bonie++ benchmark on ZFS volume: bonnie++ -s17000 -d. -n64 Everything worked fine for about ~90 minutes, but then system paniced. I got escaped to "db> " prompt on local console, but I have zero debugging experience, so I just rebooted the server. The only thing I noticed was that last mentioned PID on the panic screen was bonnie++'s process. Also, I'm very confused after re-reading ZFS tuning wiki from here: http://wiki.freebsd.org/ZFSTuningGuide "amd64 FreeBSD 7.2+ has improved kernel memory allocation strategy and no tuning may be necessary on systems with more than 2 GB of RAM." Does that mean I no longer have to tune ZFS via loader.conf? Just better leave it empty on FreeBSD-8.0 installation? Right now I'm going to continue with rsync and will start bonnie++ in parallel keeping my loader.conf with the following values: vm.kmem_size="1536M" vm.kmem_size_max="3072M" vm.pmap.shpgperproc="1024" vfs.zfs.arc_min="256M" vfs.zfs.arc_max="384M" vfs.zfs.vdev.cache.size="50M" vfs.zfs.prefetch_disable="1" kern.maxproc="20000" I would appreciate any tips on showing me how to debug system panic from "db> " prompt in case system panics again.. Thank you. From m at plus-plus.su Thu Aug 27 09:48:41 2009 From: m at plus-plus.su (Mikhail (Plus Plus)) Date: Thu Aug 27 09:48:47 2009 Subject: need help with ZFS In-Reply-To: <4A964F4E.4080009@plus-plus.su> References: <4A927CB3.3040402@plus-plus.su> <4A964F4E.4080009@plus-plus.su> Message-ID: <4A9656CE.8020107@plus-plus.su> Mikhail (Plus Plus) wrote: > Everything worked fine for about ~90 minutes, but then system paniced. I > got escaped to "db> " prompt on local console, but I have zero debugging > experience, so I just rebooted the server. The only thing I noticed was > that last mentioned PID on the panic screen was bonnie++'s process. Ok, I just ran bonnie++ test on ZFS (no rsync this time, only bonnie++) and system paniced again after ~23 minutes. Right now I see the following on local console: "Memory modified after free 0xffff... val=... @ 0xff.. panic: Most recently used by solaris cpuid = 1 KDB: enter: panic [thread pid 883 tid 10.. ] Stopped at kdb_enter 0x3d: movq ...(%rip) db> " and PID 883 was bonnie++'s PID before system panic. Mikhail. From numisemis at yahoo.com Thu Aug 27 09:52:26 2009 From: numisemis at yahoo.com (Simun Mikecin) Date: Thu Aug 27 09:52:32 2009 Subject: need help with ZFS In-Reply-To: <4A964F4E.4080009@plus-plus.su> References: <4A927CB3.3040402@plus-plus.su> <4A964F4E.4080009@plus-plus.su> Message-ID: <941145.26591.qm@web37302.mail.mud.yahoo.com> ----- Original Message ---- > From: Mikhail (Plus Plus) > To: freebsd-fs@freebsd.org > Sent: Thursday, August 27, 2009 11:18:06 AM > Subject: Re: need help with ZFS > > Last night I installed FreeBSD-8.0BETA3 AMD64, upgraded zpool and zfs, and this > morning I did some load on FS to see if it runs more stable now. > > To stress-load the system I ran rsync from ZFS to mounted UFS volume and at the > same time I started bonie++ benchmark on ZFS volume: > > bonnie++ -s17000 -d. -n64 > > Everything worked fine for about ~90 minutes, but then system paniced. I got > escaped to "db> " prompt on local console, but I have zero debugging experience, > so I just rebooted the server. The only thing I noticed was that last mentioned > PID on the panic screen was bonnie++'s process. > > Also, I'm very confused after re-reading ZFS tuning wiki from here: > http://wiki.freebsd.org/ZFSTuningGuide ZFSTuningGuide is sometimes misleading. Better look at /usr/src/UPDATING. > > "amd64 > FreeBSD 7.2+ has improved kernel memory allocation strategy and no tuning may be > necessary on systems with more than 2 GB of RAM." > > Does that mean I no longer have to tune ZFS via loader.conf? Just better leave > it empty on FreeBSD-8.0 installation? > > Right now I'm going to continue with rsync and will start bonnie++ in parallel > keeping my loader.conf with the following values: > > vm.kmem_size="1536M" > vm.kmem_size_max="3072M" > vm.pmap.shpgperproc="1024" > vfs.zfs.arc_min="256M" > vfs.zfs.arc_max="384M" > vfs.zfs.vdev.cache.size="50M" > vfs.zfs.prefetch_disable="1" > kern.maxproc="20000" Since FreeBSD 7.2 no additional ZFS tuning in loader.conf is needed (on amd64). You should remove all those settings from loader.conf, reboot, and re-run the tests. Maybe, just maybe you get a panic just because of those settings. From olivier at gid0.org Thu Aug 27 10:16:22 2009 From: olivier at gid0.org (Olivier Smedts) Date: Thu Aug 27 10:16:28 2009 Subject: need help with ZFS In-Reply-To: <4A9656CE.8020107@plus-plus.su> References: <4A927CB3.3040402@plus-plus.su> <4A964F4E.4080009@plus-plus.su> <4A9656CE.8020107@plus-plus.su> Message-ID: <367b2c980908270316n7a21673ek3a997573f2fadbb0@mail.gmail.com> 2009/8/27 Mikhail (Plus Plus) : > Mikhail (Plus Plus) wrote: >> >> Everything worked fine for about ~90 minutes, but then system paniced. I >> got escaped to "db> " prompt on local console, but I have zero debugging >> experience, so I just rebooted the server. The only thing I noticed was that >> last mentioned PID on the panic screen was bonnie++'s process. > > Ok, I just ran bonnie++ test on ZFS (no rsync this time, only bonnie++) and > system paniced again after ~23 minutes. Right now I see the following on > local console: > > "Memory modified after free 0xffff... val=... @ 0xff.. > panic: Most recently used by solaris > cpuid = 1 > KDB: enter: panic > [thread pid 883 tid 10.. ] > Stopped at ? ? ?kdb_enter 0x3d: movq ? ?...(%rip) > db> " You can output the backtrace by entering "bt" at the "db> "prompt. > and PID 883 was bonnie++'s PID before system panic. > > Mikhail. > _______________________________________________ > freebsd-fs@freebsd.org mailing list > http://lists.freebsd.org/mailman/listinfo/freebsd-fs > To unsubscribe, send any mail to "freebsd-fs-unsubscribe@freebsd.org" > -- Olivier Smedts _ ASCII ribbon campaign ( ) e-mail: olivier@gid0.org - against HTML email & vCards X www: http://www.gid0.org - against proprietary attachments / \ "Il y a seulement 10 sortes de gens dans le monde : ceux qui comprennent le binaire, et ceux qui ne le comprennent pas." From m at plus-plus.su Thu Aug 27 11:49:31 2009 From: m at plus-plus.su (Mikhail (Plus Plus)) Date: Thu Aug 27 11:49:37 2009 Subject: need help with ZFS In-Reply-To: <367b2c980908270316n7a21673ek3a997573f2fadbb0@mail.gmail.com> References: <4A927CB3.3040402@plus-plus.su> <4A964F4E.4080009@plus-plus.su> <4A9656CE.8020107@plus-plus.su> <367b2c980908270316n7a21673ek3a997573f2fadbb0@mail.gmail.com> Message-ID: <4A96731D.20406@plus-plus.su> Olivier Smedts wrote: > You can output the backtrace by entering "bt" at the "db> "prompt. Thanks. I've booted FreeBSD with empty loader.conf and I tried to run bonnie++ -s17000 -d. -n128 on ZFS. I get 100% panic after 7-10 minutes, I tried 2 times already. I'm going to redirect console to another server so that I can copy "bt" output, it is one screen long and it is really hard to type it by hand. Mikhail. From m at plus-plus.su Thu Aug 27 12:03:55 2009 From: m at plus-plus.su (Mikhail (Plus Plus)) Date: Thu Aug 27 12:04:02 2009 Subject: need help with ZFS In-Reply-To: <4A96731D.20406@plus-plus.su> References: <4A927CB3.3040402@plus-plus.su> <4A964F4E.4080009@plus-plus.su> <4A9656CE.8020107@plus-plus.su> <367b2c980908270316n7a21673ek3a997573f2fadbb0@mail.gmail.com> <4A96731D.20406@plus-plus.su> Message-ID: <4A967680.2030205@plus-plus.su> Mikhail (Plus Plus) wrote: > I'm going to redirect console to another server so that I can copy "bt" > output, it is one screen long and it is really hard to type it by hand. Hm, and just as I thought there's no COM port on that server, so I can't redirect console output to another server. Is there any other way to copy "bt" output? I can only think of making a phone camera picture of the screen. Mikhail. From m at plus-plus.su Thu Aug 27 14:50:20 2009 From: m at plus-plus.su (Mikhail (Plus Plus)) Date: Thu Aug 27 17:03:57 2009 Subject: need help with ZFS In-Reply-To: <4A967680.2030205@plus-plus.su> References: <4A927CB3.3040402@plus-plus.su> <4A964F4E.4080009@plus-plus.su> <4A9656CE.8020107@plus-plus.su> <367b2c980908270316n7a21673ek3a997573f2fadbb0@mail.gmail.com> <4A96731D.20406@plus-plus.su> <4A967680.2030205@plus-plus.su> Message-ID: <4A969D20.40809@plus-plus.su> Mikhail (Plus Plus) wrote: > Hm, and just as I thought there's no COM port on that server, so I can't > redirect console output to another server. Is there any other way to > copy "bt" output? I can only think of making a phone camera picture of > the screen. Here's "bt" output I get after system panics on running bonnie++ -s17000 -d. -n128 for about 10 minutes: http://omploader.org/vMjg5aQ/IMG_0057.jpg regards, Mikhail. From linimon at FreeBSD.org Thu Aug 27 19:46:53 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Thu Aug 27 19:47:00 2009 Subject: kern/138244: [zfs] dd(1) attempts bitwise transfer onto ZFS pool Message-ID: <200908271946.n7RJkr43045559@freefall.freebsd.org> Old Synopsis: dd attempts bitwise transfer onto ZFS pool New Synopsis: [zfs] dd(1) attempts bitwise transfer onto ZFS pool Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Thu Aug 27 19:46:17 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=138244 From aragon at phat.za.net Fri Aug 28 18:51:28 2009 From: aragon at phat.za.net (Aragon Gouveia) Date: Fri Aug 28 18:51:34 2009 Subject: some of my files have an incorrect block count Message-ID: <4A9822FC.2030606@phat.za.net> Hi, I'm copying data across to a larger file system and in so doing, I've noticed that some of the files in my old file system have an incorrect block count. After copying all data, df(1) reports that the new file system has more data on it than the old one. I've narrowed most of the difference to one file in particular: %ls -l /data/qemu/winxp.qem /mnt/data/qemu/winxp.qem -rw-r--r-- 1 aragon staff 10737418240 Mar 29 21:57 /data/qemu/winxp.qem -rw-r--r-- 1 aragon staff 10737418240 Mar 29 21:57 /mnt/data/qemu/winxp.qem %du -k /data/qemu/winxp.qem /mnt/data/qemu/winxp.qem 10490896 /data/qemu/winxp.qem 2001728 /mnt/data/qemu/winxp.qem %du -Ak /data/qemu/winxp.qem /mnt/data/qemu/winxp.qem 10485760 /data/qemu/winxp.qem 10485760 /mnt/data/qemu/winxp.qem %stat -f '%N: %z %b' /data/qemu/winxp.qem /mnt/data/qemu/winxp.qem /data/qemu/winxp.qem: 10737418240 20981792 /mnt/data/qemu/winxp.qem: 10737418240 4003456 In the above the new file system is /data, the old /mnt/data. Running fsck(8) on the old file system doesn't show any errors and makes no difference. If dd(1) reads both files in, it counts the correct size, and running md5(1) on both copies of the files produces the same hash, so at least all the data is presumably present. Surely fsck(8) should detect this? Is this inconsistency cause for concern? Thanks, Aragon From rick-freebsd2008 at kiwi-computer.com Fri Aug 28 19:12:27 2009 From: rick-freebsd2008 at kiwi-computer.com (Rick C. Petty) Date: Fri Aug 28 19:12:35 2009 Subject: some of my files have an incorrect block count In-Reply-To: <4A9822FC.2030606@phat.za.net> References: <4A9822FC.2030606@phat.za.net> Message-ID: <20090828191225.GA77462@keira.kiwi-computer.com> On Fri, Aug 28, 2009 at 08:33:32PM +0200, Aragon Gouveia wrote: > > I'm copying data across to a larger file system and in so doing, I've > noticed that some of the files in my old file system have an incorrect > block count. After copying all data, df(1) reports that the new file > system has more data on it than the old one. I've narrowed most of the > difference to one file in particular: Yes, these are called sparse files. Disk images are one example where sparse files come in handy (although some argue that you should fully zero an image initially to prevent fragmentation). Many tools can handle sparse files efficiently. You should read their individual man pages. > If dd(1) reads both files in, it counts the correct size, and running > md5(1) on both copies of the files produces the same hash, so at least > all the data is presumably present. If you want dd to copy sparse files correctly, you need to specify "conv=sparse" in the target dd command. I personally prefer rsync with "-S". > Surely fsck(8) should detect this? No. > Is this inconsistency cause for concern? No. This is just how sparse files work. -- Rick C. Petty From pjd at FreeBSD.org Sat Aug 29 16:00:43 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Sat Aug 29 16:00:51 2009 Subject: need help with ZFS In-Reply-To: <4A969D20.40809@plus-plus.su> References: <4A927CB3.3040402@plus-plus.su> <4A964F4E.4080009@plus-plus.su> <4A9656CE.8020107@plus-plus.su> <367b2c980908270316n7a21673ek3a997573f2fadbb0@mail.gmail.com> <4A96731D.20406@plus-plus.su> <4A967680.2030205@plus-plus.su> <4A969D20.40809@plus-plus.su> Message-ID: <20090829160037.GA1848@garage.freebsd.pl> On Thu, Aug 27, 2009 at 06:50:08PM +0400, Mikhail (Plus Plus) wrote: > Mikhail (Plus Plus) wrote: > >Hm, and just as I thought there's no COM port on that server, so I can't > >redirect console output to another server. Is there any other way to > >copy "bt" output? I can only think of making a phone camera picture of > >the screen. > > Here's "bt" output I get after system panics on running > > bonnie++ -s17000 -d. -n128 > > for about 10 minutes: > > http://omploader.org/vMjg5aQ/IMG_0057.jpg [...] I'm running your test on pretty low-end h/w (i386, 1GB of RAM, two cores) and cannot reproduce the problem for few hours now. The only tuning I did was to set vm.kmem_size to 1GB. You still need to do this very tuning even on amd64. Could you post output of: # sysctl vm.kmem_size # sysctl vm.kmem_size_max # sysctl vfs.zfs # zpool status # zpool list # zfs get all And place /var/run/dmesg.boot somewhere? -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090829/23b393b9/attachment.pgp From linimon at FreeBSD.org Sun Aug 30 16:47:01 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Sun Aug 30 16:47:07 2009 Subject: kern/138350: [ufs] [patch] UFS_EXTATTR static int prototyping error ufs_extattr_autostart_locked Message-ID: <200908301647.n7UGl0ag001196@freefall.freebsd.org> Old Synopsis: UFS_EXTATTR static int prototyping error ufs_extattr_autostart_locked New Synopsis: [ufs] [patch] UFS_EXTATTR static int prototyping error ufs_extattr_autostart_locked Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Sun Aug 30 16:44:02 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=138350 From gallasch at free.de Sun Aug 30 17:40:22 2009 From: gallasch at free.de (Kai Gallasch) Date: Sun Aug 30 17:40:29 2009 Subject: need help with ZFS In-Reply-To: <20090829160037.GA1848@garage.freebsd.pl> References: <4A927CB3.3040402@plus-plus.su> <4A964F4E.4080009@plus-plus.su> <4A9656CE.8020107@plus-plus.su> <367b2c980908270316n7a21673ek3a997573f2fadbb0@mail.gmail.com> <4A96731D.20406@plus-plus.su> <4A967680.2030205@plus-plus.su> <4A969D20.40809@plus-plus.su> <20090829160037.GA1848@garage.freebsd.pl> Message-ID: <20090830191336.5f1cdec0@orwell.free.de> On Sat, 29 Aug 2009 18:00:37 +0200 wrote Pawel Jakub Dawidek : > > bonnie++ -s17000 -d. -n128 > > > > for about 10 minutes: > > > > http://omploader.org/vMjg5aQ/IMG_0057.jpg > [...] > > I'm running your test on pretty low-end h/w (i386, 1GB of RAM, two > cores) and cannot reproduce the problem for few hours now. The only > tuning I did was to set vm.kmem_size to 1GB. You still need to do > this very tuning even on amd64. Hi. Could you give a short explanation why on amd64 it is still necessary to set vm.kmem_size to 1GB ? I thought with zfs on FreeBSD-7.2-STABLE tuning kmem was not necessary any more. The reason I ask is, I have some servers on FreeBSD-7.2-STABLE AMD64 that are quite stable, but sometimes suffer from zfs performance degradation when zfs is competing with applications using huge amounts of RAM. Those servers are not swapping at all when this takes place. --Kai. From linimon at FreeBSD.org Mon Aug 31 03:59:35 2009 From: linimon at FreeBSD.org (linimon@FreeBSD.org) Date: Mon Aug 31 03:59:42 2009 Subject: kern/138367: [tmpfs] [panic] 'panic: Assertion pages > 0 failed' when running regression/tmpfs Message-ID: <200908310359.n7V3xYXv095664@freefall.freebsd.org> Old Synopsis: [tmpfs] 'panic: Assertion pages > 0 failed' when running regression/tmpfs New Synopsis: [tmpfs] [panic] 'panic: Assertion pages > 0 failed' when running regression/tmpfs Responsible-Changed-From-To: freebsd-bugs->freebsd-fs Responsible-Changed-By: linimon Responsible-Changed-When: Mon Aug 31 03:59:14 UTC 2009 Responsible-Changed-Why: Over to maintainer(s). http://www.freebsd.org/cgi/query-pr.cgi?pr=138367 From stark at mapper.nl Mon Aug 31 06:52:49 2009 From: stark at mapper.nl (Mark Stapper) Date: Mon Aug 31 06:52:55 2009 Subject: ZFS and DMA read error Message-ID: <4A9B733C.8060803@mapper.nl> Good day to you, I'm having a bit of trouble with one of the disks in my zfs raidz1 pool. It's giving me dma read error, and zpool is reporting READ failures. However, data integrity is OK :-) Unfortunately I was in the middle of rearranging my backup media, so I'm backup up everything as we speak. I will be testing the failing drive in another computer soon, however before I return it i'd like to know if this could be caused my something other than hardware failing. Below the output of "zpool status" and a snippet of /var/log/messages showing the DMA errors. Thanks for the input. Greetz, Mark pool: data state: ONLINE status: One or more devices has experienced an unrecoverable error. An attempt was made to correct the error. Applications are unaffected. action: Determine if the device needs to be replaced, and clear the errors using 'zpool clear' or replace the device with 'zpool replace'. see: http://www.sun.com/msg/ZFS-8000-9P scrub: none requested config: NAME STATE READ WRITE CKSUM data ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ad4 ONLINE 0 0 0 ad6 ONLINE 21 0 0 ad8 ONLINE 0 0 0 ad10 ONLINE 0 0 0 errors: No known data errors Aug 31 03:04:35 yoshi kernel: ad6: FAILURE - READ_DMA48 status=51 error=40 LBA=932040832 Aug 31 03:04:35 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204905984 size=65536 error=5 Aug 31 03:04:35 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204925440 size=2560 error=5 Aug 31 03:04:53 yoshi kernel: ad6: FAILURE - READ_DMA48 status=51 error=40 LBA=932040832 Aug 31 03:04:53 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204905984 size=65536 error=5 Aug 31 03:05:17 yoshi kernel: ad6: FAILURE - READ_DMA48 status=51 error=40 LBA=932040832 Aug 31 03:05:17 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204905984 size=65536 error=5 Aug 31 03:05:17 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204918272 size=512 error=5 Aug 31 06:12:01 yoshi login: ROOT LOGIN (root) ON ttyv2 Aug 31 06:35:34 yoshi kernel: ad6: FAILURE - READ_DMA48 status=51 error=40 LBA=932040832 Aug 31 06:35:34 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204905984 size=65536 error=5 Aug 31 06:35:34 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204925440 size=2560 error=5 Aug 31 06:36:33 yoshi kernel: ad6: FAILURE - READ_DMA48 status=51 error=40 LBA=932040832 Aug 31 06:36:34 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204905984 size=65536 error=5 Aug 31 06:36:34 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204923392 size=2048 error=5 Aug 31 06:36:38 yoshi kernel: ad6: FAILURE - READ_DMA48 status=51 error=40 LBA=932040832 Aug 31 06:36:38 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204905984 size=65536 error=5 Aug 31 06:36:38 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204918272 size=512 error=5 Aug 31 06:36:42 yoshi kernel: ad6: FAILURE - READ_DMA48 status=51 error=40 LBA=932040832 Aug 31 06:36:42 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204905984 size=65536 error=5 Aug 31 06:36:42 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204918272 size=512 error=5 Aug 31 06:37:52 yoshi kernel: ad6: FAILURE - READ_DMA48 status=51 error=40 LBA=932040832 Aug 31 06:37:52 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204905984 size=65536 error=5 Aug 31 06:37:52 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204918272 size=512 error=5 Aug 31 06:38:31 yoshi kernel: ad6: FAILURE - READ_DMA48 status=51 error=40 LBA=932040832 Aug 31 06:38:31 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204905984 size=65536 error=5 Aug 31 06:38:31 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204918272 size=512 error=5 Aug 31 06:38:45 yoshi kernel: ad6: FAILURE - READ_DMA48 status=51 error=40 LBA=932040832 Aug 31 06:38:45 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204905984 size=65536 error=5 Aug 31 06:38:45 yoshi root: ZFS: vdev I/O failure, zpool=data path=/dev/ad6 offset=477204947968 size=512 error=5 -------------- next part -------------- A non-text attachment was scrubbed... Name: signature.asc Type: application/pgp-signature Size: 259 bytes Desc: OpenPGP digital signature Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090831/ab4aa827/signature.pgp From bugmaster at FreeBSD.org Mon Aug 31 11:07:06 2009 From: bugmaster at FreeBSD.org (FreeBSD bugmaster) Date: Mon Aug 31 11:08:02 2009 Subject: Current problem reports assigned to freebsd-fs@FreeBSD.org Message-ID: <200908311107.n7VB74We070537@freefall.freebsd.org> Note: to view an individual PR, use: http://www.freebsd.org/cgi/query-pr.cgi?pr=(number). The following is a listing of current problems submitted by FreeBSD users. These represent problem reports covering all versions including experimental development code and obsolete releases. S Tracker Resp. Description -------------------------------------------------------------------------------- o kern/138367 fs [tmpfs] [panic] 'panic: Assertion pages > 0 failed' wh o kern/138350 fs [ufs] [patch] UFS_EXTATTR static int prototyping error o kern/138244 fs [zfs] dd(1) attempts bitwise transfer onto ZFS pool o kern/138202 fs mount_msdosfs(1) see only 2Gb o kern/138109 fs [extfs] [patch] Minor cleanups to the sys/gnu/fs/ext2f f kern/137037 fs [zfs] [hang] zfs rollback on root causes FreeBSD to fr o kern/136968 fs [ufs] [lor] ufs/bufwait/ufs (open) o kern/136945 fs [ufs] [lor] filedesc structure/ufs (poll) o kern/136944 fs [ffs] [lor] bufwait/snaplk (fsync) o kern/136942 fs [zfs] zvol resize not reflected until reboot o kern/136873 fs [ntfs] Missing directories/files on NTFS volume o kern/136865 fs [nfs] [patch] NFS exports atomic and on-the-fly atomic o kern/136470 fs [nfs] Cannot mount / in read-only, over NFS o kern/136218 fs [zfs] Exported ZFS pools can't be imported into (Open) o kern/135594 fs [zfs] Single dataset unresponsive with Samba o kern/135546 fs [zfs] zfs.ko module doesn't ignore zpool.cache filenam o kern/135480 fs [zfs] panic: lock &arg.lock already initialized o kern/135469 fs [ufs] [panic] kernel crash on md operation in ufs_dirb o bin/135314 fs [zfs] assertion failed for zdb(8) usage o kern/135050 fs [zfs] ZFS clears/hides disk errors on reboot f kern/134496 fs [zfs] [panic] ZFS pool export occasionally causes a ke o kern/134491 fs [zfs] Hot spares are rather cold... o kern/133980 fs [panic] [ffs] panic: ffs_valloc: dup alloc o kern/133676 fs [smbfs] [panic] umount -f'ing a vnode-based memory dis o kern/133614 fs [smbfs] [panic] panic: ffs_truncate: read-only filesys o kern/133373 fs [zfs] umass attachment causes ZFS checksum errors, dat o kern/133174 fs [msdosfs] [patch] msdosfs must support utf-encoded int f kern/133150 fs [zfs] Page fault with ZFS on 7.1-RELEASE/amd64 while w o kern/133134 fs [zfs] Missing ZFS zpool labels o kern/132960 fs [ufs] [panic] panic:ffs_blkfree: freeing free frag o kern/132597 fs [tmpfs] [panic] tmpfs-related panic while interrupting o kern/132551 fs [zfs] ZFS locks up on extattr_list_link syscall o kern/132397 fs reboot causes filesystem corruption (failure to sync b o kern/132331 fs [ufs] [lor] LOR ufs and syncer o kern/132237 fs [msdosfs] msdosfs has problems to read MSDOS Floppy o kern/132145 fs [panic] File System Hard Crashes f kern/132068 fs [zfs] page fault when using ZFS over NFS on 7.1-RELEAS o kern/131995 fs [nfs] Failure to mount NFSv4 server o kern/131441 fs [unionfs] [nullfs] unionfs and/or nullfs not combineab o kern/131360 fs [nfs] poor scaling behavior of the NFS server under lo o kern/131342 fs [nfs] mounting/unmounting of disks causes NFS to fail o bin/131341 fs makefs: error "Bad file descriptor" on the mount poin o kern/131086 fs [ext2fs] [patch] mkfs.ext2 creates rotten partition o kern/130979 fs [smbfs] [panic] boot/kernel/smbfs.ko o kern/130920 fs [msdosfs] cp(1) takes 100% CPU time while copying file o kern/130229 fs [iconv] usermount fails on fs that need iconv o kern/130210 fs [nullfs] Error by check nullfs o kern/129760 fs [nfs] after 'umount -f' of a stale NFS share FreeBSD l o kern/129488 fs [smbfs] Kernel "bug" when using smbfs in smbfs_smb.c: o kern/129231 fs [ufs] [patch] New UFS mount (norandom) option - mostly o kern/129152 fs [panic] non-userfriendly panic when trying to mount(8) o kern/129148 fs [zfs] [panic] panic on concurrent writing & rollback o kern/129059 fs [zfs] [patch] ZFS bootloader whitelistable via WITHOUT f kern/128829 fs smbd(8) causes periodic panic on 7-RELEASE o kern/128633 fs [zfs] [lor] lock order reversal in zfs o kern/128514 fs [zfs] [mpt] problems with ZFS and LSILogic SAS/SATA Ad f kern/128173 fs [ext2fs] ls gives "Input/output error" on mounted ext3 o kern/127659 fs [tmpfs] tmpfs memory leak o kern/127492 fs [zfs] System hang on ZFS input-output o kern/127420 fs [gjournal] [panic] Journal overflow on gmirrored gjour o kern/127213 fs [tmpfs] sendfile on tmpfs data corruption o kern/127029 fs [panic] mount(8): trying to mount a write protected zi o kern/126287 fs [ufs] [panic] Kernel panics while mounting an UFS file s kern/125738 fs [zfs] [request] SHA256 acceleration in ZFS o kern/125644 fs [zfs] [panic] zfs unfixable fs errors caused panic whe f kern/125536 fs [ext2fs] ext 2 mounts cleanly but fails on commands li o kern/125149 fs [nfs] [panic] changing into .zfs dir from nfs client c f kern/124621 fs [ext3] [patch] Cannot mount ext2fs partition f bin/124424 fs [zfs] zfs(8): zfs list -r shows strange snapshots' siz o kern/123939 fs [msdosfs] corrupts new files o kern/122888 fs [zfs] zfs hang w/ prefetch on, zil off while running t o kern/122380 fs [ffs] ffs_valloc:dup alloc (Soekris 4801/7.0/USB Flash o kern/122173 fs [zfs] [panic] Kernel Panic if attempting to replace a o bin/122172 fs [fs]: amd(8) automount daemon dies on 6.3-STABLE i386, o kern/122047 fs [ext2fs] [patch] incorrect handling of UF_IMMUTABLE / o kern/122038 fs [tmpfs] [panic] tmpfs: panic: tmpfs_alloc_vp: type 0xc o bin/121898 fs [nullfs] pwd(1)/getcwd(2) fails with Permission denied o bin/121779 fs [ufs] snapinfo(8) (and related tools?) only work for t o kern/121770 fs [zfs] ZFS on i386, large file or heavy I/O leads to ke o bin/121366 fs [zfs] [patch] Automatic disk scrubbing from periodic(8 o bin/121072 fs [smbfs] mount_smbfs(8) cannot normally convert the cha f kern/120991 fs [panic] [fs] [snapshot] System crashes when manipulati o kern/120483 fs [ntfs] [patch] NTFS filesystem locking changes o kern/120482 fs [ntfs] [patch] Sync style changes between NetBSD and F o bin/120288 fs zfs(8): "zfs share -a" does not send SIGHUP to mountd f kern/119735 fs [zfs] geli + ZFS + samba starting on boot panics 7.0-B o kern/118912 fs [2tb] disk sizing/geometry problem with large array o misc/118855 fs [zfs] ZFS-related commands are nonfunctional in fixit o kern/118713 fs [minidump] [patch] Display media size required for a k o kern/118320 fs [zfs] [patch] NFS SETATTR sometimes fails to set file o bin/118249 fs mv(1): moving a directory changes its mtime o kern/118107 fs [ntfs] [panic] Kernel panic when accessing a file at N o bin/117315 fs [smbfs] mount_smbfs(8) and related options can't mount o kern/117314 fs [ntfs] Long-filename only NTFS fs'es cause kernel pani o kern/117158 fs [zfs] zpool scrub causes panic if geli vdevs detach on o bin/116980 fs [msdosfs] [patch] mount_msdosfs(8) resets some flags f o kern/116913 fs [ffs] [panic] ffs_blkfree: freeing free block p kern/116608 fs [msdosfs] [patch] msdosfs fails to check mount options o kern/116583 fs [ffs] [hang] System freezes for short time when using o kern/116170 fs [panic] Kernel panic when mounting /tmp o kern/115645 fs [snapshots] [panic] lockmgr: thread 0xc4c00d80, not ex o bin/115361 fs [zfs] mount(8) gets into a state where it won't set/un o kern/114955 fs [cd9660] [patch] [request] support for mask,dirmask,ui o kern/114847 fs [ntfs] [patch] [request] dirmask support for NTFS ala o kern/114676 fs [ufs] snapshot creation panics: snapacct_ufs2: bad blo o bin/114468 fs [patch] [request] add -d option to umount(8) to detach o kern/113852 fs [smbfs] smbfs does not properly implement DFS referral o bin/113838 fs [patch] [request] mount(8): add support for relative p o kern/113180 fs [zfs] Setting ZFS nfsshare property does not cause inh o bin/113049 fs [patch] [request] make quot(8) use getopt(3) and show o kern/112658 fs [smbfs] [patch] smbfs and caching problems (resolves b o kern/111843 fs [msdosfs] Long Names of files are incorrectly created o kern/111782 fs [ufs] dump(8) fails horribly for large filesystems s bin/111146 fs [2tb] fsck(8) fails on 6T filesystem o kern/109024 fs [msdosfs] mount_msdosfs: msdosfs_iconv: Operation not o kern/109010 fs [msdosfs] can't mv directory within fat32 file system o bin/107829 fs [2TB] fdisk(8): invalid boundary checking in fdisk / w o kern/106030 fs [ufs] [panic] panic in ufs from geom when a dead disk o kern/105093 fs [ext2fs] [patch] ext2fs on read-only media cannot be m o kern/104406 fs [ufs] Processes get stuck in "ufs" state under persist o kern/104133 fs [ext2fs] EXT2FS module corrupts EXT2/3 filesystems o kern/103035 fs [ntfs] Directories in NTFS mounted disc images appear o kern/101324 fs [smbfs] smbfs sometimes not case sensitive when it's s o kern/99290 fs [ntfs] mount_ntfs ignorant of cluster sizes o kern/97377 fs [ntfs] [patch] syntax cleanup for ntfs_ihash.c o kern/95222 fs [iso9660] File sections on ISO9660 level 3 CDs ignored o kern/94849 fs [ufs] rename on UFS filesystem is not atomic o kern/94769 fs [ufs] Multiple file deletions on multi-snapshotted fil o kern/94733 fs [smbfs] smbfs may cause double unlock o kern/93942 fs [vfs] [patch] panic: ufs_dirbad: bad dir (patch from D o kern/92272 fs [ffs] [hang] Filling a filesystem while creating a sna f kern/91568 fs [ufs] [panic] writing to UFS/softupdates DVD media in o kern/91134 fs [smbfs] [patch] Preserve access and modification time a kern/90815 fs [smbfs] [patch] SMBFS with character conversions somet o kern/89991 fs [ufs] softupdates with mount -ur causes fs UNREFS o kern/88657 fs [smbfs] windows client hang when browsing a samba shar o kern/88266 fs [smbfs] smbfs does not implement UIO_NOCOPY and sendfi o kern/87859 fs [smbfs] System reboot while umount smbfs. o kern/86587 fs [msdosfs] rm -r /PATH fails with lots of small files o kern/85326 fs [smbfs] [panic] saving a file via samba to an overquot o kern/84589 fs [2TB] 5.4-STABLE unresponsive during background fsck 2 o kern/80088 fs [smbfs] Incorrect file time setting on NTFS mounted vi o kern/77826 fs [ext2fs] ext2fs usb filesystem will not mount RW o kern/73484 fs [ntfs] Kernel panic when doing `ls` from the client si o bin/73019 fs [ufs] fsck_ufs(8) cannot alloc 607016868 bytes for ino o kern/71774 fs [ntfs] NTFS cannot "see" files on a WinXP filesystem o kern/68978 fs [panic] [ufs] crashes with failing hard disk, loose po o kern/65920 fs [nwfs] Mounted Netware filesystem behaves strange o kern/65901 fs [smbfs] [patch] smbfs fails fsx write/truncate-down/tr o kern/61503 fs [smbfs] mount_smbfs does not work as non-root o kern/55617 fs [smbfs] Accessing an nsmb-mounted drive via a smb expo o kern/51685 fs [hang] Unbounded inode allocation causes kernel to loc o kern/51583 fs [nullfs] [patch] allow to work with devices and socket o kern/36566 fs [smbfs] System reboot with dead smb mount and umount o kern/18874 fs [2TB] 32bit NFS servers export wrong negative values t 155 problems total. From pjd at FreeBSD.org Mon Aug 31 11:08:14 2009 From: pjd at FreeBSD.org (Pawel Jakub Dawidek) Date: Mon Aug 31 11:28:16 2009 Subject: need help with ZFS In-Reply-To: <20090830191336.5f1cdec0@orwell.free.de> References: <4A927CB3.3040402@plus-plus.su> <4A964F4E.4080009@plus-plus.su> <4A9656CE.8020107@plus-plus.su> <367b2c980908270316n7a21673ek3a997573f2fadbb0@mail.gmail.com> <4A96731D.20406@plus-plus.su> <4A967680.2030205@plus-plus.su> <4A969D20.40809@plus-plus.su> <20090829160037.GA1848@garage.freebsd.pl> <20090830191336.5f1cdec0@orwell.free.de> Message-ID: <20090831110808.GF1671@garage.freebsd.pl> On Sun, Aug 30, 2009 at 07:13:36PM +0200, Kai Gallasch wrote: > On Sat, 29 Aug 2009 18:00:37 +0200 > wrote Pawel Jakub Dawidek : > > > > bonnie++ -s17000 -d. -n128 > > > > > > for about 10 minutes: > > > > > > http://omploader.org/vMjg5aQ/IMG_0057.jpg > > [...] > > > > I'm running your test on pretty low-end h/w (i386, 1GB of RAM, two > > cores) and cannot reproduce the problem for few hours now. The only > > tuning I did was to set vm.kmem_size to 1GB. You still need to do > > this very tuning even on amd64. > > Hi. > Could you give a short explanation why on amd64 it is still necessary > to set vm.kmem_size to 1GB ? I thought with zfs on FreeBSD-7.2-STABLE > tuning kmem was not necessary any more. Actually I might be wrong. I was sure I saw amd64 machine with low vm.kmem_size, but I checked on 8GB RAM machine and it is auto-tuned to more than 2.5GB. I still prefer to set it to 8GB just in case and eventually limit ARC size. > The reason I ask is, I have some servers on FreeBSD-7.2-STABLE > AMD64 that are quite stable, but sometimes suffer from zfs > performance degradation when zfs is competing with applications using > huge amounts of RAM. Those servers are not swapping at all when this > takes place. -- Pawel Jakub Dawidek http://www.wheel.pl pjd@FreeBSD.org http://www.FreeBSD.org FreeBSD committer Am I Evil? Yes, I Am! -------------- next part -------------- A non-text attachment was scrubbed... Name: not available Type: application/pgp-signature Size: 187 bytes Desc: not available Url : http://lists.freebsd.org/pipermail/freebsd-fs/attachments/20090831/13899df8/attachment.pgp From jhb at FreeBSD.org Mon Aug 31 11:55:23 2009 From: jhb at FreeBSD.org (jhb@FreeBSD.org) Date: Mon Aug 31 11:55:29 2009 Subject: kern/138350: [ufs] [patch] UFS_EXTATTR static int prototyping error ufs_extattr_autostart_locked Message-ID: <200908311155.n7VBtNFG026753@freefall.freebsd.org> Synopsis: [ufs] [patch] UFS_EXTATTR static int prototyping error ufs_extattr_autostart_locked State-Changed-From-To: open->closed State-Changed-By: jhb State-Changed-When: Mon Aug 31 11:54:27 UTC 2009 State-Changed-Why: Fix committed. It was fixed earlier in 8.0 as a small part of another change. http://www.freebsd.org/cgi/query-pr.cgi?pr=138350 From dfilter at FreeBSD.ORG Mon Aug 31 12:00:17 2009 From: dfilter at FreeBSD.ORG (dfilter service) Date: Mon Aug 31 12:00:24 2009 Subject: kern/138350: commit references a PR Message-ID: <200908311200.n7VC0HCs027088@freefall.freebsd.org> The following reply was made to PR kern/138350; it has been noted by GNATS. From: dfilter@FreeBSD.ORG (dfilter service) To: bug-followup@FreeBSD.org Cc: Subject: Re: kern/138350: commit references a PR Date: Mon, 31 Aug 2009 11:54:25 +0000 (UTC) Author: jhb Date: Mon Aug 31 11:54:13 2009 New Revision: 196693 URL: http://svn.freebsd.org/changeset/base/196693 Log: MFC a part of 191990: Fix compile of UFS_EXTATTR without UFS_EXTATTR_AUTOSTART. PR: kern/138350 Modified: stable/7/sys/ufs/ufs/ufs_extattr.c Modified: stable/7/sys/ufs/ufs/ufs_extattr.c ============================================================================== --- stable/7/sys/ufs/ufs/ufs_extattr.c Mon Aug 31 10:20:52 2009 (r196692) +++ stable/7/sys/ufs/ufs/ufs_extattr.c Mon Aug 31 11:54:13 2009 (r196693) @@ -93,8 +93,10 @@ static int ufs_extattr_set(struct vnode struct thread *td); static int ufs_extattr_rm(struct vnode *vp, int attrnamespace, const char *name, struct ucred *cred, struct thread *td); +#ifdef UFS_EXTATTR_AUTOSTART static int ufs_extattr_autostart_locked(struct mount *mp, struct thread *td); +#endif static int ufs_extattr_start_locked(struct ufsmount *ump, struct thread *td); _______________________________________________ svn-src-all@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/svn-src-all To unsubscribe, send any mail to "svn-src-all-unsubscribe@freebsd.org" From m at plus-plus.su Mon Aug 31 12:09:31 2009 From: m at plus-plus.su (Mikhail (Plus Plus)) Date: Mon Aug 31 12:09:38 2009 Subject: need help with ZFS In-Reply-To: <20090829160037.GA1848@garage.freebsd.pl> References: <4A927CB3.3040402@plus-plus.su> <4A964F4E.4080009@plus-plus.su> <4A9656CE.8020107@plus-plus.su> <367b2c980908270316n7a21673ek3a997573f2fadbb0@mail.gmail.com> <4A96731D.20406@plus-plus.su> <4A967680.2030205@plus-plus.su> <4A969D20.40809@plus-plus.su> <20090829160037.GA1848@garage.freebsd.pl> Message-ID: <4A9BBDDF.4030005@plus-plus.su> Pawel Jakub Dawidek wrote: > I'm running your test on pretty low-end h/w (i386, 1GB of RAM, two cores) > and cannot reproduce the problem for few hours now. The only tuning I did > was to set vm.kmem_size to 1GB. You still need to do this very tuning even > on amd64. Thanks for your response. I just opened server case, and one of the possible reasons why system panics could be due to faulty hardware.. At least right now I see one SATA controller not sitting properly in it's slot. This could be due to bad transportation from colo DC. I'm going to fix all these small hardware-related issues and will re-run tests once again. Below is a list of settings you requested: > # sysctl vm.kmem_size vm.kmem_size: 2753769472 > # sysctl vm.kmem_size_max vm.kmem_size_max: 329853485875 > # sysctl vfs.zfs vfs.zfs.arc_meta_limit: 430276480 vfs.zfs.arc_meta_used: 1534208 vfs.zfs.mdcomp_disable: 0 vfs.zfs.arc_min: 215138240 vfs.zfs.arc_max: 1721105920 vfs.zfs.zfetch.array_rd_sz: 1048576 vfs.zfs.zfetch.block_cap: 256 vfs.zfs.zfetch.min_sec_reap: 2 vfs.zfs.zfetch.max_streams: 8 vfs.zfs.prefetch_disable: 0 vfs.zfs.recover: 0 vfs.zfs.txg.synctime: 5 vfs.zfs.txg.timeout: 30 vfs.zfs.scrub_limit: 10 vfs.zfs.vdev.cache.bshift: 16 vfs.zfs.vdev.cache.size: 10485760 vfs.zfs.vdev.cache.max: 16384 vfs.zfs.vdev.aggregation_limit: 131072 vfs.zfs.vdev.ramp_rate: 2 vfs.zfs.vdev.time_shift: 6 vfs.zfs.vdev.min_pending: 4 vfs.zfs.vdev.max_pending: 35 vfs.zfs.cache_flush_disable: 0 vfs.zfs.zil_disable: 0 vfs.zfs.version.zpl: 3 vfs.zfs.version.vdev_boot: 1 vfs.zfs.version.spa: 13 vfs.zfs.version.dmu_backup_stream: 1 vfs.zfs.version.dmu_backup_header: 2 vfs.zfs.version.acl: 1 vfs.zfs.debug: 0 vfs.zfs.super_owner: 0 > # zpool status pool: mp3pool state: ONLINE scrub: none requested config: NAME STATE READ WRITE CKSUM mp3pool ONLINE 0 0 0 raidz1 ONLINE 0 0 0 ad24 ONLINE 0 0 0 ad8 ONLINE 0 0 0 ad18 ONLINE 0 0 0 ad20 ONLINE 0 0 0 ad22 ONLINE 0 0 0 ad10 ONLINE 0 0 0 spares ad26 AVAIL errors: No known data errors > # zpool list NAME SIZE USED AVAIL CAP HEALTH ALTROOT mp3pool 5.44T 3.54T 1.90T 65% ONLINE - > # zfs get all NAME PROPERTY VALUE SOURCE mp3pool type filesystem - mp3pool creation Thu Feb 12 23:02 2009 - mp3pool used 2.94T - mp3pool available 1.51T - mp3pool referenced 2.94T - mp3pool compressratio 1.00x - mp3pool mounted yes - mp3pool quota none default mp3pool reservation none default mp3pool recordsize 128K default mp3pool mountpoint /mp3pool default mp3pool sharenfs off default mp3pool checksum on default mp3pool compression off default mp3pool atime on default mp3pool devices on default mp3pool exec on default mp3pool setuid on default mp3pool readonly off default mp3pool jailed off default mp3pool snapdir hidden default mp3pool aclmode groupmask default mp3pool aclinherit restricted default mp3pool canmount on default mp3pool shareiscsi off default mp3pool xattr off temporary mp3pool copies 1 default mp3pool version 3 - mp3pool utf8only off - mp3pool normalization none - mp3pool casesensitivity sensitive - mp3pool vscan off default mp3pool nbmand off default mp3pool sharesmb off default mp3pool refquota none default mp3pool refreservation none default mp3pool primarycache all default mp3pool secondarycache all default > > And place /var/run/dmesg.boot somewhere? http://91.206.231.132/~miha/zfs.dmesg.boot Thanks, Mikhail. From bossic at ngs.ru Mon Aug 31 13:10:03 2009 From: bossic at ngs.ru (bossic@ngs.ru) Date: Mon Aug 31 13:11:22 2009 Subject: 64-bit inodes Message-ID: <1287213639.20090831194947@ngs.ru> Good time of day! I have a question about 64-bit inodes. Some software (for example glusterfs http://www.gluster.org/) don't work on operation systems this type of inodes support without. Will FreeBSD support 64-bit inodes? And when expect it? Best regards, Andrey. From rick-freebsd2008 at kiwi-computer.com Mon Aug 31 18:40:47 2009 From: rick-freebsd2008 at kiwi-computer.com (Rick C. Petty) Date: Mon Aug 31 18:40:53 2009 Subject: 64-bit inodes In-Reply-To: <1287213639.20090831194947@ngs.ru> References: <1287213639.20090831194947@ngs.ru> Message-ID: <20090831184045.GA5846@keira.kiwi-computer.com> On Mon, Aug 31, 2009 at 07:49:47PM +0700, bossic@ngs.ru wrote: > Good time of day! > I have a question about 64-bit inodes. Some software (for example > glusterfs http://www.gluster.org/) don't work on operation systems this type of inodes support > without. Will FreeBSD support 64-bit inodes? And when expect it? UFS2 uses 64-bit block numbers and 64-bit times. Not sure about ZFS, but I know it supports at least 64-bit block pointers. I'm not exactly sure what you mean by "support 64-bit inodes". Certainly FreeBSD supports filesystems that use 64-bit pointers and timestamps. -- Rick C. Petty From rick-freebsd2008 at kiwi-computer.com Mon Aug 31 20:45:08 2009 From: rick-freebsd2008 at kiwi-computer.com (Rick C. Petty) Date: Mon Aug 31 20:45:14 2009 Subject: 64-bit inodes In-Reply-To: <448086815.20090901031952@ngs.ru> References: <1287213639.20090831194947@ngs.ru> <20090831184045.GA5846@keira.kiwi-computer.com> <448086815.20090901031952@ngs.ru> Message-ID: <20090831204506.GA6464@keira.kiwi-computer.com> On Tue, Sep 01, 2009 at 03:19:52AM +0700, bossic@ngs.ru wrote: > > On Mon, Aug 31, 2009 at 07:49:47PM +0700, bossic@ngs.ru wrote: > >> Good time of day! > >> I have a question about 64-bit inodes. Some software (for example > >> glusterfs http://www.gluster.org/) don't work on operation systems this > >> type of inodes support > >> without. Will FreeBSD support 64-bit inodes? And when expect it? > > > UFS2 uses 64-bit block numbers and 64-bit times. Not sure about ZFS, but I > > know it supports at least 64-bit block pointers. I'm not exactly sure what > > you mean by "support 64-bit inodes". Certainly FreeBSD supports > > filesystems that use 64-bit pointers and timestamps. > > Hi! I need to work the packet glusterfs. But I have next: "Distribute > translator: uses 64bit inode numbers, as FreeBSD doesn't support 64bit > inodes. Distribute is seen to not work on FreeBSD" on > http://www.gluster.org/docs/index.php/Whats_New_v2.0 and > http://www.mavetju.org/weblog/html/00262.html . What can I do? This > packet working on Linux, but have "core dumped" on FreeBSD. There is > not big problem to debug it for me, but definition of bug is lower > hands me. [CC'd to the list for additional comments] Okay, you're not talking about the block pointers or timestamps. You're talking about the inode number: % grep ino_t /usr/include/sys/_types.h typedef __uint32_t __ino_t; /* inode number */ Apparently FreeBSD expects that you cannot have around 4.2 billion inodes on any given filesystem. That seems reasonable. Changing from uint32 to uint64 would require a complete rebuild and would introduce considerable ABI breakage. If you're building FreeBSD (and all of the ports you need) yourself, and you don't need binary compatibility, you could probably change this yourself and things would just work. As for convincing the FreeBSD core team to make this change (e.g. for 9.0), I suspect you may see some push back. I'm not even sure how you would make everything ABI compatible. This is yet another reason why embedding ABI version numbers everywhere would help. I think there would have to be a number of duplicate system calls to handle the different inode sizes in any structures passed between the kernel and userland, like stat(2). -- Rick C. Petty From kmacy at freebsd.org Mon Aug 31 21:51:10 2009 From: kmacy at freebsd.org (Kip Macy) Date: Mon Aug 31 21:51:17 2009 Subject: need help with ZFS In-Reply-To: <941145.26591.qm@web37302.mail.mud.yahoo.com> References: <4A927CB3.3040402@plus-plus.su> <4A964F4E.4080009@plus-plus.su> <941145.26591.qm@web37302.mail.mud.yahoo.com> Message-ID: <3c1674c90908311451j5650bcdfl119309368d852a49@mail.gmail.com> >> vm.kmem_size="1536M" >> vm.kmem_size_max="3072M" >> vm.pmap.shpgperproc="1024" >> vfs.zfs.arc_min="256M" >> vfs.zfs.arc_max="384M" >> vfs.zfs.vdev.cache.size="50M" >> vfs.zfs.prefetch_disable="1" >> kern.maxproc="20000" > > Since FreeBSD 7.2 no additional ZFS tuning in loader.conf is needed (on amd64). > You should remove all those settings from loader.conf, reboot, and re-run the tests. > Maybe, just maybe you get a panic just because of those settings. Actually you probably still want to set kmem_size, kmem_size_max, and arc_min. kmem_size_max just overrides auto-tuning it doesn't actually raise kmem_size. I have found that the ARC can get starved out by user processes with many pages in the page cache so a large arc_min can be useful too. Bear in mind that kmem_size can safely be much larger than physical memory, kmem_malloc can often fail due to KVA fragmentation even when physical memory is plentiful. -Kip From kmacy at freebsd.org Mon Aug 31 22:45:35 2009 From: kmacy at freebsd.org (Kip Macy) Date: Mon Aug 31 22:45:42 2009 Subject: RFT ZFS updates Message-ID: <3c1674c90908311545h6d4d48fcj14da6af6df0f13f0@mail.gmail.com> I've created a branch where I've merged in some changes from my private branch to recent HEAD - it incorporates a switch to UMA, some amount of lock pushdown, and it adds the ability to link in ZFS to facilitate profiling. Any testing would be appreciated. svn://svn.freebsd.org/base/user/kmacy/head_zfs_merge/