svn commit: r320156 - in head: cddl/contrib/opensolaris/cmd/zdb cddl/contrib/opensolaris/cmd/ztest cddl/contrib/opensolaris/lib/libzfs/common sys/cddl/contrib/opensolaris/common/zfs sys/cddl/contri...
O. Hartmann
ohartmann at walstatt.org
Wed Jun 21 07:59:24 UTC 2017
Am Tue, 20 Jun 2017 17:25:53 -0400
"Kenneth D. Merry" <ken at FreeBSD.ORG> schrieb:
> On Tue, Jun 20, 2017 at 23:37:10 +0300, Andriy Gapon wrote:
> > On 20/06/2017 23:29, Ken Merry wrote:
> > > I don???t know for sure that this commit is the cause, but it (and r320153) are the
> > > only ZFS commits between a version of head from June 14th that boots off a ZFS
> > > mirror, and one that panics.
r320153 is running well here and stable, but with r320156, my kernel(s) on all ZFS
machines panic immediately (they have ZFS built in into the kernel, not a module).
This moment, I went back to r320153. I'm sorry for not having debugging informations, the
boxes are w/o debugging options this moment.
> > >
> > > Here???s the stack trace:
> > >
> > > Fatal trap 12: page fault while in kernel mode
> > > cpuid = 22;
> > >
> > > Fatal trap 12: page fault while in kernel mode
> > > cpuid = 9; apic id = 09
> > > fault virtual address = 0x0
> > > fault code = supervisor read data, page not present
> > > instruction pointer = 0x20:0xffffffff81e47f21
> > > stack pointer = 0x28:0xfffffe08b37f8810
> > > frame pointer = 0x28:0xfffffe08b37f8860
> > > code segment = base 0x0, limit 0xfffff, type 0x1b
> > > = DPL 0, pres 1, long 1, def32 0, gran 1
> > > processor eflags = interrupt enabled, resume, IOPL = 0
> > > current process = 0 (zio_free_issue_0_3)
> > > [ thread pid 0 tid 100478 ]
> > > Stopped at 0xffffffff81e47f21 = zio_vdev_io_start+0x1f1: testb
> > > $0x1,(%rax)
> > > db> bt
> > > Tracing pid 0 tid 100478 td 0xfffff80193156000
> > > zio_vdev_io_start() at 0xffffffff81e47f21 = zio_vdev_io_start+0x1f1/frame
> > > 0xfffffe08b37f8860 zio_execute() at 0xffffffff81e4312c = zio_execute+0x36c/frame
> > > 0xfffffe08b37f88b0 zio_nowait() at 0xffffffff81e422b8 = zio_nowait+0xb8/frame
> > > 0xfffffe08b37f88e0 vdev_mirror_io_start() at 0xffffffff81e224fc =
> > > vdev_mirror_io_start+0x38c/frame 0xfffffe08b37f8930 zio_vdev_io_start() at
> > > 0xffffffff81e48030 = zio_vdev_io_start+0x300/frame 0xfffffe08b37f8990 zio_execute()
> > > at 0xffffffff81e4312c = zio_execute+0x36c/frame 0xfffffe08b37f89e0
> > > taskqueue_run_locked() at 0xffffffff809a9d6d = taskqueue_run_locked+0x13d/frame
> > > 0xfffffe08b37f8a40 taskqueue_thread_loop() at 0xffffffff809aab28 =
> > > taskqueue_thread_loop+0x88/frame 0xfffffe08b37f8a70 fork_exit() at
> > > 0xffffffff8091e3e4 = fork_exit+0x84/frame 0xfffffe08b37f8ab0 fork_trampoline() at
> > > 0xffffffff80d930fe = fork_trampoline+0xe/frame 0xfffffe08b37f8ab0 --- trap 0, rip =
> > > 0, rsp = 0, rbp = 0 ---
> > > db>
> > >
> > > (kgdb) list *(zio_vdev_io_start+0x1f1)
> > > 0xd9f21 is in zio_vdev_io_start
> > > (/usr/home/kenm/perforce4/kenm/FreeBSD-test/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:350).
> > > 345 346 /*
> > > 347 * Ensure that anyone expecting this zio to contain a linear ABD
> > > isn't 348 * going to get a nasty surprise when they try to access the
> > > data. 349 */
> > > 350 IMPLY(abd_is_linear(zio->io_abd), abd_is_linear(data));
> > > 351
> > > 352 zt->zt_orig_abd = zio->io_abd;
> > > 353 zt->zt_orig_size = zio->io_size;
> > > 354 zt->zt_bufsize = bufsize;
> > >
> > > I???ll try rebooting and see if the problem goes away. If not, I???ll roll back
> > > the ABD change and see if the problem goes away.
> >
> > Judging from the thread that panic-ed the problem may have to do with our TRIM
> > support. Unfortunately, I didn't have a chance to test the change on a system
> > with working TRIM and, so, I missed it.
> > I will look into this further, but it's almost obvious that the problem is
> > caused by zio->io_abd being NULL for a zio of type ZIO_TYPE_FREE.
>
> FWIW, avg sent me a patch for this particular problem (by checking for NULL
> before dereferencing the pointer), and although it got me past the above
> problem, I hit another related panic:
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 6;
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 14; apic id = 22
> fault virtual address = 0x4
> fault code = supervisor read data, page not present
> instruction pointer = 0x20:0xffffffff81d92a2d
> stack pointer = 0x0:0xfffffe08b36e0710
> frame pointer = 0x0:0xfffffe08b36e0730
> code segment = base 0x0, limit 0xfffff, type 0x1b
>
>
> Fatal trap 12: page fault while in kernel mode
> cpuid = 11; apic id = 0b
> fault virtual address = 0x4
> Fatal trap 12: page fault while in kernel mode
> cpuid = 8; apic id = 08
> = DPL 0, pres 1, long 1, def32 0, gran 1
> processor eflags = interrupt enabled, resume, IOPL = 0
> current process = 0 (zio_free_issue_4_1)
> [ thread pid 0 tid 100799 ]
> Stopped at 0xffffffff81d92a2d = abd_verify+0xd: movl 0x4(%r14),%eax
> db> bt
> Tracing pid 0 tid 100799 td 0xfffff801931b8560
> abd_verify() at 0xffffffff81d92a2d = abd_verify+0xd/frame 0xfffffe08b36e0730
> abd_put() at 0xffffffff81d92eff = abd_put+0xf/frame 0xfffffe08b36e0750
> vdev_raidz_map_free() at 0xffffffff81e26312 = vdev_raidz_map_free+0x82/frame
> 0xfffffe08b36e0780 zio_vdev_io_assess() at 0xffffffff81e48646 =
> zio_vdev_io_assess+0x116/frame 0xfffffe08b36e07b0 zio_execute() at 0xffffffff81e4312c =
> zio_execute+0x36c/frame 0xfffffe08b36e0800 zio_vdev_io_start() at 0xffffffff81e48184 =
> zio_vdev_io_start+0x454/frame 0xfffffe08b36e0860 zio_execute() at 0xffffffff81e4312c =
> zio_execute+0x36c/frame 0xfffffe08b36e08b0 zio_nowait() at 0xffffffff81e422b8 =
> zio_nowait+0xb8/frame 0xfffffe08b36e08e0 vdev_mirror_io_start() at 0xffffffff81e224fc =
> vdev_mirror_io_start+0x38c/frame 0xfffffe08b36e0930 zio_vdev_io_start() at
> 0xffffffff81e48030 = zio_vdev_io_start+0x300/frame 0xfffffe08b36e0990 zio_execute() at
> 0xffffffff81e4312c = zio_execute+0x36c/frame 0xfffffe08b36e09e0 taskqueue_run_locked()
> at 0xffffffff809a9d6d = taskqueue_run_locked+0x13d/frame 0xfffffe08b36e0a40
> taskqueue_thread_loop() at 0xffffffff809aab28 = taskqueue_thread_loop+0x88/frame
> 0xfffffe08b36e0a70 fork_exit() at 0xffffffff8091e3e4 = fork_exit+0x84/frame
> 0xfffffe08b36e0ab0 fork_trampoline() at 0xffffffff80d930fe = fork_trampoline+0xe/frame
> 0xfffffe08b36e0ab0 --- trap 0, rip = 0, rsp = 0, rbp = 0 ---
> db>
>
> (kgdb) list *(abd_verify+0xd)
>
> 0x24a2d is in abd_verify
> (/usr/home/kenm/perforce4/kenm/FreeBSD-test/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/abd.c:231).
> 226 } 227
> 228 static inline void
> 229 abd_verify(abd_t *abd)
> 230 {
> 231 ASSERT3U(abd->abd_size, >, 0);
> 232 ASSERT3U(abd->abd_size, <=, SPA_MAXBLOCKSIZE);
> 233 ASSERT3U(abd->abd_flags, ==, abd->abd_flags & (ABD_FLAG_LINEAR |
> 234 ABD_FLAG_OWNER | ABD_FLAG_META));
> 235 IMPLY(abd->abd_parent != NULL, !(abd->abd_flags & ABD_FLAG_OWNER));
> (kgdb) list *(abd_put+0xf)
> 0x24eff is in abd_put
> (/usr/home/kenm/perforce4/kenm/FreeBSD-test/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/abd.c:514).
> 509 */ 510 void
> 511 abd_put(abd_t *abd)
> 512 {
> 513 abd_verify(abd);
> 514 ASSERT(!(abd->abd_flags & ABD_FLAG_OWNER));
> 515
> 516 if (abd->abd_parent != NULL) {
> 517 (void) refcount_remove_many(&abd->abd_parent->abd_children,
> 518 abd->abd_size, abd);
> (kgdb) list *(vdev_raidz_map_free+0x82)
> 0xb8312 is in vdev_raidz_map_free
> (/usr/home/kenm/perforce4/kenm/FreeBSD-test/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/vdev_raidz.c:281).
> 276 zio_buf_free(rm->rm_col[c].rc_gdata,
> 277 rm->rm_col[c].rc_size); 278 }
> 279
> 280 size = 0;
> 281 for (c = rm->rm_firstdatacol; c < rm->rm_cols; c++) {
> 282 abd_put(rm->rm_col[c].rc_abd);
> 283 size += rm->rm_col[c].rc_size;
> 284 }
> 285
> (kgdb) list *(zio_vdev_io_assess+0x116)
> 0xda646 is in zio_vdev_io_assess
> (/usr/home/kenm/perforce4/kenm/FreeBSD-test/sys/cddl/contrib/opensolaris/uts/common/fs/zfs/zio.c:3315).
> 3310 if (vd == NULL && !(zio->io_flags & ZIO_FLAG_CONFIG_WRITER))
> 3311 spa_config_exit(zio->io_spa, SCL_ZIO, zio); 3312
> 3313 if (zio->io_vsd != NULL) {
> 3314 zio->io_vsd_ops->vsd_free(zio);
> 3315 zio->io_vsd = NULL;
> 3316 }
> 3317
> 3318 if (zio_injection_enabled && zio->io_error == 0)
> 3319 zio->io_error = zio_handle_fault_injection(zio, EIO);
> (kgdb)
>
> So, I disabled trim by setting vfs.zfs.trim.enabled=0 in the loader, and I
> can boot now.
>
> Ken
--
O. Hartmann
Ich widerspreche der Nutzung oder Übermittlung meiner Daten für
Werbezwecke oder für die Markt- oder Meinungsforschung (§ 28 Abs. 4 BDSG).
-------------- next part --------------
A non-text attachment was scrubbed...
Name: not available
Type: application/pgp-signature
Size: 313 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freebsd.org/pipermail/svn-src-all/attachments/20170621/8b180ec3/attachment.sig>
More information about the svn-src-all
mailing list