ZFS L2ARC checksum errors after compression

Andriy Gapon avg at FreeBSD.org
Sat Oct 29 13:33:44 UTC 2016


On 29/10/2016 14:36, Lev Serebryakov wrote:
> Hello freebsd-fs,
> 
>  System is FreeBSD 10.3-STABLE #0 r307523: Mon Oct 17 22:36:27 MSK 2016.
> 
>  I have a small L2ARC (185G) on SSD for my RAIDZ1 pool.
> 
>  When "ALLOC" on this L2ARC becomes greater than "SIZE" (it is compression
>  works, am I right?), zfs-stats shows, that number of checkum errors start
>  to raise. For example, I have this "zfs-stats -L" output now:
> 
>  L2 ARC Summary: (DEGRADED)
>         Passed Headroom:                        153.46k
>         Tried Lock Failures:                    9.65k
>         IO In Progress:                         4.33k
>         Low Memory Aborts:                      9
>         Free on Write:                          1.77k
>         Writes While Full:                      15.20k
>         R/W Clashes:                            0
>         Bad Checksums:                          104.95k
>         IO Errors:                              0
>         SPA Mismatch:                           4.10m
> 
> 
>  And "Bad Checksums" goes up rather fast, it becomes 105.31k when I compose
>  this message!
> 
>   Looks like here is some problems with L2ARC compression.
> 

I think that a recent upstream change, compressed ARC support, reintroduced an a
old problem that was fixed a while ago.

It would be great if you could test this patch:
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c
===================================================================
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c	(revision 308050)
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/arc.c	(working copy)
@@ -7028,7 +7028,22 @@ l2arc_write_buffers(spa_t *spa, l2arc_dev_t *dev,
 				continue;
 			}

-			if ((write_asize + HDR_GET_LSIZE(hdr)) > target_sz) {
+			/*
+			 * We rely on the L1 portion of the header below, so
+			 * it's invalid for this header to have been evicted out
+			 * of the ghost cache, prior to being written out. The
+			 * ARC_FLAG_L2_WRITING bit ensures this won't happen.
+			 */
+			ASSERT(HDR_HAS_L1HDR(hdr));
+
+			ASSERT3U(HDR_GET_PSIZE(hdr), >, 0);
+			ASSERT3P(hdr->b_l1hdr.b_pdata, !=, NULL);
+			ASSERT3U(arc_hdr_size(hdr), >, 0);
+			uint64_t size = arc_hdr_size(hdr);
+			uint64_t asize = vdev_psize_to_asize(dev->l2ad_vdev,
+			    size);
+
+			if ((write_asize + asize) > target_sz) {
 				full = B_TRUE;
 				mutex_exit(hash_lock);
 				ARCSTAT_BUMP(arcstat_l2_write_full);
@@ -7063,21 +7078,6 @@ l2arc_write_buffers(spa_t *spa, l2arc_dev_t *dev,
 			list_insert_head(&dev->l2ad_buflist, hdr);
 			mutex_exit(&dev->l2ad_mtx);

-			/*
-			 * We rely on the L1 portion of the header below, so
-			 * it's invalid for this header to have been evicted out
-			 * of the ghost cache, prior to being written out. The
-			 * ARC_FLAG_L2_WRITING bit ensures this won't happen.
-			 */
-			ASSERT(HDR_HAS_L1HDR(hdr));
-
-			ASSERT3U(HDR_GET_PSIZE(hdr), >, 0);
-			ASSERT3P(hdr->b_l1hdr.b_pdata, !=, NULL);
-			ASSERT3U(arc_hdr_size(hdr), >, 0);
-			uint64_t size = arc_hdr_size(hdr);
-			uint64_t asize = vdev_psize_to_asize(dev->l2ad_vdev,
-			    size);
-
 			(void) refcount_add_many(&dev->l2ad_alloc, size, hdr);

 			/*

-- 
Andriy Gapon


More information about the freebsd-fs mailing list