kern/146296: [zfs] [patch] fix deadlock during zfs receive (onnv 9299)

Martin Matuska mm at FreeBSD.org
Tue May 4 10:50:02 UTC 2010


>Number:         146296
>Category:       kern
>Synopsis:       [zfs] [patch] fix deadlock during zfs receive (onnv 9299)
>Confidential:   no
>Severity:       serious
>Priority:       medium
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Tue May 04 10:50:01 UTC 2010
>Closed-Date:
>Last-Modified:
>Originator:     Martin Matuska
>Release:        FreeBSD 8.0-STABLE amd64
>Organization:
>Environment:
>Description:
I have encountered a hanging zfs receive during receiving many incremental
streams.

This problem has been described in OpenSolaris mailing lists,
it matches my symptoms and affects our ZFS port.

OpenSolaris Bug IDs:
	6783818 Incremental stream receive panics system
	6826836 Deadlock possible in dmu_object_reclaim()

Mailing list discussion:
http://mail.opensolaris.org/pipermail/storage-discuss/2009-June/006171.html

Fixed in onnv revision: 9299:8809e849f63e
>How-To-Repeat:
>Fix:
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c
===================================================================
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c	(revision 207608)
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dmu_object.c	(working copy)
@@ -128,15 +128,6 @@
 		return (0);
 	}
 
-	tx = dmu_tx_create(os);
-	dmu_tx_hold_bonus(tx, object);
-	err = dmu_tx_assign(tx, TXG_WAIT);
-	if (err) {
-		dmu_tx_abort(tx);
-		dnode_rele(dn, FTAG);
-		return (err);
-	}
-
 	nblkptr = 1 + ((DN_MAX_BONUSLEN - bonuslen) >> SPA_BLKPTRSHIFT);
 
 	/*
@@ -144,16 +135,27 @@
 	 * be a new file instance.   We must clear out the previous file
 	 * contents before we can change this type of metadata in the dnode.
 	 */
-	if (dn->dn_nblkptr > nblkptr || dn->dn_datablksz != blocksize)
-		dmu_free_long_range(os, object, 0, DMU_OBJECT_END);
+	if (dn->dn_nblkptr > nblkptr || dn->dn_datablksz != blocksize) {
+		err = dmu_free_long_range(os, object, 0, DMU_OBJECT_END);
+		if (err)
+			goto out;
+	}
 
+	tx = dmu_tx_create(os);
+	dmu_tx_hold_bonus(tx, object);
+	err = dmu_tx_assign(tx, TXG_WAIT);
+	if (err) {
+		dmu_tx_abort(tx);
+		goto out;
+	}
+
 	dnode_reallocate(dn, ot, blocksize, bonustype, bonuslen, tx);
 
 	dmu_tx_commit(tx);
-
+out:
 	dnode_rele(dn, FTAG);
 
-	return (0);
+	return (err);
 }
 
 int
Index: sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c
===================================================================
--- sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c	(revision 207608)
+++ sys/cddl/contrib/opensolaris/uts/common/fs/zfs/dbuf.c	(working copy)
@@ -464,15 +464,15 @@
 	ASSERT(db->db_buf == NULL);
 
 	if (db->db_blkid == DB_BONUS_BLKID) {
-		int bonuslen = dn->dn_bonuslen;
+		int bonuslen = MIN(dn->dn_bonuslen, dn->dn_phys->dn_bonuslen);
 
 		ASSERT3U(bonuslen, <=, db->db.db_size);
 		db->db.db_data = zio_buf_alloc(DN_MAX_BONUSLEN);
 		arc_space_consume(DN_MAX_BONUSLEN);
 		if (bonuslen < DN_MAX_BONUSLEN)
 			bzero(db->db.db_data, DN_MAX_BONUSLEN);
-		bcopy(DN_BONUS(dn->dn_phys), db->db.db_data,
-		    bonuslen);
+		if (bonuslen)
+			bcopy(DN_BONUS(dn->dn_phys), db->db.db_data, bonuslen);
 		dbuf_update_data(db);
 		db->db_state = DB_CACHED;
 		mutex_exit(&db->db_mtx);
>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list