misc/186574: zpool history hangs (infinite loop)

Andrew Childs lorne at cons.org.nz
Sun Feb 9 05:20:00 UTC 2014


>Number:         186574
>Category:       misc
>Synopsis:       zpool history hangs (infinite loop)
>Confidential:   no
>Severity:       non-critical
>Priority:       low
>Responsible:    freebsd-bugs
>State:          open
>Quarter:        
>Keywords:       
>Date-Required:
>Class:          sw-bug
>Submitter-Id:   current-users
>Arrival-Date:   Sun Feb 09 05:20:00 UTC 2014
>Closed-Date:
>Last-Modified:
>Originator:     Andrew Childs
>Release:        FreeBSD 10.0-RELEASE amd64
>Organization:
>Environment:
System: FreeBSD northind.cons.org.nz 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu Jan 16 22:34:59 UTC 2014 root at snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64


	
>Description:

Running "zpool history zroot" on my FreeBSD 10-RELEASE machine results
in zpool entering an infinite loop.

>How-To-Repeat:

Run "zpool history" on an affected pool.

>Fix:

This appears to be caused by having a history record larger than
libzfs's HIS_BUF_LEN.

zpool_get_history reads HIS_BUF_LEN sized chunks and parses entire
records at a time. When an entire record isn't contained within a
single chunk, zpool_get_history correctly concludes the result was
truncated. The next read is aligned such that any truncated record
will be at the beginning of the buffer. zpool_get_history then loops
with the assumption this will read the entire record. When a record
larger than HIS_BUF_LEN is encountered, it will always be truncated,
even if it's at the start of the buffer, and zpool_get_history will
loop indefinitely.

With a warning patched in:

  History for 'zroot':
  HIS_BUF_LEN (0x20000) too small to fit record of length 0x239c8

With a dynamically increasing buffer the history is successfully
printed.

I don't believe the particular record is malformed. The slightly
redacted output of running strings over the data is here:
https://gist.github.com/thefloweringash/1fd434541c05fc091b10

Patch attached to introduce a dynamically sized buffer. This solves
the problem for me. However I don't know how HIS_BUF_LEN was chosen,
and could be patching a symptom.

--- libzfs_pool-dynamic-history-buffer begins here ---
--- libzfs_pool.c.orig	2014-02-09 16:25:17.210736940 +1300
+++ libzfs_pool.c	2014-02-09 17:50:52.585383154 +1300
@@ -3745,14 +3745,19 @@
 int
 zpool_get_history(zpool_handle_t *zhp, nvlist_t **nvhisp)
 {
-	char buf[HIS_BUF_LEN];
+	char *buf;
+	uint64_t buflen = HIS_BUF_LEN;
 	uint64_t off = 0;
 	nvlist_t **records = NULL;
 	uint_t numrecords = 0;
 	int err, i;
 
+	buf = malloc(buflen);
+	if (!buf)
+		return (1);
+
 	do {
-		uint64_t bytes_read = sizeof (buf);
+		uint64_t bytes_read = buflen;
 		uint64_t leftover;
 
 		if ((err = get_history(zhp, buf, &off, &bytes_read)) != 0)
@@ -3765,6 +3770,26 @@
 		if ((err = zpool_history_unpack(buf, bytes_read,
 		    &leftover, &records, &numrecords)) != 0)
 			break;
+
+		if (leftover == bytes_read) {
+			// No progress was made. Check if we're
+			// attempting to read a record longer than our
+			// buffer.
+
+			uint64_t reclen;
+			for (i = 0, reclen = 0; i < sizeof (reclen); i++)
+				reclen += (uint64_t)(((uchar_t *)buf)[i]) << (8*i);
+
+			while (buflen < sizeof(reclen) + reclen)
+				buflen <<= 1;
+
+			buf = realloc(buf, buflen);
+			if (!buf) {
+				err = 1;
+				break;
+			}
+		}
+
 		off -= leftover;
 
 		/* CONSTCOND */
@@ -3779,6 +3804,8 @@
 		nvlist_free(records[i]);
 	free(records);
 
+	free(buf);
+
 	return (err);
 }
 
--- libzfs_pool-dynamic-history-buffer ends here ---


>Release-Note:
>Audit-Trail:
>Unformatted:


More information about the freebsd-bugs mailing list