misc/186574: zpool history hangs (infinite loop)
Andrew Childs
lorne at cons.org.nz
Sun Feb 9 05:20:00 UTC 2014
>Number: 186574
>Category: misc
>Synopsis: zpool history hangs (infinite loop)
>Confidential: no
>Severity: non-critical
>Priority: low
>Responsible: freebsd-bugs
>State: open
>Quarter:
>Keywords:
>Date-Required:
>Class: sw-bug
>Submitter-Id: current-users
>Arrival-Date: Sun Feb 09 05:20:00 UTC 2014
>Closed-Date:
>Last-Modified:
>Originator: Andrew Childs
>Release: FreeBSD 10.0-RELEASE amd64
>Organization:
>Environment:
System: FreeBSD northind.cons.org.nz 10.0-RELEASE FreeBSD 10.0-RELEASE #0 r260789: Thu Jan 16 22:34:59 UTC 2014 root at snap.freebsd.org:/usr/obj/usr/src/sys/GENERIC amd64
>Description:
Running "zpool history zroot" on my FreeBSD 10-RELEASE machine results
in zpool entering an infinite loop.
>How-To-Repeat:
Run "zpool history" on an affected pool.
>Fix:
This appears to be caused by having a history record larger than
libzfs's HIS_BUF_LEN.
zpool_get_history reads HIS_BUF_LEN sized chunks and parses entire
records at a time. When an entire record isn't contained within a
single chunk, zpool_get_history correctly concludes the result was
truncated. The next read is aligned such that any truncated record
will be at the beginning of the buffer. zpool_get_history then loops
with the assumption this will read the entire record. When a record
larger than HIS_BUF_LEN is encountered, it will always be truncated,
even if it's at the start of the buffer, and zpool_get_history will
loop indefinitely.
With a warning patched in:
History for 'zroot':
HIS_BUF_LEN (0x20000) too small to fit record of length 0x239c8
With a dynamically increasing buffer the history is successfully
printed.
I don't believe the particular record is malformed. The slightly
redacted output of running strings over the data is here:
https://gist.github.com/thefloweringash/1fd434541c05fc091b10
Patch attached to introduce a dynamically sized buffer. This solves
the problem for me. However I don't know how HIS_BUF_LEN was chosen,
and could be patching a symptom.
--- libzfs_pool-dynamic-history-buffer begins here ---
--- libzfs_pool.c.orig 2014-02-09 16:25:17.210736940 +1300
+++ libzfs_pool.c 2014-02-09 17:50:52.585383154 +1300
@@ -3745,14 +3745,19 @@
int
zpool_get_history(zpool_handle_t *zhp, nvlist_t **nvhisp)
{
- char buf[HIS_BUF_LEN];
+ char *buf;
+ uint64_t buflen = HIS_BUF_LEN;
uint64_t off = 0;
nvlist_t **records = NULL;
uint_t numrecords = 0;
int err, i;
+ buf = malloc(buflen);
+ if (!buf)
+ return (1);
+
do {
- uint64_t bytes_read = sizeof (buf);
+ uint64_t bytes_read = buflen;
uint64_t leftover;
if ((err = get_history(zhp, buf, &off, &bytes_read)) != 0)
@@ -3765,6 +3770,26 @@
if ((err = zpool_history_unpack(buf, bytes_read,
&leftover, &records, &numrecords)) != 0)
break;
+
+ if (leftover == bytes_read) {
+ // No progress was made. Check if we're
+ // attempting to read a record longer than our
+ // buffer.
+
+ uint64_t reclen;
+ for (i = 0, reclen = 0; i < sizeof (reclen); i++)
+ reclen += (uint64_t)(((uchar_t *)buf)[i]) << (8*i);
+
+ while (buflen < sizeof(reclen) + reclen)
+ buflen <<= 1;
+
+ buf = realloc(buf, buflen);
+ if (!buf) {
+ err = 1;
+ break;
+ }
+ }
+
off -= leftover;
/* CONSTCOND */
@@ -3779,6 +3804,8 @@
nvlist_free(records[i]);
free(records);
+ free(buf);
+
return (err);
}
--- libzfs_pool-dynamic-history-buffer ends here ---
>Release-Note:
>Audit-Trail:
>Unformatted:
More information about the freebsd-bugs
mailing list