Hang when importing pool

Karli Sjöberg Karli.Sjoberg at slu.se
Wed Aug 15 06:46:52 UTC 2012


31 jul 2012 kl. 17.31 skrev Freddie Cash:

On Mon, Jul 30, 2012 at 11:31 PM, Karli Sjöberg <Karli.Sjoberg at slu.se<mailto:Karli.Sjoberg at slu.se>> wrote:
I´m really struggling with this. I have had a pool with imported filesystems from a Solaris system that had dedup activated. Then, when the time came to erase them, it just stalled. When rebooting, it stalled again at mounting filesystems, and since then, I´ve installed two USB drives to act as root pool with FreeBSD-9.0-RELEASE so that I could import the original pool in recovery, but it always stalls after a couple of hours. Looking at top, I could see that the 16GB RAM was maxed out, so I have heavily tuned down kmem, arc, etc:

You're running out of RAM during the import, as it loads the DDT.
Stuff a bunch more RAM into the machine (32 GB, 48 GB, even 64 GB).
Then you will be able to load the full DDT into RAM, finish the
aborted destroy process, and import the pool.

We've run into this three or four times now on systems with dedupe
enabled and only 16 GB of RAM.  We've since upgraded all our boxes to
a minimum of 32 GB, with one having 64 GB.

ZFS dataset destruction with dedupe enabled takes *a lot* of time and
RAM, as the DDT needs to be updated for every block freed.  And
rebooting in the middle of a "zfs destroy" operation means that the
operation needs to finish at pool import time.

--
Freddie Cash
fjwcash at gmail.com<mailto:fjwcash at gmail.com>

I took your advice. I replaced my Core i5 with a Xeon X3470 and ramped up the RAM to 32GB, maxing out the HW. Sadly enough, it still stalls in the exact same manner:( This has to be the most frustrating thing ever, since there´s tons of data there that I really need and if it wasn´t for that stupid destroy operation, it would still be accessible.

I feel that FreeBSD is partly to blame since it was completely possible in the originating SUN machine with Solaris that only has 16GB RAM to do the same destroy to the same dataset without any problem. Sure, it took forever and then some (about two weeks) but it stayed afloat during the whole time.

The FreeBSD machine starts to accumulate more and more RAM, but quite steadily until it comes up to 9-9.5GB of RAM Wired and then it just SHOOTS off to swallow it all and cause a stall. That is the same behavior as when there was only 16GB RAM. During the last attempt I had

while true; do
zfs-stats -A | grep "ARC Size:"
zfs-stats -L | egrep '(L2 ARC Size|Bytes Scanned)'
sleep 10
done

running, so I could monitor the usage just before the crash. The ARC stayed at 6GB, while the last top sample shows 28GB Wired. See for yourselves:

top:
http://i45.tinypic.com/21do5ra.png

gstat:
http://i49.tinypic.com/e197ax.png

zfs-stats:
http://i46.tinypic.com/250uxhz.png

Import CTRL+T after it stalled:
http://i46.tinypic.com/2uxvb4h.png


I´m willing to try anything at this point. Any longshots you have are most welcome, since it couldn´t get any worse:(



Med Vänliga Hälsningar
-------------------------------------------------------------------------------
Karli Sjöberg
Swedish University of Agricultural Sciences
Box 7079 (Visiting Address Kronåsvägen 8)
S-750 07 Uppsala, Sweden
Phone:  +46-(0)18-67 15 66
karli.sjoberg at slu.se<mailto:karli.sjoberg at adm.slu.se>



More information about the freebsd-fs mailing list