FS hang when creating snapshots on a UFS SU+J setup

I've done some tests to verify that the problem only occures when SU+J
is used, but not SU without J. In fact, I did run the following two
loops on different TTYs in parallel:

while 1
 cp -r /usr/src /root
 rm -Rf /root/src

while 1
 mksnap_ffs / /.snap/snap
 rm -f /.snap/snap

With SU without J the system survives this for at least 1 hour. But as
soon as SU+J is used it most likely deadlocks or even panics in the
first 1 or 2 minutes. What extactly happens seems to vary... In most
cases the system just deadlocks, sometimes like alain at bsdgate.org
descripes and sometimes it's completely unresponsive to any input. 
I've seen kernel messages like "fsync: giving up on dirty".

Several times the system paniced. In most cases printing the generic
"panic: page fault while in kernel mode" and one time printing 
"panic: snapacct_ufs2: bad block". I've never seen the same
backtrace twice. One time the system suddenly rebooted, like a tripple
fault or something like that happend.

Since it's much more likely that the problems described above arrise
when the the filesystem is loaded (for example by the first loop) while
taking the snapshot this looks like some kind of race condition or
something like that. 

Some more information from an older debug session can be found at:

On Tue, 10 Jan 2012 10:30:13 -0800







 

 





 


> First step in debugging is to find out if the problem is SU+J
> specific. To find out, turn off SU+J but leave SU. This change
> is done by running:
> 	umount <filesystem>
> 	tunefs -j disable <filesystem>
> 	mount <filesystem>
> 	cd <filesystem>
> 	rm .sujournal
> You may want to run `fsck -f' on the filesystem while you have
> it unmounted just to be sure that it is clean. Then run your
> snapshot request to see if it still fails. If it works, then
> we have narrowed the problem down to something related to SU+J.
> If it fails then we have a broader issue to deal with.
> If you wish to go back to using SU+J after the test, you can
> reenable SU+J by running:
> 	umount <filesystem>
> 	tunefs -j enable <filesystem>
> 	mount <filesystem>
> When responding to me, it is best to use my <mckusick at mckusick.com>
> email as I tend to read it more regularly.
> 	Kirk McKusick

