ufs snapshot is sometimes corrupt on gjourneled partition
Konstantin Belousov
kostikbel at gmail.com
Tue Jul 18 10:22:06 UTC 2017
On Mon, Jul 17, 2017 at 05:44:20PM -0700, Kirk McKusick wrote:
> The sequence of calls when using bread is:
>
> Function Line File
> -------- ---- ----
> bread 491 sys/buf.h
> breadn_flags 1814 kern/vfs_bio.c
> bstrategy 397 sys/buf.h
> BO_STRATEGY 86 sys/bufobj.h
> bufstrategy 4535 kern/vfs_bio.c
> ufs_strategy 2290 ufs/ufs/ufs_vnops.c
> BO_STRATEGY on filesystem device -> ffs_geom_strategy
> ffs_geom_strategy 2141 ufs/ffs/ffs_vfsops.c
> g_vfs_strategy 161 geom/geom_vfs.c
> g_io_request 470 geom/geom_io.c
>
> Whereas readblock skips all these steps and calls g_io_request
> directly. In my looking at the gjournal code, I believe that we
> will still enter it with the g_io_request() call as I believe that
> it does not hook itself into any of the VOP_ call structure. but I
> have not done a backtrace to confirm this fact. Assuming that we
> are still getting into g_journal_start(), then it should be possible
> to catch reads that are only in the log and pull out the data as
> needed.
>
> Another alternative is to send gjournal a request to flush its log
> before starting the removal of a snapshot.
I do not think that UFS call sequence is relevant there. It is clearly
an underlying io device (gjournal) malfunction if it returns a data block
which is different from the latest successful written block. As is,
whether UFS pass the read request from buffer cache by the BO_STRATEGY
layers, or directly creates bio and reads the block, is not important.
OTOH, I do not think that this is an issue that gjournal always reads
from the data area and misses journal. The failure would be much more
spectacular in this case. I see some gjournal code which tries to find the
data in 'cache' on read, whatever it means. It is clearly that sometimes
it does not find the data. The failure is probably additionally hidden
by the buffer cache eliminating most reads for recently written data.
So the way to fix the bug is to read gjournal code and understand why
does it sometime returns wrong data. For instance, there were relatively
recent changes to geom infrastructure allowing for direct completion of
bios. Anyway, I have no interest in gjournal.
More information about the freebsd-fs
mailing list