bin/157244: dump/restore: unknown tape header type -230747966

Gene Stark gene at home.starkeffect.com
Sun May 22 13:20:10 UTC 2011


The following reply was made to PR bin/157244; it has been noted by GNATS.

From: Gene Stark <gene at home.starkeffect.com>
To: FreeBSD-gnats-submit at FreeBSD.org, freebsd-bugs at FreeBSD.org
Cc:  
Subject: Re: bin/157244: dump/restore: unknown tape header type  -230747966
Date: Sun, 22 May 2011 08:33:23 -0400

 I wrote a program to compare the blocks in another copy of one of the
 large files in the dump with the version extracted from restore after
 applying my header reordering program.  The program read each of the
 files in blocks of TP_BSIZE bytes, computed the SHA1 hash of each
 block, stored the resulting <hash, offset> pairs in a hash map for
 each file, unioned the key sets of the two hash maps to obtain a
 single master list of block hashes, traversed the master key set
 to construct a map <offset, <offset0, offset1>> that gave the
 correspondence between the blocks in the two files, and printed out
 the contents of that map in increasing order of offset, showing the
 differences between the two files.  Here is the initial part of the
 result:
 
 Lectures.zip.bad: 52469795 bytes
 Lectures.zip.good: 52469795 bytes
 11612   11622   10
 11613   11623   10
 11614   11624   10
 11615   11625   10
 11616   11626   10
 11617   11627   10
 11618   11628   10
 11619   11629   10
 11620   11630   10
 11621   11631   10
 11622   11632   10
 11623   11633   10
 11624   11634   10
 11625   11635   10
 11626   11636   10
 11627   11637   10
 11628   11638   10
 11629   11639   10
 11630   11640   10
 11631   11641   10
 11632   11612   -20
 11633   11613   -20
 11634   11614   -20
 11635   11615   -20
 11636   11616   -20
 11637   11617   -20
 11638   11618   -20
 11639   11619   -20
 11640   11620   -20
 11641   11621   -20
 11642   11652   10
 11643   11653   10
 11644   11654   10
 11645   11655   10
 11646   11656   10
 11647   11657   10
 11648   11658   10
 11649   11659   10
 11650   11660   10
 11651   11661   10
 11652   11662   10
 11653   11663   10
 11654   11664   10
 11655   11665   10
 11656   11666   10
 11657   11667   10
 11658   11668   10
 11659   11669   10
 11660   11670   10
 11661   11671   10
 11662   11642   -20
 11663   11643   -20
 11664   11644   -20
 11665   11645   -20
 11666   11646   -20
 11667   11647   -20
 11668   11648   -20
 11669   11649   -20
 11670   11650   -20
 11671   11651   -20
 11672   11682   10
 11673   11683   10
 
 The pattern repeats this way for *almost* the entire file.
 There are sets of 20 blocks that occur 10 blocks ahead of the
 corresponding blocks in the other file, and then a set of 10
 blocks that occur 20 blocks behind the corresponding blocks
 in the other file.  There are occasional values of 9 and 19
 for the differences, which I don't have a ready explanation for,
 except that my header reordering relied on the magic number
 to identify the header blocks and it is possible there were
 a few blocks that were misidentified as headers that were actually
 data blocks.  At the end of the files there are a few blocks
 that do not correspond; these are probably due to alignment
 at the end which caused some of the last data blocks to be used
 as the first blocks for the next file in the dump.
 
 To test my suspicion that it is a concurrency issue in dump,
 I recompiled dump after setting #define SLAVES 1 in tape.c
 (rather than the value 3 it had before).  I then was able to
 complete two rounds of "dump 0f - /mail | restore rfN -"
 without any errors, whereas if I use /sbin/dump it fails out
 very quickly as indicated in the original PR.
 
 I am not familiar with the locking features, etc. being used in
 dump, so I don't know if I will be able to go farther than this
 with a reasonable expenditure of time.  However, I strongly
 suggest that the "concurrency modifications" in dump be turned
 off (perhaps by setting SLAVES to 1 as I did) until somebody
 can get to the bottom of this.  If this is happening to me,
 then I suspect there are *massive* numbers of bad dumps out there
 that people think are actually good.  It will really be a rude
 awakening when people try to read them back.  Since the data
 blocks don't contain any tape address information in them,
 it is not possible to recover.
 


More information about the freebsd-bugs mailing list