Re: What is the best way to look for a lost file in the disk blocks
Date: Wed, 10 Aug 2022 18:05:10 UTC
On 10 August 2022 5:26:27 pm AEST, Matthias Apitz <guru@unixarea.de> wrote:
> El día miércoles, agosto 10, 2022 a las 07:18:03a. m. +0200, Michael
> Schuster escribió:
>
> > On Wed, Aug 10, 2022 at 3:55 AM David Christensen
> > <dpchrist@holgerdanske.com> wrote:
> > >
> > > On 8/9/22 05:23, Matthias Apitz wrote:
> > > >
> > > > Hello,
> > > >
> > > > Last night I damaged a plain UTF-8 HTML file (I copied by
> accident a
> > > > JPEG file over it) and it turned out that the backup was done a
> month
> > > > ago. I learned my lesson from this re/ doing backups more often
> of files
> > > > I'm working on...
>
> Thanks for the hints.
>
> The file in question is my diary, written in Spanish and every day
> is headed by a line like
>
> <dt><b>Viernes, 29 de julio de 2022 </b>
>
> So I wrote a 35 line C-programm reading any 1024 byte block from the
> device, terminate it with '\0' to make sure that a
>
> char *p = strstr(block, " de 2022 </b>");
>
> would not fail, and with p != NULL I printed with printf(p-16);
> the diary entry; and the
> current block number to be used in dd(1) later.
> It finds all the lines of this year, but not the missing between July
> 10
> and August 1 :-(
> So the blocks have been lost. I was hoping that UFS puts them back to
> free block chains for later use, but it seems that
> the 'cp picture.jpg diary.html' directly overwrote the used blocks.
>
> Lesson learned. I'm attaching the C-pgm, maybe someone can use it or
> at
> least its idea.
>
> matthias
"Necessity is the mother of Invention" alright. A neat solution.
Could any other files written since have reused those blocks? I'm a little surprised if the cp did that ...
FWIW, I was about to offer a different method that came from my own need - finding a small but rare string in the 12.3-RELEASE dvd1.iso to be replaced, so that the 2+GiB of included packages may be installed - after 3 patches to bsdconfig, but that's another story - so I'll share it as it could be used on each (say) 10MiB block dd'd from a disk or partition as well. play.iso is a copy of the 4.1GiB dvd1.iso
<code>
smithi@t430s:/home/dvds % strings -an7 -td play.iso | grep -i2 'pkg.txz'
2442269512 sod.J{++I
2442271727 %R:*lAS
2442277052 PKG.TXZ;1PX,
2442277146 pkg-1.17.2.txzNM
2442277165 pkg.txz
2442278912 version = 2;
2442278925 packing_format = "txz";
--
4377882256 Signature type %s is not supported for bootstrapping.
4377882310 %s/%s.pubkeysig.XXXXXX
4377882333 pkg.txz
4377882341 Invalid configuration format, ignoring the configuration fi
4377882420 Consider changing PACKAGESITE or installing it from ports:
4377882498 REPOS_DIR
4377882508 asprintf
4377882517 Path to pkg.txz required
4377882543 %s/trusted
4377882556 A pre-built version of pkg could not be found for your syst
--
4466242378 pistrings
4466242388 pkg.conf
4466242397 pkg.txz
4466242410 plasma_saver
4466242423 plasma_saver.ko
</code>
The numbers are byte offsets into the .iso file. -n7 is the size of the string I was after; increase if hunting a longer string.
Something to consider - in a general case, probably not yours - is that the desired string/s might be split over adjacent blocks, requiring some overlap of perhaps a few kb.
cheers, Ian