Grepping though a disk

Polytropon freebsd at edvax.de
Mon Mar 4 00:36:06 UTC 2013


Due to a fsck file system repair I lost the content of a file
I consider important, but it hasn't been backed up yet. The
file name is still present, but no blocks are associated
(file size is zero). I hope the data blocks (which are now
probably marked "unused") are still intact, so I thought
I'd search for them because I can remember specific text
that should have been in that file.

As I don't need any fancy stuff like a progress bar, I
decided to write a simple command, and I quickly got
something up and running which I _assume_ will do what
I need.

This is the command I've been running interactively in bash:

	$ N=0; while true; do echo "${N}"; dd if=/dev/ad6 of=/dev/stdout bs=10240 count=1 skip=${N} 2>/dev/null | grep "<PATTERN>"; if [ $? -eq 0 ]; then break; fi; N=`expr ${N} + 1`; done

To make it look a bit better and illustrate the simple
logic behind my idea:

	N=0
	while true; do
		echo "${N}"
		dd if=/dev/ad6 of=/dev/stdout bs=10240 count=1 skip=${N} \
			2>/dev/null | grep "<PATTERN>"
		if [ $? -eq 0 ]; then
			break
		fi
		N=`expr ${N} + 1`
	done

Here <PATTERN> refers to the text. It's only a small, but
very distinctive portion. I'm searching in blocks of 10 kB
so it's easier to continue in case something has been found.
I plan to output the resulting "block" (it's not a real disk
block, I know, it's simply a unit of 10 kB disk space) and
maybe the previous and next one (in case the file, the _real_
block containing the data, has been split across more than
one of those units. I will then clean the "garbage" (maybe
from other files) because I can easily determine the beginning
and the end of the file.

Needless to say, it's a _text_ file.

I understand that grep operates on text files, but it will
also happily return 0 if the text to search for will appear
in a binary file, and possibly return the whole file as a
search result (in case there are no newlines in it).

My questions:

1. Is this the proper way of stupidly searching a disk?

2. Is the block size (bs= parameter to dd) good, or should
   I use a different value for better performance?

3. Is there a program known that already implements the
   functionality I need in terms of data recovery?

Results so far:

The disk in question is a 1 TB SATA disk. The command has
been running for more than 12 hours now and returned one
false-positive result, so basically it seems to work, but
maybe I can do better? I can always continue search by
adding 1 to ${N}, set it as start value, and re-run the
command.

Any suggestion is welcome!



-- 
Polytropon
Magdeburg, Germany
Happy FreeBSD user since 4.0
Andra moi ennepe, Mousa, ...


More information about the freebsd-questions mailing list