Re: Tool to compare directories and delete duplicate files from one directory
Date: Sat, 06 May 2023 20:33:14 UTC
I thought I sent this, but it never hit the list (?) -- David On 5/4/23 21:06, Kaya Saman wrote: > To start with this is the directory structure: > > > ls -lhR /tmp/test1 > total 1 > drwxr-xr-x 2 root wheel 3B May 5 04:57 dupdir1 > drwxr-xr-x 2 root wheel 3B May 5 04:57 dupdir2 > > /tmp/test1/dupdir1: > total 1 > -rw-r--r-- 1 root wheel 8B Apr 30 03:17 dup > > /tmp/test1/dupdir2: > total 1 > -rw-r--r-- 1 root wheel 7B May 5 03:23 dup1 > > > ls -lhR /tmp/test2 > total 1 > drwxr-xr-x 2 root wheel 3B May 5 04:56 dupdir1 > drwxr-xr-x 2 root wheel 3B May 5 04:56 dupdir2 > > /tmp/test2/dupdir1: > total 1 > -rw-r--r-- 1 root wheel 4B Apr 30 02:53 dup > > /tmp/test2/dupdir2: > total 1 > -rw-r--r-- 1 root wheel 7B Apr 30 02:47 dup1 > > > So what I want to happen is the script to recurse from the top level > directories test1 and test2 then expected behavior should be to remove > file dup1 as dup is different between directories. My previous post missed the mark, but I have been watching this thread with interest (trepidation?). I think Tim already identified a tool that will safely get you close to your goal, if not all the way: On 5/4/23 09:28, Tim Daneliuk wrote: > I've never used it, but there is a port of fdupes in the ports tree. > Not sure if it does exactly what you want though. fdupes(1) is also available as a package: 2023-05-04 21:25:31 toor@vf1 ~ # freebsd-version; uname -a 12.4-RELEASE-p2 FreeBSD vf1.tracy.holgerdanske.com 12.4-RELEASE-p1 FreeBSD 12.4-RELEASE-p1 GENERIC amd64 2023-05-04 21:25:40 toor@vf1 ~ # pkg search fdupes fdupes-2.2.1,1 Program for identifying or deleting duplicate files Looking at the man page: https://man.freebsd.org/cgi/man.cgi?query=fdupes&sektion=1&manpath=FreeBSD+13.2-RELEASE+and+Ports I am fairly certain that you will want to give the destination directory as the first argument and the source directories after that: $ fdupes --recurse /dir /dir_1 /dir_2 /dir_3 The above will provide you with information, but not delete anything. Practice under /tmp to gain familiarity with fdupes(1) is a good idea. As you are using ZFS, I assume you know how to take snapshots and do rollbacks (?). These could serve as backup and restore operations if things go badly. Given a 12+ TB of data, you may want the --noprompt option when you do give the --delete option and actual arguments, David