Re: Tool to compare directories and delete duplicate files from one directory

From: Kaya Saman <kayasaman_at_optiplex-networks.com>
Date: Fri, 05 May 2023 01:32:28 UTC
On 5/5/23 01:13, Paul Procacci wrote:
> #!/bin/sh
>
> #
> # dir_1, dir_2, and dir_3 are the directories I want to search through.
> for i in dir_1 dir_2 dir_3;
> do
>   # Retrieve the filenames within each of those directories
>   ls $i/ | while read file;
>   do
>      If the file doesn't exist in the base dir, copy it and continue 
> with the top of the loop.
>     [ ! -f dir_base/$file ] && cp $i/$file dir_base/ && continue
>
>     #
>     # Getting to this point means the file eixsts in both locations.
>     #
>
>     # Get the file size as it is in the dir_base
>     ref=`stat -f '%z' dir_base/$file`
>
>     # Get the file size as it is in $i
>     src=`stat -f '%z' $i/$file`
>
>     # If the sizes are the same, remove the file from the source directory
>     [ $ref -eq $src ] && rm -f $i/file
>
>   done
> done


Thanks so much!


just a quick question... you have dir_base written in the script. Do I 
need to define this or is this part of the shell language itself?


Right now I have modifed the script to make it non destructive so that 
it doesn't do any copying or removing yet... call it a test instance if 
you like. I personally prefer doing things like this so I don't have any 
accidents and loose things in the meantime...


So my initial modification is this:


> #!/bin/sh
>
> #
> # dir_1, dir_2, and dir_3 are the directories I want to search through.
> for i in /dir_1 /dir_2 /dir_3;
> do
>   # Retrieve the filenames within each of those directories
>   ls $i/ | while read file;
>   do
>     # If the file doesn't exist in the base dir, copy it and continue 
> with the top of the loop.
>     [ ! -f dir_base/$file ] && ls $i/$file && continue
>
>     #
>     # Getting to this point means the file eixsts in both locations.
>     #
>
>     # Get the file size as it is in the dir_base
>     ref=`stat -f '%z' dir_base/$file`
>
>     # Get the file size as it is in $i
>     src=`stat -f '%z' $i/$file`
>
>     # If the sizes are the same, remove the file from the source directory
>     [ $ref -nq $src ] && ls $i/file > /tmp/file
>
>   done
> done


If this works it should just output the different files into a file 
called "file" under /tmp


Ok, this didn't work at all.... it just listed a whole bunch of top 
level folders and didn't recurse through them :-(


I ran it on the assumption that I needed to run the script under /dir 
and that dir_base was a shell function which would essentially be /dir/.


[EDIT]


Currently, I managed to get it partly running by modifying ls to use ls 
-R *but* I think that the 'stat' statements don't allow for recursion?


The script is running as I type this but it's most likely just 
outputting a whole bunch of ls information... as I see many 'stat' 
errors in the shell output.