Many a time, we find duplicate images residing in our Albums, Image_directory, etc, there are various reasons, downloading the same file from various sources, auto backup on the cloud, it slipped out of our mind that we downloaded it already in the first place, etc. Manually selecting them is actually a hassle, but why do such a boring task when automation can do the trick. This sweet and simple script helps you to compare various files (not only images) in a directory, find the duplicate, list them out, and then even allows you to delete them.
Sweet!!!
- Setup a
python 3.x
virtual environment. Activate
the environment- Install the dependencies using
pip3 install -r requiremnts.txt
- You are all set and the script is Ready to run.
- Clearly Follow the Instructions provided in the comments.
In Command Line Interface, Run the script using -
python image_finder.py <path of folder1, path of folder2, .....>
1. folder1 - Parent Folder
2. folder2, folder3 .... - Subsequent Folders
- This acts as a reference for duplicate files, i.e. this contains the original copy, hence no file is deleted from this folder.
- Comparisons are done with in the folder, and from Parent to Subsequent Folders.
- python3
- keyboard
The Script works on a simple fundamental. Two files with same md5checksum
will
have similar contents. So in the script all we aim to do is determine the checksum, compare and find the duplicates.
- Stand_Alone folder has 6 images, 2 of them are duplicate of images within the folder only.
- Parent contains standard images used for Image Processing in png format.
- Duplicate folder contains 5 images duplicate of images in Stand_Alone (named
Random Name (n)
). There are similar images in tiff extension as well, They are not Duplicate as file type is different. - Duplicate_1 folder contains another 5 images duplicate of images in Stand_Alone
(named
Another Random Name (n)
). There are similar images in jpeg extension as well, They are not Duplicate as file type is different.
- Running Script on a single folder
Stand_Alone
. In this example I pressed [n] in order to not to delete anything.
- Stand_Alone folder Before Deleting the files.
- After Deleting the Files, i.e. Pressing [y] at the prompt.
Parent
,Duplicate
,Duplicate_1
folder before running the script.
- Running the scripts on the Folder and deleting the duplicate files.
- Final Result, Notice that all the files in
Parent
Folder remain as it is.
Also notice that similar files but wth different extensions are not deleted, cause technically they aren't same.
Made by Vybhav Chaturvedi
Check Rotten Scripts for more explanation and Implementation