A simple Python script to automatically find and remove duplicate files in a specified directory. The program uses SHA-256 hashing to detect duplicates based on file content, not just file names.
- Scans a directory (and its subdirectories) for duplicate files.
- Compares files using SHA-256 hash for reliable detection.
- Automatically deletes duplicate files, keeping only one copy.
- Prints the path of each removed duplicate file.
- Lightweight and runs entirely in the terminal.
- The user specifies a directory path as a command-line argument.
- The script walks through all files in the directory and subdirectories.
- For each file, it calculates a SHA-256 hash of its contents.
- If a file's hash matches a previously seen hash, it is considered a duplicate and is deleted.
- The script prints the path of each removed duplicate file.
- Clone or download this repository.
- Open a terminal in the project directory.
- Run the program with the directory you want to scan:
Replace
python file_duplicates_remover.py <directory_path>
<directory_path>with the path to the folder you want to scan for duplicates.