Over the years, photo collections accumulate in many places: phones, cameras, messaging apps, backups, cloud exports, USB drives, βmiscellaneousβ folders on a NAS, and even old SD cards. The result is usually a chaotic archive of:
- inconsistent naming conventions
- missing or broken EXIF dates
- photos in wrong folders
- duplicates scattered everywhere
- folders with meaningless names
- unsorted collections imported from past devices
ChronoClean is a practical, conservative tool whose goal is simple: Reorganize decades of photos into a clean, consistent chronological structure, with safe renaming rules and optional folder-name tagging.
It is designed for real-life usage on large libraries (50kβ300k files), including preparation for long-term archival on systems like a Synology NAS.
ChronoClean reflects the toolβs purpose:
- Chrono β time-based ordering at the heart of the system
- Clean β removing clutter, inconsistencies, duplicates, and chaotic structures
It focuses on rebuilding a clean chronological library without aggressively deduplicating or altering media unless needed.
ChronoClean restores order to photo collections by:
- Reading EXIF timestamps when available
- Using fallback logic when EXIF is missing
- Organizing files into a time-based hierarchy
- Optionally renaming files using a standard pattern
- Optionally tagging filenames with relevant folder names
- Providing dry-run and export analysis before modifying anything
- Ensuring safe handling of duplicates or filename conflicts
The philosophy is:
- Predictable output
- Minimal risk
- Clear visibility before acting
- User-controlled decisions
ChronoClean does not attempt complex AI deduplication or modification of image contents.
- EXIF-based sorting into a chronological folder hierarchy (year/month) (day optional)
- Video metadata extraction for MP4, MOV, MKV, etc. (v0.3)
- Fallback to creation/modification timestamp when EXIF is missing
- Optionally infer date from directory name if metadata is absent
- Consistent, predictable tree structure suitable for NAS archives and long-term storage
- Disabled by default; original filenames are preserved unless renaming is enabled
- Configurable renaming rules for standard, readable patterns
- Configurable folder-tag insertion for context-rich filenames
- Tag-only mode: add tags without full renaming (v0.3)
- Prevents double-tagging and name inflation
Two levels:
- Dry-run console output: shows exactly what would be done, without making changes
- Export report (current
exportcommand): JSON or CSV including:- source path and filename
- detected date and date source (EXIF, filesystem, filename, video_metadata)
- date mismatch detection (when filename date β EXIF/file date)
- error categories for troubleshooting (v0.3)
This export can be reviewed before applying changes.
- Only checked when needed (filename clash after sorting/renaming)
- Uses SHA256 for reliable comparison (MD5 optional via config)
- If same hash β skip second copy
- If different β rename second file safely to avoid collision
- Identify meaningful folder names for possible filename enrichment
- Detect if folder name already appears in filenames (distance-based logic)
- Tag enrichment is optional and user-controlled
- Possibility to maintain allow/ignore lists for folder tags
- Designed with Synology usage in mind
- Works on Python 3 installed on Synology
- Can be executed manually or via Task Scheduler
- Works well with shared folders and large volumes
- Instructions provided for installing dependencies and running the tool on Synology DSM
- Clean chronological sorting
Photo libraries often contain a mix of sources: phone backups, WhatsApp exports, SD card dumps, and more. ChronoClean reorganizes them into a clear, time-based structure:
2024/
01/
02/
2025/
01/
02/
Files are moved based on EXIF or fallback dates, ensuring chronological order regardless of original folder or filename chaos.
- Optional file renaming
Many users prefer to keep original names, so ChronoClean provides renaming as optional, not default. When enabled, files are renamed using a standard, readable pattern:
20240214_153200.jpg
Or, with a folder tag for context:
20240214_153200_ParisTrip.jpg
- Folder-name influence
Some folders contain meaningful names (e.g., βParis 2022β, βSkiTripβ, βMariage Julieβ, βNaomie Anniversaryβ), while others are generic or unhelpful (e.g., βtosortβ, βunsortedβ, βmiscβ, βdownloadedβ).
ChronoClean allows you to:
- Tag folder names for filename enrichment
- Ignore folder names that are not informative
- Detect if the folder name is already present in filenames (using a similarity/distance check)
Folder tag detection is automatic during scan and apply. Configuration allows setting ignore/force lists to control which folder names are used as tags.
- Duplicate handling
Duplicate detection is not a feature goal but a safety mechanism:
If two files collide on filename (after sorting), ChronoClean checks their hashes.
- If identical: only one copy is kept (depending on options).
- If different: filename is adjusted to avoid collision.
ChronoClean does not attempt full-library deduplication unless explicitly requested.
ChronoClean supports two workflows (the second is planned for v0.4).
- Scan - analyze directories, read EXIF, classify folder names.
- Dry run apply - simulate actions (
applydefaults to dry-run). - Apply - perform sorting, moves, renames, and conflict resolution.
This workflow introduces an explicit, reviewable plan file for large/messy imports.
- Scan/report - generate analysis output for review (JSON/CSV).
- Plan - generate an executable plan file (JSON) that includes destinations + final names + decisions.
- Dry run from plan - simulate the plan without touching disk.
- Apply from plan - execute exactly what the reviewed plan specifies.
Planned command split for clarity:
report ...= analysis output (what was detected)plan ...= executable plan (what will be done)
Use when you just want to organize a folder, with a safety dry-run first:
python -m chronoclean apply D:\photos\incoming D:\photos\archive --dry-run
python -m chronoclean apply D:\photos\incoming D:\photos\archive --no-dry-runOptional (info-only):
python -m chronoclean scan D:\photos\incomingUse when you want cryptographic confirmation before deleting sources:
python -m chronoclean apply D:\photos\incoming D:\photos\archive --no-dry-run
python -m chronoclean verify --last
python -m chronoclean cleanup --only ok --dry-run
python -m chronoclean cleanup --only ok --no-dry-runUse when you copied files but did not keep a run report:
python -m chronoclean verify --source D:\photos\incoming --destination D:\photos\archive --reconstruct
python -m chronoclean cleanup --only ok --no-dry-runchronoclean/
cli/
core/
exif_reader.py
date_inference.py
sorter.py
renamer.py
folder_tagging.py
duplicate_checker.py
export_plan.py
utils/
tests/
README.md
ChronoClean is implemented in Python 3 for its readability, ecosystem, and easy deployment on Synology NAS.
The following libraries are recommended but not mandatory:
- Pillow (image metadata): Used for general image processing and reading/writing image metadata, including basic EXIF extraction and manipulation. Pillow is versatile and works with many image formats, but for advanced or lossless EXIF editing, see below.
- piexif or exifread (EXIF handling):
- piexif is ideal for lossless EXIF extraction, insertion, and modification, especially when you need to preserve all metadata exactly or write EXIF back to files. It is more specialized for EXIF than Pillow.
- exifread is focused on reading EXIF data only (no writing), and is robust for extracting metadata from a wide range of JPEGs and TIFFs.
- hachoir or ffprobe (video metadata - v0.3):
- ffprobe (part of FFmpeg) is the preferred provider for extracting creation dates from video files. It's fast and handles most formats including MP4, MOV, MKV, AVI.
- hachoir is a pure Python fallback that requires no external installation but supports fewer formats and is slower.
- ChronoClean automatically uses ffprobe if available, falling back to hachoir otherwise.
- hashlib (hashing):
- Provides cryptographic hashes (SHA256, MD5) for duplicate detection.
- Included in Python's standard library; reliable and portable.
- Typer or Argparse (CLI):
- Typer is a modern, user-friendly CLI framework based on Python type hints, making it easy to build intuitive command-line tools.
- Argparse is Pythonβs built-in CLI parser, stable and widely supported, but with a more traditional interface.
- pytest (testing): Used for writing and running automated tests, ensuring code reliability and correctness.
Alternatives may be chosen depending on portability and Synology constraints.
This section is for local development (tests, linting). On a NAS, follow docs/SYNOLOGY_INSTALLATION.md.
python -m venv .venv
source .venv/bin/activate
python -m pip install --upgrade pip
python -m pip install -e ".[dev]"ChronoClean requires Python 3.10+. On many Synology DSM setups, Package Center only provides Python up to 3.9, so the recommended approach is to install a newer Python via Entware.
See the dedicated guide: docs/SYNOLOGY_INSTALLATION.md.
Quick usage reminder (after install):
# Analyze files
chronoclean scan /volume1/photos
chronoclean scan /volume1/photos --report # Detailed per-file report
# Review folder tag classifications (v0.3.4)
chronoclean tags list /volume1/photos
chronoclean tags classify "Paris 2024" use --tag "ParisTrip"
chronoclean tags classify "temp" ignore
# Export scan results (v0.2)
chronoclean export json /volume1/photos -o results.json
chronoclean export csv /volume1/photos -o results.csv
# Export with destination preview (v0.3.4)
chronoclean export json /volume1/photos --destination /volume1/archive -o preview.json
# Organize files (dry-run by default)
chronoclean apply /volume1/unsorted /volume1/photos
# Apply changes with copy (safe)
chronoclean apply /volume1/unsorted /volume1/photos --no-dry-run
# Apply changes with move
chronoclean apply /volume1/unsorted /volume1/photos --no-dry-run --moveOptionally schedule via DSM Task Scheduler.
- Run on a copy of your library the first time: Always test ChronoClean on a duplicate or backup of your photo collection to ensure the results match your expectations and to prevent accidental data loss.
- Use the export report to tune tagging rules: Review and edit the export (JSON/CSV) to refine folder-tagging, renaming, and conflict resolution before applying changes.
- Ensure you have enough space for temporary moves: Large reorganizations may require extra disk space for file operations, especially when working with tens of thousands of files.
ChronoClean can be configured via YAML files. Create a chronoclean.yaml in your project root:
# Minimal config example
sorting:
folder_structure: "YYYY/MM/DD"
folder_tags:
enabled: true
ignore_list: ["tosort", "misc", "temp"]
renaming:
enabled: true
pattern: "{date}_{time}"Config file search order:
--config <path>argumentchronoclean.yamlin current directory.chronoclean/config.yaml- Built-in defaults
CLI arguments always override config file values.
π See docs/CONFIGURATION.md for complete configuration reference.
ChronoClean development follows a phased approach from prototype to production-ready tool.
π See docs/IMPLEMENTATION_SPEC_v0.1.md for v0.1 implementation details.
π See docs/IMPLEMENTATION_SPEC_v0.2.md for v0.2 implementation details.
π See docs/IMPLEMENTATION_SPEC_v0.3.md for v0.3 planning.
π See docs/IMPLEMENTATION_SPEC_v0.3.1.md for v0.3.1 planning.
π See docs/IMPLEMENTATION_SPEC_v0.3.2.md for v0.3.2 details.
π See docs/IMPLEMENTATION_SPEC_v0.4.md for v0.4 planning.
π See docs/IMPLEMENTATION_SPEC_v0.5.md for v0.5 planning.
π See docs/IMPLEMENTATION_SPEC_v0.6.md for v0.6 planning.
- β Project structure and configuration system (YAML-based)
- β
Data models (
FileRecord,ScanResult,OperationPlan,MoveOperation) - β
EXIF extraction and date parsing (images via
exifread) - β Date inference engine with fallback priority (EXIF β filesystem β folder name)
- β Chronological sorting into YYYY/MM folders (YYYY/MM/DD optional)
- β
Basic file renaming (optional, pattern-based with
{date}_{time}) - β Folder tag detection (heuristics, ignore/force lists, fuzzy matching)
- β Safe file operations (copy/move with dry-run, rollback support)
- β
CLI with Typer (
scan,apply,version) - β Unit test suite (352 tests with pytest + pytest-mock)
- β
Dry-run mode (default) with
--no-dry-runto apply - β
Copy mode (default) with
--movefor moves - β
Detailed scan report (
--reportflag)
- β
Filename date parsing (detect dates in filenames like
IMG_090831.jpg) - β Date mismatch detection (when filename date β EXIF/file date)
- β Export: JSON/CSV with detected dates, tags, target paths, mismatch info
- β
exportcommand withjsonandcsvsubcommands - β Hash-based duplicate detection on filename collision (SHA256)
- β Collision resolution strategies: check_hash, rename, skip, fail
- β
config showcommand to display current configuration - Planned:
reportcommand for detailed analysis output
- β
Video "taken date" extraction (choose backend:
ffprobevshachoir) mapped intoDateSource.VIDEO_METADATA - β Unified "best date" resolution across image+video metadata + filesystem + filename + folder-name (same priority system)
- β Improved metadata error handling/logging (clear per-file reason + summary counts by category)
- β
Decouple tagging from date renaming; allow tag-only filenames (append tags even when
--renameis off) - Deferred to future: Optional heuristics for "no metadata" cases (config placeholder added)
- β
Apply run record (captures
source -> destinationmapping for later verification) - β
verifycommand (hash-based SHA-256;--algorithm quickfor size-only) - β
cleanupcommand to delete only sources verified OK (supports--only ok) - β
Recovery path when the apply report was forgotten (
verify --reconstruct) - β Auto-discovery of recent runs and verification reports with interactive prompts
- β
--no-run-recordflag for apply when record not needed
- β
doctorcommand to check system dependencies (ffprobe, hachoir, exiftool) - β
doctor --fixto interactively update config with correct paths - β
Searches common paths including
/opt/bin/ffprobe(Synology DSM) - β CLI helpers refactoring to reduce code duplication
- β Centralized component factory for scan/apply/verify commands
- Atomic copy (temp file + rename) to avoid partial destination files
- Persistent apply state on disk (plan + journal) for resume/retry on interruption
- Resume/detect incomplete runs and continue safely
- Detailed per-file error recording (not only a summary counter)
- Streaming execution to keep memory bounded on large libraries
- Unambiguous command split:
report= scan/analysis outputs (JSON/CSV)plan= executable plan generation (JSON)
apply --from-plan <plan.json>(deterministic execution of reviewed plans)dryrun --from-plan <plan.json>(optional;apply --dry-runremains for quick runs)reportcommand(s) for scan/plan/apply summaries (mismatches, collisions, duplicates)- Interactive review (Rich): confirm risky ops, inspect collisions, accept/reject tags/renames
- Conditional rename for garbage filenames (optional)
- Persistent state for tag decisions/overrides (separate from main YAML), plus optional
config set - Safety gates: disk-space check, live-mode warnings, backup reminders, clearer "what will change"
- Implement performance knobs: parallel scan/inference (configurable workers) and memory-efficient iteration
- Caching (SQLite) for metadata/hash results with invalidation strategy (mtime/size), to speed re-runs
- Synology DSM notes + headless/Task Scheduler friendly mode and recommended configs
- Optimize collision/duplicate handling for large batches (streaming plan generation, bounded caches)
- Full conflict resolution and rollback support
- Comprehensive NAS documentation
- Extended CLI options (filtering, simulation, advanced tagging)
- Test suite and CI integration
- User documentation and troubleshooting guide
- Web-based UI for plan review and approval
- Advanced duplicate analysis (optional, opt-in)
- Customizable renaming/tagging templates
- Multi-language support
- Plugin system for custom workflows
ChronoClean β Order your memories.

