Supported by IPRoyal — Proxy services for OSINT and security research.
Metadata extraction and web scraping for OSINT and pentesting.
MetaDetective is a single-file Python 3 tool for metadata extraction and web scraping, built for OSINT and pentesting workflows.
It has no Python dependencies beyond exiftool. One curl and you're operational.
What it extracts: authors, software versions, GPS coordinates, creation/modification dates, internal hostnames, serial numbers, hyperlinks, camera models - across documents, images, and email files.
What it does beyond extraction:
- Direct web scraping of target sites (no search engine dependency, no IP blocks)
- GPS reverse geocoding with OpenStreetMap, map link generation
- Export to HTML, TXT, or JSON
- Selective field extraction with
--parse-only - Deduplication across multiple files
It was built as a replacement for Metagoofil, which dropped native metadata analysis and relied on Google search (rate limiting, CAPTCHAs, proxy overhead).
Requirements: Python 3, exiftool.
# Debian / Ubuntu / Kali
sudo apt install libimage-exiftool-perl
# macOS
brew install exiftoolcurl -O https://raw.githubusercontent.com/franckferman/MetaDetective/stable/src/MetaDetective/MetaDetective.py
python3 MetaDetective.py -hpip install MetaDetective
metadetective -hgit clone https://github.com/franckferman/MetaDetective.git
cd MetaDetective
python3 src/MetaDetective/MetaDetective.py -hdocker pull franckferman/metadetective
docker run --rm franckferman/metadetective -h
# Mount a local directory
docker run --rm -v $(pwd)/loot:/data franckferman/metadetective -d /data# Analyze a directory (deduplicated singular view by default)
python3 MetaDetective.py -d ./loot/
# Specific file types, filter noise
python3 MetaDetective.py -d ./loot/ -t pdf docx -i admin anonymous
# Per-file display with formatted output
python3 MetaDetective.py -d ./loot/ --display all --format formatted
# Single file
python3 MetaDetective.py -f report.pdf
# Multiple files
python3 MetaDetective.py -f report.pdf photo.heic--parse-only limits extraction to specific fields. Useful to cut noise or target a specific data point.
# Extract only Author and Creator fields
python3 MetaDetective.py -d ./loot/ --parse-only Author Creator
# Extract GPS data only from iPhone photos
python3 MetaDetective.py -d ./photos/ -t heic heif --parse-only 'GPS Position' 'Map Link'# HTML report (default)
python3 MetaDetective.py -d ./loot/ -e
# TXT
python3 MetaDetective.py -d ./loot/ -e txt
# JSON - singular (deduplicated values per field)
python3 MetaDetective.py -d ./loot/ -e json
# JSON - per file
python3 MetaDetective.py -d ./loot/ --display all -e json
# Custom filename suffix and output directory
python3 MetaDetective.py -d ./loot/ -e json -c pentest-corp -o ~/results/JSON singular output structure:
{
"tool": "MetaDetective",
"generated": "2026-03-21T...",
"unique": {
"Author": ["Alice Martin", "Bob Dupont"],
"Creator Tool": ["Microsoft Word 16.0"]
}
}Pivot with jq:
jq '.unique.Author' MetaDetective_Export-*.json# Scan target site, list files found
python3 MetaDetective.py --scraping --scan --url https://target.com/
# Filter by extension
python3 MetaDetective.py --scraping --scan --url https://target.com/ --extensions pdf docx xlsx
# Download files (depth 2, 8 threads)
python3 MetaDetective.py --scraping --url https://target.com/ \
--download-dir ~/loot/ --extensions pdf docx --depth 2 --threads 8
# Control request rate (requests/sec)
python3 MetaDetective.py --scraping --url https://target.com/ \
--download-dir ~/loot/ --rate 5
# Follow external links
python3 MetaDetective.py --scraping --url https://target.com/ \
--download-dir ~/loot/ --follow-extern| Flag | Description |
|---|---|
-t pdf docx |
Restrict to file types |
-i admin anonymous |
Ignore values matching pattern (regex supported) |
--parse-only Author Creator |
Extract only specified fields |
--display all |
Show metadata per file |
--display singular |
Deduplicated view across all files (default) |
--format formatted |
Decorated output |
--format concise |
Compact output |
Documents: PDF, DOCX, ODT, XLS, XLSX, PPTX, ODP, RTF, CSV, XML Images: JPEG, PNG, TIFF, BMP, GIF, SVG, PSD, HEIC, HEIF Email: EML, MSG, PST, OST Video: MP4, MOV
Open an issue or submit a pull request on GitHub.
AGPL-3.0. See LICENSE.
MetaDetective is provided for educational and authorized security testing purposes. You are responsible for ensuring compliance with applicable laws.

