This Python script scans the memory of a target Windows process (by PID), detects embedded PDF file data in memory, and dumps complete PDF files to disk. It uses low-level Windows APIs via ctypes.
- Scans committed, readable memory regions of a running process.
- Detects PDF headers (
%PDF-x.y) and trailers (%%EOF) to extract full PDF content. - Automatically filters valid PDFs (with
/Pagesmarker and size threshold). - Saves results to a specified output directory.
- Windows OS
- Python 3.12+
- Admin privileges (or access rights to target process)
- Optional:
psutil(for process listing)
- Opens the target process using Win32 API with full access rights.
- Iterates through all memory regions using
VirtualQueryEx. - Reads readable memory chunks using
ReadProcessMemory. - Searches for PDF signatures.
- If a valid PDF structure is found (including
/Pages), writes it to a.pdffile.
Edit the top of the script to match your use case:
PID = 2016 # Target process ID
OUT_DIR = 'pdf_dumps' # Output directory for dumped PDFs
MIN_SIZE_KB = 100 # Minimum size to consider a valid PDF
MAX_BUF_MB = 100 # Max buffer size for region streamingYou can use the
psutilsnippet at the bottom of the script to help find a process PID.
- Make sure the script is run as an administrator (if needed).
- Replace the
PIDin the script with your target process's PID. - Run the script:
python pdf_dumper.py- Extracted PDFs will be saved in the
OUT_DIRdirectory.
β
Saved pdf_dumps/dump_0.pdf (139584 bytes)
β
Saved pdf_dumps/dump_1.pdf (87299 bytes)
β Skipped fragment (95123 bytes)
π Done: 2 PDF(s) dumped into 'pdf_dumps/'
- This script is for educational and forensic purposes only.
- Do not use on processes without permission.
- Improper use may violate software terms or laws.
MIT License