extract_images_from_pdf

PDF Image Extractor

Introduction

This Python script allows you to extract images from a PDF file and save them to a specified directory. It utilizes the PyMuPDF library to process PDF files and the Pillow library to handle image manipulation.

Usage

Prerequisites

Before using this script, ensure you have the following:

Python installed on your system.
Required libraries: os, fitz, io, PIL.

Running the Script

Place the PDF file you want to extract images from in the same directory as this script.
Replace the input_file variable with the name of your PDF file.

input_file = 'sample-pdf-with-images.pdf'
python pdf_image_extractor.py

Output

The script will create a directory named IMAGES_FROM_PDF in the same location as the script. Within this directory, a subdirectory will be created for each PDF file processed, named after the PDF file (excluding the extension).

Inside each subdirectory, the script will save the extracted images. Each image will be named with the following convention: pdf_file_name_page_number_image_number.extension.

Example:

For example, if you run the script with a PDF file named sample-pdf-with-images.pdf, it will create the following structure:

IMAGES_FROM_PDF/
    sample-pdf-with-images/
        sample-pdf-with-images_001_001.jpg
        sample-pdf-with-images_002_001.jpg
        ...

License

This script is released under the MIT License. Feel free to use, modify, and distribute it as needed.

Name		Name	Last commit message	Last commit date
parent directory ..
README.md		README.md
__init__.py		__init__.py
extract_images_from_pdf.py		extract_images_from_pdf.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

PDF Image Extractor

Introduction

Usage

Prerequisites

Running the Script

Output

Example:

License

FilesExpand file tree

extract_images_from_pdf

Directory actions

More options

Directory actions

More options

Latest commit

History

extract_images_from_pdf

Folders and files

parent directory

README.md

PDF Image Extractor

Introduction

Usage

Prerequisites

Running the Script

Output

Example:

License