Skip to content

phingorani/image-vectorizer

Repository files navigation

Image Vectorizer

This project is a web application that allows you to upload images, convert them into vector embeddings using a pre-trained ResNet-50 model, and store these embeddings in a PostgreSQL database with the PGVector extension. The application also provides an interactive 3D visualization of the vectorized images.

Architecture

The project consists of the following components:

  1. Frontend (Web UI): A simple, user-friendly web interface built with HTML, CSS, and JavaScript. It allows users to upload one or more images and view a 3D visualization of the image vectors.
  2. Backend (Flask API): A Python-based backend built with Flask. It exposes RESTful API endpoints to handle image uploads, vectorization, and to provide data for the 3D visualization.
  3. Image Vectorization: A Python module (vectorize.py) that uses a pre-trained ResNet-50 model from torchvision to convert images into 2048-dimensional vector embeddings.
  4. Database (PostgreSQL + PGVector): A PostgreSQL database with the PGVector extension. The database stores the image data directly, as well as the corresponding vector embeddings. The project includes a docker-compose.yml file to easily set up the database in a Docker container.
  5. 3D Visualization: The frontend uses Plotly.js to render an interactive 3D scatter plot of the image vectors. The dimensionality of the vectors is reduced to 3 using Principal Component Analysis (PCA) on the backend.

Workflow

  1. A user selects one or more images through the web UI and clicks "Upload and Vectorize".
  2. The frontend sends each image to the /upload_and_vectorize endpoint of the Flask API.
  3. The API checks if an image with the same filename already exists in the database.
    • If it exists, the existing image data is updated, and the vector is re-calculated and updated.
    • If it does not exist, a new image record is created.
  4. The image is vectorized, and the resulting embedding is stored in the image_vectors table, linked to the corresponding image.
  5. The 3D visualization can be refreshed to show the new and updated vectors in the 3D plot.

Setup and Installation

Follow these instructions to set up and run the application on your local machine.

Prerequisites

  • Python 3.x: Make sure you have Python 3 installed. You can download it from python.org.
  • Docker: The easiest way to set up the database is by using Docker. You can install Docker from the official website.

1. Set Up the Database

You have two options for setting up the database:

Option A: Use the Provided Docker Compose File (Recommended)

  1. Start the Docker Container: In the root directory of the project, run the following command to start the PostgreSQL database in a Docker container:

    docker-compose up -d

    This will start a PostgreSQL database with the PGVector extension on localhost:5432.

Option B: Use an Existing PostgreSQL Database

If you have an existing PostgreSQL database, you can use it by following these steps:

  1. Install the PGVector Extension: Follow the instructions on the PGVector GitHub page to install the extension for your PostgreSQL version.
  2. Create a Database: Create a new database for this project.
  3. Update Connection Settings: In database_setup.py and api.py, update the DB_HOST, DB_PORT, DB_NAME, DB_USER, and DB_PASSWORD variables with your database connection details.

2. Install Python Dependencies

  1. Create a Virtual Environment: It's recommended to use a virtual environment to manage the project's dependencies.

    python3 -m venv venv
    source venv/bin/activate
  2. Install Dependencies: Install the required Python packages using pip.

    pip install -r requirements.txt

3. Set Up the Database Schema

Run the database_setup.py script to create the necessary tables in the database. Note: This will delete any existing data in the images and image_vectors tables.

python database_setup.py

4. Run the Application

  1. Start the Flask API Server:

    python api.py
  2. Access the Web UI: Open your web browser and navigate to http://127.0.0.1:5000. You can now upload images and view the 3D visualization.

API Endpoints

The Flask API provides the following endpoints:

  • POST /upload_and_vectorize: Uploads an image. If the filename already exists, it overwrites the image and re-vectorizes it. Otherwise, it creates a new entry.

    • Request: multipart/form-data with a file field named file.
    • Response: A JSON object with a success or error message.
  • POST /search: Uploads a query image, generates its vector embedding, and finds the most similar images in the database.

    • Request: multipart/form-data with a file field named file.
    • Response: A JSON object containing a list of similar images with their filenames and distances.
  • GET /get_vectors_3d: Fetches all vectors, reduces their dimensionality to 3D using PCA, and returns them with their corresponding filenames.

    • Response: A JSON object containing a list of 3D data points, each with a filename, x, y, and z coordinate.

About

This project demonstrates vectorizing images for AI models to use and train

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors