This project is a web application that allows you to upload images, convert them into vector embeddings using a pre-trained ResNet-50 model, and store these embeddings in a PostgreSQL database with the PGVector extension. The application also provides an interactive 3D visualization of the vectorized images.
The project consists of the following components:
- Frontend (Web UI): A simple, user-friendly web interface built with HTML, CSS, and JavaScript. It allows users to upload one or more images and view a 3D visualization of the image vectors.
- Backend (Flask API): A Python-based backend built with Flask. It exposes RESTful API endpoints to handle image uploads, vectorization, and to provide data for the 3D visualization.
- Image Vectorization: A Python module (
vectorize.py) that uses a pre-trained ResNet-50 model fromtorchvisionto convert images into 2048-dimensional vector embeddings. - Database (PostgreSQL + PGVector): A PostgreSQL database with the PGVector extension. The database stores the image data directly, as well as the corresponding vector embeddings. The project includes a
docker-compose.ymlfile to easily set up the database in a Docker container. - 3D Visualization: The frontend uses
Plotly.jsto render an interactive 3D scatter plot of the image vectors. The dimensionality of the vectors is reduced to 3 using Principal Component Analysis (PCA) on the backend.
- A user selects one or more images through the web UI and clicks "Upload and Vectorize".
- The frontend sends each image to the
/upload_and_vectorizeendpoint of the Flask API. - The API checks if an image with the same filename already exists in the database.
- If it exists, the existing image data is updated, and the vector is re-calculated and updated.
- If it does not exist, a new image record is created.
- The image is vectorized, and the resulting embedding is stored in the
image_vectorstable, linked to the corresponding image. - The 3D visualization can be refreshed to show the new and updated vectors in the 3D plot.
Follow these instructions to set up and run the application on your local machine.
- Python 3.x: Make sure you have Python 3 installed. You can download it from python.org.
- Docker: The easiest way to set up the database is by using Docker. You can install Docker from the official website.
You have two options for setting up the database:
-
Start the Docker Container: In the root directory of the project, run the following command to start the PostgreSQL database in a Docker container:
docker-compose up -d
This will start a PostgreSQL database with the PGVector extension on
localhost:5432.
If you have an existing PostgreSQL database, you can use it by following these steps:
- Install the PGVector Extension: Follow the instructions on the PGVector GitHub page to install the extension for your PostgreSQL version.
- Create a Database: Create a new database for this project.
- Update Connection Settings: In
database_setup.pyandapi.py, update theDB_HOST,DB_PORT,DB_NAME,DB_USER, andDB_PASSWORDvariables with your database connection details.
-
Create a Virtual Environment: It's recommended to use a virtual environment to manage the project's dependencies.
python3 -m venv venv source venv/bin/activate -
Install Dependencies: Install the required Python packages using
pip.pip install -r requirements.txt
Run the database_setup.py script to create the necessary tables in the database. Note: This will delete any existing data in the images and image_vectors tables.
python database_setup.py-
Start the Flask API Server:
python api.py
-
Access the Web UI: Open your web browser and navigate to
http://127.0.0.1:5000. You can now upload images and view the 3D visualization.
The Flask API provides the following endpoints:
-
POST /upload_and_vectorize: Uploads an image. If the filename already exists, it overwrites the image and re-vectorizes it. Otherwise, it creates a new entry.- Request:
multipart/form-datawith a file field namedfile. - Response: A JSON object with a success or error message.
- Request:
-
POST /search: Uploads a query image, generates its vector embedding, and finds the most similar images in the database.- Request:
multipart/form-datawith a file field namedfile. - Response: A JSON object containing a list of similar images with their filenames and distances.
- Request:
-
GET /get_vectors_3d: Fetches all vectors, reduces their dimensionality to 3D using PCA, and returns them with their corresponding filenames.- Response: A JSON object containing a list of 3D data points, each with a
filename,x,y, andzcoordinate.
- Response: A JSON object containing a list of 3D data points, each with a