An image-editing playground combining a Next.js client with a Flask server. Users can create a conversation from a base image, upload a boxed/annotated variant plus a global prompt, and the server will generate up to four edited outputs using Google Gemini.
- Create conversations from an uploaded base image
- Upload original and modified (boxed) images to request edits
- Generate 4 variants via Gemini and browse selections
- Static serving of stored images and outputs
- macOS/Linux/Windows
- Python 3.10+ (virtualenv recommended)
- Node.js 18+ and pnpm (or npm)
- A Google Gemini API key
- Clone and enter the project directory
git clone <this-repo-url>
cd Commstem-hack- Server setup (Python virtual environment)
python3 -m venv venv
source venv/bin/activate # Windows: venv\\Scripts\\activate
pip install -r requirements.txt- Configure environment
cp example.env .env
# Edit .env and set GEMINI_API_KEY=<your_key>- Initialize and run the Flask server
# Optional: clean storage (outputs, modified, originals) and reset DB
make clean
# Start the server (default http://127.0.0.1:5000)
python3 server/app.py- Client setup and run (in a new terminal)
cd client
pnpm install # or npm install / yarn
pnpm dev # or npm run dev / yarn dev
# Next dev runs at http://localhost:3000Create a .env in the project root with:
GEMINI_API_KEY=your_api_key_here
The server reads this in server/services/model.py via python-dotenv.
client/ # Next.js 14 app
server/ # Flask app
routes/ # API blueprints
services/ # model + storage helpers
storage/ # sqlite DB + image files
originals/
modified/
outputs/
app.py # Flask app factory + runner
requirements.txt # Python deps
example.env # Environment template
- The SQLite database lives at
server/storage/app.dband is created automatically on first run. - All stored images are under
server/storage/*. Themake cleantarget wipes these folders and the DB file. - The server enables CORS by default.
Base URL: http://127.0.0.1:5000
-
GET
/images/<image_id>- Returns the image bytes for a stored image id (PNG).
-
GET
/server/storage/<path>- Serves files from
server/storage/*directories.
- Serves files from
-
POST
/conversations- multipart/form-data
image(file, required): base image without boxestitle(string, optional)
- Response:
{ id, title, current_image: { id, url } }
- multipart/form-data
-
POST
/conversations/<cid>/edits- multipart/form-data
original(file, required): original/base imagemodified(file, required): boxed image (edit target)prompt(string, required): global directive
- Behavior: saves inputs, calls Gemini to produce up to 4 outputs, stores them, logs messages
- Response:
{ outputs: [{ image_id, url }*4] }
- multipart/form-data
-
POST
/conversations/<cid>/select- application/json:
{ "selected_image_id": number | null } - Sets current image when a valid id is provided;
nullrecords a deselection. - Response:
{ current_image: { id, url }, selected }
- application/json:
-
GET
/conversations- Returns list of conversations:
[{ id, title }]
- Returns list of conversations:
-
GET
/conversations/<cid>- Returns conversation details with current image and message history.
-
PUT
/conversations/<cid>- application/json:
{ "title": string } - Updates conversation title.
- application/json:
- Start the Next.js dev server:
pnpm devinclient/ - Open
http://localhost:3000 - Use the UI to upload images, enter a directive, and iterate on edits.
- Missing API key: ensure
.envis present at repo root andGEMINI_API_KEYis set. - HTTP 500 from model calls: check your key, network access, or Gemini quota.
- CORS/404 on images: confirm server is running and images exist in
server/storage. - Clean slate: run
make cleanand restart the server.
MIT or project’s chosen license.