A local-first visual memory assistant that captures screen activity, builds a searchable memory index, and lets you recall past work through natural language and voice.
-
Add your Gemini API key to a root
.envfile:cp .env.example .env
-
Start the full web workflow:
python3 start.py
-
Open the URL printed by Vite, usually
http://127.0.0.1:5173.
start.py creates a local Python environment in .memorium/, installs the backend requirements, starts the screenshot search API on 127.0.0.1:8765, and launches the React client with the correct environment variables.
| Area | Purpose |
|---|---|
client/ |
Browser-based Memorium interface for text search, voice mode, and result browsing. |
memorium_tauri/gui/ |
Desktop shell built with Tauri and the same React UI foundation. |
memorium_tauri/backend/space_clip.py |
Local indexing engine that stores frame metadata, OCR, embeddings, and searchable memory records. |
assets/ |
Project branding and icon assets used in the app and documentation. |
start.py |
One-command local launcher for the web development path. |
debug_agent.py, test_agent_*.py |
Prompt and tool-calling experiments used while shaping the memory retrieval loop. |
Memorium is split into two layers:
- A local backend watches for captured frames, extracts OCR with Apple Vision, generates semantic descriptions, and stores embeddings plus metadata in SQLite.
- A React interface sends natural-language prompts, calls the local
/api/searchendpoint for relevant memories, and uses Gemini to turn those results into answers or voice interactions.
The backend keeps image payloads on the local machine and exposes a lightweight HTTP API:
| Endpoint | Purpose |
|---|---|
/health |
Readiness check used by the launcher and desktop shell. |
/status |
Runtime status used by the debug dashboard and startup screen. |
/api/search |
Semantic search over stored memories. |
/api/frame |
Fetch a frame preview for a search result. |
/api/push_frame |
Submit a new frame to the indexing pipeline. |
.
├── assets/
├── client/
├── memorium_tauri/
│ ├── backend/
│ └── gui/
├── start.py
├── debug_agent.py
└── test_agent_*.py
# Build the web client
cd client && npm run build
# Build the desktop UI bundle
cd memorium_tauri/gui && npm run build
# Launch the desktop development shell
python3 start.py --desktop
# Verify both UI builds without starting servers
python3 start.py --build-only| Variable | Purpose |
|---|---|
GEMINI_API_KEY |
Enables Gemini-backed search narration and live voice mode. |
VITE_GEMINI_API_KEY |
Optional override passed directly to the Vite clients. If omitted, start.py mirrors GEMINI_API_KEY. |
VITE_API_BASE |
Optional API base URL for the web and desktop clients. Defaults to http://127.0.0.1:8765. |
MEMORIUM_PORT |
Backend port used by the launcher. Defaults to 8765. |
- React 19 + Vite for both interfaces
- Tauri 2 for the desktop shell
- Python + SQLite for local indexing and search
- Apple Vision for OCR
- Sentence Transformers and FastVLM-based captioning for semantic retrieval

