Yet Another Topic Modeling Browser...
Originally based off https://github.com/AdrienGuille/TOM with many changes and enhancements. Each trained model is stored as a DuckDB file served by the bundled API.
While you can build and install TopoLogic directly, it is highly encouraged to use a Docker container instead.
- Run
docker build -t topologic .to build the image (CUDA-enabled by default). For a CPU-only image, pass--build-arg TOPOLOGIC_BACKEND=cpu. - Run
docker run -td --name topologic -p 8080:80 topologicto start the container. The image listens on port 80 internally (gunicorn serves the SPA and the API on a single port); map it to whatever host port you like. - Enter the container with
docker exec -it topologic bash. - If you need to install SpaCy models, enter the topologic virtual environment first:
source /var/lib/topologic/topologic_env/bin/activate. - If you are exposing the container at a hostname other than
localhost, edit/etc/topologic/global_settings.iniinside the container (or mount a replacement in) and setserver_nameto that hostname before training. The value is baked into each model'sappConfig.jsonat training time so the frontend knows where to reach the API.
- Edit
/etc/topologic/global_settings.iniwith the web configuration. No separate database server to configure — each trained model is stored as a DuckDB file under its web-app directory. - Run the
install.shscript (it will install uv automatically if it isn't already on your system; uv then manages Python 3.12 and the project virtual environment). Pass--cpuor--cudato pick the torch backend; without a flag, the script auto-detects vianvidia-smi. - If your OS uses systemd, use the
topologic.servicetemplate inapi_server/topologic.serviceto start the API server. - The install includes Gunicorn, used to serve the API. Start it from the shell script installed in
/var/lib/topologic/api_server/, adjusting paths/ports for your setup.
-
Copy
topologic_config.inifrom/var/lib/topologic/configto your working directory and edit it. -
Run the
topologicexecutable, passing the config and the number of workers. E.g.topologic --config=topologic_config.ini --workers=32
Each trained model emits two config files next to the web app:
appConfig.build.json— values Vite bakes into the bundle (just the deployment path). Editing requires a rebuild.appConfig.json— runtime config fetched by the browser on page load: API server URL, display name, metadata fields to show, per-DB citation styling, time-series bounds. Edits take effect on page reload, no rebuild needed.
If you run out of memory when processing the text files, use fewer cores. This lowers the chance of data accumulating in RAM while waiting to be written out to disk.