Knowledge graph built using Neo4j for digital marketing agency call transcript intelligence. Originally attempted via the create-context-graph scaffolding tool (which crashed), completed via manual build.
transcript-kg/
data/ontology.yaml # Custom domain ontology (9 entity types, 13 relationship types)
cypher/schema.cypher # Neo4j constraints and indexes
ingest_zts_data.py # Main ingestion script - loads all ZTS data into Neo4j
query_graph.py # 12 exploration/analytics queries
chat_interface.py # NL-to-Cypher chat via Gemini 2.5 Flash
.env.example # Environment variable template
backend/ # Scaffolded FastAPI backend (from create-context-graph, partial)
frontend/ # Scaffolded Next.js frontend (from create-context-graph, partial)
Makefile # Build targets from scaffold
docker-compose.yml # Docker compose from scaffold
EVAL-NOTES.md # Detailed evaluation notes
- 29,225 nodes: 13,759 ActionItems, 8,378 Topics, 2,179 Analyses, 1,477 Meetings, 1,336 Summaries, 1,041 Persons, 738 SlackChannels, 189 Domains, 128 Clients
- ~45,240 relationships across 13 types
- All 1,477 meetings loaded (deterministic, no LLM extraction)
- Python 3.11+
- Neo4j instance running (tested with Neo4j 5.x)
- Data export at
/workspace/kg_export/(or update paths in scripts) - Gemini API key (for chat interface only)
cd /workspace/kg-1/transcript-kg
# 1. Create and activate virtual environment
python3 -m venv .venv
source .venv/bin/activate
# 2. Install dependencies
pip install neo4j google-generativeai pyyaml python-dotenv
# 3. Create .env from template
cp .env.example .env
# Edit .env with your values:
# NEO4J_URI=bolt://localhost:7687
# NEO4J_USERNAME=neo4j
# NEO4J_PASSWORD=your-password
# GEMINI_API_KEY=your-gemini-key
# 4. Run schema setup (creates constraints and indexes)
# You can run this manually in Neo4j Browser, or:
python3 -c "
from neo4j import GraphDatabase
driver = GraphDatabase.driver('bolt://localhost:7687', auth=('neo4j', 'your-password'))
with driver.session() as session, open('cypher/schema.cypher') as f:
for stmt in f.read().split(';'):
stmt = stmt.strip()
if stmt:
session.run(stmt)
driver.close()
print('Schema created')
"
# 5. Run data ingestion (reads from /workspace/kg_export/)
python3 ingest_zts_data.py
# 6. Run exploration queries
python3 query_graph.py
# 7. Start chat interface
python3 chat_interface.py- Ingestion: Ran
ingest_zts_data.pywhich loaded all 1,477 meetings, 1,041 persons, 128 clients with full relationship structure. Uses MERGE for deduplication. - Query verification: Ran
query_graph.pywhich executes 12 Cypher queries covering client analytics, employee analytics, content analytics, and cross-entity traversals. - Chat interface: Tested NL-to-Cypher via Gemini 2.5 Flash with questions like "Which clients have the most meetings?" and "Show me action items from meetings with Kelly Langley."
- Neo4j Browser: Verified graph visually at
http://localhost:7474.
# After setup, verify node counts:
python3 -c "
from neo4j import GraphDatabase
d = GraphDatabase.driver('bolt://localhost:7687', auth=('neo4j','your-password'))
with d.session() as s:
r = s.run('MATCH (n) RETURN labels(n)[0] as label, count(*) as cnt ORDER BY cnt DESC')
for rec in r: print(f'{rec[\"label\"]}: {rec[\"cnt\"]}')
d.close()
"
# Run queries:
python3 query_graph.py
# Interactive chat:
python3 chat_interface.py
# Then type: "What are the top 5 clients by meeting count?".venv/- Python virtual environment (python3 -m venv .venv && pip install neo4j google-generativeai pyyaml python-dotenv).env- Copy from.env.exampleand fill in credentialsnode_modules/- Frontend deps (cd frontend && npm install)