This repo scaffolds data preparation + SFT + DPO training for a multi-turn Motivational Interviewing (MI) style coach model using Hugging Face Transformers + TRL.
data/— schema + a tiny synthetic example dataset (JSONL) to verify the pipeline.scripts/data_prep.py— normalize MI datasets (MI-TAGS / AnnoMI / MI-Dataset) into a common JSONL format.scripts/sft_train.py— Supervised fine-tuning with TRL SFTTrainer. Supports LoRA + 4-bit.scripts/dpo_train.py— Preference optimization with TRL DPOTrainer.scripts/infer_demo.py— Run inference with memory + MI persona prompt.scripts/metrics_mi.py— Simple MI-style behavioral metrics (coverage of open-question/reflect/affirm, etc.).configs/*.yaml— Example hyper-parameters.
⚠️ You must provide your own dataset paths for MI-TAGS / AnnoMI etc. The includedexample_mi_dialogs.jsonlis just for smoke tests (not for real training).
cd /mnt
wget https://repo.anaconda.com/miniconda/Miniconda3-latest-Linux-x86_64.sh
bash Miniconda3-latest-Linux-x86_64.sh -b -p /mnt/miniconda3
echo 'export PATH="/mnt/miniconda3/bin:$PATH"' >> ~/.bashrc
source ~/.bashrc
conda create -n gptcoach python=3.10
conda activate gptcoach
pip install -r /mnt/gptcoaching_mi_training/requirements.txt
pip install -U transformers datasets accelerate trl peft bitsandbytes torch torchvision torchaudio
# If CUDA is not available, install CPU wheels for torch.Each line is a dict:
{
"dialog_id": "string",
"turn_id": 7,
"user_utt": "string",
"coach_utt": "string",
"mi_tags": ["open_question","reflection_simple","affirm"],
"state_before": {}, # optional structured state (goal, barriers, wearable stats, etc.)
"state_after": {} # optional updated state
}You can include additional fields; unknown keys are ignored by the loader.
Set your dataset file paths and run:
python scripts/data_prep.py --annomi_csv ./data/AnnoMI-full.csv --out_jsonl data/mi_unified_from_annomi_full.jsonl
python scripts/data_prep.py --annomi_csv ./data/AnnoMI-simple.csv --out_jsonl data/mi_unified_from_annomi_simple.jsonlpython scripts/sft_train.py \
--model_name_or_path Qwen/Qwen2.5-3B-Instruct \
--train_file data/mi_unified_from_annomi_full.jsonl \
--eval_file data/mi_unified_from_annomi_simple.jsonl \
--output_dir outputs/qwen2p5-3b-mi-sft \
--num_train_epochs 3 \
--per_device_train_batch_size 20 \
--lr 2e-5 \
--eval_steps 200 \
--save_steps 200 \
--wandb --wandb_project mi-coach-sft --wandb_run_name qwen2p5_3b_sft \
--bnb_4bit --loraThis part will generate both positive and negative samples (MI-style coaching response).
python scripts/make_dpo_prefs_v2.py \
--sft_file data/mi_unified_from_annomi_full.jsonl \
--out_file data/mi_prefs.jsonl \
--seed 123 \
--max_samples 5000
Prepare a JSONL with pairs of responses (chosen vs rejected) per context/turn.
python scripts/dpo_train.py \
--model_name_or_path outputs/qwen2p5-3b-mi-sft/checkpoint-510 \
--pref_file data/mi_prefs.jsonl \
--output_dir runs/qwen2p5-3b-mi-dpo \
--num_train_epochs 3 \
--per_device_train_batch_size 1 \
--lr 2e-5 \
--logging_steps 10 \
--save_steps 200 \
--lora --bnb_4bit \
--wandb --wandb_project mi-coach-dpo --wandb_run_name qwen2p5_3b_dpoexport HF_HOME=/mnt/.cache/huggingface
export TRANSFORMERS_CACHE=/mnt/.cache/huggingface/transformers
# merged model dir from your DPO (or SFT) run
export MODEL_PATH=/mnt/gptcoaching_mi_training/runs/qwen2p5-3b-mi-dpo-merged
uvicorn scripts.app_demo:app --host 0.0.0.0 --port 8000 --reload
# POST http://localhost:8000/chat
# body:
# {
# "history": [{"user":"I want to be more active.","coach":"What matters most about being active for you?"}],
# "user_msg":"I'm too busy this week."
# }Kerrio.AI implements a Mayo Clinic-inspired 7-stage clinical journey for cognitive optimization. This is NOT a chatbot - it's a diagnostic-first digital cognitive clinic.
- Accurate diagnosis is the foundation of effective treatment
- Understanding is a prerequisite for permanent change
- Client History and Clinician's Notes are maintained separately
- Registration - Client validated as invited guest
- History Collection - Three Pillars (History, Psychology/Philosophy, Physiology)
- Consultation - Clarify ambiguities, uncover blind spots
- Diagnosis - Build Cognitive Wiring Map, explain WHY the problem exists
- Proposal - Personalized treatment plan based on diagnosis
- Treatment - Cognitive Rewiring Maps (Patent Pending)
- Monitoring - Longitudinal progress assessment
# 1. Set environment variables
export HF_HOME=/mnt/.cache/huggingface
export TRANSFORMERS_CACHE=/mnt/.cache/huggingface/transformers
export MODEL_PATH=/mnt/gptcoaching_mi_training/runs/qwen2p5-3b-mi-dpo-merged
# 2. Start the server
uvicorn scripts.app_demo:app --host 0.0.0.0 --port 8000 --reload
# 3. Open browser
# http://localhost:8000/POST /api/chat- Send message and get AI response{"user_id": "demo_user", "user_msg": "I feel stuck in my career"}
GET /api/journey/{user_id}- Get current journey status and stagePOST /api/journey/advance- Advance to next stage (if requirements met)GET /api/journey/prompts/{user_id}- Get suggested prompts for current stage
GET /api/journey/history/{user_id}- Get client's collected history across 3 pillarsGET /api/journey/notes/{user_id}- Get clinician's notes (AI observations)
GET /api/journey/diagnosis/{user_id}- Generate/retrieve diagnosisPOST /api/journey/diagnosis/confirm/{user_id}- Confirm understanding of diagnosisGET /api/journey/treatment/{user_id}- Get treatment proposal with Cognitive Rewiring MapPOST /api/journey/treatment/accept/{user_id}- Accept treatment planPOST /api/journey/treatment/progress/{user_id}- Update treatment progress
GET /api/journey/videos- Get all educational video libraryGET /api/journey/videos/{video_id}- Get specific video details
GET /api/journey/full-profile/{user_id}- Complete client profile with all data
POST /api/cogmap- Build cognitive map from sessionGET /api/map/{user_id}- Get cognitive wiring map for user
The web UI (web/index.html) includes:
- Journey Progress Bar - Shows current stage (Registration → Monitoring)
- Chat Interface - Conversational interaction with Kerrio
- Three Pillars Panel - View collected history across:
- History Pillar (life events, patterns)
- Psychology/Philosophy Pillar (beliefs, values)
- Physiology Pillar (sleep, stress, health)
- Diagnosis Panel - View:
- Core Constraints
- Bottlenecks
- Root Causes
- Explanation
- Recommended Educational Videos
- "I Understand My Diagnosis" confirmation button
- Treatment Panel - View:
- Current Wiring patterns
- Target Wiring (desired state)
- Rewiring Steps
- Progress bar
- "Accept Treatment Plan" button
- Cognitive Map Visualization - Interactive graph with Cytoscape.js
# Run the kerrio_journey.py module directly for testing
python scripts/kerrio_journey.py
# This will:
# - Create a test profile
# - Show the stage-specific system prompt
# - Test the Diagnostic Engine
# - Display sample diagnosis output| File | Description |
|---|---|
scripts/kerrio_journey.py |
Core journey management, data structures, diagnostic & rewiring engines |
scripts/app_demo.py |
FastAPI server with all endpoints |
scripts/cogmap_utils.py |
Heuristic cognitive map builder |
web/index.html |
Full-featured web interface |
runs/kerrio_profiles/ |
Persistent client profile storage (JSON) |
┌─────────────────────────────────────────────────────────────────┐
│ Web Interface │
│ (Chat, Journey Bar, Three Pillars, Diagnosis, Treatment) │
└─────────────────────────────────────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ FastAPI (app_demo.py) │
│ - /api/chat, /api/journey/*, /api/cogmap, /api/map/* │
└─────────────────────────────────────────────────────────────────┘
│
┌───────────────────┼───────────────────┐
▼ ▼ ▼
┌─────────────────┐ ┌─────────────────┐ ┌─────────────────┐
│ KerriJourney │ │ DiagnosticEngine│ │ CognitiveRewiring│
│ Manager │ │ │ │ Engine │
│ - 7 stages │ │ - Root causes │ │ - Rewiring maps │
│ - 3 pillars │ │ - Bottlenecks │ │ - Treatment steps│
│ - Profile I/O │ │ - Video recs │ │ │
└─────────────────┘ └─────────────────┘ └─────────────────┘
│ │ │
└───────────────────┴───────────────────┘
│
▼
┌─────────────────────────────────────────────────────────────────┐
│ LLM (Qwen/fine-tuned model) │
│ Stage-specific system prompts │
└─────────────────────────────────────────────────────────────────┘