21 lines (17 loc) · 838 Bytes

NL2Shell Training — Program Constraints

Budget

Max compute: <15 compute units (Google Colab A100)
Expected: ~3-5 CU for 3 epochs on NL2Bash (~10k examples)

Metrics

Primary: eval_loss (lower is better)
Qualitative: 7 NL->shell test prompts in prepare.py:EVAL_PROMPTS
Success: model produces syntactically valid shell commands for >=5/7 prompts

Target

HuggingFace: AryaYT/nl2shell-0.8b
Artifacts: merged model + GGUF (q4_k_m, q8_0)

Rules

Do NOT modify prepare.py — it is immutable
Edit only train.py for hyperparameter tuning or bug fixes
Close Colab session when training completes
If training loss plateaus, reduce learning rate or increase epochs
If OOM, reduce batch_size from 8 to 4 (keep grad_accum=4)