Phase 1 — Boot & Baselines (1–5) 1. autotrain-Llama2chat Why: gentle on-ramp to chat fine-tunes. Do: run inference, inspect tokenizer, export to GGUF. 2. NousResearch-Llama2-chat Why: compare a known chat baseline. Do: side-by-side eval vs #1 (accuracy, toxicity, latency). 3. NousResearch-Llama2-7bhf Why: plain 7B base; learn prompting vs. instruction. Do: simple domain prompts; log failure…
