Skip to content
Gopi Krishna Tummala

Tag: dpo

All the articles with the tag "dpo".

  • Advanced MLOps & Production
    45 MIN READ

    Post-Training Playbook: SFT, LoRA, DPO, and GRPO from First Principles

    Pre-training gives a model knowledge; post-training gives it behavior. A practitioner's breakdown of SFT, LoRA/QLoRA, DPO, and GRPO — with the memory math, concrete configs, and interview reasoning that separates candidates who've done this from candidates who've read about it.