Skip to content
Gopi Krishna Tummala

Posts

6 of 47 articles — browse by tag or search to filter.

  • Intermediate MLOps & Production
    30 MIN READ

    ML Pipeline Orchestration: Temporal, Airflow, Kubeflow, Ray — Which Layer Does What

    A precise mental model for ML pipeline orchestration—mapping durable backend workflows (Temporal), data schedulers (Airflow, Prefect, Dagster), ML-native pipeline frameworks (Kubeflow, Metaflow, ZenML), and distributed compute engines (Ray). Built for engineers who need to answer 'design an ML pipeline' in interviews. Includes 2025-2026 updates: Airflow 3, KFP v2, Ray 2.x, MLflow 3.

  • Advanced MLOps & Production
    45 MIN READ

    Training Frameworks: ZeRO, FSDP, and the Memory Math That Gets You Hired

    A practitioner's guide to distributed training frameworks — the memory formulas, parallelism strategies, and failure-mode reasoning that ML infra interviews actually test. Covers DDP, FSDP, DeepSpeed ZeRO, 3D parallelism, and fault tolerance.

  • Advanced MLOps & Production
    40 MIN READ

    Datasets & Dataloaders: The Art of Never Starving Your GPU

    GPU utilization is a lagging indicator — the real battle is in the data pipeline. A practitioner's deep dive into PyTorch DataLoader internals, zero-copy data pumps, WebDataset streaming, and the exact questions this gets you in ML system design interviews.

  • Advanced MLOps & Production
    45 MIN READ

    Post-Training Playbook: SFT, LoRA, DPO, and GRPO from First Principles

    Pre-training gives a model knowledge; post-training gives it behavior. A practitioner's breakdown of SFT, LoRA/QLoRA, DPO, and GRPO — with the memory math, concrete configs, and interview reasoning that separates candidates who've done this from candidates who've read about it.