Skip to content
Gopi Krishna Tummala

Tag: pytorch

All the articles with the tag "pytorch".

  • Advanced MLOps & Production
    45 MIN READ

    Training Frameworks: ZeRO, FSDP, and the Memory Math That Gets You Hired

    A practitioner's guide to distributed training frameworks — the memory formulas, parallelism strategies, and failure-mode reasoning that ML infra interviews actually test. Covers DDP, FSDP, DeepSpeed ZeRO, 3D parallelism, and fault tolerance.

  • Advanced MLOps & Production
    40 MIN READ

    Datasets & Dataloaders: The Art of Never Starving Your GPU

    GPU utilization is a lagging indicator — the real battle is in the data pipeline. A practitioner's deep dive into PyTorch DataLoader internals, zero-copy data pumps, WebDataset streaming, and the exact questions this gets you in ML system design interviews.