Skip to content
Gopi Krishna Tummala

Posts

All the articles I've posted.

  • Advanced MLOps & Production
    25 MIN READ

    The Infrastructure-First MLOps Roadmap: From Data DNA to Agentic AI

    Standard MLOps advice tells you to learn Git and Docker. But for the next generation of AI Engineers, that's just the baseline. This roadmap focuses on the Infrastructure Round—deep-diving into how data is structured for speed, how it's fed into models, how those models scale across clusters, and how we squeeze every drop of performance out of the silicon.

  • Advanced MLOps & Production
    40 MIN READ

    Life of a Tensor: A Deep Dive into Production Inference

    A comprehensive deep-dive into production inference optimization, tracing the path of a request through LLM and diffusion model serving systems. Understanding the bottlenecks from gateway to GPU kernel execution.

  • Advanced MLOps & Production
    35 MIN READ

    vLLM and the Trilogy of Modern LLM Scaling

    How PagedAttention, Continuous Batching, Speculative Decoding, and Quantization unlock lightning-fast, reliable large language model serving.