Skip to content
Gopi Krishna Tummala

Tag: gpu

All the articles with the tag "gpu".

  • Advanced MLOps & Production
    40 MIN READ

    Life of a Tensor: A Deep Dive into Production Inference

    A comprehensive deep-dive into production inference optimization, tracing the path of a request through LLM and diffusion model serving systems. Understanding the bottlenecks from gateway to GPU kernel execution.