Skip to content
Gopi Krishna Tummala

Tag: ml-infrastructure

All the articles with the tag "ml-infrastructure".

  • Life of a Tensor: A Deep Dive into Production Inference

    Advanced MLOps & Production 25 min

    A comprehensive deep-dive into production inference optimization, tracing the path of a request through LLM and diffusion model serving systems. Understanding the bottlenecks from gateway to GPU kernel execution.