Gopi Krishna Tummala

Tag: ml-infrastructure

All the articles with the tag "ml-infrastructure".

Advanced MLOps & Production
40 MIN READ

Life of a Tensor: A Deep Dive into Production Inference

A comprehensive deep-dive into production inference optimization, tracing the path of a request through LLM and diffusion model serving systems. Understanding the bottlenecks from gateway to GPU kernel execution.

January 28, 2025