Gopi Krishna Tummala

Tag: systems

All the articles with the tag "systems".

Advanced MLOps & Production
40 MIN READ

The Custom Kernel Craze — Handcrafting GPU Performance

Why modern AI teams are handcrafting GPU kernels—from FlashAttention to Triton code—and how silicon-level tuning is the new frontier of MLOps.

November 11, 2025
Advanced MLOps & Production
35 MIN READ

vLLM and the Trilogy of Modern LLM Scaling

How PagedAttention, Continuous Batching, Speculative Decoding, and Quantization unlock lightning-fast, reliable large language model serving.

November 10, 2025