Video Diffusion Fundamentals: The Temporal Challenge
Why video is harder than images, the DiT revolution for video, and how diffusion models learn temporal consistency. Covers V-DiT, AsymmDiT, and the mathematical foundations of video generation.
All the articles I've posted.
Why video is harder than images, the DiT revolution for video, and how diffusion models learn temporal consistency. Covers V-DiT, AsymmDiT, and the mathematical foundations of video generation.
A deep dive into how datasets and dataloaders power modern AI—from the quiet pipeline that feeds models to the sophisticated tools that make training efficient. Understanding the hidden engine that keeps AI systems running.
Why L5 autonomy is harder than a moon landing. Understanding ODD, latency loops, compute constraints, and the probability of failure in autonomous systems.
From photons to decisions: How machines reconstruct 3D reality from 2D data. Covers cameras, IPM, radar, LiDAR, and sensor fusion in an intuitive, first-principles approach.
If you don't know where your eyes are relative to your feet, you trip. Covers intrinsics, extrinsics, SE(3) transforms, online vs. offline calibration, and time synchronization.
From GPS to centimeter accuracy: How autonomous vehicles know their exact position. Covers GNSS, IMU, wheel odometry, scan matching, and the Kalman Filter fusion that creates the "Blue Line."