By Gopi Krishna Tummala
Table of Contents
The Story: The Most Under-Appreciated Part
The Analogy: If you donβt know where your eyes are relative to your feet, you trip. If you donβt know where your cameras are relative to your LiDAR, your perception fails.
Calibration is the most under-appreciated part of the autonomous stack. Itβs invisible when it works, catastrophic when it fails. A 1-degree error in camera-LiDAR calibration can cause a 10cm error at 10m distance. At 30 mph, thatβs the difference between hitting a pedestrian and missing them.
The βOh S**tβ Scenario: The Misaligned Sensor
The Failure Mode: Your vehicle has been driving for 6 months. A camera-LiDAR calibration drifts by 0.5 degrees (thermal expansion, vibration, or a minor impact). You donβt notice it β the error is small.
Then you encounter a scenario: a pedestrian is 20m ahead. Your camera sees them. Your LiDAR sees them. But because of the calibration error, when you project the LiDAR point onto the camera image, itβs offset by 10cm.
Your fusion algorithm thinks: βThe camera sees a person here, but the LiDAR sees something 10cm away. These donβt match. Must be a false positive.β
Result: The pedestrian is ignored. Near-miss collision.
Why This Happens:
- Calibration drift: Sensors move relative to each other over time
- No detection: Small errors are hard to detect without explicit monitoring
- Cascading failure: Small calibration errors cause large perception errors
The Solution: Online calibration β continuously monitor and correct calibration errors in real-time.
Intrinsics: Lens Distortion
Intrinsics describe the internal properties of a camera β how it maps 3D rays to 2D pixels.
The Pinhole Model (Ideal)
The Math:
Where is the intrinsic matrix:
Parameters:
- = focal lengths (in pixels)
- = principal point (image center, in pixels)
Lens Distortion (Real-World)
The Problem: Real lenses have distortion β straight lines in the world appear curved in the image.
Types of Distortion:
- Radial Distortion: Caused by lens shape (barrel or pincushion)
- Tangential Distortion: Caused by lens misalignment
The Math:
Radial Distortion:
Where:
- (distance from image center)
- = radial distortion coefficients
Tangential Distortion:
Where = tangential distortion coefficients
The Calibration Problem: Estimate and distortion coefficients from images of a known pattern (e.g., checkerboard).
Extrinsics: Rigid Body Transforms
Extrinsics describe the position and orientation of one sensor relative to another (or relative to the vehicle frame).
The Transform
The Math:
Where:
- = point in source frame
- = point in target frame
- = rotation matrix (3Γ3)
- = translation vector (3Γ1)
Example: Transform a LiDAR point to the camera frame:
Why This Matters
The Fusion Problem: To fuse camera and LiDAR data, you need to know:
- Where is the LiDAR relative to the camera? (extrinsics)
- How does the camera project 3D to 2D? (intrinsics)
The Error Propagation:
If calibration is off by angle and distance :
At range :
- Angular error: (for small )
- Distance error:
Example: At m, if :
Thatβs the width of a person. A calibration error can cause you to miss a pedestrian.
Homogeneous Coordinates
The Problem: Rotation and translation are separate operations. This makes composition of transforms awkward.
The Solution: Homogeneous coordinates β represent rotation and translation as a single matrix operation.
The Math
Homogeneous Representation:
Where the 4Γ4 transformation matrix is:
Composition of Transforms:
If you have two transforms and :
The Intuition: Transform from frame A β B β C is the same as transforming A β C directly.
Example: Transform from LiDAR β Vehicle β Camera:
SE(3): Lie Groups and Lie Algebras
SE(3) is the Special Euclidean Group β the set of all rigid body transforms (rotations + translations).
Why SE(3) Matters
The Problem: How do you optimize over rotations? Rotation matrices have constraints:
- (orthonormal)
- (no reflection)
These constraints make optimization difficult.
The Solution: Lie Groups and Lie Algebras
Lie Algebra:
The Math:
A transform can be represented by its Lie algebra :
Where:
- = translation component
- = rotation component (axis-angle representation)
The Exponential Map:
Where is the βhatβ operator that converts to a 4Γ4 matrix.
The Logarithm Map:
Why This Helps
Optimization: Instead of optimizing over constrained rotation matrices, you optimize over unconstrained Lie algebra elements .
The Calibration Problem:
Minimize reprojection error:
Where:
- = 3D point (from LiDAR)
- = 2D observation (from camera)
- = projection function
- = transform parameterized by Lie algebra
Gradient-based optimization (e.g., Levenberg-Marquardt) can now optimize over directly.
Calibration Methods: Online vs. Offline
Offline Calibration
The Setup: Use a known calibration target (checkerboard, AprilTag, etc.) in a controlled environment.
The Process:
- Collect data: Capture images/point clouds of the calibration target from multiple viewpoints
- Extract features: Detect corners/markers in images, extract points in LiDAR
- Optimize: Minimize reprojection error to estimate calibration parameters
The Math:
Where:
- = 3D point on calibration target
- = observed 2D point in image
- = projection function (with distortion)
Pros:
- Accurate (millimeter-level precision)
- Can calibrate all parameters at once
- Well-understood, standard approach
Cons:
- Requires controlled environment
- Time-consuming (30-60 minutes)
- Doesnβt handle calibration drift
Online Calibration
The Setup: Continuously estimate calibration from natural scene observations (no calibration target needed).
The Process:
- Detect correspondences: Find matching features between camera and LiDAR
- Estimate transform: Use RANSAC or optimization to find best transform
- Monitor drift: Track calibration over time, detect when it drifts
The Math:
Correspondence-based:
Given correspondences :
RANSAC: Robustly handle outliers (false correspondences).
Pros:
- No calibration target needed
- Handles calibration drift
- Can run continuously
Cons:
- Less accurate than offline (centimeter-level)
- Requires good correspondences
- Computationally expensive
Production Practice: A hybrid approach is commonly used:
- Offline calibration at factory (high accuracy baseline)
- Online calibration in the field (monitor and correct drift)
Time Synchronization: PTP and Timestamps
The Problem: Different sensors capture data at different times. If you fuse camera and LiDAR data, but the camera image is from time and the LiDAR scan is from time ms, youβre fusing data from different moments.
The Real-World Twist: At 30 mph, in 50ms you travel 2.2 feet. If you fuse a camera image with a LiDAR scan thatβs 50ms later, objects will be misaligned by 2.2 feet.
PTP (Precision Time Protocol)
The Setup: All sensors synchronize to a master clock using PTP (IEEE 1588).
The Math:
Clock Synchronization:
Where:
- = clock offset (measured via PTP)
- = network delay (measured via PTP)
PTP achieves microsecond-level synchronization β good enough for sensor fusion.
Timestamping
The Process:
- Hardware timestamping: Each sensor timestamps data at capture time (not processing time)
- PTP synchronization: All timestamps are in the same time reference
- Temporal alignment: When fusing, align data by timestamp (not by arrival time)
The Challenge: Different sensors have different latencies:
- Camera: 10-20ms (readout + processing)
- LiDAR: 50-100ms (scan time)
- Radar: 5-10ms (processing)
The Solution: Predict forward or interpolate backward to align timestamps.
Example: If camera image is at and LiDAR scan is at ms:
- Option 1: Predict LiDAR points forward to time (using motion model)
- Option 2: Interpolate camera image backward to time ms (not possible, so use closest frame)
The Intuition: Laser Pointer on a Unicycle
The Analogy: Youβre holding a laser pointer while riding a unicycle. The laser pointer jitters. Is the jitter because:
- Your hand is shaking? (sensor noise)
- The unicycle is wobbling? (vehicle motion)
- Both? (combined effect)
The Calibration Problem: Similarly, if a LiDAR point appears to move, is it because:
- The LiDAR measurement is noisy? (sensor noise)
- The vehicle is moving? (ego motion)
- The calibration is wrong? (extrinsic error)
- All of the above? (combined effect)
The Solution: Motion compensation β account for vehicle motion, then analyze residual error to detect calibration drift.
The Math:
Motion Compensation:
Where accounts for vehicle motion between timesteps.
Calibration Monitoring:
If calibration is correct, after motion compensation, corresponding points from camera and LiDAR should align. If they donβt, calibration has drifted.
Summary: The Bedrock of Perception
Calibration is the foundation of sensor fusion:
- Intrinsics: Know how each sensor maps the world to measurements
- Extrinsics: Know where sensors are relative to each other
- Time sync: Know when each sensor captured its data
- Monitoring: Continuously verify calibration hasnβt drifted
The Path Forward:
With calibrated sensors, we can now:
- Fuse sensor data (Module 6)
- Localize the vehicle (Module 4)
- Detect objects (Module 5)
- Track them over time (Module 6)
Graduate Assignment: Camera-LiDAR Calibration
Task: Implement offline calibration between a camera and LiDAR.
Setup:
- Calibration target: Checkerboard (known 3D positions)
- Data: Camera images + LiDAR point clouds of the checkerboard
Deliverables:
- Extract checkerboard corners from camera images
- Extract checkerboard points from LiDAR scans
- Implement optimization to estimate (extrinsics) and (intrinsics)
- Visualize: Project LiDAR points onto camera image β do they align?
Extension: Implement online calibration using natural scene correspondences.
Further Reading
- Module 2: Eyes and Ears (Sensors)
- Module 6: Merging Senses (Sensor Fusion & Tracking)
- AutoCalib Research: Automatic Camera Calibration
This is Module 3 of βThe Ghost in the Machineβ series. Module 4 will explore localization β knowing where you are in the world with centimeter-level accuracy.