← Back to projects
Lightweight BEV Transformer – Attention & Temperature
CompletedTiny PyTorch codebase for playing with transformer attention on simple BEV (bird’s-eye-view) grids and seeing how the softmax temperature changes attention patterns and downstream predictions.
BEV transformer attention
2025 About this project
- Generates synthetic BEV grids (32×32) with rectangles standing in for:
- cars (5×3)
- pedestrians (2×2)
- Trains a small transformer over BEV patches:
- patch embedding: 4×4 → 8×8 = 64 patches
- 2 encoder layers, 2 heads, 64-dim embeddings
- Uses object queries + soft-argmax detection head:
- predicts object centers (x, y in [0, 1])
- predicts class (car vs pedestrian)
- Exposes internal self-attention maps so you can:
- overlay them on the BEV grid
- compare different temperatures in the softmax
The whole thing is CPU-friendly and stays under ~100k parameters.
Gallery
Screenshot 2026-04-27 at 17-19-15 attention_layer0.png (PNG Image 4396 × 2370 pixels) — Scaled (39%)