Vista: A Generalizable Driving World Model with High Fidelity and Versatile Controllability
Last updated at May 22, 2024 08:00 AM PDT. Chrome browser is recommended.
1. High-Fidelity Open-World Prediction
Videos in this section are: 5 seconds, 10 Hz, 576×1024 resolution.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
2. Continuous Long-Horizon Rollout
Videos in this section are: 16 seconds, 10 Hz, 576×1024 resolution.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
Realistic drive view.
3. Zero-Shot Action Controllability
In this section, we use either [trajectory] or [angle+speed] to control the ego-vehicle. Hover the mouse to see the action types that are derived from [trajectory] and [angle+speed] for demonstration clarity.