Vista: A Generalizable Driving World Model with
High Fidelity and Versatile Controllability

Last updated at May 22, 2024 08:00 AM PDT.
Chrome browser is recommended.

1. High-Fidelity Open-World Prediction

Videos in this section are: 5 seconds, 10 Hz, 576×1024 resolution.

2. Continuous Long-Horizon Rollout

Videos in this section are: 16 seconds, 10 Hz, 576×1024 resolution.

3. Zero-Shot Action Controllability

In this section, we use either [trajectory] or [angle+speed] to control the ego-vehicle.
Hover the mouse to see the action types that are derived from [trajectory] and [angle+speed] for demonstration clarity.