4x accelerated playing speed:
segment_1.mp4
We've implemented the interactive generation with Carla (0.9.15). The main components:
- Server-side Carla maintains the simulation state.
- Server-side simulation script configs the environment, ego car, sensors, and traffic manager.
- Server-side streaming generation reads condition data from the Carla, and write generated frames to video streaming server.
- Server-side video streaming server to publish the video streaming for client video player.
- Client-side Carla control to control the ego car in the simulation world with kayboard.
- Client-side video player to receive the generated result.
The dataflow is:
- Carla control
- Carla (configured by the simulation script)
- Streaming generation
- Video streaming server
- Video player
The server requires:
- GPU (nVidia A100 is recommended)
- network accessibility.
- Python in 3.9 or 3.10
- Carla == 0.9.15
- mediamtx
The client requires:
- Windows or Ubuntu (The supported platforms for the Carla Python API).
- Python in 3.9 or 3.10
- ffmpeg
The interactive generative model is trained from scratch on autonomous driving data after the specification reduction (model size, view count, resolution) of CTSD 3.5, in order to reduce the overhead of model inference.
Base Model | Temporal Training Style | Prediction Style | Configs | Checkpoint Download |
---|---|---|---|---|
SD 3.5 | Diffusion forcing transformer | FIFO diffusion | Config | Checkpoint |
-
Download the base model (for VAE and text encoders) and model checkpoint, then edit the config.
-
Launch the video streaming server following the official guide.
-
Launch the Carla:
{CARLA_ROOT}/CarlaUE4.sh -RenderOffScreen -quality-level=Low
-
Configure the Carla by editing the config template and run:
PYTHONPATH=src python src/dwm/utils/carla_simulation.py -c configs/experimental/simulation/carla_simulation_town10_nusc_3views.json --client-timeout 150
-
Edit the generation config template (e.g. Carla endpoint, video streaming options) and run:
PYTHONPATH=src python src/dwm/streaming.py -c configs/experimental/streaming/ctsd_35_xs_p6_tirda_bm_nwao_streaming.json -l output/ctsd_35_xs_p6_tirda_bm_nwao_streaming -s rtsp://{VIDEO_STREAMING_ENDPOINT}/live --fps 2
-
Launch the video player after the server-side streaming begin:
ffplay -fflags nobuffer -rtsp_transport tcp rtsp://{VIDEO_STREAMING_ENDPOINT}/live
-
Launch the Carla control after the server-side streaming begin:
python src\dwm\utils\carla_control.py --host {CARLA_SERVER_ADDRESS} -p {CARLA_SERVER_PORT}
- Generation speed.
- Latency due to the denoising queue.