Overview
Vehicle-to-everything (V2X) collaborative perception has emerged as a promising solution to address the limitations of single-vehicle perception systems. However, existing V2X datasets are limited in scope, diversity, and quality. To address these gaps, we present Mixed Signals, a comprehensive V2X dataset featuring 45.1k point clouds and 240.6k bounding boxes collected from three connected autonomous vehicles (CAVs) equipped with two different configurations of LiDAR sensors, plus a roadside unit with dual LiDARs.
Our dataset provides point clouds and bounding box annotations across 10 classes, ensuring reliable data for perception training. We provide detailed statistical analysis on the quality of our dataset and extensively benchmark existing V2X methods on it. The Mixed Signals dataset is ready-to-use, with precise alignment and consistent annotations across time and viewpoints.This work was done as a collaboration between University of Sydney ACFR, Cornell University, INRIA ACENTAURI, and various other institutions. View the preprint: arXiv:2502.14156.
Geographic Location
The data collection took place at the Abercrombie Street and Myrtle Street intersection in Sydney, Australia, where the roadside unit is located. The vehicles recorded LiDAR data for two hours during peak rush hour. Throughout this period, the three vehicles repeatedly passed through the intersection. This allowed them to capture interactions between the vehicles and other agents on the road, such as pedestrians, cyclists, and other vehicles.

Vehicles and Devices
Mixed Signals dataset is collected via 3 collaborative autonomous vehicles, consisting of 2 electric vehicles and 1 urban vehicle, and a roadside unit. (a) shows the small electric vehicles outfitted with an OS1-128 beams LiDAR system. The LiDAR is mounted at a 15° angle relative to the vehicle’s body and stands at a height of 1.63 meters. (b) shows the urban vehicle equipped with an OS1-128 beam LiDAR system located at a height of 1.9 meters. (c) shows the RSU which consists of two LiDARs: an OS1-64 beam (TOP) and an OSDome-128 (DOME) LiDAR mounted on a pole at the intersection at a height of 2.5 meters.
Annotations

maintain consistent poses over time, highlighting the high quality of our annotations.
Our annotations consist of bounding boxes parameterized by the center location, three dimensions (length, width, height), rotation (represented as a quaternion), and track ID’s. To generate such annotations for each data sample, we first aggregate the point clouds of every agent in the coordinate of the roadside unit’s top LiDAR to focus the annotators’ attention to the intersection of interest. Then, professional annotators employ the SegmentsAI annotation tool to label objects and localize them with a 3D bounding box. Classes labeled belong to 10 categories, consisting of: car, truck, pedestrians, bus, electric vehicle, trailer, motorcycle/bike, bicycle, portable personal mobility, and emergency vehicle.
We provided the annotation instructions given to the annotators in this document.
References
In proceedings to ICCV, 2025. View the arXiv preprint: arXiv:2502.14156
@article{luo2025mixed, title={Mixed Signals: A Diverse Point Cloud Dataset for Heterogeneous LiDAR V2X Collaboration}, author={Luo, Katie Z and Dao, Minh-Quan and Liu, Zhenzhen and Campbell, Mark and Chao, Wei-Lun and Weinberger, Kilian Q and Malis, Ezio and Fremont, Vincent and Hariharan, Bharath and Shan, Mao and Worrall, Stewart and Berrio Perez, Julie Stephany}, journal={arXiv preprint arXiv:2502.14156}, year={2025} }