The IPS300+ is the first multimodal dataset available for roadside perception tasks in large-scale urban intersection scene. Following points are the key factors that IPS300+ has.

●The first multimodal dataset (including point clouds and images) available for roadside perception tasks in large-scale urban intersection scene. The point clouds remain usable within 300m.

●The most challenging dataset with the highest label density. The proposed dataset includes 14198 frames of data and every single frame has an average of 319.84 labels.

●The 3D bounding box is labeled at 5Hz, which provides dense truth data for 3D target detection task and tracking task.

●A feasible and affordable solution for IPS construction and a wireless approach for time synchronization and spatial calibration are provided.


●A Robosense Ruby-Lite Lidar: 80 beams; 10Hz; detect range: 230m; single echo mode; hFOV: 0◦-360◦; vFOV: -25◦ - 15◦; angle resolution: 0.2◦; 144000 points in a single frame.

●Two Sensing-SG5 color cameras. 30Hz; 5.44MP; CMOS: Sony IMX490RGGB; rolling shutter; HDR dynamic range: 120dB; lens hFov: -60◦-60◦; lens vFov: -32◦-32◦; lens focal length: 4.57mm; image size: 1920×1080. The two cameras form a binocular system with a distance of 1.5m.

●A BS-280 GPS used for positioning and time sync, timing error: less than 1µs. (Since IPU are static in WGS84 coordinate system, the GPRS output is set to 1Hz.)

●An industrial computer for data collection. CPU: Intel I7-10700; Memory: 16G; Ubuntu 20.04; ROS Noetic.


The location for data collection that we selected is at the intersection of Chengfu Road and Zhongguancun East Road, the central area of this intersection reach 60m x 50m and has heavy trafiic volume everyday. Both red dots in the diagram are two platforms that setup on the roadside.

Below are the detail parameters for the sensors.

sensors Information
Robosensor Ruby-Lite Lidar

80 beams, 10Hz detect, range: 230m, single echo mode, hFOV: 0°-360°, vFOV: -25°-15°, angle resolution: 0.2°

Sensing-SG5 color camera

30Hz, 5.44MP, CMOS: Sony IMX490RGGB, rolling shutter, HDR dynamic range: 120dB, resolution: 1920×1080

BS-280 GPS

Time error: less than 1 us

Data analysis

The 3D bounding box annotation contaions 7 kinds of categories on both point cloud and RGB diagram. For most of frame, the number of pedestrains is around 40 to 60 per frame, the number of cars is around 250-270 per frame.

Data Structure



 │ calib_file.txt

 │ README.txt



# Pointclouds from two IPUs after preprocess our article. (The labeling work is under these pointclouds)

 │ ├─IPU1

 │ │ ├─IPU1_cam1

# Raw images (have distort) of each camera.  

 │ │ ├─IPU1_cam2

 │ │ ├─IPU1_pcd

# Raw pointclouds (removed None) of each Lidar.

 │ │ └─IPU1_cam1_undistort

# Undistort images in jpg format.

 │ └─IPU2

 │ ├─IPU2_cam1

 │ ├─IPU2_cam2

 │ ├─IPU2_cam1_undistort

 │ └─IPU2_pcd


 │ ├─json

# Label in json format

 │ ├─txt

# Label in txt format

 │ └─README.txt




 │ README.txt


 │ turned_left.7z

 │ went_straight.7z


 │ │ 2020-12-16-17-27-48.7z.001

 │ │ 2020-12-16-17-27-48.7z.002

 │ │ 2020-12-16-17-27-48.7z.003

 │ │ 2020-12-16-17-27-48.7z.004

 │ │ 2020-12-16-17-34-57.7z.001

 │ │ 2020-12-16-17-34-57.7z.002

 │ │ 2020-12-16-17-34-57.7z.003

 │ │ 2020-12-16-17-41-38.7z

 │ │ 2020-12-16-17-44-50.7z.001

 │ │ 2020-12-16-17-44-50.7z.002

 │ │ 2020-12-16-17-44-50.7z.003

 │ │ 2020-12-16-17-44-50.7z.004

 │ └─2020-12-16-17-50-21.7z


 │ 2020-12-16-17-27-45.7z

 │ 2020-12-16-17-34-55.7z

 │ 2020-12-16-17-41-29.7z

 │ 2020-12-16-17-44-49.7z





 │ ├─README.txt

 │ ├─label

 │ │ ├─json

 │ │ │ 000000_LABEL.json

# Label in json format.

 │ │ ├─txt

 │ │ │ 000000_LABEL.txt

# Label in txt format.

 │ │ └─README.txt

 │ └─data


 │ ├─IPU1

 │ │ ├─IPU1_cam1

# Raw images (have distort) of each camera.

 │ │ ├─IPU1_cam2

 │ │ ├─IPU1_pcd

# Raw pointclouds (removed None) of each Lidar.

 │ │ └─IPU1_cam1_undistort

# Undistort images in jpg format.

 │ └─IPU2

 │ ├─IPU2_cam1

 │ ├─IPU2_cam2

 │ ├─IPU2_cam1_undistort

 │ └─IPU2_pcd





The IPS300+ data is published under CC BY-NC-SA 4.0 license. You may need to send an email to to obtain the password for download.

Thanks to Datatang(Stock Code : 831428) for providing us with professional data annotation services. Datatang is the world’s leading AI data service provider. We have accumulated numerous training datasets and provide on-demand data collection and annotation services. Our Data++ platform could greatly reduce data processing costs by integrating automatic annotation tools. Datatang provides high-quality training data to 1,000+ companies worldwide and helps them improve the performance of AI models.

Tsinghua University

Haidian District, Beijing, 100084, P. R. China

© Copyright 2016-2020 All Rights Reserved