Whether autonomous driving can effectively handle challenging scenarios such as bad weather and complex traffic
environments is still in doubt. One of the critical difficulty is that the single-agent perception is hard to obtain the
complementary perceptual information around the multi-condition scenes, such as meeting occlusion. To investigate the
advantages of collaborative perception in high-risky driving scenarios, we constructed a multiple challenging condition
dataset for large-range vehicle-Infrastructure cooperative perception, called V2XScenes, which include seven typical
multi-modal sensor layouts at successive road section. Particularly, each selected scene is labeled with specific
condition description, and we provide the unique global object tracking numbers across the entire road section and
sequential frames to ensure consistency. Comprehensive cooperative perception benchmarks of 3D object detection and
tracking are provided, the quantitative results based on the state-of-the-art demonstrate the effectiveness of
collaborative perception facing corner condition.