Search

Hongke's latest articles

HongKe

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

Lorem ipsum dolor sit amet, consectetur adipiscing elit. Ut elit tellus, luctus nec ullamcorper mattis, pulvinar dapibus leo.

[Hongke Solutions] SimData: A High-Fidelity Virtual Dataset Generation Solution Based on aiSim

01 Preamble

In the research and development of autonomous driving perception systems, the performance of AI models is highly dependent on large-scale, high-quality perception datasets. Currently, widely adopted open-source datasets in the industry include KITTI, nuScenes, and the Waymo Open Dataset, among others; these datasets have laid a critical foundation for the evolution of autonomous driving algorithms.

However, building real-world on-site perception datasets is no easy task—companies not only need to invest significant human, material, and time resources, but must also contend with numerous severe challenges, including data collection constraints, privacy compliance, time-consuming data annotation, and the difficulty of capturing extreme/ rare scenarios.

Against this backdrop, high-fidelity virtual datasets are rapidly emerging as a new frontier in the research of perception algorithms for autonomous driving. Through virtual data generated by simulation platforms, R&D teams can not only rapidly scale up the volume of data but also flexibly simulate complex road conditions, adverse weather, and rare unexpected events, thereby providing perception models with more comprehensive and diverse training samples.

Based on this,HOSCOWe have launched a brand-new high-fidelity virtual dataset—SimData. SimData Deep Trust aiSim simulation platformWith its high-precision physical modeling and realistic visual rendering capabilities, it can efficiently generate synchronized data from multiple sensors (including cameras, LiDAR, millimeter-wave radar, IMUs, and more), perfectly achieving multimodal characteristics that closely align with real-world data.

SimData's data structure strictly follows nuScenes Data Set Format Specification, developers can directly use the official nuscenes-devkit The tool performs data analysis and visualization, significantly reducing the time and effort required for R&D personnel to adapt to and master it.

This article will provide an in-depth analysis of SimData’s core features and development process, and comprehensively demonstrate its actual performance in typical autonomous driving perception tasks.SimData Official ReleaseThe report and related comparison test results will be released shortly, so please stay tuned.HOSCOThe latest updates and technical insights.

02 Analysis of the SimData Development Process and Architecture

Sensor Layout

In the aiSim simulation platform, we have rigorously replicated the sensor configuration and layout of the nuScenes dataset to ensure a high degree of consistency in data structure and multimodal synchronization.

The simulation vehicle is equipped with 6 surround cameras, 5 millimeter-wave radars, 1 LiDAR, 1 inertial measurement unit (IMU), and 1 global positioning system (GPS). The sampling frequency for both the cameras and radars is set to 40 Hz, while the LiDAR operates at a high sampling frequency of 80 Hz, perfectly meeting the requirements for high-temporal-precision, multi-sensor synchronized data acquisition.

The spatial layout, positions, and orientations of the various sensors are shown in the figure below:

Overall view (left), front view (right)
Left view (left), top view (right)

Unlike the nuScenes dataset, all sensors in SimData use a unified FLU (Forward-Left-Up) Coordinate System; whereas in the original nuScenes dataset, the camera sensors use RDF (Right-Down-Forward) Coordinate SystemThe

During the dataset construction process, we have performed extremely rigorous coordinate system transformations and alignment optimizations on all annotation files to ensure that the coordinate definitions are logically consistent with nuScenes. Consequently, when deploying and using SimData, users do not need to expend additional effort addressing coordinate discrepancies; their data parsing and secondary development experience remain seamlessly consistent with native nuScenes. The figure below illustrates the typical layout of various sensors in nuScenes and their coordinate system definitions.

Data Structures

The SimData dataset is fully compatible with nuScenes in terms of structural design and directory organization. For algorithm engineers and developers already familiar with the nuScenes framework, deploying, using, and parsing SimData requires no additional adaptation, conversion, or learning effort.

The figure below illustrates the overall directory structure of the SimData dataset. nuScenes follows the same organizational structure, with the goal of achieving seamless compatibility and tool-level interoperability.

The specific core directories and file structure are described below:

  • maps folder This repository contains all high-definition map (HD Map) image files, which are primarily used to provide precise geospatial information and serve as background references for scenes.

  • samples folder Stores keyframe data for various types of sensors, specifically including:

    • 6-channel camera feed (.jpg (Format)

    • 5-channel millimeter-wave radar point cloud (.pcd (Format)

    • 1-channel LiDAR point cloud (.bin (Format) In this process, the system selects one frame of data every 0.5 seconds as a keyframe and saves it with high precision.

  • sweeps folder Store continuous sensor data (excluding keyframes), which is primarily used to construct temporal information and perform advanced perception tasks such as multi-frame fusion.

  • v1.0-* Folder Stores annotations and metadata for sensors. All files are in .json The file format includes core elements such as timestamps, ego-pose parameters, labels, and scene descriptions.

Each .json The annotated network of relationships between files is also fully consistent with the nuScenes dataset. This is illustrated in detail below using the official nuScenes file structure diagram:

In the SimData dataset, the information blocks in each data file are identified by a globally unique UUID (Universally Unique Identifier) used as tokens for unique identification. These tokens serve as the link between different data dimensions within the dataset. Users simply need to sample.json,sample_data.json cap (a poem) sample_annotation.json With these three core documents, you can efficiently obtain the vast majority of annotated data and structured information.

sample.json

sample.json The document provides a detailed record of the basic core information for keyframes.

  • Each keyframe corresponds to a unique sample_token, used to precisely identify that frame of data.

  • Developers use scene_token You can scene.json Quickly locate the specific simulation scenario to which the sample belongs in the file.

  • The file also provides the previous frame (prev) and the next frame (next) Token pointers can be used to construct continuous frame relations.

sample_data.json

utilization sample_tokenDevelopers can sample_data.json Comprehensively obtain detailed multisensor data for the corresponding frame, specifically including:

  • ego_pose_token: The reference to the vehicle's ego-pose can be found in ego_pose.json to obtain precise pose information (including 3D position and orientation) for that specific moment.

  • calibrated_sensor_token: Calibration parameters for the corresponding sensor can be found in calibrated_sensor.json Look up the sensor's intrinsic and extrinsic parameters.

  • filename: The file path of the sensor's raw data. If the data comes from a camera, it will also include the image height (height) and width (width)。

  • timestamp: Timestamp (unit: microseconds), used for hard time synchronization among multiple sensors.

  • is_key_frame: Boolean, used to indicate whether a specific frame is a keyframe.

  • next / prev: Tokens pointing to the next and previous frames, respectively, thereby enabling precise temporal association.

sample_annotation.json

sample_annotation.json The file accurately records the 3D annotations of the detected objects in each keyframe, allowing for full sample_token Perform a cross-table join. The main key fields included are as follows:

  1. instance_token: A unique identifier for an object instance. Developers can instance.json Look up the instance corresponding to category_token(category information), as well as the keyframe tokens for the object's first and last appearances. Through category_token Then you can further category.json Retrieve the specific semantic category name (Category Name) for that instance.

  2. visibility_token: Visibility rating (divided into four levels; a higher number indicates greater visibility of the object). For specific definitions, see visibility.json ...can be viewed there.

  3. Geometry and pose information of the target object; these pose data are precisely defined in the sensor coordinate system:

    • Center point location (translation)

    • Dimensions (size)

    • Rotation angle (rotation), using quaternions (Quaternion) format.

  4. Point Cloud Statistics: The number of LiDAR points contained within the bounding box (num_lidar_pts) and the number of millimeter-wave radar points (num_radar_pts)。

  5. Frame Association: Accurately records the token identifiers corresponding to the target instance in the previous and subsequent frames, respectively.

03 Examples of SimData and Perception Model Deployment

Usage and Truth Value Visualization

SimData supports direct use nuScenes-devkit When analyzing data, the actual methods for calling and using it are exactly the same as those for the native nuScenes dataset. Here is an example of how to call the code:

from nuscenes.nuscenes import NuScenes
nusc = NuScenes(version=’v1.0-custom’, dataroot=data_path, verbose=True)

Once the instantiated object has been successfully retrieved, developers can directly utilize the comprehensive toolchain provided by nuScenes to perform in-depth analysis of the SimData dataset and train perception models. In conjunction with cv2 maybe matplotlib visualization libraries, you can intuitively visualize datasets in 3D:

  • 6-channel camera image output with Ground Truth (GT) bounding boxes:

  • Synchronized LiDAR point cloud data enables the simultaneous generation of precise annotations from a bird’s-eye view (BEV):

Demonstration of bevformer Detection Results

The following are pre-trained weights that were trained directly using the native nuScenes dataset, using BEVFormer-tiny The following demonstrates the model's actual performance in object detection without any SimData-based incremental training or fine-tuning (zero-shot inference):

  1. BEVFormer Official Repository:https://github.com/fundamentalvision/BEVFormer/tree/master

  2. Authoritative Academic Papers on BEVFormer:https://arxiv.org/pdf/2203.17270

Conclusion

This paper provides an in-depth discussion of the critical importance of virtual datasets in the research and practical implementation of autonomous driving perception algorithms, and offers a comprehensive introduction to SimData—a brand-new, high-fidelity virtual perception dataset generated using the aiSim high-precision simulation platform.

The paper provides a detailed explanation of SimData’s data architecture, underlying schema, and specific parsing methods. It also conducts cross-dataset validation using mainstream open-source perception models (such as BEVFormer), thereby strongly demonstrating the high usability and technical effectiveness of this synthetic dataset in real-world R&D environments.

Moving forward,Hongke Team(Hongke Team) will be releasing more detailed data testing and metric comparison reports in the coming weeks to further quantify and validate the high domain consistency between SimData and real-world datasets. Through this series of in-depth technical studies, we have not only demonstrated the extreme high-fidelity characteristics of the aiSim simulation environment but also provided researchers and autonomous driving developers worldwide with a high-quality, plug-and-play, and highly scalable virtual perception data resource, continuing to provide strong support for the research, iteration, and model training of autonomous driving perception algorithms.

Please stay tunedHOSCOFurther information regardingOfficial Virtual DatasetA major announcement! If you’d like to learn more about our solutions for autonomous driving simulation and virtual datasets, please feel free to contact us.

Other Articles

Hongke Dry Goods

[Hongke Insights] Single-Use vs. Reusable Cold Chain Data Loggers: A Guide to Pharmaceutical GDP Compliance and Selection for Transportation

How to Choose the Right Temperature Data Logger for the Pharmaceutical Cold Chain? This article provides an in-depth comparison of the pros and cons of single-use and reusable data loggers, in accordance with GMP/GDP compliance standards, to help pharmaceutical companies and logistics providers in Hong Kong and Southeast Asia optimize temperature control management in their supply chains and reduce compliance risks when expanding into international markets. Click to learn about expert selection solutions!

Read more
Hongke Dynamic

[Hongke News] Hongke AR Smart Glasses Drive a Comprehensive Upgrade in Telemedicine – Vuzix M400 Smart Healthcare Solution

Hongke has partnered with Chunghwa Telecom to introduce the Vuzix M400 enterprise-grade AR smart glasses, helping to upgrade telemedicine services in remote areas! By breaking down geographical barriers through "first-person view" and hands-free collaboration, this initiative accelerates digital transformation and the implementation of smart healthcare applications for B2B medical institutions and care providers. Click to learn more about the full Proof of Concept (POC) solution.

Read more

Contact Hongke to help you solve your problems.

Let's have a chat