Automatic Image Annotation for Mapped Features Detection

Conference Paper

Published in 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS)

Authors

Affiliations

Maxime Noizet

Heudiasyc UMR CNRS 7253, Université de technologie de Compiègne

Philippe Xu

Heudiasyc UMR CNRS 7253, Université de technologie de Compiègne

U2IS, ENSTA Paris, Institut Polytechnique de Paris

Philippe Bonnifait

Heudiasyc UMR CNRS 7253, Université de technologie de Compiègne

Published

October 16, 2024

Abstract

Detecting road features is a key enabler for autonomous driving and localization. For instance, a reliable detection of poles which are widespread in road environments can improve localization. Modern deep learning-based perception systems need a significant amount of annotated data. Automatic annotation avoids time-consuming and costly manual annotation. Because automatic methods are prone to errors, managing annotation uncertainty is crucial to ensure a proper learning process. Fusing multiple annotation sources on the same dataset can be an efficient way to reduce the errors. This not only improves the quality of annotations, but also improves the learning of perception models. In this paper, we consider the fusion of three automatic annotation methods in images: feature projection from a high accuracy vector map combined with a lidar, image segmentation and lidar segmentation. Our experimental results demonstrate the significant benefits of multi-modal automatic annotation for pole detection through a comparative evaluation on manually annotated images. Finally, the resulting multi-modal fusion is used to fine-tune an object detection model for pole base detection using unlabeled data, showing overall improvements achieved by enhancing network specialization. The dataset is publicly available.

This paper is available on arXiv.

The results are demonstrated in a video available on YouTube.

The video showcases a segment of a driving sequence from the dataset used in this study and provides a detailed presentation of the results.

Initially, the video highlights the automatic annotations generated by the three methods proposed in the paper. It then demonstrates the process of merging these annotation sources, using different colors for the crosses representing annotations to indicate the level of consensus among the methods. Specifically, annotations validated by all methods are distinguished from those that are ambiguous.

Next, the video illustrates how black patches were added to address uncertainties in the annotations. Finally, it presents the results of pole base detection using a YOLOv7 neural network. This network was trained on high-consensus automatic annotations, with the input images modified to mask ambiguous objects.

Citation

BibTeX citation:

@inproceedings{noizet2024,
  author = {Noizet, Maxime and Xu, Philippe and Bonnifait, Philippe},
  title = {Automatic {Image} {Annotation} for {Mapped} {Features}
    {Detection}},
  booktitle = {2024 IEEE/RSJ International Conference on Intelligent
    Robots and Systems (IROS)},
  pages = {9367 - 9373},
  date = {2024-10-16},
  url = {https://ieeexplore.ieee.org/document/10801773},
  doi = {10.1109/IROS58592.2024.10801773},
  langid = {en},
  abstract = {Detecting road features is a key enabler for autonomous
    driving and localization. For instance, a reliable detection of
    poles which are widespread in road environments can improve
    localization. Modern deep learning-based perception systems need a
    significant amount of annotated data. Automatic annotation avoids
    time-consuming and costly manual annotation. Because automatic
    methods are prone to errors, managing annotation uncertainty is
    crucial to ensure a proper learning process. Fusing multiple
    annotation sources on the same dataset can be an efficient way to
    reduce the errors. This not only improves the quality of
    annotations, but also improves the learning of perception models. In
    this paper, we consider the fusion of three automatic annotation
    methods in images: feature projection from a high accuracy vector
    map combined with a lidar, image segmentation and lidar
    segmentation. Our experimental results demonstrate the significant
    benefits of multi-modal automatic annotation for pole detection
    through a comparative evaluation on manually annotated images.
    Finally, the resulting multi-modal fusion is used to fine-tune an
    object detection model for pole base detection using unlabeled data,
    showing overall improvements achieved by enhancing network
    specialization. The dataset is publicly available.}
}

For attribution, please cite this work as:

Noizet, Maxime, Philippe Xu, and Philippe Bonnifait. 2024. “Automatic Image Annotation for Mapped Features Detection.” In 2024 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), 9367–73. https://doi.org/10.1109/IROS58592.2024.10801773.