3D Semantic Novelty Detection via Large-Scale Pre-Trained Models

Abstract

Shifting deep learning models from lab environments to real-world settings entails preparing them to handle unforeseen conditions, including the chance of encountering novel objects from classes that were not included in their training data. Such occurrences can pose serious threats in various applications. The task of Semantic Novelty detection has attracted significant attention in the last years mainly on 2D images, overlooking the complex 3D nature of the real-world.

In this study, we address this gap by examining the geometric structures of objects within 3D point clouds to effectively detect semantic novelty. We advance the field by introducing 3D-SeND, a method that harnesses a large-scale pre-trained model to extract patch-based object representations directly from its intermediate feature representation. These patches are used to precisely characterize each known class. At inference, a normality score is obtained by assessing whether a test sample can be reconstructed predominantly from patches of a single known class or from multiple classes. We evaluate 3D-SeND on real-world point cloud samples when the reference known data are synthetic and demonstrate that it excels in both standard and few-shot scenarios. Thanks to its patch- based object representation, it is possible to visualize 3D-SeND’s predictions with a valuable explanation of the decision process. Moreover, the inherent training-free nature of 3D-SeND allows for its immediate application to a wide array of real-world tasks, offering a compelling advantage over approaches that require a task-specific learning phase.

Method

Our method is divided in three phases:

First we obtain a deep learning model for feature extraction, we have no requirement for training objective or output type as our method is both objective and output agnostic, only that the model has a good internal representation of generic 3D point clouds
Then we cut off the last layers of the model and extract all the reference patches from the support set data. This step generates a patchyfied representation of our cad reference models that we store in our unified memory bank. Optionally if the memory bank becomes too large we can down-sample it using coreset subsampling
Finally we can feed our method a test sample, we patchfy again using our feature extractor and compare it with our memory bank, if the extracted features match closely features belonging all from the same class of objects then the object will be classified as ID (In Distribution) otherwise if patches belong to different classes or are far away in feature space they will be classified as OOD (Out of Distribution). Check out the qualitative results for a visual intuition of our scoring function

BibTeX

@ARTICLE{Rabino3dsend,
  author={Rabino, Paolo and Alliegro, Antonio and Tommasi, Tatiana},
  journal={IEEE Access}, 
  title={3D Semantic Novelty Detection via Large-Scale Pre-Trained Models}, 
  year={2024},
  volume={12},
  pages={135352-135361}}