Research on Feature Descriptors used for Point Cloud Registration

Research on Feature Descriptors used for Point Cloud Registration

Team SLAMer: Ma Teng, Sharad Maheshwari, Mingxuan Líu

Supervised by Alessandro Luchetti & Prof. Mariolino De Cecco

1. Problem description: 

In order to achieve SLAM, first we need to get the information of the environment, here we use a 3d camera to generate PCD files in unity (find more in appendix). Then we want to match two 3d pictures that come from the same environment with a different shooting angle and get the transformation, which is the main task of our work.

2. Implementation: 

The process is divided into five steps: preprocessing the point clouds, keypoint extraction, computing feature descriptors, coarse registration (Sampling and Consensus – Initial Alignment) and fine registration (Iterative Closest Point). All processing is done using the PCL library in C++.

2.1 Point Cloud Data Generation: 

In order to test the performance of point cloud registration with different descriptors on point cloud data with different degrees of noise. We used the RGB-Depth 3D camera in Unity to capture 12 sets of dense point cloud data with varying degrees of noise in a flat scene.

 

2.2 Pre-process: 

Input point clouds are pre-processed as needed: filtering to remove outliers, down-sampling, removing NaN values.

2.3 Keypoint Detection:

In this step, one of the two methods are used : ISS (Intrinsic Shape Signatures) and SIFT (Scale-invariant feature transform). Other methods are Harris3d, NARF

2.4 Computing Feature Descriptors:

In this step, five descriptors are evaluated: FPFH, SI, SHOT, CSHOT, SIFT (both feature extraction and descriptor), and also a new data-driven descriptor is explored theoretically: 3DMatch.

The first five descriptors are based on histogram and geometry of point clouds, while 3D Match is a deep learning technique to learn the most robust representation of key-points.

2.5 Coarse registration:

To get a good initial state for ICP, SAC-IA (Sample Consensus Initial Alignment) is used to register the point cloud before ICP. SAC-IA shares a similar idea with RANSAC (Random Sample Consensus), which is widely used in registration as well (with a known model to fit).

2.6 Fine registration:

After SAC-IA, we get the approximate transformation matrix which gives ICP a good initial state. ICP finds the correspondence between points in a greedy way (closest point). It starts with computing correspondences between two point clouds and computes and applies a transformation to minimise the distance between corresponding points. This process is repeated until convergence. For best performance, ICP should get point clouds which are not extremely misaligned, which is the reason SAC-IA is used before ICP. ICP can be based on two methods – SVD and Non-Linear Least Square.

3. Comparison and analysis of pipelines:

Implementation of pipelines: 

(1) SIFT3D + RANSAC + ICP 

(2) SIFT3D + Spin Image + RANSAC + ICP 

(3) SIFT3D + FPFH + RANSAC + ICP 

(4) SIFT3D + SHOT + RANSAC + ICP 

(5) SIFT3D + CSHOT + RANSAC + ICP 

Dependencies: Point Cloud Library , C++, Visual Studio 2019, Unity

3.1 Performance:

Number of registered point pairs is used as a metric for performance. Figure 1 presents the number of registered points of each pipeline using different feature descriptors with increasing the noises of the point cloud from 0 (0%) to 4.0(100%) in our experiment. It can be clearly seen that first, SI descriptor has a relatively better performance with increasing noises. In contrast, SIFT descriptor that is the key points themselves detected by SIFT has the lowest performance. Moreover, when other geometry-based descriptors have lower performance, CSHOT descriptor that use color information have a better performance than others.

3.2 Efficiency:

During the experiments, we also recorded the computational cost of these descriptors. Figure 2 presents the running time of calculating descriptors for source and target key points with increasing noises of the point cloud from 0 (0%) to 4.0(100%). We can clearly see that SI descriptor is the most efficient descriptor. In contrast, SHOT and CSHOT become the most computationally expensive descriptors under current radius. However, if we increase the searching radius, the efficiency of these descriptors will decrease by different degree.

4. Conclusion:

Overall, SI performs the best among these descriptors. SI has a good balance between performance and efficiency. Particularly, SI is suitable for real-time applications. 

However, the performance and efficiency of these descriptors highly depends on the parameters setting. For instance, during the experiments, we found that the efficiency of FPFH decreased dramatically with increasing searching radius. If we set a larger searching radius for the FPFH descriptor, it will become very slow, while other descriptors are not very sensitive to the increasing searching radius. In order to obtain a clearer and comprehensive understanding of these feature descriptors, we should carry out more experiments in tuning the parameters setting in the future.