RESEARCH ON HIGH-FREQUENCY ODOMETRY IN LASER SLAM USING POINT-SUBMAP MATCHING

: The stability of front-end odometry output is the cornerstone of real-time mapping and high-precision localization. However, in autonomous driving, factors such as sensor types, driving speed, and signal occlusion can impact system performance, particularly causing trajectory drift, reduced accuracy, and decreased reliability during high-speed travel. This study employs a point-by-point update approach for odometry, achieving high-frequency odometry output and mapping updates nearly matching the point sampling rate, effectively eliminating point cloud distortions and providing solid support for tracking high-speed motion and navigation localization. This method emphasizes maintaining registration accuracy and robustness under high-frequency output conditions. Moreover, by utilizing backward propagation to optimize the estimation of posture and position, and ignoring inter-frame processing, this research implements an iterative extended Kalman filter, achieving tight coupling with IMU data. Additionally, the use of the iVox data structure accelerates point cloud searching, enhancing system processing efficiency. Testing in various scenarios has demonstrated that our method exhibits superior performance and robustness across different scene data, showcasing its potential application in the field of autonomous driving.


INTRODUCTION
SLAM (Simultaneous Localization and Mapping) technology stands as a pivotal innovation in robotics and computer vision, enabling robots and autonomous systems to adeptly navigate uncharted environments while concurrently constructing a map of their surroundings.This technology finds extensive application in autonomous vehicles, drone navigation, and indoor robot maneuvering.Vision-based SLAM systems excel in precise positioning, yet they face challenges in generating dense maps.These challenges arise partly due to environmental factors like lighting and partly due to the constraints of computing resources.on the other hand, LIDAR offers accurate, dense, and direct 3D point cloud data, while the inertial measurement unit (IMU) provides real-time positional information.Together, they form a synergistic, tightly coupled LIO (Lidar-Inertial odometry) system.
Recent advancements in lidar odometry have focused on accumulating lidar points into frames to extract feature points (areas and lines), then matching and merging a point cloud frame with a previously matched submap to create a new submap, or alternatively, conducting direct frame-to-frame matching.This approach somewhat mitigates computational strain; however, it struggles with rapid map drift, particularly in high-speed vehicular movement, due to the reliance on substantial overlap between matching frames.One major drawback of scan-to-map registration is that the odometry is estimated at the rate of the scan (or frame), limiting the odometry output frequency at the frame rate.The limited output frequency will cause a delay in the odometry equal to the scan duration.In this paper.
In this work, we address these issues by point-by-point state update.More specifically, our contributions are as follows: 1) We propose a point-wise LiDAR-inertial odometry (LIO) framework, that integrates LiDAR points at their precise moment of capture, bypassing the need for frame accumulation.This approach effectively eliminates motion distortion within the frame,facilitating high frequency odometry outputs and mapping updates that closely match the point sampling rate; 2) In order to compensate for potential errors in point-submap matching, the system integrates IMU and LIDAR data into the backend.This integration forms a tightly coupled LIO system, utilizing Iteratively Extended Kalman Filter (IEKF) to construct state and motion equations; 3) To enhance the point cloud search rate and maintain the real-time performance of the LIO system, the frontend employs an incremental voxel (IVox) data structure.This structure tracks all map points within a specific map capacity, incorporates an LRU cache into the submap, and passively deletes long-unused voxels when capacity is exceeded.

LiDAR Odometry
Recent advancements in 3D LiDAR odometry and mapping have predominantly harnessed the LiDAR Odometry and Mapping framework (Zhang and Singh, 2014).In this framework, raw LiDAR points are amalgamated into a frame (or scan) to delineate feature points, such as edges and planes.These identified feature points are subsequently aligned with the preceding scan to produce odometry data at the scan frequency (typically 10Hz), and a compilation of several recent scans forms a condensed submap.This submap is then incrementally integrated into the global map at a reduced rate (around 1Hz), facilitating the refinement of the LiDAR's pose in relation to the map.This bifurcated structure of scan-to-scan and scan-to-map integration underpins the LOAM architecture and has catalyzed a plethora of subsequent scholarly inquiries.Notably, Lego-LOAM enhances scan-to-scan matching accuracy through the incorporation of ground constraints, while LINS (Qin et al.,2020) amalgamates IMU data with scan registration.Concurrently, other research endeavors have concentrated on amplifying computational efficiency or augmenting accuracy within this domain.
While the delineation between scan-to-scan and scan-to-map processes substantially mitigates the computational load for odometry, it precipitates a swift accrual of drift during the scanto-scan registration phase.Furthermore, this registration necessitates significant overlap between successive scans, a condition that may not be consistently achievable with small FoV solid-state LiDARs.In response to these challenges, direct scan-to-map (or scan-to-local map) methodologies have gained traction, employing strategies such as point maps, ICP, NDT, Surfel maps, or voxel maps.To specifically address the constrained FoV challenge presented by solid-state LiDARs, some methods have introduced parallel scan-to-map techniques.Moreover, the incorporation of IMU measurements into the scan-to-map registration process, within an optimized iterated Kalman filter framework, has emerged as a promising solution.A pivotal element of the scan-to-map methodology is the sustenance of an adaptable map structure that supports the incremental integration of new scan points while ensuring efficient data access.The advent of an incremental k-d tree, or ikd-Tree, as a map structure, has played a crucial role in equipping systems like FAST-LIO2 (Xu et al.,2022) with the capability to execute real-time odometry and mapping for spinning LiDARs at 10Hz and solid-state LiDARs at 100Hz, even on resource-constrained ARM-based computing platforms.
A notable limitation of the scan-to-map registration process is its dependency on the scan (or frame) rate, which inherently restricts the odometry update frequency to this rate.Consequently, there is a latency in the odometry updates equivalent to the scan duration, creating an unnecessary constraint on the odometry bandwidth, as dictated by the Nyquist Shannon sampling theorem.This challenge has been partially mitigated by innovations such as Lola-SLAM (Karimi et al.,2021) and LLOL (Qu et al.,2022), which segment a scan into multiple subscans, registering each to the map instantaneously upon reception, thus achieving odometry frequencies of 160Hz and 80Hz, respectively.This remarkable enhancement in high-frequency state updates drastically reduces latency to the microsecond level and substantially increases the odometry bandwidth, representing a significant advancement in the field.
Within the expansive domain of spatial data management, voxel hashing stands out as a compelling alternative to traditional hierarchical tree-based architectures for visual SLAM and volumetric 3D reconstruction tasks.Implementations that utilize hashed voxels for map management and nearest neighbor searches unveil the capacity to significantly improve incremental mapping processes in LiDAR-Inertial Odometry (LIO) systems.This method showcases a scalable and efficient approach, which is particularly advantageous for real-time applications, highlighting its potential to revolutionize the landscape of spatial data handling and analysis.

Formulation of Inertial Measurements
In the realm of synergizing Inertial Measurement Unit (IMU) data with LiDAR point registration to augment robotic navigation and mapping capabilities, two primary paradigms emerge: the loosely coupled and tightly coupled approaches.Loosely coupled frameworks utilize IMU measurements to formulate an initial pose estimate, which serves as a precursor for the subsequent scan registration endeavors (Zhao et al.,2019) .In contrast, tightly coupled methodologies integrate IMU data and LiDAR points within a cohesive optimization structure, delineating two main implementations: the Extended Kalman Filter (EKF)-based and the optimization-based techniques.EKF-based (Hemann et al.,2016) methods meticulously assimilate IMU measurements within the EKF's propagation phase to refine pose estimations, which are then coalesced with LiDAR measurements in the EKF's update phase.Alternatively, optimization-based approaches employ IMU data preintegration to ascertain relative pose constraints, integrating these preintegrated constraints with point registration discrepancies to bolster system robustness and precision.Tightly coupled methods are generally recognized for their superior robustness and accuracy over loosely coupled alternatives.In conventional tightly coupled systems, IMU data are processed as inputs within a kinematic model, aiding in the propagation for EKF implementations or pre-integration across a frame's duration.However, this conventional approach may face saturation issues when the robotic motion exceeds the measurement limits of the IMU.In contrast, some research initiatives leverage IMU data to measure angular velocity and linear acceleration, as forecasted by a continuous-time trajectory model.This model then concurrently refines trajectory parameters and LiDAR scan registration factors, effectively counteracting IMU saturation during vigorous movements (Gentil et al.,2019).Our approach resonates with this concept, employing IMU data as output measurements from the model, but innovates by depicting robot motion as a stochastic process intertwined with kinematic dynamics.This unique amalgamation allows for the simultaneous update of the state with LiDAR points within an EKF framework, proficiently managing saturated IMU data during extreme dynamics, like severe vibrations and rapid movements.To the best of our knowledge, this functionality to operate with saturated IMU data within LiDAR-inertial systems is unparalleled in existing studies, signifying a pivotal breakthrough in the domain.

METHOD
Figure 1.System overview of our approach.
Our system architecture, as depicted in Figure 1, operates on sequentially sampled LiDAR points, updating the state at their respective timestamps, thereby yielding an odometry output at an exceptionally high rate, practically achieving 3-6kHz.Specifically, for each received LiDAR point, we conduct a search for a corresponding plane in the map.For state estimation purposes, points in the new scan are registered to the map points maintained in a large local map if they match with the planes deduced in the map, utilizing a tightly-coupled iterated Kalman filter framework.The global map points within the large local map are organized by an incremental voxel structure, iVox (Bai et al.,2022).An LRU (Least Recently Used) cache mechanism is implemented in the local map to automatically purge long-unused voxels from iVox, thereby ensuring system efficiency.Thus, iVox tracks all map points within a designated area, which has a specific length and is employed to compute residuals in the state estimation module.The optimized pose eventually registers the points from the new scan into the global coordinate system, integrating them into the map by inserting into iVox at an odometry rate.This integration exemplifies our system's sophisticated mechanism for dynamic map management and state estimation, marking a significant stride in LiDAR-inertial navigation technology.

State Estimation
LiDAR systems typically operate through point-by-point sampling.Our methodology processes each point directly, rather than between frames, thus obviating the need to estimate the LiDAR pose for each point in the point cloud relative to a specific moment in time.However, to enhance positioning accuracy and system stability, we have adopted the backward propagation method introduced in FAST-LIO, which estimates the LiDAR pose at the end of each scan for every point based on IMU measurements.This approach allows for precise and stable localization by directly correlating each LiDAR point with its respective pose at the scan's conclusion, leveraging the high-resolution temporal data provided by the IMU.
The system nominal state x , error state x  , IMU input u and noise  are defined as follows:  Using the IMU measurements as input, the nominal state values are predicted using a discrete-time propagation model: with a linearized discrete-time error state transfer equation( 7), the covariance matrix of the error state is forward propagated as equation ( 8).
For the detailed forms of readers can refer to FAST-LIO (Xu et al.,2021).In addition, the result of the forward propagation are used for motion compensation and registration.We follow the direct method of FAST-LIO2 by downsampling the sub-frame lidar points and searching the plane for each point, then constructing point-toplane residuals after fitting the local plane patches: Where p is the corresponding j-th lidar point,means the -th k iteration.Using the point-to-plane residuals as measurements, data association and error state update operations are performed iteratively FAST-LIO2: During the above iterations, we have simultaneously performed correction of the nominal state and reset of the error state.When the iterations converge, the state covariance is updated with the following equation:

Data Structure of IVox :
In iVox, point clouds are initially stored within sparse voxels, which are indexed and hashed into an unordered map via a hash function.Given the sparse nature of LiDAR point clouds, we eschew volumetric representations like TSDF, opting instead for a sparse hashing map that only stores voxels containing at least one point.The hash indices are computable using any spatial hashing algorithm, facilitating efficient and effective storage and retrieval of the point cloud data within the iVox structure, optimizing both memory usage and computational resources.Within each voxel in iVox, points are stored as a vector, forming the fundamental internal structure.K-Nearest Neighbor (k-NN) searches are conducted within each voxel with a complexity of   O n ，where n is the number of points in the voxel.To prevent excessive point insertion into a single voxel during the incremental mapping phase of map construction, a threshold is established.The k-NN search is constrained within a predefined range and unfolds in three stages.Given an iVox structure V and a query point P ， the following steps are executed ： 1) Identify the voxel index and adjacent voxels, denoted as S. 2) Traverse each voxel in S, conducting a search for up to K nearest neighbors within each voxel.3) Aggregate the search results and select the optimal K nearest neighbors.
The k-NN search within iVox is relatively straightforward and efficient compared to tree-based algorithms.

Incremental Mapping:
The incremental mapping in iVox encompasses both the addition and deletion of points.Addition is straightforward; to enhance voxel search efficiency, an octree data structure is integrated within each voxel, facilitating the process through the insertion of new points and the creation of new voxels as required.To prevent excessive point accumulation in a single voxel, a filtering approach akin to VoxelGrid, similar to that employed in FastLIO2, is utilized to skip unnecessary point insertions.Given that the nearest neighbor voxel indices are precomputed, a current point is not inserted if any neighbor is closer to the center of the voxel grid.
The leaf size of the filter is a crucial parameter in balancing precision and speed; a larger leaf size prevents too many insertions into a single voxel but can reduce the accuracy of the k-NN search.Drawing from the experiences with Faster-LIO, a leaf size of 0.5 meters typically performs well across most datasets.Incremental deletion in iVox differs from the k-d tree structure, where the latter traverses the entire map for operations, a process too resource-intensive when applied to an entire voxel hash map.Therefore, instead of actively removing points outside the current field of view, we employ a Least Recently Used (LRU) cache strategy within a time window to maintain the local map.In addition to the voxel hash map, we also keep a queue of recently accessed voxels and establish a maximum size for it.Should the voxel count exceed this limit, we eliminate the oldest voxels in memory.Given that both insertion and deletion operations in the hash map are of O(1) complexity, this approach is sufficiently efficient and accurate for our LIO system, balancing precision and operational speed effectively.

Platforms
The experimental setup in this study employs the AGILE mobile platform, which is equipped with a nine-axis IMU for precise inertial navigation, four infrared obstacle avoidance sensors, and wheel motors as the propulsion components, achieving a maximum speed of 5 m/s and a load capacity of 50 kg.The control system utilizes a high-performance industrial computer with an i7 CPU and 16G of memory, supporting the ROS operating system, ensuring robust computational power and stability.Additionally, the experimental platform is outfitted with HESAI's Pandar40 LiDAR, capable of 360degree omnidirectional scanning, high-density point cloud output, and long-range distance measurement, facilitating efficient, accurate, and comprehensive environmental perception.
On the software side, the experimental platform is operational on Ubuntu 20.04 and integrates with the ROS, offering robust software support and an extensive algorithm library for the experiments

Mapping effect analysis
The mapping results of a park, a teaching building, and a corridor are respectively showcased in Figures 5, 6 b1) and ( c1)) and our method (as seen in ( b2) and ( c2)).Finally, to illustrate inframe motion distortion, points from a scan cycle (0.1 seconds) in the selected local areas are displayed (sub-figure (d)), along with a further zoom-in (sub-figure (e)), for both FAST-LIO2 (as (d1) and (e1)) and our method (as (d2) and (e2)), where green points represent the LiDAR points registered in the current scan, and white points denote the mapping results accumulated up to the current scan.
Sub-figure (c) of Figures 5, 6, and 7 reveals that the overall map of FAST-LIO2 (c1) is noticeably thicker than our map (c2).This phenomenon is attributed to the in-frame motion compensation within each individual scan, as depicted in sub-figures (e) of Figures 5, 6, and 7.It is observed that all red points surrounding the selected wall should belong to the same plane; however, for FAST-LIO2 (e1), these points are dispersed off the wall due to in-frame motion distortion.This distortion is significantly mitigated in our method, demonstrating enhanced performance in maintaining structural consistency in the mapping process.FAST-LIO2 utilizes backward propagation based on IMU measurements to project all scanned points to the pose at the end of the scan.In contrast, our method optimizes the estimation of pose and position, discarding the inter-frame processing, a process that can be easily affected by IMU measurement noise, bias estimation errors, and the limited IMU sampling rate.Specifically, sensor vibrations cause rapid changes in acceleration and angular velocity within an IMU sample interval, leading to significant IMU propagation errors under the assumption that angular velocity and acceleration are constant during the sampling interval.In contrast, by integrating LiDAR points at their actual sampling times without accumulating any points, our approach fundamentally eliminates motion distortion, as demonstrated in sub-figures (c1) and (c2).

Mapping accuracy analysis
In the SLAM algorithm domain, the current standard for accuracy evaluation involves the use of the odometry and SLAM evaluation tool (evo) to determine the trajectory's precision.The primary evaluation metrics include Absolute Pose Error (APE) and Relative Pose Error (RPE).APE analysis compares the pose data computed by the system against the assumed true trajectory based on continuous time data, deriving trajectory discrepancies at each corresponding time point to quantify the trajectory error.Conversely, RPE assesses the root mean square error of the Lie algebra for each pose along the trajectory over continuous time, factoring in both rotational and translational errors of the motion pose.The error metrics for APE and RPE cover various dimensions, encompassing maximum, minimum, average, median error values, standard deviation, root mean square error, and other statistical indices, providing a comprehensive evaluation of the trajectory accuracy.
Regarding the robustness of the laser SLAM system, this study conducted multiple experiments on the gate, street, and hall sequences of the M2DGR dataset to assess the algorithm's robustness across different scenarios.The performance was compared with the well-regarded FAST-LIO2 algorithm, and various error metrics were analyzed.The findings are as follows: The   We employed seven sequences to evaluate the mapping accuracy of our system.Figures 8, 9, and 10 depict the Absolute Trajectory Error (ATE) comparisons between FAST-LIO2 and our method across different scenarios within these sequences, with our method represented on the right side of each figure.In Figures 8 and 10, the trajectory accuracy margin between the two methods is minimal, illustrating that our point-by-point update strategy facilitates high-frequency state updates.Figure 9, set in a scenario featuring dense rotations and moving objects, suggests that our algorithm may better exploit the high resolution and update rate of LiDAR data, providing a more coherent and detailed environmental representation.This capability allows for a more precise capture of dynamic changes in the environment, thereby demonstrating superior mapping accuracy.Table1.Comparison of RMSE (meters) for ours, FAST-LIO2 and Faster-LIO on public sequences.
Table 1 illustrates that our system achieved comparable or superior accuracy relative to other algorithms.Notably, in the hall_05 sequence, where the robot executed numerous rotations, our system enhanced the precision significantly using the pointby-point update strategy and the ESKS framework.However, in the street_06 and gate_02 sequences, the accuracy of our system and Faster-LIO was observed to be lower than that of FAST-LIO2, which can be attributed to the inherent precision advantage of the k-d tree-based LIO method over voxel-based approaches.In these experiments, FAST-LIO2, Faster-LIO, and our system all employed a 0.5-meter voxel filter to process the raw point cloud data.These experimental outcomes underscore the high robustness and precision of our method under dynamic and challenging environmental conditions, demonstrating its potential applicability in the realms of autonomous driving and robotic navigation.

CONCLUSION
This paper integrates a robust LiDAR-Inertial Odometry (LIO) framework based on an innovative point-by-point update scheme that refreshes the system state at the actual sampling time of each point, thereby avoiding point accumulation within a frame.This strategy effectively addresses the longstanding issue of in-frame motion distortion and enables high-frequency odometry outputs, facilitating the tracking of rapid movements.Moreover, the integration of map points through the iVox data structure accelerates map searching.Our method has been rigorously tested across various scenes and motion patterns in both public datasets and proprietary data collections, demonstrating excellent mapping results and comparable odometry accuracy against other advanced algorithms, proving its capability in mapping tasks under high-speed motion.Future work aims to develop a reliable mapping method that relies solely on LiDAR, independent of inertial measurement units, to further enhance the robustness and efficiency of the mapping process.
IMU position, velocity, attitudein   the global frame, a b ， g b are the IMU biases and G g is thegravity in the global frame.
n  are the measurement noise of m a , m  and ba n , b n  are the random walk noise of IMU biases.
of the j-th plane patch respectively, l j

Figure 2 .
Figure 2. Illustration of direct registration of LiDAR point to map.

Figure 4 .
Figure 4. Update the iVox with LRU cache.Relocate the newly created and recently utilized voxels to the back of the cache, and purge those that have remained unused for a predetermined duration.
, and 7.In each figure, we initially present a global view of the final mapping outcome (sub-figure (a)), followed by a focus on specific local areas containing large planes, such as walls (subfigure (b)).We investigate the consistency of points on the wall (sub-figure (c)), which facilitates a comparison of mapping accuracy between FAST-LIO2 (as seen in (

Figure 5 .
Figure 5. Illustration of motion distortion for the Park.

Figure 6 .
Figure 6.Illustration of motion distortion for the teaching building sequence.

Figure 7 .
Figure 7. Illustration of motion distortion for the corridor.

Figure 8 .
Figure 8.Comparison of APE on the M2DGR dataset's gate_02 sequence, with FAST-LIO2 on the left and our method on the right.

Figure 9 .
Figure 9.Comparison of APE on the M2DGR dataset's hall_05 sequence, with FAST-LIO2 on the left and our method on the right.

Figure 10 .
Figure 10.Comparison of APE on the M2DGR dataset's street_06 sequence, with FAST-LIO2 on the left and our method on the right.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLVIII-4/W10-2024 8th International Conference on Smart Data and Smart Cities (SDSC), 4-7 June 2024, Athens, Greece 3 P  R .The s is the voxel size, x n , y n , z n are large prime numbers and N is the size of hash map, correspondingly.