SMARTPHONE LEVEL INDOOR/OUTDOOR UBIQUITOUS PEDESTRIAN POSITIONING 3DMA GNSS/VINS INTEGRATION USING FGO

: This paper discusses ubiquitous smartphone pedestrian positioning challenges in urban canyons and GNSS-denied areas such as indoor spaces. Existing sensor-based techniques, including GNSS, INS, and VIO, have limitations that affect positioning accuracy and reliability. A machine learning-based approach is suggested to employ Support Vector Machine (SVM) to classify indoor/outdoor (IO) detection using GNSS measurement data. The proposed system integrates local estimates on VIO and 3D mapping aided (3DMA) GNSS measurements using Factor Graph Optimization (FGO) with an IO detection switch to estimate precise pose and eliminate global drift. The effectiveness of the system is evaluated through real-world experiments that produce notable outcomes.


INTRODUCTION
Smartphone positioning is challenging in urban canyons: Smart mobility has recently been trendy with various sensors, cutting-edge intelligence, and next-generation networks.Among these intelligent techniques, a vast variety of mobile positioning has been proposed over time to support the Location-Based Service (LBS).Smartphones are fast becoming a key instrument in mobile positioning.It has several sensors, for instance, Wi-Fi, an inertial sensor, a magnetometer, a monocular camera, etc. (Sheta et al., 2018).The positioning system can be configured into different combinations of sensors to provide reliable localization information and achieve all-rounded positioning.The challenge of mobile Global Navigation Satellite System (GNSS), positioning is sensor perception in urban areas.Rajak (Rajak et al., 2021a) and Wen (Wen et al., 2021a) (Wen et al., 2019) have validated the unsatisfactory performance of GNSS positioning in the urban environment with conventional positioning methods.Poor positioning accuracy highly affects user experiences, especially for smartphone users.Extra information can be provided to aid the positioning in urban canyons.3D building models provide the perception of the actual environment as a software-based aided positioning approach for the low-cost, namely 3D mapping aided (3DMA) GNSS (Groves, 2016).Existing studies by (Zhong & Groves, 2022) and (Ng et al., 2021) show a superior positioning performance of 3DMA GNSS in urban canyons.Doppler measurements are often used to integrate with the position solution to provide smoother positioning results.(Ng et al., 2022) integrates 3DMA GNSS and velocity estimated by Doppler frequency as a loosely coupled using Factor Graph Optimization (FGO).This study is also going to adopt the loosely coupled 3DMA GNSS and Doppler velocity to provide more accurate positioning in a global frame.
GNSS suffering from outages in indoor scenarios: Augmented with different sensors as a fused solution can improve the performance of the GNSS in urban cities.When the operating environment affects the GNSS, such as the solution outages due to insufficiently visible satellites to resolve the position.The GNSS positioning degrades in the indoor environment because of weak attenuated and scattered signals received by numerous objects in the indoor environment.For example, (Rajak et al., 2021a) found that the signal strength of GPS decreased by 10 12 decibels leading to a rapid reduction in positioning accuracy.In addition, the scale degeneracy in rotation-only or constant velocity motions happens since the lack of direct distance measurements (Zhu et al., 2019).Hence, IO (Indoor/Outdoor) detection is the key to ubiquitous positioning.To accomplish the ubiquitous indoor localization framework, many researchers conducted IO detection using GPS measurement (Rajak et al., 2021a) (Pei et al., 2009) (Wang et al., 2020).The framework should not require any extra infrastructure but use the existing built-in sensors in the smartphone.The recent developments in IO detection have the potential to assist positioning.

Visual/inertial odometry can help but is subject to drift over time:
To assist the indoor positioning, visual-inertial odometry is fully employed in the GNSS-denied area for bridging the GNSS gaps (Huai et al., 2015).Although Visual-Inertial Odometry (VIO) is hard to implement within the speed and latency constraints, all stages are refined with a nonlinear optimization (He et al., 2018).In accordance with the different sensor assessments, the effectiveness varies with respect to indoor or outdoor environments.Consequently, the attenuated and scattered signal received will not be adopted to access the indoor position.GNSS, Inertial Measurement Unit (IMU), and the monocular camera are utilized to take advantage of their forte.A loosely coupled GNSS/VINS (Visual Inertial Navigation System) integration is generated in the paper.

Adaptive 3DMA GNSS/VINS integration is promising:
The multi-sensor integration approach applied for the 3DMA GNSS/VINS is FGO in a loosely coupled way.It is nonlinear optimization represented in probabilistic graphical models.After factorization, it transformed into a factor graph to simulate the relationship between the poses and estimate their value.It satisfies various changes in the dynamic environment (Chen et al., 2016).Moreover, FGO has been demonstrated to use most of the feature constrain to get the optimal trajectory, resulting in higher accuracy and efficiency to achieve a robust estimation (Rajak et al., 2021b).
The proposed method in this paper: In this research, smartphone-level ubiquitous mobile pedestrian positioning is proposed.The first objective of the research is to use a machine learning-based method Support Vector Machine (SVM) to classify and detect IO transition by the GNSS measurement.As indoor environments cannot receive satellite signals, the GNSS positioning performance will be degraded going into the indoor environment from the outdoor environment when achieving seamless positioning.The IO detection can assist in the system switching from GNSS/VINS to VINS.The second objective is the GNSS/VINS alignment.The frame between the global position and local poses must be aligned in the same coordinate system.Through the continuous dynamic motion of the smartphone, the heading of the device can be derived through the accelerations from the accelerometer with the position and velocity measurements from the GNSS receiver.The third objective is the integration of the loosely coupled GNSS/VINS using FGO.FGO constrained all factors regarding GNSS/VINS performance, and IO detection results to smooth indoor and outdoor transitions.It is a cost-deducted system acquiring a precise loosely coupled positioning in a dynamic and complex high-density environment with a monocular camera, Micro Electro Mechanical Systems (MEMS) IMU, and GNSS for performance evaluation.This can enhance the whole accuracy into a sub-meter with cost reduction and accessibility.This paper is organized as follows: In Section 2, relevant literature will be discussed.In Section 3, a system overview, SVM classification, 3DMA GNSS/VINS, and FGO will be presented.In Section 4, the experiment result of the 3DMA GNSS/VINS will be shown.Finally, the paper will be concluded in Section 5.

Introduction to GNSS/VINS integration for pedestrian positioning
Pedestrian positioning in urban and indoor environments has been a topic of research interest, with GNSS and VINS being two common positioning technologies.The complementary nature of these technologies makes them suitable for integration to provide precise and continuous positioning.However, integrating these technologies faces challenges due to measurement errors, such as multipath, signal obstructions, and drift.
Improving GNSS performance in challenging areas: GNSS is wildly applied to provide continuous positioning in the global frame in absolute coordinates.However, the performance is usually unsatisfactory due to the blockage or reflection of the signals over the buildings, resulting in the non-line-of-sight (NLOS) reception and multipath effect (Groves, 2013).Improving the GNSS positioning can benefit the whole positioning system, and researchers are trying to mitigate the NLOS error to improve the GNSS performance alone.Recent research has explored integrating VINS for pedestrian positioning using loosely and tightly coupled integration approaches, combining measurements from GNSS, INS, and visual sensors for accurate pose estimation in GPS-denied environments.Evaluation of this approach in real urban scenarios has confirmed its ability to improve positioning accuracy compared to the standalone GNSS or INS systems (Falco et al., 2017).Despite the potential benefits, GNSS/VINS integration still faces several challenges and limitations, including dealing with signal loss, reducing computational complexity, and improving robustness in various environmental conditions.One significant challenge of GNSS/VINS integration is the occurrence of multipath errors.

Smartphone-level pedestrian positioning: challenges and opportunities
Smartphone-level pedestrian positioning is essential for LBS.However, providing accurate positioning in real-time poses challenges due to smartphones' limited processing power, memory, and battery life.State-of-the-art positioning techniques, such as GNSS and VIO, have been used in smartphone-level pedestrian positioning.Recently, various techniques have been proposed to improve pedestrian positioning accuracy using smartphones.One such technique is Pedestrian Dead Reckoning (PDR), using inertial sensors such as accelerometers and gyroscopes.An efficient PDR algorithm was developed using a low-cost MEMS IMU for smartphones and showed that it is lowcost, simple, and easy to use compared to other methods (Jimenez et al., 2009).Even though the most significant advantage of PDR is an infrastructure-independent (Pratama et al., 2012), it also has some limitations, such as low accuracy and drift.
Integrating VIO with other sensors has been proposed to address these challenges and achieve more precise localization.One such approach is to integrate VIO with Light Detection and Ranging (LiDAR) using Simultaneous Localization and Mapping (SLAM) algorithms (Debeunne & Vivet, 2020).SLAM algorithms use visual and LiDAR data to generate a more accurate and detailed map of the surroundings, improving accuracy, robustness, and efficiency.LiDAR-based pedestrian tracking is also a promising technique due to its high accuracy and robustness in various environments.However, this approach is limited by high computational complexity and cost.
One popular approach to improve GNSS positioning in urban canyons is using a 3D building model to identify and correct the NLOS reception error, namely 3DMA GNSS (Groves, 2016).3DMA GNSS usually demonstrate as a particle-based approach.Measurements are modeled as the prediction at each distributed position hypothesis candidate.The candidate with the highest similarity between modeled and actual received measurements is assumed to be the receiver location.3DMA GNSS can commonly divide into shadow-matching and ranging-based 3DMA GNSS.Shadow matching (Wang et al., 2013) (Wang et al., 2015) matches the satellite visibility across distributed locations.Meanwhile, ranging-based 3DMA GNSS provides reflection delays for the NLOS-predicted pseudo-range modelling.The delay estimation can be done in a geometrical approach, such as ray-tracing GNSS (Hsu et al., 2016) (Miura et al., 2015) and Skymask 3DMA (Ng et al., 2020).Both approaches validate the signal transmission path and calculate the reflection delay with the predicted reflecting point.The likelihood-based ranging (Zhong & Groves, 2022) statistically uses a skew-normal distribution to model the NLOS delay measurements.Then it remaps the errors to the LOS one with the normal distribution.Extending the single epoch positioning approach to temporal connected can increase the robustness of the positioning performance.(Zhong & Groves, 2022) adopts a grid filter to distribute positioning candidates evenly to improve the smoothness of the solution.An alternative way is using the FGO to connect the temporal domain as a batch optimization to increase the overall robustness and smoothness.(Ng et al., 2022) integrates 3DMA GNSS with velocity estimated by Doppler measurements as a loosely coupled solution, and states are optimized via FGO.The results show that it can provide a more robust trajectory for pedestrian applications.
VINS positioning and navigation: VINS are integrated navigation systems that combine visual and inertial sensors to estimate a platform's position and orientation.Despite the promising results of VINS in various applications, they face several challenges, including time drift and accumulated error, particularly in complex environments.Two residual errors associated with visual and inertial measurements include the discrepancy between the prediction by the IMU and the position estimated by visual odometry.Different algorithms have been evaluated to address these issues, such as keyframes-based methods, inertial measurements preintegration, nonlinear optimization methods, and machine learning-based methods (Leutenegger et al., 2015) (Zhang & Scaramuzza, 2018) (Forster et al., 2017) (Han et al., 2019).For instance, preintegrating IMU measurements and feature observations can obtain high accuracy and efficiency by preintegrating inertial measurements between selected keyframes into single relative motion constraints and using keyframes to reduce computational complexity and improve accuracy (Forster et al., 2017).While the nonlinear optimization method can attain a highly accurate state estimation (Zhang & Scaramuzza, 2018), real-time optimization quickly becomes infeasible as the trajectory grows over time as the inertial measurements come at a high rate (Forster et al., 2017).Deepvio has explored the potential applications of VIO in various fields and identified future research directions, such as deep learning techniques to improve performance (Han et al., 2019).Nonetheless, time drifting is the primary uncertainty when using VINS.

Review of existing GNSS/VINS integration approaches for pedestrian positioning
The Extended Kalman Filter (EKF) is an effective method for sensor fusion, but it relies on a linear approximation of system dynamics, leading to reduced accuracy in nonlinear systems.Furthermore, the EKF's computational complexity scales quadratically with the number of 3D landmarks, limiting its scalability (He et al., 2018).In dense urban areas, EKF fails to achieve optimal performance due to the accumulation of Gaussian errors (Wen et al., 2021b).Nonlinear optimization methods have been proposed to address this issue.FGO is another approach for pedestrian positioning that models relationships between observed measurements and unknown states of the system, leading to high accuracy and efficiency.FGO handles noisy or incomplete data better than EKF and is more equipped to handle changing dynamics.EKF, on the other hand, struggles with nonlinearities in system dynamics.However, FGO's computational requirements may be higher, and accurate modeling of system dynamics is necessary for optimal performance.Ultimately, the choice of approach depends on the specific application and environmental conditions.To mitigate challenges associated with integrating VINS and GNSS, including errors in VINS measurements (e.g., drift) and errors in GNSS measurements (e.g., multipath and signal obstructions), the integration of 3DMA GNSS and VINS using FGO can provide accuracy, robustness, and reliability enhancement in complex and challenging environments.

Development of a framework for Integrating 3DMA GNSS and VINS using FGO for Pedestrian Positioning
In summary, this research is to develop a framework for smartphone-level pedestrian positioning technologies to improve accuracy, robustness, and efficiency in urban and indoor environments.Several challenges such as signal obstructions, drift, and multipath limit the effectiveness of the existing techniques.To address these challenges, the proposed framework suggests integrating 3DMA GNSS and VINS using FGO with accuracy, robustness, and reliability enhancement in complex and challenging environments.The framework emphasizes the importance of IO detection, using a machine learning-based method SVM to distinguish IO, and increasing the robustness of the positioning performance.

Figure. 1 Flowchart of the proposed system GNSSVINS-IO FGO
The positioning framework proposed in this paper is shown in Figure .1.This paper demonstrates a machine learning-based method using the SVM to classify IO.Features from GNSS measurements, such as the received and used satellite numbers and elevation angle, are extracted for IO classification.
Improving the GNSS positioning accuracy is the key to satisfactory performance under global coordinates.This study integrates the state-of-the-art 3DMA GNSS algorithms on shadow matching and likelihood-based ranging 3DMA GNSS.The 3DMA GNSS solution is then optimized with Doppler measurements using FGO in a loosely coupled fashion to increase the robustness.The full implementation can be found in 3DMA GNSS distributes the hypothesis positioning candidates around the initial position.Measurements are then simulated as each candidate and compared with the received measurements.And this study integrates shadow matching and likelihood-based ranging 3DMA GNSS, as , for FGO integration.
Meanwhile, receiver velocity, , and clock drift, , is estimated by the Doppler measurements of every satellite at epoch , , via the least-squares (LS) method (Wen & Hsu, 2021).The 3DMA GNSS also integrates with the Doppler measurement estimated velocity using FGO to increase the position robustness.The FGO structure consists of three factors.The first error factor constraints the 3DMA GNSS solution, , and optimized state, , where is a diagonal variance matrix of the 3DMA GNSS.While another two factors connect two consecutive epochs based on the motion propagation model, , and constant velocity motion model (Li et al., 2018), , given by, (3) where is the time difference between epoch and . is a diagonal covariance associated with the velocity at x-, y-, and z-axis, respectively.
is the averaged diagonal covariance matrix at time and .Therefore, FGO minimizes the total error of three cost functions of the loosely coupled 3DMA GNSS as, ) where is the state set of the receiver.And denotes the optimal states set.
To assist with indoor positioning, visual-inertial odometry is fully employed in the GNSS-denied area for bridging the GNSS gaps (Leutenegger et al., 2015).As the input, the images, the angular velocities, and accelerations are pre-processed.A pose graph is defined by skipping frames, pre-integrate IMU between keyframes, and then representing most of the IMU measurements into a single pose constraint so that the IMU can be manageable.The IMU is pre-integrated by adding posterior IMU bias correction and alignment with visual measurement to generate VIO to provide times, positions, orientations, and velocities.It then loosely fuses with GNSS and performs global optimization.All stages are refined with a nonlinear optimization (Shen et al., 2015).Besides the visual and inertial factors, effectiveness varies with respect to indoor or outdoor environments in accordance with the different sensor assessments.Consequently, the switch factor will also be applied when the attenuated and scattered signal is not received in the indoor position.A loosely coupled GNSS/VINS integration is then generated to achieve pedestrian positioning when GNSS, IMU, and the monocular camera are utilized to take advantage of their forte.This research focuses on providing reliable sensor fusion twofold: 1) selecting the reliable sensor for integration; 2) providing complete robustness during integration.As a result, reliable smartphone-level ubiquitous mobile pedestrian positioning is proposed.There are three contributions to this study: Develop a machine learning-based IO method based on GNSS measurement as the feature to select the reliable sensor during integration to maximize the positioning performance.
Coordinate frame alignment between INS in a local frame and GNSS in a global frame.
Loosely integrating solutions on GNSS and VINS as a batch using FGO to provide complete robustness.

Indoor Outdoor Detection Approach
IO detection: Indoor refers to a physically confined area; Outdoor refers to a non-completely confined area.(Bai et al., 2022) (Yan et al., 2019) Figure .2 Flowchart of the SVM To classify IO, SVM is adopted by GNSS features, including the received and used satellite numbers, the received average carrier-to-noise ratio (C/N0), and the elevation angle are extracted for IO classification.It is a supervised learning approach that transforms into a binary result.The data was collected from a university campus, indoor/outdoor, fully implementable on the phone.Figure .2 illustrates the algorithm framework of the proposed IO detection.The algorithm relies on provided GNSS information about pre-existing relationships among a subset of the elements, such as the received and used satellite numbers, the received average C/N0, and the elevation angle that need to be categorized as Indoor or Outdoor.The binary classification result is based on the initial assumption of the relationships between the elements as the expression pattern data, which is either belonging to Indoor or not belonging to Indoor.
The whole framework consists of two primary phases.The training phase is the first phase to utilize the presumptive classification (IO) and the expression data (GNSS measurements) as inputs to generate a set of weights.These weights are then utilized in the subsequent phase.The presumptive classification of IO was done manually for ground truth information.The second phase is the classification phase to utilize the weights from the training phase and the expression data, which are used to assign a score to each element.Each element is classified into the relevant category or excluded depending on the score.

VINS
After detection and tracking of the camera image feature and matching these features to the next image, a visual trajectory is constructed from SfM.It is required to align with IMU to recover the scale, velocity, gravitational acceleration, and IMU deviation (Zhu et al., 2019).However, every IMU rotation in the world coordinate system is required when the speed is updating; those IMU speeds and rotations in the world coordinate system are required when the translation is updating.Knowing that the IMU is in hundreds of Hz, it is unreasonable to update all the states every time.Therefore, the pose graph is defined by skipping frames, pre-integrate IMU between keyframes, and then representing most of the IMU measurements into a single pose constraint.
IMU pre-integrations: utilizing the continuous-time quaternion-based approach as previously developed in (Qin et al., 2018), with the inclusion of IMU biases as described in (Qin et al., 2019) (Qin & Shen, 2018).To avoid redundancy, this section provides a concise overview of our method in this section.IMU measurements, which are measured in the body frame, combine the force for countering gravity and the platform dynamics, and are affected by acceleration bias , gyroscope bias , and additive noise.It is postulated that the additional noise present in the measurements obtained from the acceleration and gyroscope instruments follows a Gaussian white noise distribution.Moreover, the biases associated with the measurements obtained from these instruments are represented as random walks, whose derivatives are modeled as Gaussian white noise.The raw gyroscope and accelerometer measurements, denoted as and , respectively, can be expressed as functions of ( ) This study assumes the acceleration and gyroscope measurements error follows a Gaussian white noise distribution.We use a random walk process to model the acceleration and gyroscope biases, whose derivatives are also Gaussian white noise. (6) The Equations 7, and 8 are computed using only the IMU measurements within the time span ] between two consecutive frames , where is the reference frame given the bias.

GNSS VINS Alignment
Yaw offset: Three-dimensional position coordinates and quaternion are outputted by the VINS with the acceleration of gravity of IMU, the local frame can be aligned with the ENU coordinate system using two degrees of freedom.To recover the 4-DoF local and global frames transformation, we aim to determine the yaw offset between the ENU frame and the local world frame.
The alignment between VINS and GNSS uses , , and relative position of consecutive epochs from GNSS, and minimizes the cost function, Where and are the rotational matrix from ENU to ECEF, and local frame to ENU, respectively.After this step, the transformation between the ENU and local world frames is completely calibrated.Therefore, the final positions estimated are in a global frame.Although the yaw angle will change when every time the system restarts, the yaw offset estimation between the ENU and local world frames is needed.

Factor Graph Optimization
The multi-sensor integration approach applied for the GNSS/VINS is FGO in a loosely coupled way.It is non-linear optimization represented in probabilistic graphical models and transformed into a factor graph to simulate the relationship between the poses and estimate their value.It can improve the accuracy of the system and mitigate drift and error accumulation issues.After the global and local frame alignment, the result be further optimized.The VINS estimates are represented as factors in the while the GNSS measurements are incorporated as constraints to optimize the estimates.where p is position, v is velocity, and q is orientation ( 13) The system states will be estimated in a set X in Equation to initialize the system.11, xt is what we currently want to estimate.
Each epoch we calculate is to estimate the position, velocity, and orientation.There are three factors, including the VINS factor in Equation.13, the GNSSS factor in Equation.14, and switching factor in Equation.15.And we factories all measurement.

+ (16)
Finally, the whole problem is organized with a series of factors formulated by the estimation from VINS, GNSS, and Switching I/O.The optimum system state would be stated as Equation.16 after the estimation problem has been transformed and every cost corresponding to their specific measurement is minimized.To become optimal, the optimum system gives all measurements to be as the state at maximizes a posterior (MAP) (Cao et al., 2022).

Figure. 6 Xiaomi Mi8
We used the Xiaomi Mi8 smartphone in Figure .6 to obtain the IMU, GNSS, and Image data in this experiment.There is a Triple-axis MEMS-IMU (TDK-InvenSense ICM-20690) at 100Hz, GNSS receivers of Broadcom BCM47755 chip, and a 1280 × 640 resolution monocular camera with 1.4 µm pixel size at 30 FPS equipped in the Mi8 smartphone.GNSS receivers have a Broadcom BCM47755 chip that receives GPS (L1+L5), Galileo (E1+E5a), QZSS (L1+L5), GLONASS (L1), Beidou (B1) at 1 Hz.In our system, the intrinsic and extrinsic parameters of the camera and IMU of Xiaomi are calibrated, and the window size is set to 10, as same as VINS-Mono, but the loop closure function is disabled.The library chosen is Ceres for the optimization part.And the experiments conducted are executed on a desktop personal computer equipped with an Intel i9-9900K at 3.6 GHz and 31.2-GBmemory.

Experiment Results
Table.After that, this section also selects some of the experiment results as a case study.It can be observed that the VINSGNSS-IO recovered the positioning while no GNSS signals.However, the VINS error did not be suppressed.Since it is added to FGO, the proposed method gets a worse result than 3DMA GNSS.

Figure. 9 The B1 absolute positioning error comparison
The third study is Experiment B1, the position error is shown in Figure 9.It can be observed that the VINS/GNSS integration can mitigate the drift caused by the VINS itself and provide a better positioning performance.After integrating the IO detection, the positioning performance is generally improved more than without IO detection.

CONCLUSION
This study introduces a framework based on factor graph optimization that integrates local estimates derived from prior research on VIO and 3DMA GNSS measurements with an IO detection switch.The proposed system enables precise estimation of pose locally and eliminates global drift.To evaluate the performance of the system, The effectiveness of this system is demonstrated through real-world experiments, which produced notable outcomes.However, there are still some limitations.The training model is not applicable to all smartphone devices as there is only one smartphone used in the experiment.Since the dataset is only covered on the campus, there is still an unexplored area for dealing with different environments.It is believed that there will have improvements in resolving environmental factors in these dynamic and complex urban canyons.

Formulation
shown above, is Non-linear Least Square where is the loss function, is the cost function.First, define the parameter block and residual block to compute the residual value.Then, set up all parameter blocks and residual blocks to build the problem.The solver used here is Ceres solver.The structure is shown in Figure.3.

Figure
Figure. 3 System state illustration.

Figure
Figure. 4 Trajectory A Figure. 5 Trajectory B In the experiment, there are two scenarios conducted in two places at Hong Kong Polytechnic University to assess the performance.The two scenarios, first scenario is starting from indoor, having a transition to outdoor and going back to indoor.Vice versa, the second scenario is starting outdoor, having a transition indoor and going back indoor.The trajectories A and B are shown in Figures 4, 5 respectively.This section compares 1 The RMSE of absolute positioning error along with the time and SVM accuracy comparison between 3DMA GNSSVINS-IO FGO, 3DMA GNSSVINS FGO, VINSMONO, and 3DMA GNSSin all experiments In general, GNSSVINS FGO provides the worst positioning performance with 4m or more.This is because the VINS-Mono accumulates the positioning drift to the time.After integrating the GNSS in the position solution domain, the positioning performance is improved to within 4 m average.Together with the IO switching factor, the RMSE of GNSSVINS-IO-FGO outperforms others across all experiments.Also, It can be seen that trajectory A gets a relatively low accuracy result in IO detection than trajectory B. It is because trajectory A has a more complex environment than trajectory B.
Figure.7  shows the absolute positioning error of Experiment A1.It can be observed that the VINSGNSS-IO suppressed the positioning error in general.However, a wrong estimated covariance of 3DMA GNSS may degrade the optimization results, such as in epochs around 620 to 650 (+1.677e9), due to the covariance cannot bound to the actual positioning error, FGO wrongly trusting the wrong 3DMA GNSS and degrades the performance.The VINSGNSS FGO even affected the performance seriously by the large GNSS error with small covariance occurring around 615 (+1.677e9), and the whole optimized result deformed.Besides, IO detection can effect of large positioning errors.

Figure. 8
Figure. 8 shows the absolute positioning error of Experiment A2.It can be observed that the VINSGNSS-IO recovered the positioning while no GNSS signals.However, the VINS error did not be suppressed.Since it is added to FGO, the proposed method gets a worse result than 3DMA GNSS.

Figure. 10
Figure. 10 The B2 absolute positioning error comparisonThe last study is the B2 positioning error, the error plot is shown in Figure.10.There is a large positioning error between times 2,050 to 2,100.However, the positioning error can be successfully suppressed after integrating with GNSS.The result with IO detection can provide a generally better positioning performance, especially at epochs around 2,030.The peak positioning error is successfully suppressed, resulting in the overall positioning error of VINSGNSS-IO being much lower than the result without IO.Experimental validation: