MATHEMATICAL AND PHYSICAL APPROACHES TO INFER ABSOLUTE ZENITH WET DELAYS FROM DOUBLE DIFFERENTIAL INTERFEROMETRIC OBSERVATIONS USING ERA5 ATMOSPHERIC REANALYSIS

: Atmospheric water vapor (WV) is one of the driving constituents of the atmosphere. The modelling and forecasting of WV and derived quantities like precipitable water is reliable on regional scales but challenging on small scales because of its high spatial and temporal variation. Interferometric synthetic aperture radar (InSAR) can be exploited to retrieve integrated atmospheric water vapor (IWV) from path delay observations along the radar line of sight. InSAR-derived IWV maps feature a very high spatial resolution but the double-differential interferometric observations only provide changes of IWV between acquisition times and with respect to a certain spatial reference. In this study we present a method to derive the absolute IWV by combining ERA5 numerical weather model data with differential path delay observations from InSAR time series. We propose different functional approaches to merge the regional trend of WV from ERA5 with the high resolution IWV signal from InSAR. We apply this to a Sentinel-1 Persistent Scatterer InSAR time series in the Upper Rhine Graben and validate against IWV observations at GNSS stations of the Upper Rhine Graben Network.


INTRODUCTION
SAR interferometry (InSAR) is susceptible for changes of atmospheric water vapor (IWV) which can be derived from path delay observations along the radar line of sight. Persistent Scatterer Interferometry (PSI) is an enhanced approach to process long InSAR time series where only those image pixels are chosen, which show high coherence (low phase variance) throughout the time series. PSI enables -apart from height and motion determination -the estimation of differential slant delays due to varying water vapor in the atmosphere. Moreover, single observations of zenith delays (ZD) can be estimated from a PSI stack under certain assumptions and with respect to an arbitrary constant, which appears as a rank deficiency in the estimation procedure. In particular for modern SAR satellite imagery, ZD maps are characterized by a high spatial resolution that can be further used e.g., as input to water vapor tomography.
The high spatial resolution of the InSAR data compared to other meteorological space-borne instruments makes them attractive to address meteorological questions (Hanssen, 2001). Space-borne C-band SAR instruments as, e.g., on the European Remote Sensing satellites (ERS-1/2) and onboard the Envisat satellite are successfully used to construct water vapor maps (Hanssen, 2001, Heublein et al., 2015. The higher sensitivity and resolution of X-band sensors offers an even more detailed analysis as shown, e.g., by Qin (2013) for an urban example. In turn, the wide footprint of the C-band sensors on the current Sentinel-1 (A/B) satellite missions enables large areas to be investigated, still with a moderately high resolution (Mateus et al., 2017).

PROBLEM DEFINITION
A common approach to overcome the rank deficiency of this estimation process is the so called 'zero-mean-assumption', which constraints a zero average delay over all acquisition times for each PS-point. However, this approach neglects local variations, for example height dependent variations due to stratospheric layering or specific regional variations of weather characteristics. Even a conventional DEM-dependent stratification approach can only accomplish a rough approximation of the true situation. In addition, the resulting delays are still relative with respect to an arbitrary constant but for the usage in weather models, absolute values need to be determined. Spatial phase gradients with long wavelengths, so called phase ramps in PSI stacks can be caused by several influences, such as orbital ramps, processing errors, solid earth tides and ocean loading effects or atmospheric signals otherwise. The distinguishment of those contributions is only possible with further knowledge, while a full estimation and reduction from the observables seems hardly realizable in the case of Sentinel-1. Alshawaf et al. (2015a) derive absolute IWV maps with a sequential approach too. First, a least squares inversion is applied to PSI delays under the constraint of zero temporal mean, whereby the PSI delays have been low-pass filtered before to suppress long-wave signal. Subsequently, these maps are combined with complementary maps from GNSS which provide more reliable absolute and long-wave IWV but at low spatial resolution. Comparing the final combined IWV maps with IWV maps from the MEdium Resolution Imaging Spectrometer (MERIS) onboard the Envisat satellite, shows good spatial correlation of up to 92%.

DATA AND PREPROCESSING
In this work, we focus on the processing of the PSI phase observations, which are already unwrapped and corrected. The database is a stack of 169 scenes of Sentinel-1A/B data, acquired in Interferometric Wide Swath Mode (IW) at an altitude of around 690 km and a swath width of 250 km. The ground resolution is about 5 x 20 m. All data are recorded along ascending orbit 88 between March 2015 and July 2019 with the acquisition time of 17:26 UTC. In this study, we used the VV polarized data and the SRTM-1 as reference digital elevation model during preprocessing. The orbit correction was performed with the provided precise orbit files. Those resulting double differential slant delays are corrected for any linear deformations. The data is available online (Fersch et al., 2021).
ERA5 is the fifth generation atmospheric reanalysis provided by ECMWF (Hersbach et al., 2020). With more input data and advanced algorithms in assimilation, ERA5 represents the stateof-the-art atmospheric reanalysis. In this work, we employed the pressure level ERA5 product for the calculation of auxiliary meteorological data. We downloaded the humidity, temperature, and geopotential data with a spatiotemporal resolution of 0.25 • × 0.25 • , 37 vertical levels, and 1-hourly.
As the ERA5 data are provided as spatial grids, spatial interpolation or extrapolation is needed in the calculation of meteorological variables at the PS positions (see Appendix C of  and the references therein). In addition, the ERA5 data are provided at integral hours, the values at two nearest hours were linearly interpolated to the PS epoch.

GNSS observations from the GURN (GNSS Upper Rhine
Graben Network) were collected and processed by using GAMIT software. The tropospheric delays were corrected with Vienna Mapping Function 1 (VMF1) and associated gridded map products of a priori ZHD and ZWD. The first-order of ionospheric delay was removed by using dual-frequency observations and the second-order delay was modelled. The observations are weighted according to elevation and the observations with their elevations lower than 10 • were discarded. The tropospheric delay data are available online . In this study, we used the 25 GNSS stations included in the area covered by the PSI dataset. Additionally 5 stations are chosen as validation stations for specific analysis. More on this separation of the stations can be read in (Fersch et al., 2021).

METHODOLOGY
In this study, we consider different approaches to infer additional information from weather models such as ERA5 to resolve the rank deficiency by (temporal) fixing of absolute ZWDs and including statistics on (spatial) variances of average delays. The workflow is presented in Figure 1.
The data basis of this study are the PSI phase observations. Firstly, the observations are mapped from slant to zenith direction using a simple cosine function. Afterwards, an inversion is performed to obtain the atmospheric phase screens for each scene from the interferometric observations. Traditionally the zero-mean assumption is used as restriction of this underdetermined inversion problem. To introduce this condition means that the same long-term mean value of 0 is assumed for each PS point. This neglects the fact, that regional characteristics such as height differences and specific meteorological factors influence the true mean value. We therefore performed studies on the ERA5 data to determine the validity of the zero mean assumption. Those tests are performed on the ZWD values obtained by conversion of the phase values to delays in mm. The main influence on the long-term mean is due to the different heights. But even after subtracting this effect, systematic deviations from the zero-mean assumption can be seen for our stack as shown in figure 2. Further tests also showed that those deviations may vary for different years or epochs included in the calculations. This leads to the first correction introduced in our study, the mean correction which is applied onto the APS values. Using statistics of the ERA5 model at the specific PSpoints and at the specific SAR acquisition times, ERA5 mean values of the Zenith Wet Delay (ZWD) can be determined for each PS and introduced using the following equation: Such mean values are more realistic, as they include regional variabilities and systematic height dependent differences.
As the ERA5 model describes the atmospheric delays at longwavelengths reliably, it is a suitable dataset to prune the longwavelength components of the InSAR signal leaving only the atmospheric contribution. Different approaches are presented to introduce the information from the ERA5 model to the PSI dataset. Firstly, simple mathematical models can be used to adjust the difference of the long wavelength component between the mean-corrected PSI ZWD and the nominal ERA5 model. Using low-order polynomial functions assures that only long wavelength components are corrected. We use bivariate polynomials of degree 0, 1, 2 and a thin-plate spline approximation. The latter is based on the MATLAB function tpaps (The MathWorks Inc., 2022). In the case of the polynomial approx- The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLVIII-M-1-2023 39th International Symposium on Remote Sensing of Environment (ISRSE-39) "From Human Needs to SDGs", 24-28 April 2023, Antalya, Türkiye  imations, the functional description is given with the following equations: with ϕ, λ geographic coordinates a, b, c.d parameters As we have over 4 million PS points we use a bootstrapping approach to calculate the ideal parameter estimation for the polynomial approximations. In a loop every 100th point is used, changing the starting index in each run. The mean values of all those 100 parameter sets is then used as final estimation set. Taking into account, that the spline estimation takes for one loop more time then the 100 loops for the other functions, we only use 1 estimation for this approach. Every 1000th point is used for the thin-plate spline interpolation.
An example of the different correction functions and their results using the dataset from 2017/11/24 17:24 UTC is shown in Figure 3. In subfigures c, e, g the correction functions are displayed illustrating the amount of detail captured by the functions.
Another approach is via the spectral description of the long wavelength signal components. After resampling on an equidistant grid of 2 km resolution, the datasets are Fourier transformed and splitted at a certain cutoff wavelength, which is determined using the grid resolution of the ERA5 dataset. While the short wavelength component of the observed signal is maintained, the components with longer wavelengths of the PSI ZWD signal are replaced by the ERA5 model. As the weather model data is originally on a coarse grid of 0.25°resolution, only low frequency components can be used. At the same time, the high spatial variability from the PSI ZWD is preserved. The mask used for filtering the components of the ERA5 data is a gaussian filtering mask. The high frequency parts of the SAR ZWD are then determined by the inversed mask. The result of the filtering can be visualised in power spectra of the datasets. The radially averaged power spectra of the PSI, ERA5 and final ZWD are shown in Figure 5. The final dataset first follows the curve of the ERA5 and then smoothly changes to follow the SAR dataset for shorter wavelengths.
In Figure 4 the different parts are visualized using again the dataset of 2016/02/03 17:24 UTC: (a) shows the low frequency part of the ERA5 data set, (b) the low frequency part of the SAR data which is removed by the filter, (c) displays the high frequency part of the SAR data, which is used for the final product and (d) is the final product using (a) and (c).
The applied corrections influence the absolute values in the regional context and in the long-wavelength range, but do not lower the advantage of the spatially high-resolution InSAR-data and thus, allows the analysis of local variations with absolute ZWD. The proposed upgrading of the zero-mean assumption does not require any commercial data, as both the Sentinel-1 data and weather model data from ERA5 are freely available online.
The final product are high resolution IWV maps, which are generated through transformation of the ZWD using mean temperatures from ERA5 data. For details of this transformation, see (Fersch et al., 2021).
The validation of the different methods is performed on the final IWV products using the GNSS data. For the polynomial approaches the 10 nearest neighbors are chosen for each GNSS station. All values from the PS points are then corrected for the height differences between itself and the GNSS station. For the spectral approach the two nearest grid points are chosen and height corrected. The mean value of the neighboring PS or grid points is then used for time series analyses. Those are computed using the Kling-Gupta-efficiency metric KGE and its constituents: correlation r, the relative variability α (DMEAN) and the bias ratio β (DSTD) (Gupta et al., 2009) as described in equation 5.

RESULTS
We show the results of the upgrading of the ZWD estimation and discuss the plausibility and suitability of the different approaches. The validation of the resulting absolute IWV is performed using GNSS data, including local characteristics at the GNSS sites, which might be contained in the InSAR dataset, but not in the ERA5 model. A first impression of the fitness of the approaches is given by Figure 6b for the GNSS station MNBL (Montbéliard) and the spline correction. Each datapoint represents one epoch included in the SAR and GNSS data. Directly visible are the seasonal dependencies of the IWV amount, which is significantly higher and more variable in summer. That can be explained due to the higher temperatures and accordingly higher water vapour content in the atmosphere as well as the higher turbulence.
The overall evaluation of the different methods is performed on all 25 GNSS stations as shown in Figure 6a. Although more complex functions result in better results, the overall improvement is slight. The best performance is achieved with the thinplate spline interpolation, followed by the spectral correction in the frequency domain. Notable is the improvement of the The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLVIII-M-1-2023 39th International Symposium on Remote Sensing of Environment (ISRSE-39) "From Human Needs to SDGs", 24-28 April 2023, Antalya, Türkiye  validation results from using only ERA5 data to the fused data sets.
A more detailed look at different seasons of the year shows Figure 6c. Clearly the best performance for all approaches is achieved in autumn, which is known for stable weather conditions. In contrast to that, the summer is the most difficult to estimate. The biggest differences between the methods are consequently in summer, followed by spring.
To further go into detail on the reasons of the different performances, we take a look at the different components of the KGE analysis in summer for all the methods. Figure 6d displays the components of KGE. Clearly the results are very similar for the relative bias and variability (DMEAN and DSTD), the biggest differences are visible in the correlation and causing the differences in the KGE.

CONCLUSIONS
We presented a workflow to insert mean-correction of the APS and multiple approaches to infer the long wavelength component from ERA5 into the PSI ZWD products. This enables the production of high resolution IWV maps with absolute values derived from PS In-SAR observations. All methods show a significant improvement from the simple zero-mean assumption and should be selected according to the respective requirements. Although the best results were achieved using the spline interpolation, this method comes with high computational costs and needs to be adjusted to avoid overfitting. Physically straight forward is the spectral filtering approach in the frequency domain, which on the other hand leads to a reduced horizontal resolution of the product. For simple applications it might be sufficient to have a polynomial approach, which can be run significantly faster than the spline estimation.