SPATIALLY AWARE LANDSLIDE SUSCEPTIBILITY PREDICTION USING A GEOGRAPHICAL RANDOM FOREST APPROACH
Keywords: Landslide Susceptibility, Geographical Random Forest, Machine Learning, Mapping Unit, Spatial Autocorrelation
Abstract. Landslide susceptibility prediction practices have been increasingly reliant on non-geographically-oriented (i.e., aspatial) machine learning algorithms. While these approaches have exhibited increasing success, they have often faced criticism for their limited consideration of spatial autocorrelations and local variations across geographical space, thereby neglecting the concept of spatial non-stationarity. To fulfill the research gap, this work applies a geographical random forest (GRF) approach, contrasting it with the conventional random forest (RF) algorithm. To this end, the study area, encompassing the Lake Sapanca Basin and its surroundings, was subdivided into 4,452 slope-based mapping units. The effectiveness of both predictive models was then measured by using overall accuracy (OA) and area under the curve (AUC). The results revealed that the GRF (OA = 80.82% and AUC = 85.22%) outperformed the RF algorithm (OA = 75.34% and AUC = 82.50%) by approximately 5% in OA, and demonstrated a 3% improvement in AUC score. The Wilcoxon signed-rank test confirmed significant differences (95% level) between the predictions of both models. The slope parameter emerged as the globally most influential factor, but local interpretations disclosed notable variations in the importance of causative factors contingent upon location. For instance, the curvature parameter was the most important geospatial covariate in around one-third (34.23%) of the slope units, mostly concentrated in the northernmost zones of the study area. On the other hand, elevation was the most important factor for 14.67% of the slope units primarily located in the southern region.