IMPROVING VOLUNTEERED GEOGRAPHIC DATA QUALITY USING SEMANTIC SIMILARITY MEASUREMENTS
Keywords: Volunteered geographic information, Data quality, Semantic similarity, Semantic heterogeneity, OpenStreetMap
Abstract. Studies have analysed the quality of volunteered geographic information (VGI) datasets, assessing the positional accuracy of features and the completeness of specific attributes. While it has been shown that VGI can, in some context, reach a high positional accuracy, these works have also highlighted a large spatial heterogeneity in positional accuracy, completeness but also with regards to the semantics of the objects. Such high semantic heterogeneity of VGI datasets becomes a significant obstacle to a number of possible uses that could be made of the data.
This paper proposes an approach for both improving the semantic quality and reducing the semantic heterogeneity of VGI dat asets. The improvement of the semantic quality is achieved by automatically suggesting attributes to contributors during the editing process. The reduction of semantic heterogeneity is achieved by automatically notifying contributors when two attributes are too similar or too dissimilar. The approach was implemented into a plugin for OpenStreetMap and different examples illustrate how this plugin can be used to improve the quality of VGI data.