A QUALITY ANALYSIS AND UNCERTAINTY MODELING APPROACH FOR CROWD-SOURCING LOCATION CHECK-IN DATA
Keywords: Crowd-sourcing, Location check-in, Quality analysis, Spatial registration, Uncertainty, Error distribution
Abstract. The location check-in data, developing along with social network, are considered as user-generated crowd-sourcing geospatial data. With massive data volume, abundance in contained information, and high up-to-date status, the check-in data provide a new data source for geographic information service represented by location-based service. However, there is a significant quality issue regarding to crowd-sourcing data, which has a direct influence to data availability. In this paper, a data quality analysis approach is designed for the location check-in data and a check-in data uncertainty model is proposed. First of all, the quality issue of location check-in data is discussed. Then, according to the characteristics of check-in data, a location check-in data quality analysis and data processing approach is proposed, using certain standard dataset as reference to conduct an affine transformation for the check-in dataset, during which the RANSAC algorithm is adopted for outlier elimination. Subsequently, combining GIS data uncertainty theory, an uncertainty model of processed check-in data is set up. At last, using location check-in data obtained from jiepang.com as experimental data and selected navigation data as data standard, multiple location check-in data quality analysis and uncertainty modeling experiments are conducted. By comprehensive analysis of experimental results, the feasibility of proposed location checkin data quality analysis and process approach and the availability of proposed uncertainty model are verified. The novel approach is proved to have a certain practical significance to the study of the quality issue of crowd-sourcing geographic data.