CROWDSOURCING BIG TRACE DATA FILTERING: A PARTITION-AND-FILTER MODEL
Keywords: Crowdsourcing traces, Trace partitioning, GPS trace filtering, GPS baseline
Abstract. GPS traces collected via crowdsourcing way are low-cost and informative and being as a kind of new big data source for urban geographic information extraction. However, the precision of crowdsourcing traces in urban area is very low because of low-end GPS data devices and urban canyons with tall buildings, thus making it difficult to mine high-precision geographic information such as lane-level road information. In this paper, we propose an efficient partition-and-filter model to filter trajectories, which includes trajectory partitioning and trajectory filtering. For the partition part, the partition with position and angle constrain algorithm is used to partition a trajectory into a set of sub-trajectories based on distance and angle constrains. Then, the trajectory filtering with expected accuracy method is used to filter the sub-trajectories according to the similarity between GPS tracking points and GPS baselines constructed by random sample consensus algorithm. Experimental results demonstrate that the proposed partition-and-filtering model can effectively filter the high quality GPS data from various crowdsourcing trace data sets with the expected accuracy.