PAIRS (RE)LOADED: SYSTEM DESIGN & BENCHMARKING FOR SCALABLE GEOSPATIAL APPLICATIONS
Keywords: big data analytics, ML, AI, distributed geo-spatial data structures, Hadoop, HBase, Spark, GeoMesa, PAIRS Geoscope
Abstract. In this paper we benchmark a previously introduced big data platform that enables the analysis of big data from remote sensing and other geospatial-temporal data. The platform, called IBM PAIRS Geoscope, has been developed by leveraging open source big data technologies (Hadoop/HBase) that are in principle scalable in storage and compute to hundreds of PetaBytes. Currently, PAIRS hosts multiple PetaBytes of curated and geospatial-temporally indexed data. It organizes all data with key-value combinations, performing analytics close to the data to minimize data movement.