OGC-AI: A Retrieval-Augmented Large Language Model Interface for Open Geospatial Consortium Web Services
Keywords: OGC Standards, Large Language Models, Geospatial Data, Geo-AI Integration, Natural Language Interface
Abstract. In this research, OGC-AI is presented as a retrieval-augmented large language model (LLM) interface that enables plain-language access to Open Geospatial Consortium (OGC) web services while keeping organization-internal endpoints private. Standards documents and service metadata are automatically harvested and indexed; at inference time, relevant snippets are retrieved to compose syntactically correct, standards-compliant requests, execute them via a secure proxy, and return grounded answers with source links. As of 30 April 2025, the corpus comprises 397 documents across 92 OGC standards, spanning both legacy and modern APIs commonly used in Spatial Data Infrastructures. The two use cases are including (i) the use of OGC-AI with complex SensorThings API request, and (ii) generating a working CesiumJS example that consumes geospatial data from OGC API services. A retrieval-augmented strategy is favored over cache-augmented alternatives to accommodate a large, evolving standards landscape. Current limitations (e.g., multi-step analytics, semantic disambiguation, dependence on upstream document structures) and a roadmap toward interactive mapping, task decomposition, and quantitative evaluation are outlined. By lowering the skill barrier to OGC-compliant data access, OGC-AI advances the FAIR principles—especially Accessibility and Reusability within established SDIs.