RESTFUL IMPLEMENTATION OF CATALOGUE SERVICE FOR GEOSPATIAL DATA PROVENANCE

Provenance, also known as lineage, is important in understanding the derivation history of data products. Geospatial data provenance helps data consumers to evaluate the quality and reliability of geospatial data. In a service-oriented environment, where data are often consumed or produced by distributed services, provenance could be managed by following the same service-oriented paradigm. The Open Geospatial Consortium (OGC) Catalogue Service for the Web (CSW) is used for the registration and query of geospatial data provenance by extending ebXML Registry Information Model (ebRIM). Recent advance of the REpresentational State Transfer (REST) paradigm has shown great promise for the easy integration of distributed resources. RESTful Web Service aims to provide a standard way for Web clients to communicate with servers based on REST principles. The existing approach for provenance catalogue service could be improved by adopting the RESTful design. This paper presents the design and implementation of a catalogue service for geospatial data provenance following RESTful architecture style. A middleware named REST Converter is added on the top of the legacy catalogue service to support a RESTful style interface. The REST Converter is composed of a resource request dispatcher and six resource handlers. A prototype service is developed to demonstrate the applicability of the approach.


INTRODUCTION
With the advancement of Web Service technologies, Geographic Information System (GIS) has evolved into Geographic Information Service (GIService).In a GIService environment, where data are often consumed or produced by distributed services, geospatial data provenance becomes important since it helps data providers and consumers to evaluate the quality and reliability of geospatial data.Consequently, new challenges have emerged, including sharing and integrated analysis of geospatial data provenance in a service-oriented environment.The Open Geospatial Consortium (OGC) Catalogue Service for the Web (CSW) specifies the interface to query and register descriptive information (metadata) about geospatial data, services and related resources.It can be used for the registration and query of geospatial data provenance by extending ebXM L Registry Information M odel (ebRIM ) (Yue et al., 2011).
Recent advance of the REpresentational State Transfer (REST) paradigm has shown great promise for the easy integration of distributed resources.REST is a set of principles and constraints imposed on the communication between clients and servers in designing distributed systems and applications (Fielding, 2000).RESTful Web Service aims to provide a standard way for Web clients to communicate with servers based on REST principles.The existing approach for provenance catalogue service could be improved by adopting the RESTful design.This paper presents the design and implementation of a catalogue service for geospatial data provenance following RESTful architecture style.In the study, the PROV-DM model developed by World Wide Web Consortium (W3C) is extended to represent geospatial data provenance.It is combined with a catalogue service to provide the storage, management, and share of geospatial data provenance.According to the extended model, five kinds of resource entities are defined, namely, Data resource, Service resource, Provenance resource, DataList resource, and ServiceList resource.According to REST principles, every resource is assigned with a unique Uniform Resource Identifier (URI).These identified resources can be retrieved via HTTP GET on their URIs.A middleware named REST Converter is added on the top of the legacy catalogue service to support a RESTful style interface.The REST Converter is composed of a resource request dispatcher and six resource handlers.A prototype service is developed to demonstrate the applicability of the approach.

Geospatial Data Provenance
Some of the earliest research on geospatial data provenance can be traced back to early 1990s, when lineage of derived products in GIS is characterized as information that describes materials and transformations applied to derive the data (Lanter, 1991)

REpresentational S tate Transfer
REST is a hybrid style derived from several network-based architectural styles and defines a uniform connector interface (Fielding, 2000).As Roy Fielding is one of the principal contributors of the Web protocols, REST can be considered as a post-hoc description of the features of the Web that made the Web successful.It should be understood that REST is not a standard, but an architectural style.The REST describes six constraints applied to the architecture, namely client-server, stateless, cacheable, layered system, code on demand (optional), and uniform interface.Any applications conforming to these constraints can be characterized as "RESTful".Fundamental principles in the design of REST interface can be concluded as follows:  Identification of resources. M anipulation of resources through representations. Self-descriptive messages.
 Hypermedia as the engine of application state.
Resource, referenced with a unique URI, is the central idea of REST.Only limited operations are used to manipulate the resources: HTTP GET to retrieve resources; HTTP PUT to create resources; HTTP POST to update resources; HTTP DELETE to remove resources.Resources are manipulated through their representations which can be more than one format (HTM L, XM L, etc.).Relationships between resources are expressed by hyperlinks, through which state of the representation is transferred.

APPROACH
In order to build a RESTful Catalogue Service for geospatial data provenance, a middleware called REST Converter is added on top of the legacy catalogue service.By this means, the legacy interface of catalogue service is reserved to interoperate with existing Web Services.M eanwhile, the rest principle is supported.The design of a RESTful Catalogue Service for geospatial data provenance is based on a 3-tier Web architecture (Figure 1).In the development of the RESTful Catalogue Service for geospatial data provenance, two key issues need to be addressed.Since resource is the most essential element in RESTful Web Services, one key issue is how to identify all resources with global URIs.The other issue is to comply with REST principles while at the same time reusing legacy catalogue services.

Identifying resources
In the implementation of RESTful Web Services, it is a crucial step to identify all required resources.With a particular focus on provenance entities including data and services used in deriving geospatial data products, we define five types of resources, namely Data, Service, Provenance, DataList, and ServiceList.In the provenance, a data product could have descendants or ancestors, which are represented using Inputs and Outputs resources.Then URIs are assigned to these resources as follows: For Provenance resource, parameter is provided for users to choose the encoding mode.The default parameter is "w3c", and it also can be "ebrim" or "iso".For DataList and ServiceList resource, parameters such as resource name, keyword information for describing resource and bounding box can be appended to a URI to act as filters for querying.
As resources are always returned through a representation in a RESTful Web Service, their encodings have to be defined.
Resources can be encoded in XM L, JSON, or other formats.
Since the extended PROV-DM model is represented in XM L in this study, we provide resources with XM L representations. Figure 2 shows an example of XM L representation of geospatial data resource.The XM L representation contains links to related resources following which state transfer can be achieved.

Designing a RES T Converter
The core module of a RESTful Catalogue Service for geospatial provenance is the REST Converter.The REST Converter extracts the user's query from the URI, encodes it as a valid CSW request, sends the request to CSW interface, and converts the response into an XM L representation for resources.A CSW request will be created according to operation types of user's RESTful requests in the REST Converter.If a request of GET /geoprov/rest/datalist/{dataId} HTTP/1.1 is sent, the GetRecordById is constructed; if a request of URI with key value pairs is sent (e.g.GET /geoprov/rest/datalist?keyword=test&bbox=-180,-90,180,90 HTTP/1.1), the GetRecords request is constructed according to URI and request parameters.Subsequently, the CSW request constructed is submitted to the CSW service.When the REST Converter receives the response from the CSW service, it invokes a pre-defined XSLT style sheet (see Figure 5) to convert the response into an XM L representation which will be sent to the client as the result.All resources listed in Section 3.1 have been tested.They can conveniently accessed by simply appending their URIs to the service address.Taking provenance resource for example, we can get the provenance information of data "urn:uuid:c96b4e97-0494-4d6e-a70e-cdecf40a02c4" by following the URI http://localhost:8081/geoprov/rest/datalist/urn:uuid:c96b4e97-0494-4d6e-a70e-cdecf40a02c4/geoprov.
Besides, the KVP parameter outputSchema can be appended to the URI to set the encoding mode.In the traditional CSW approach, users need to be familiar with the CSW-ebRIM model, and construct corresponding CSW queries.For RESTful requests, all users need to know is the resource's URI and corresponding HTTP method.

CONCLUS TIONS AND FUTURE WORK
This paper presents design and implementation of a catalogue service for geospatial data provenance following RESTful architecture style.It illustrates how REST principles can be applied into a catalogue service to facilitate the management and share of geospatial data provenance.RESTful approach to the design and standardization of geospatial services will lead to both improved interoperability and greater accessibility for these services.The results demonstrate the applicability of the RESTful Catalogue Service for managing geospatial data provenance.It also paves the way for the investigation of how REST architecture style could be used in geospatial service development.
Future work will include implementing more CSW optional operations in the REST Converter, and improving the integration and interoperation of geospatial data provenance with other geospatial service.

Figure 1 .
Figure 1.Architecture of RESTful geospatial provenance service

Figure 2 .
Figure 2. Example of a Data resource represented in XM L

Figure 3 .
Figure 3. M echanism of the REST Converter

Figure 4 .
Figure 4. Class overview in REST Converter

Figure 5 .
Figure 5. Fragment of XSLT style sheet Figure 6 and 7 show response examples of provenance resource with PROV-DM and ISO19115 encoding, respectively.

Figure 6 .
Figure 6.Response of provenance resource with PROV-DM encoding A RESTful Web Service is a Web API developed based on HTTP and REST principles.Because of its scalability and simplicity, OGC forms a RESTful Service Policy Standards Working Group (SWG) in September 2011.This SWG aims to develop a policy standard for the structure and content of RESTful approach to the implementation of geospatial services.