MAKING SPATIAL STATISTICS SERVICE ACCESSIBLE ON CLOUD PLATFORM

Web service can bring together applications running on diverse platforms, users can access and share various data , information and models more effectively and conveniently from certain web service platform. Cloud computing emerges as a paradigm of Internet computing in which dynamical, scalable and often virtualized resources are provided as services. With the rampant growth of massive data and restriction of net, traditional web services platforms have some prominent problems existing in development such as calculation efficiency, maintenance cost and data security. In this paper, we offer a spatial statistics service based on Microsoft cloud. An experiment was carried out to evaluate the availability and efficiency of this service. The results show that this spatial statistics service is accessible for the public conveniently with high processing efficiency. About the first author: Mu Xiaoyan, E-mail: iccgs@whu.edu.cn


INTRODUCTION
Web service can bring together applications running on diverse platforms, enable database information exchange and allow applications originally meant for internal use be made available through the Internet (Christopher Ferris & Joel Farrell 2003).Cloud computing emerges as a paradigm of Internet computing in which dynamical, scalable and often virtualized resources are provided as services.With the rampant growth of massive data and restriction of net, traditional web services platforms have some prominent problems existing in development such as calculation efficiency, maintenance cost and data security.To solve these problems, individual applications and enterprise storage are being deployed on clouds.By deploying service in the cloud, such infrastructure as storage and networks, which needs to be supplied by the developer in a traditional host-client pattern, is now available through the Internet from anywhere.
In this paper, we propose some spatial statistics services such as matrix operations service, statistical calculation service and regression analysis service.Integrating these services, we report an effort to develop a service system providing eigenvector-based spatial filtering (ESF) method to perform regression analysis with spatial data.This ESF service system makes users access to the powerful regression method by provisioning a serial of concerned services with the support of cloud computing .With these services, users can efficiently and conveniently perform regression analysis with their own spatial data.M icrosoft Windows Azure is chosen as the platform to develop and deploy these services and the ESF service system.M icrosoft Windows Azure actually consists of a variety of different services on top of a common platform and it supports applications built on the .NET Framework and other popular programming languages supported in Windows systems (Zhang et al., 2010).Azure is selected as the platform for the ESF service due to our familiarity with the .NET framework and the development tools.Azure provides the full range services for developing the ESF system that takes advantage of parallel computing and easy access of the Web.The remainder of this paper is arranged as follows: the second section analyzes the algorithm of ESF.Section three designs the service system and Web page layout.Subsequently, an experiment is presented to demonstrate the availability and accessibility of these services and the accuracy of the eigenvector-based spatial filtering regression model.

ANALYS IS OF ES F ALGORITHM
Ordinary Least Square (OLS) is a commonly used method in regression analysis.However, when dealing with spatial data, the existence of spatial autocorrelation often leads to violations of the basic assumptions for OLS (Tobler, 1970;Hubert et al., 1981;Getis, 1990).The eigenvector-based spatial filtering (ESF) service provides an effective spatial regression model to handle with spatial autocorrelations.The effect of the variables in the regression model is divided into two parts: spatial influence and non-spatial influence.Extracting the spatial influence part and filtering it, then we can use the basic method to make regression analysis.

ES F algorithm process
(1) Generate the spatial connectivity matrix.
(2) Calculate the eigenvalue and eigenvector of the connectivity matrix.
(3) Select eigenvectors as spatial filters.(4)Perform Ordinary Least Square regression analysis with the original explanatory variables and the selected eigenvectors.
(5)Assess the quality of the model.

The concrete implementation of the ES F algorithm
Component Object M odel (COM ) from M icrosoft is one of the mature component technologies and widely employed in its operating system and application software.Component technology is a powerful and hot scheme to actualize software reuse, shorten programming time, decrease maintenance costs and realize dynamic program upgrade.Dynamic Link Library (DLL) provides a convenient way to share data and code.It can save disk space significantly for different applications can share the same DLL function.Various of programming tools can be implemented to write DLL files such as vc + +, c + + Builder and Delph.Eigenvector-based spatial filtering algorithm is written with COM technology in M icrosoft Visual C + + 6.0 environment .Then the corresponding DLLs are generated to be accessed dynamically at run time and invoked by the share service.

The overall framework and functional module
The ESF service system will be divided into four major modules: fundamental function module, matrix calculation module, statistical computing module and regression calculation module.There are four classes according to the module division.The fundamental functional class is not required as the interface exposed to the users, so it is a fundamental class while the other three classes are the interface classes.The next section will give detailed descriptions of these classes.

The design of the class
(1) CComBase is a common class to achieve fundamental functions.It contains some functions for the subsequent calculation.
(2) CmatrixBase is a class for matrix operations, and ultimately exposed to the user with the form of interface.Users can invoke external without understanding details of its implementation.The interfaces and functions are shown in table 1

The generation of DLL
Build function framework for COM component according to the system structure designed in the previous section in visual c + + 6.0.So we can have more clearly organized to write code.After the completion of code editor, click the DLL generation button in the Build TAB, then complete the generation of a DLL.Users can invoke the interface functions in other programs after registering the DLL on their computers.

The system design of ES F service
M aking DLL component function in to web service, and then deploy the web service on cloud platform for users to access.Edit the web service interface for invoke to achieve the function of regression analysis.Specific implementation process is shown in figure 2.
Figure 2.Web service process ESF service system is a more complete service system, including user management, document management and data processing and other functions.Service architecture is shown in figure 3.  Design the website page and invoke the Web service

The website design of the service
We create two websites (web service website and ESF service website which integrates related web services) considering that users may need one or more services.The name of the interface corresponds to its function in the web service website.Each web service has some brief instructions on the interface so that users can have a clearly understanding of its use.From the friendly user and result interfaces showed above, it can be seen that ESF service is accessible for the public conveniently with high processing efficiency .In addition, deployed on M icrosoft cloud, these spatial statistics services can be remote accessed by entering website in to browser address bar wherever and whenever.

Comparision of different regression results
Area, population and GDP of each province are selected as experiment data.Perform regression analysis with this data respectively in Geoda software and ESF system.The different regression results are shown in table 5. Table 5. Regression results in different software Illustration: Geoda -Classic: regression result with Geoda software; ESF -Simple (simple ratio to choose eigenvector): regression result with ESF system.In all the regressions, four parameters are calculated to assess the quality of those models.For the convenience of comparison, the significant tests for regression equation and regression parameters are not listed here.Among those parameters, both Rsquared and Adjusted R-squared are used to test the goodness of fit of the regression model.A low number means worse fitting degree while higher numbers means good fit degree.Akaike info criterion is used to test the accuracy of the regression model.A low number means good accuracy while higher numbers means worse.
Result: As displayed in the table above, ESF service system has better fitting degree than Geoda software and improves the accuracy of the regression model.

CONCLUS ION
This paper proposed an ESF service system and some statistics services based on a cloud platform, Windows Azure.By deploying the system on the cloud, the availability and accessibility of these services have been significantly improved.With this system, users can conveniently get access to the spatial filtering method and use those services to analyze spatial data efficiently.Eigen function based spatial filtering method can eliminate the spatial autocorrelation to some extent and improve the accuracy of the regression model.

Table 1 .
. Introduction for the CmatrixBase interface and function (3) Cstatistic is an interface class for statistical computing function.It implements intermediary step of the ESF algorithm and can be invoked by users.The interfaces and functions are shown in table 2.

Table 2 .
Introduction for the Cstatistic interface and function (4)Ceigenfunction is an interface class for regression computing.It is important step of the ESF algorithm and can be invoked by users.The interfaces and functions are shown in table 3.

Table 3 .
Introduction for the Ceigenfunction interface and function

Table 4 .
Introduction for web service interface and function