ESTIMATION OF GRASS YIELD IN LARGE REGION ON GEOGRAPHICALLY WEIGHTED REGRESSION MODEL

The grass yield embodies its productivity,and also is ground for developing animal husbandry production management. Now the remote sensing technology has been becoming an efficient and feasible mean to estimate the grass yield. In the study, the thought about Geographically Weighted Regression (GWR) was involved in estimating the grass yield. The special characteristics of samples measured on field were considered, and then each sample has a local function covering area around. And the parameters for the function are decided by the weighted function which is associated with the spatial distance between the sample and others around. GWR is a good solution to the model without spatial stationarity, as a consequence a significant model-fitting degree comes out. Based on GWR model an ideal production of grassland can be estimated. In this study, Qinghai province, about 0.72 million square kilometres, was taken as an example. The province is an important one on the Qinghai Tibet Plateau. Here the grassland not only closely relates with the local animal husbandry economy, but also directly affects the regional ecosystem security. Landsat TM data in 2013 and samples on field were used to estimate the production. As input parameters, OSAVI and FVC have high correlation coefficient more than 97% with grass yield. There were 201 samples involved in modelling, and the accuracy is 87.27%, above about 47% than that of multiple linear regression model, a widely used traditional statistic model. Another 220 samples were used to verify the results, and here the accuracy can reach 81.3%. Out results indicated that in 2013 the yield of grass in Qinghai province is 1.018*10 ton. The difference between our data and that from professional sector is less than 10%. * Corresponding author. chfluo@casm.ac.cn , +86-10-63880548, +86 135 2276 0996.


INTRODUCTION
As the natural resources for livestock production forage, the productivity of grassland can reflect the capabilities of pasture carrying directly.There have many studies about estimating grass yield by remote sensing technology(Jin Y X， 2011； Benie G.B.，2005；Liu X Y，2010).Issues in focus include how to estimate the grass yield quickly and efficiently, monitor the productivity of grassland in a large region and know the current situation and development trend well.Such issues has becoming key topic in ecology and grassland academia field.It is also practical problem for grassland management urgently needed to be solved (Xu B, 2007).To estimate grass yield there are two models used widely, biological -physical model and statistical empirical model (Lv H Y，2010).The former is always adopted to work for large scale region with coarse resolution data, like NOVAA and MODIS.And the result has low accuracy.When it comes to estimation with high accurate for local region, the modelling is becoming more complex and affected by the natural characteristics of the geographical closely (Chen J，2009).Establishing the model based on ground truth data and remote sensing data, the statistical empirical model is relatively simple and practicable.Because the spatial heterogeneity in nature always is ignored in practice, to get high estimation accuracy is difficult (Tao W G，2007).According to the first law of geography: the closer the distance between the features, the greater similarity ( Miller H J ， 2004).It is necessary to take the rule into modelling in order to get better result.
According to the first law of geography: the closer the distance between the features, the greater similarity ( Miller H J ， 2004).It is necessary to take the rule into modelling in order to get better result.Geographically Weighted Regression (GWR) model create the conditions for regression analysis of the relationship between spatial features.The grass yield has relation with geographical location, adjacent relationship between features with geographical proximity makes the grass yield spatial correlation.In the study the GWR was adopted to model, which can reflect spatial characteristics of grass yield.Another valuable side is the estimation was applied using imageries with30 m resolution in a large region, about 720,000 square kilometres.

Study Area
Qinghai Province, one of the four largest provinces in China, extending 8 degree of latitude.It also is one of the national five major pastoral areas, the grassland area here accounting for 50.46% of the total land area.The grassland resources here is very abundance, and available pasture area accounts for 87% of the province's natural grassland area, about 15% of the available grassland area in China (Shen Y C, 1991).Grassland mainly appeared in Southern Qinghai Plateau, Qilian Mountain and the south-eastern margin of the Qaidam basin mountain.There are 9 Grassland categories ， including alpine meadow, alpine grassland, alpine meadow grassland, alpine desert, temperate steppe, temperate desert, temperate desert steppe, mountain meadow and lowland meadow.Here alpine meadow and alpine grassland present the main body of natural grassland, about 2948.16×104hm 2 and accounting for 80.88% of the province's total area of grassland.With remote sensing technology the grass yield can be monitored macro and comprehensively.It has great significance during balancing the livestock development and ecological construction (Yan D L, 2007).

Data and Preprocessing
Landsat 8 TM data with resolution 30m was used in the study.According to the growth of grass is best in august each year, the images, receiving before and after a month from august 15th, were selected to work.There involves 46 scenes in all.
The field data comes from the ground field survey by the local grassland supervising department in 2013 august.There are about 600 typical and representative samples totally.
The preprocessing included radiometric calibration, atmospheric correction, terrain correction and removing cloud.
Here the FLASH was used to finish atmospheric correction, Ccorrection model (Wang S, 2013) to terrain correction.The method referred from Li B X (2010) was used to remove the effects of cloud. Where( Where  ˆis the estimation of  , the first row elements of the independent variable matrix are 1; X is the independent variable matrix about model factors; Y is the vector of the values of the dependent variable and it is the matrix about the grass yield ground measured data; W is a square matrix of weights relative to the position of ) , (

Input Parameter
To estimate grass yield with remote sensing data, vegetation index is adopted by many researches(Skinner R H，2011， JOSEM.PARCEIO，1997).In the same time some weather elements, like temperature, rainfall, dryness also be used as input parameters(Li W J, 2012; Wei Y X, 2012).Generally correlation coefficient ( Xu H L ， 2013 ) and estimation accuracy are two important index to compare and choose the input parameters.In this study, five vegetation indices were compared, including normalized difference vegetation index (NDVI), enhanced vegetation index (EVI), soil adjusted vegetation index (SAVI), modified soil adjusted vegetation index (MSAVI) and optimal soil adjusted vegetation index (OSAVI) (Chen P F, 2007).After several tests with five VIs, weather elements and fraction of vegetation coverage (FVC), finally EVI and FVC was adopted as input parameter to estimate grass yield.
The method mentioned by Brian Johnson(2012)was used to estimate the FVC, and the accuracy for the entire study area in August 2013 is 85.8%.

Modelling with GWR
To model with GWR, the fresh weight of grass yield measured on the spot is chosen as the dependent variable and other factors include EVI and FVC are independent variables.After modelling, there is an equation for every samples calculated by GWR.With the 201 samples, the modeling accuracy for GWR is 87.3%, while the accuracy for OLS is 40.35%.When it come to the statistical results of model parameters, it can be found the statistical index for GWR are better than that for OLS.Here, AICc is short for Akaike Information Criterion, r is correlation coefficient, and r 2 the coefficient of determination, r 2 Adj Adjusted r-square.

Grass yield estimating
While estimating the grass yield for the whole region, the input parameters included raster data of the EVI and FVI.The value was calculated pixel by pixel.To find the nearest sample away the current pixel, and take the fitting equation of the nearest sample as basis to figure the grass yield for the pixel.For large region, there is huge amount of data, and the key to improve efficiency is to identify the nearest samples in shortest time.
After repeated practice, it was found that building four forks tree index for all samples can improve the operation rate obviously.
The estimation result of grass yield in 2013 for Qinghai province is show as figure 4.Here from southeast to northwest the grass yield is becoming lower.The region with high grass yield distributes in southeast, including Haibei, Haidong, Huang Nan, Guoluo, and the south-eastern region of Yushu.And Haixi Autonomous region is desert and wilderness, so there has lowest grass yield.According the estimation, the total grass yield of Qinghai province is 1.018×108 Ton in 2013.Here the non-grass area was not taken into operate and such area were identified by the thematic map of second grassland survey in Qinghai Province.

Precision verification
To verify the accuracy, root mean squared error (RMSE) (Jin Y X, 2011) was used.Theoretically, if RMSE=0, the model is error-free and it is considered an ideal state.In practical applications, the closer predictive value to the critical value, the better the model fitting results.The actual estimation accuracy of the model is 1-RMSE.
There are 210 ground truth samples involved into precision verification and the result value is 81.3%.The total grass yield estimated by the study were also compared with that from local Grassland professional department based on ground survey.The difference is less than 10%.

CONCLUSIONS
The location is introduced to building estimation model based on GWR.The parameters for different geographical samples are diverse, which reflect the spatial difference of these points.That means the same factors have different influence to the dependent variable because of deferent geographical position.The spatial difference of the grass yield in a large region can be reflected adequately by GWR.The experiments in the study have shown with GWR the goodness-of-fit of the model can be improved obviously, In the same time the estimation accuracy and verify accuracy are more than 80%.The difference between estimation with remote sensing data and ground survey is less than 10%.When it comes to long-term continuous monitoring in large area, technical method can be used.
To estimate grass yield with GWR model and remote sensing data, the estimation accuracy depends on ground samples.The accuracy is affected directly by spatial distribution characteristics of ground samples, as well as their representation of grassland production capacity.The high accuracy in the study benefited from the typical and representative samples.Now it takes grassland specialized department a lot of manpower and material resources every year to carry out the ground surveys.With the method mentioned above, only using RS data combined with the ground samples, it can be explored whether the ground samples can be reduced and how many samples is minimum requirements within an acceptable estimation error range.Such work will alleviate amount of field work for grassland specialized department.
Based on the grass yield estimated quickly with remote sensing technique, It is very meaningful to carry further research about expertly valuing the service function of the grassland ecosystem near real-time.Both from theoretical and practical view, such study is helpful for such work, like understanding the region ecological importance, scientific management and rational utilization of grass resources , maintaining grassland ecological balance, reasonable arrangements for livestock production, protection grass ecosystems and establish a comprehensive economic accounting system.The following work will focus on ecological evaluation.

Figure 1 .
Figure 1.Sketch map of the study area

Figure 2 .
Figure 2. The mosaic image of study area after preprocessing 2.3 Method GWR model is an expansion on the traditional model, and the spatial characteristics of the data are introduced the model.The geographic location of sample are involved into modelling based on multivariate linear regression.And the regression coefficient is assumed as the function of the location of sample.While modelling, a local function for every observation sample is calculated by GWR.And the coefficients are measured by weighting function among adjacent relationship of the samples.The non-stationary of the model is solved well by GWR.Thereby the goodness of fitting of the model is improved and better simulation results can be made.GWR model can be expressed as following (Xuan H Y,2007): Tobler's first law of geography, the influence on the estimation of parameters for point i from the observation value near point i on is greater than influence from that far away from i , GWR model based on linear regression model assumes that the regression coefficient is an arbitrary function of observation point location and it brings the spatial character into the model.The model calculates a local equation at each point, and observation value weight is no longer remain constant in the regression process, and the weight is related to the proximity position to i area; in practices the calculation method of spatial weights matrices are the Gauss distance, exponential distance and tribute distance.This study chose Gauss distance to determine the weight:

Figure 2 .
Figure 2. The FVC map of QingHai Province in 2013 August

Figure 3 .
Figure 3.The distribution map of the samples for modelling The multiple linear regression model is as following: Y=0.684X1-0.006X2+2.706(4)

Figure 4 .
Figure 4.The grass yield map of QingHai Province in 2013

Table 1 .
While there is only one equation if the binary linear regression model is used.A total of 201 field samples involved in grass yield estimation model building, the parameters of GWR model change in different geographical locations.As table 2，the parameters of GWR model change with different geographical locations.The range change of parameters from GWR model