IMPLEMENTATION OF THE DISTRIBUTED PARALLEL PROGRAM FOR GEOID HEIGHTS COMPUTATION USING MPI AND OPENMP

Much research have been carried out using optimization algorithms for developing high-performance program, under the parallel computing environment with the evolution of the computer hardware technology such as dual-core processor and so on. Then, the studies by the parallel computing in geodesy and surveying fields are not so many. The present study aims to reduce running time for the geoid heights computation and carrying out least-squares collocation to improve its accuracy using distributed parallel technology. A distributed parallel program was developed in which a multi-core CPU-based PC cluster was adopted using MPI and OpenMP library. Geoid heights were calculated by the spherical harmonic analysis using the earth geopotential model of the National Geospatial-Intelligence Agency(2008). The geoid heights around the Korean Peninsula were calculated and tested in diskless-based PC cluster environment. As results, for the computing geoid heights by a earth geopotential model, the distributed parallel program was confirmed more effective to reduce the computational time compared to the sequential program. * Corresponding author.


INTRODUCTION
These days, global navigation satellite systems (GNSS) have been widely used and geoid is a very important element which makes it possible to determine orthometric height using GNSS positioning.Moreover, many countries have continued to make efforts to develop their own high-precision geoid model (Chandler and Merry, 2010;Kuroishi et al., 2002;Roman et al. 2009;Toth et al. 2000) Since global geopotential models (GGMs) well expresses long wavelength component of the Earth's gravity field (Daho et al. 2008;Krynski and Lyszkowicz 2006), they not only provide basis for the gravity field to develop high-precision geoid model but also have important meanings as reference surface for calculating a local geoid (Bae et al., 2011;Dawod et al., 2010).In addition, countries which are difficult to develop geoid models have been using GGMs for rough calculation of geoid heights (or undulations) and gravity anomaly through spherical harmonic analysis (SHA) (Lee et al., 2008) Geopotential data that is recently observed by GRACE satellite has contributed to more precise GGMs development and especially these (geopotential data) have been greatly contributed to development of Earth gravitational model 2008 (EGM2008).EGM2008 is complete to degree and order 2,159 and includes additional spherical harmonic coefficients extending to degree 2,190 and order 2,159 (Pavlis et al., 2008;Pavlis et al., 2012).
Moreover, because GGMs-derived geoid heights are referred to a global vertical datum and these are fitted with a local vertical datum by GPS/levelling-derived geoid heights, the accuracy of these (geoid heights) could be enhanced.For this purpose, leastsquares collocation (LSC) method is widely used (Kotsakis et al., 2008;Lee et al., 2008).
As mentioned above, although SHA and LSC are mostly used to evaluate accuracy and suitability of ultra-high degree geopotential models for certain local areas or countries, they all contain time-consuming problem.In geoscience simulations, to solve this problem, resources are distributed parallel computation is getting popularity to reduce data processing time.This paper is intended to carry out LSC to fit ultra-high degree SHA by EGM2008 using distributed parallel computation on PC cluster and present its performance evaluation.
EGM2008 is the first-ever global model that is capable of resolving the Earth's gravity field beyond spherical harmonic degree 2000.EGM96 that is a widely used GGM has spatial resolution of 30 arc minutes whereas EGM2008 is a highresolution GGM of Earth's gravity field that allows computation of geoid heights down to a resolution of 5 arc minutes (Dawod et al., 2010;Pavlis et al. 2008).
International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XXXIX-B4, 2012 XXII ISPRS Congress, 25 August -01 September 2012, Melbourne, Australia Especially, unlike conventional GGMs, EGM2008 contains gravity data of 15' resolution for Asian regions and also gravity data calculated from SRTM terrestrial data.Therefore, it is known to be 3-6 more accurate than a previous model, EGM96 (Pavlis et al., 2008;Pavlis et al., 2012).

GPS/levelling data
GPS/levelling data is a very important factor not only to evaluate accuracy of the developed gravimetric geoid models but also fit gravimetric geoid models to local vertical datum.
And accuracy for GGMs is evaluated primarily by crossvalidation between geoid heights calculated from SHA and geometric geoid heights calculated from GPS/levelling data.This paper, in order to evaluate the accuracy of EGM2008derived geoid heights and to fit the EGM2008-derived geoid heights to local vertical datum, used a total of 1,175 points of the GPS/levelling data which is acquired from unified control points (UCPs) installed with 10 km spacing in South Korea (Figure 1).
Figure 1.Distribution of a total of 1175 points of the GPS/levelling data (blue square)

Spherical harmonic analysis
SHA is a process of decomposing a function on a sphere into components of various wavelengths using surface spherical harmonic as base functions.Newtonian gravitational potential V satisfies Poisson's equation and gravitational potential satisfies Laplace's equation in free space.Solution of Laplace's equation can be expressed by harmonic functions series.And disturbing potential is calculated and then it substitutes Bruns's formula and finally geoid heights N is calculated.Further details on this spherical harmonic analysis are given in Rapp and Pavlis (1990), Heiskanen and Moritz (1996), and Kenyon and Pavlis (1996).

Least-squares collocation
Least-squares collocation in Gravity field addresses correlation between signal and noise given by gravity data, and using this relation predicts signal where observation is not carried out.
The LSC represents one of the major theoretical and practical foundations of modern physical geodesy (Tscherning, 1986;Tsaoussi, 1989).
In this paper, we used LSC to fit the EGM2008-derived geoid heights with geometric geoid heights calculated from GPS/levelling data .The most important rule to apply LSC is that data needs to be centered before it is collocated (Moritz 1980).In other words, trend has to be removed from the raw data such that mean of the data would be equal to zero (Forsberg et al., 2003).Procedure to remove this trend can be obtained by applying various trend models to raw data.
The introduction of the GRAVSOFT package programmed by Forsberg et al. (2003) was a turning point in LSC application and adoption (Darbeheshti, 2009).In this paper, in order to fit geoid surface using LSC, GEOIP and GEOGRID in the GRAVSOFT written by Fortran 77 language are used and program of GCOMB is ported by C++ language and then used.Further details on this spherical harmonic analysis are given in Forsberg et al. (2003) and Darbeheshti (2009).

MPI and OpenMP programming
Generally, parallel programming is classified as distributed programming using the message-passing interface (MPI) and multi-core CPU-based parallel programming using the open multi-processing (OpenMP).
MPI is a standardization for message passing and currently the de facto standard and the parallelization method for developing the high-performance computing (HPC) applications on distributed memory architecture (Message Passing Interface Forum, 1994;Wittwer, 2006;Chorley and Walker, 2010).The message-passing programming model is based on the abstraction of a parallel computer with a distributed address space where each processor has a local memory to which it has exclusive access (Rauber and Rünger, 2010).However, in the case of the communication of high volume data in the lowperformance distributed computing environment which consists of low-speed network environment, MPI application's performance can be reduced.
OpenMP is a standardization and an application programming interface (API) that supports multi-platform shared memory multi-processing programming.It consists of a set of compiler directives, library routines, and environment variables that influence run-time behavior (Chapman et al, 2007;Refianti et al., 2011).Many of the major compiler vendors (e.g.Microsoft, Intel, HP) support OpenMP technology.And one of advantages of the OpenMP technology is to allow relatively fast development of parallel applications through the global access of application memory address space (Jin et al., 2011).Especially, it is available in multiple operating system environments.However, the OpenMP technology does not support the distributed computing environment.
In order to overcome this problem, researches to solve timeconsuming problem through hybrid MPI and OpenMP approach which combined MPI and OpenMP have been recently proceeding.

Parallelization modelling
This research performed parallelization modeling by applying hybrid approach to analyze accuracy of geoid heights that is finally determined by fitting the EGM2008-derived geoid heights to geometric geoid heights calculated from GPS/levelling data and determine optimum correlation length.
The parallelization modeling consists of SHA and LSC (Figure 2).And a task of the parallelization model is separated into one cluster master and cluster nodes.The cluster master performs to allocate tasks and gather results in each node.And the cluster nodes perform tasks of allocated regions.
First, the master node is to allocate EGM2008 coefficients, working scopes to compute EGM2008-dervied geoid heights to each cluster node.Here, generally, working scopes used for distributed parallel processing are divided with grid shapes but their associated Legendre function values are same in same latitude (Xiao and lu, 2007).Therefore, the working scopes can be divided based on latitude.Next, the master node performs to gather EGM2008-dervied geoid heights and then distributes the EGM2008-derived geoid heights and working scopes for LSC tasks to each cluster node.Next, each cluster node carries out LSC fitting using the EGM2008-derived geoid heights, geometric geoid heights calculated from GPS/levelling data and correlation length and explores correlation length which has minimum standard deviation at corresponding node.Finally, the cluster master performs to gather optimum correlation length in each node and then chooses final optimum correlation length.And hybrid MPI and OpenMP application to carry out SHA and LSC is ported so that it can be executed on Linux platform written with Fortran 77 language (Forsberg et al., 2003;Rapp, 1982;Yecai, 1994).These are implemented by C++ using MPICH2 and OpenMP library and also compiled by GNU C Compiler.

Computing platform
Diskless-based PC cluster system is implemented for distributed parallel computing using hybrid MPI and OpenMP approach proposed in this paper (Table 1).This cluster system consists of 1 cluster master and 16 cluster nodes, and its network environments is made of 100MB bps switch hub.Operating system software for cluster master used Community ENTerprise Operating System (CentOS) 6.2 x86 64bit, with GCC 4.4.6 compiler used for compilation of code and MPI library used MPICH2 1.2.1 (http://www.mcs.anl.gov/mpi/mpich2).Cluster nodes are structured to be driven by diskless method using Perceus 1.6.1 (http://www.perceus.org/)which is a program package made by Infiscale for driving diskless cluster while each node of cluster system is structured in a way that OS image stored at cluster master is connected by PXE network booting method.From the results of analysis, the EGM2008-derived geoid heights are calculated for a total of 357,601 grid points with 1 arc minute of interval for latitude and longitude respectively.And LSC was performed between 0.1 and 150 km by 0.1 km interval.And broadcast time of EGM2008 and the EGM2008derived geoid heights , runtime for SHA of EGM2008, runtime to find optimum correlation length, and total computation time are measured (Figure 3). 2 processors means 1 node and 1 processor means an environment without node i.e. test result by serial algorithm at single computer not at cluster system.As results, the broadcast time of EGM2008 and the EGM2008derived geoid heights needed approximately 4 seconds at 1 node (2 processors) and approximately 18 seconds for maximum node (32 processors).The share (ratio) of broadcast time in total computation time showed almost 0%.This means that although it has almost no influence upon total computation time, it increased linearly depending on the number of nodes.
For SHA, the calculations were possible with approximately 3.6% of 1 processor.And LSC fitting was possible to be calculated with approximately 3.8% of 1 processor.The total computation time was approximately 6 hours and 40 minutes (0 node) and approximately 15 minutes (maximum node).This is similar to the result of SHA research using supercomputer, which means parallization modeling for SHA and LSC fitting suggested by this paper is well executed in low-performance cluster system (Xiao and lu, 2007).

CONCLUSION
In this paper, we proposed spherical harmonic analysis used to evaluate suitability of GGMs and hybrid MP and OpenMP approach to reduce data processing time for least-squares collocation fitting.
Hybrid implementation using parallelization modeling showed more advanced result in calculation speed than serial implementation.Moreover, cluster system with lowperformance not high-performance computing (HPC) like supercomputer performed SHA and LSC fitting more effectively.

Figure 2 .
Figure 2. Parallelization model flow chart of SHA and LSC

Figure 3 .
Figure 3. Running time of hybrid MPI and OpenMP application for computing of SHA and LSC.(a) broadcast time of EGM2008 and the EGM2008-derived geoid heights, (b) runtime for SHA of EGM2008, (c) runtime to find optimum correlation length using LSC fitting, and (d) total computation time.

Table 1 .
Summary of the diskless-based PC cluster system