EVALUATION OF FUTURE INTERNET TECHNOLOGIES FOR PROCESSING AND DISTRIBUTION OF SATELLITE IMAGERY

Satellite imagery data centres are designed to operate a defined number of satellites. For instance, difficulties when new satellites have to be incorporated in the system appear. This occurs because traditional infrastructures are neither flexible nor scalable. With the appearance of Future Internet technologies new solutions can be provided to manage large and variable amounts of data on demand. These technologies optimize resources and facilitate the appearance of new applications and services in the traditional Earth Observation (EO) market. The use of Future Internet technologies for the EO sector were validated with the GEO-Cloud experiment, part of the Fed4FIRE FP7 European project. This work presents the final results of the project, in which a constellation of satellites records the whole Earth surface on a daily basis. The satellite imagery is downloaded into a distributed network of ground stations and ingested in a cloud infrastructure, where the data is processed, stored, archived and distributed to the end users. The processing and transfer times inside the cloud, workload of the processors, automatic cataloguing and accessibility through the Internet are evaluated to validate if Future Internet technologies present advantages over traditional methods. Applicability of these technologies is evaluated to provide high added value services. Finally, the advantages of using federated testbeds to carry out large scale, industry driven experiments are analysed evaluating the feasibility of an experiment developed in the European infrastructure Fed4FIRE and its migration to a commercial cloud: SoftLayer®, an IBM Company. * Corresponding author.


INTRODUCTION
EO satellite missions make use of on-site infrastructures dedicated to the reception, storage, processing and distribution of the images.The state of the art in remote sensing has raised difficulties when new satellites with improved capabilities are incorporated in a system.Traditional infrastructures are not flexible or scalable, presenting limitations to manage large amounts of data, to provide services to variable demands, or to redesign the system.
The appearance of Future Internet technologies such as cloud computing facilitates the optimization of resources and overcomes the previous limitations.The main characteristics of cloud computing are the following: elasticity, scalability and on demand use of resources (Ambrust, 2010).GEO-Cloud is an experiment inside the Fed4FIRE FP7 European Project, which makes use of the federated testbeds PlanetLab Europe, Virtual Wall and BonFIRE whose objective is to find if Future Internet technologies provide viable solutions for the Earth Observation market.For that purpose, traditional on site and cloud computing infrastructures are compared.
In addition, this paper includes a feasibility analysis for the migration of a system, designed to be tested in Fed4FIRE, to a public commercial cloud infrastructure: SoftLayer® (SoftLayer, 2015).
The project team was based on three pillars equally important and essential for its success: Technology, Industry Sector and Integration.On the side of the commercial technology IBM played a significant role with their recently acquired SoftLayer®; Elecnor Deimos contributed with their Industry Sector experience (satellites systems integrators and operators), while Itera Process acted as an integrator with deep knowledge about both sides: cloud computing vendors and industry sector, providing trusted assistance to manage cloud computing technology for high performance computing in commercial infrastructures.
The paper is structured into the following sections: Section 1 is devoted to introduce the work herein presented; Section 2 describes the Fed4FIRE infrastructure, including the used testbeds, and the SoftLayer® cloud infrastructure; Section 3 is devoted to explain the GEO-Cloud Experiment; Section 4 is aimed to discuss the characteristics of having a ground segment implemented in a traditional fashion or in cloud; Section 5 summarizes the main conclusions of this work.

Fed4FIRE
Fed4FIRE (Fed4FIRE, 2014)  In the GEO-Cloud project, the testbeds used were Virtual Wall, PlanetLab and BonFIRE, described as follows:  Virtual Wall: it is an emulation environment for experimenting with advances networks, distributed software and service evaluation carried out by the University of Ghent (Virtual Wall, 2015).Each of the data centres that constitute SoftLayer® boast an identical modular and scalable deployment platform with servers, storage, routers, firewalls and load balancers.The heart of SoftLayer® service is a software platform and management system that automates every aspect of the infrastructure as a service (IaaS) providing a high performance and robust cloud platform.Furthermore, it improves customer control and transparency, reducing human errors and optimizing costs.
SoftLayer® has designed and deployed a global, interconnected platform that meets the key operational and economic requirements of cloud infrastructure across a broad portfolio of dedicated and shared devices, physical and virtual servers, hourly compute instances and four-way, octo-core bare metal, along with a wealth of storage, networking and security components.
SoftLayer® is here proposed as an infrastructure to match the GEO-Cloud business needs, from storage and network services to software, monitoring, storage and security.Moreover, the infrastructure facilitated automated deployment and direct management through the customer portal and the application programming interface (API).

Description
GEO-Cloud consists of a cloud-based Earth Observation system capable of covering the demand of highly variable services (Becedas, 2014).The system was constituted of i) a Space System Simulator implemented in Virtual Wall reproducing the behaviour of a constellation of 17 satellites acquiring the Earth surface in a daily basis and 12 ground stations receiving the images, ii) a data centre implemented in the BonFIRE multicloud system (Kavoussanakis, et al., 2013), which ingested, processed, archived, catalogued and distributed the images to the end users, on demand (Pérez, et al., 2014), and iii) an experiment by using PlanetLab nodes to obtain the real impairments between the cloud nodes and the implemented ground stations in Virtual Wall (González, et al., 2014).
The GEO-Cloud architecture was tested and validated with predesigned scenarios emulating realistic use cases of EO satellites such as crisis response, infrastructure monitoring, land coverage and precision agriculture.
In these scenarios, the constellation of satellites acquired the images of the Area of Interest; downloaded them to the ground stations network and then they were ingested into the cloud for their processing, archive and catalogue to be distributed through a web service to the end users.
The main measurable parameters of the system performing under the conditions of the previous scenarios include the processing time of the processors, the time required to distribute the final products to the end users from the images acquisition and real-time scalability of the cloud system.

Architecture
The GEO-Cloud experiment architecture consists in three parts: a) A simulated constellation of satellites and a network of ground stations implemented in Virtual Wall, b) a cloud architecture for processing, store and distribute geo-images and c) the experiment in PlanetLab testbed to obtain the impairments between the BonFIRE nodes and the Virtual Wall ground stations.The results of the PlanetLab experiment was implemented and tested and they are described in (González, et al., 2014).Figure 1 describes the architecture of the GEO-Cloud system.On the top, the constellation of 17 satellites and the ground stations network implemented in Virtual Wall communicate with the BonFIRE multi-cloud through the Orchestrator.The Orchestrator ingests the raw data images and manages their processing in the Processing Chain module and their archiving and cataloguing in the Archive and Catalogue module.More information on the architecture of the system can be found in (Pérez, et al., 2014).
The functions of the components in the architecture are the following:  Orchestrator: it manages the automatic distribution of the raw data to the processors, the cataloguing of the products and the ingestion of images into the cloud.If the processor chain is occupied, the manager replicates the complete chain in a new machine. Processing chain: this component is in charge of the processing of the payload raw data from the satellite imagery to produce different kind of image products in different processing levels: i) L0 level decodes the raw data providing L0 image products; ii) L0R level transforms the L0 image products in squared images; iii) L1A processor takes the L0R products as input and performs a geolocation and radiometric calibration in the images; iv) L1B level resamples the image and makes a more precise geolocation; and v) L1C level ortho-rectifies the images of the L1B products by using ground control points. Archive and Catalogue: this module stores and catalogues every product generated by the Processing Chain module.This component consists of the archive and catalogue submodules.The archive manages the storage allowing the management of large amount of data and optimizing the operations with the file system.The catalogue provides a web interface and a CSW interface.

Results of the experiment
The cloud architecture was implemented in BonFIRE to check and validate the GEO-Cloud experiment in an experimental cloud platform.Different scenarios were executed and the results of the processing time, the cataloguing time and the delivery time (from the ingestion of the raw data into the system to the availability of the images in the web service.After validating the system in BonFIRE it was migrated to SoftLayer®.1146 performance trials were done to obtain the results. The

Traditional infrastructures
Since the first space missions, ground segments have comprised at least one antenna for the communications and a data centre for the processing of the data science, telemetry and control of the satellite by telecommanding.Processing and satellite control, although very related (e.g.telemetry is essential to know the status of the satellite and to act consequently by using the telecommands), can be separated in some missions.Earth Observation satellites in Low Earth Orbit usually make use of polar stations for image downloading, taking advantage more frequent passes at high latitudes to have visibility with the ground station once per orbit to download more images and receive commands.However it is also common to have the main data centre in the local area of the satellite operator to download images and have a local link for tele-commanding.
Large scale operators and space agencies such as NASA with DSN (see NASA, 2015) and ESA with Estrack (see ESA, 2013) have distributed worldwide networks to control many satellites.
In the case of private companies, the antennas and data centres are designed ad hoc for each mission.This design precludes the scalability of the facilities in case of future new missions, forcing to redesign and assume the costs of the expansion.To see an example of architecture and a suite of products for the ground segment software of a use case, see (González, 2014b).

Cloud Infrastructures
Traditional business applications have always been too expensive and complex.Thanks to Cloud Computing the previous complexities are eliminated because the infrastructure and associated costs are externalized.It is the responsibility of the cloud provider.It works like a utility: the user only pays for what he really needs and uses; upgrades are automatic and the enlargement or reduction of the service comprises a simple process.
Cloud computing features can be summarized as follows: a) it provides on-demand computing and storage resources; b) it has broad network access, which facilitates customers to access the platform from everywhere; c) it provides elasticity whereby the resources can be elastically provisioned, deployed and released; d) it provides a service to monitor, control and report the infrastructure behaviour; and e) new consumption model that measures the service utilized by the consumer and charges back to each of them accordingly in a `pay-as-you-go' fashion.For establishing the discussion GEO-Cloud is considered as a basis.The constellation described downloaded 1.64 TB of raw data after compression in a daily basis.This means that 73.24 TB of imagery products were generated every day in different commercial products.With cloud computing this amount of data can be managed on demand.Besides no effort in designing and deploying the data centre is required, since the infrastructure is already provided.If a private infrastructure must be deployed, it would require time and investment to appropriately design, deploy, test and maintain the infrastructure However public cloud infrastructures eliminates those the previous associated costs.Then we can affirm that public cloud computing reduces the total cost of ownership (TCO).
Another characteristic of using cloud computing is that it does not matter the number of ingested images; they all can be processed in parallel in different processing chains.This is possible because of the elasticity and scalability characteristics of the cloud.
Nevertheless, the data centre designer shall pay attention to the following: the virtualization of the instances in a public cloud affect the processing time of the imagery data making it variable.This is produced because the virtualization of instances allows cloud providers to offer the same physical machine to several users.To avoid having a large variability in the satellite imagery processing times consider that fully dedicated resources would provide more stability.
Furthermore cloud infrastructures have the servers distributed in different locations, sometimes in different countries, as the experimental multi-cloud platform BonFIRE, but also in different continents as SoftLayer®.This can be an advantage since several antennas distributed around the world can download the images to a near node in the cloud.This can reduce latencies and costs in the transfer of data.If a traditional localized data centre is used instead, dedicated channels from the distributed network of antennas to the data centre should be deployed, incurring in a high cost.
A drawback of public cloud computing infrastructures is that, they cannot be used in some applications in which the location of the files must be perfectly known and controllable, or to fulfil the legal aspects in the use and storage of data.To study the possibility of having full control in the instance management and location is a must in these cases.SoftLayer®, for example, can provide, fully controllability and location control of the data, but the specific legal aspects of each application and country have to be considered.
Cloud computing also contributes to globally distribute the imagery products obtained.Any user in any part of the world can access a web service and visualize or download the imagery products.Furthermore, if many users are accessing at the same time, the auto-scaling of the cloud resources contribute to offer on demand services to all the users accessing the images.If a private infrastructure is used instead this cannot be done in a flexible and scalable manner.In that case, the communications can be saturated if there is a high demand in the service.
One of the main objectives in the project was the evaluation of the social and economic viability of using cloud computing for EO services.Economic issues arise as an important handicap when talking about storage capability, although the scalability of the system and the reduced cost of on-demand processing and distribution are positive reasons to consider an economic advantage versus traditional on premises infrastructures.
Social aspects that affect the comparison between cloud computing and conventional EO infrastructures are not easily identifiable.Intrinsic performances as security in Cloud shall be taking into account; it is an issue to be solved in the next years in which numerous companies are working on.However, cloud computing offers many security solutions, and most important, a professional infrastructure which in many cases provide more security than using on premises infrastructures, being the lack of security a sensation of having your business virtualized, more than a fact.Other social benefit of the use of cloud computing is the robustness and efficiency of the services that can be provided thanks to the reduction of the processing, storage, communications and distribution costs that facilitate the access to remote sensing technologies.

Hybrid infrastructure versus cloud.
Cloud computing is efficient when on demand processing has to be done.However, if the input of data is constant a hybrid infrastructure could be a better solution.The processing of satellite imagery could be done on premises because the dimensioning of the system can be easily calculated with the generation of data.The distribution to end users of the generated products could be done by using cloud, since the demand is variable and non-deterministic.
The hybrid infrastructure is also a good solution for those applications and countries with legal limitations to store the files in distributed servers in different countries.In this case the processing and storage would be done on premises and the distribution to global users from the cloud.It is interesting to distribute the final products through the cloud because an elastic service for distribution can be implemented.The previous characteristics make Fed4FIRE a valuable infrastructure to carry out academic and industry driven experiments.Regarding the last one, Fed4FIRE was a key factor in the development of GEO-Cloud.The system was designed and developed to be experimented in Fed4FIRE, but it was directly migrated to SoftLayer® with good results.

Use of
The whole architecture and software modules were perfectly portable to the commercial cloud infrastructure and not extra effort in the implementation of the GEO-Cloud system in the Softlayer® infrastructure was required.
In SoftLayer® all the characteristics of the cloud were enhanced, since it is specifically designed for doing high performance computing.There we could validate the performance of the GEO-Cloud system in a commercial cloud infrastructure.
The interaction with SoftLayer® was intuitive thanks to the learning and training that was done in BonFIRE.SoftLayer® provided an API to manage the implementation of the system and the webpage.Besides, a wide set of VMs with different Operative Systems was offered.
In SoftLayer® dedicated VMs and physical machines were also selectable, which made the system performance very stable.This infrastructure is industry oriented more than experimentation oriented, facilitating the deployment, monitoring and control of solutions for the market.Thus both systems are compatible.While Fed4FIRE facilitates experimentation and learning to deploy industry driven systems, SoftLayer® offers the professional solution to deploy the commercial application.In Table 3 the main features of both cloud infrastructures are shown.

CONCLUSIONS
This paper described the culmination work of GEO-Cloud showing the main results of the research and its portability to a commercial infrastructure.
It was presented that cloud computing technology provides many benefits to remote sensing technologies, such as elimination of the total cost of ownership, reduction of the time to user, parallelization of processing, scalable storage, global distribution for variable demands of services, and externalization of services although also some limitations such as legal constraints for some applications.However a hybrid solution to use the best of traditional infrastructures and public cloud computing has also been discussed.
Furthermore, it was proven the usefulness of the European experimental infrastructures such as Fed4FIRE, which provides experimentation testbeds to trial innovative ideas and training in Future Internet technologies before porting the final solution to a commercial infrastructure such as SoftLayer®, which provides professional and optimized solutions for cloud based high performance computing.
It was also discussed the necessity of studying case by case the application to be ported to cloud computing because of legal constraints, architectural designs and costs of storage, in order to optimize the final solution and to evaluate which type of implementation is the best for every specific application.
The final conclusions of this work can be summarized as follows: Conclusion 1: The system architect has to appropriately design the system to be implemented in cloud.

Figure 2
Figure 1.Cloud Architecture in BonFIRE is an Integrated Project under the European Union's Seventh Framework Programme (FP7) in the topic of Future Internet Research and Experimentation.The consortium of this project is composed by 29 partners worldwide distributed and it is coordinated by iMinds, Belgium.This project establishes a heterogeneous, scalable and federated framework for experimenters in which a large number of European facilities are integrated.Those facilities focus on different areas of cloud computing and networking such as wireless networking, cloud federations, smart cities, grid computing and mobile networks among others.The number of this testbeds is increasing with new open calls of the Fed4FIRE project.Currently, the federation is composed by the following:  Community-Lab: it is a distributed infrastructure for researchers to experiment with community networks for creating digital and social environments. UltraAccess: it provides several Optical network protocols and resources to experiment with Quality of Service (QoS) features, traffic engineering and virtual Local Area Networks (LAN).
) Federated cloud: it can be formed by different and any kind of cloud infrastructure.Its main feature is that it can be constituted by different clouds, but provide an interface as if it were only one, i.e. it offers transparency for the user, abstracting him from the combinations of infrastructures that can constituted the federation.
c) Public cloud: it can be used by the general public, academics, and other organizations.d) Hybrid cloud: it combines public cloud with private cloud infrastructures.e

Table 3 .
Comparison between BonFIRE and SoftLayer® Consider that a specific architecture implemented in cloud computing can delay the image processing time if distributed storage is used, in comparison with a cloud architecture using local storage.It will depend on the application to use one or other storage distribution.Conclusion 2: Cloud computing eliminates data transfers using NAS or SAN shared storage instead of using local storage.The performance of NAS and SAN is lower than local storage, but transparency and unification are achieved.Conclusion 3: Cloud computing can reduce the time to deliver the imagery products to the end user since scalability allows parallel processing and automatic archive and catalogue of imagery products.Conclusion 4: Cloud computing facilitates the transfer of data from ground stations distributed around the world to commercial cloud nodes.Conclusion 5: Cloud computing eliminates the costs and effort of deploying, dimensioning, testing and maintaining a traditional infrastructure.Conclusion 6: Cloud computing can lengthens the satellite imagery processing time because of the virtualization of the instances, compared with fully dedicated resources, which can provide more stable processing time instead.Conclusion 7: Cloud computing for remote sensing applications requiring full control and specific location of the files stored in the servers is a drawback compared with private infrastructures.Conclusion 8: Cloud computing contributes to globally distribute the imagery products obtained.Conclusion 9: Cloud computing is positioned as a good alternative for remote sensing services in terms of socioeconomic advantages that not only imply cost reductions but also social benefits.Conclusion 10: Hybrid infrastructures can deal with legal limitations to store data and to contribute to the global distribution of services.Conclusion 11: If the generation of data is constant, hybrid infrastructures could provide a feasible solution by implementing on premises the processing of data and in the cloud the distribution of the final products to attend variable demands.Conclusion 12: Cloud computing for online services allow auto-scaling when several users access the servers simultaneously and eliminates the risk of overload in the web services.