GLOBAL HUMAN SETTLEMENT ANALYSIS FOR DISASTER RISK REDUCTION

The Global Human Settlement Layer (GHSL) is supported by the European Commission, Joint Research Center (JRC) in the frame of his institutional research activities. Scope of GHSL is developing, testing and applying the technologies and analysis methods integrated in the JRC Global Human Settlement analysis platform for applications in support to global disaster risk reduction initiatives (DRR) and regional analysis in the frame of the European Cohesion policy. GHSL analysis platform uses geo-spatial data, primarily remotely sensed and population. GHSL also cooperates with the Group on Earth Observation on SB-04-Global Urban Observation and Information, and various international partners and World Bank and United Nations agencies. Some preliminary results integrating global human settlement information extracted from Landsat data records of the last 40 years and population data are presented.


INTRODUCTION
Global actors and global decision making processes need accurate and globally consistent data for evidence-driven reasoning: testing of hypothesis, development of concepts, monitoring and understanding of trends, and exploration of alternative scenarios.Space and airborne remote sensing technologies have contributed since the 60 of the last century to make consistent data series describing the physical characteristics of the global atmosphere, ocean and land masses.Census surveys techniques reporting about the amount of population, houses and productive activities have more than 5,000 years history dating back to the origin of the state and urban social organization of the humanity.The same can be affirmed about cadastral and geometrical land surveys techniques that have been an essential element in the development of the human environment since the beginning of recorded history.Taking apart hunter and gathering societies, today the absolute majority of human beings spend the dominant part of their life time in built artificial environment supporting by different means both their symbolic-cultural and practical-functional necessities.From the material and practical point of view, this artificial environment includes closed built-up structures and their open neighborhood spaces as roads, squares, gardens.The whole above can be shortly described as belonging to the basic, physical or material elements of the human settlement.
The understanding of the global human settlements is absolutely critical for a large number of issues including housing and urban development, poverty reduction, sustainable development, climate change, crisis management and disaster risk reduction, just to name a few.But despite their importance and despite the long history of human development in the surveys techniques, apparently basic questions about global human settlements still remain unanswered, as for example: how many square/cubic meters we have built in the last 40 years?, what are the specific spatial-temporal trends and patterns?, what are the occupancy conditions and density of persons in these spaces?and similar.Detailed human settlement data collected by census surveys techniques are very local, expensive then rare, and difficult to harmonize globally.Consequently large inconsistencies and large data gaps are existing at global level inherited from different national standards, nomenclature and resource availability for census data collection.The definition of what is considered an urban area can vary from localities of 200 or more inhabitants (Iceland) to settlements having 50 000 or more inhabitants with 60 per cent or more of the houses located in the main built-up areas and 60 per cent or more of the population (including their dependents) engaged in manufacturing, trade or other urban type of business (Japan) with a whole range of single national-specific heterogeneous conditions in between 1 .Similar and even more challenging issues are related to fine-scale land surveys and cadastral data: they are very expensive to be produced, collected and managed: from the global surface perspective, they are a rarity rather than the normality.Recent developments on crowd-sourcing and fine-scale cartographic open data collection as the Open Street Map 2 (OSM) project may contribute to the general picture, but they are far from the necessary completeness and consistency needed for global decision making processes.
Broad-scale land surveys using standard remote sensing technologies have demonstrated the capacity to map the global landmass with a sustainable cost in several application domains, but the map of the global human settlement is still largely incomplete.Available global surveys provide total accounted settlement surfaces that can vary up to one order of magnitude (Potere et al., 2009).The global bias and gain functions associated to the different sensors and information retrieval methods are still largely unknown, as regarding the detection of basic components of the humans settlements: houses, roads, and open spaces.In particular, the detection of these components with different background surfaces and landscape patterns combinations, in a representative set of the heterogeneous fine-scale cases available in the globe.As a consequence, global remote-sensing derived information about human settlements risks to inject large indeterminacy and systematic uncontrolled bias in the models assuming these information in input.Moreover, as other consequence, so The density, the heterogeneity, the dynamics of human settlements and their interactions with the environment are fundamental pieces of information we need to have at hand to help us keep in balance the use and regenerative capacity of our planets resources.But the current picture of the human footprint is incomplete.The majority of small and medium-sized settlements, critical for accounting and understanding the impact of people on the globe, remain largely invisible.The big dots may be visible, but not the all-important connections between them.And the truly vulnerable, such as those dwelling in refugee camps, shantytowns and slums are effectively missing from our global understanding3 .The aim of the GHSL project is to contribute to fill these information gaps exploiting remote sensing technologies.In particular, GHSL aims to contribute to a the global assessment of human settlement surfaces and their spatial dynamics, using fine-scale remote sensing data (Pesaresi and Ehrlich, 2009, Pesaresi et al., 2011, Pesaresi et al., 2013).Extensive tests using the Landsat data archive of the last 40 years have been conducted during 2014: some first results combining human settlement information extracted from satellite data and population data are showed here.

FINE-SCALE REMOTE SENSING OF GLOBAL HUMAN SETTLEMENTS
Despite the large potential of today's remote sensing technologies, concrete attempts to create fine-scale information layers reporting about human settlement at the date are only few.On the active remote sensing technology, published examples so far include a method aiming to produce global urban area extent using ASAR 75-m-resolution input data (Gamba and Lisini, 2012) and a method aiming to produce global urban footprints using finer resolution TanDEM-X data (Esch et al., 2013): while very promising, and based on input data with available global coverage, these technologies dont have provided yet a global complete classification test.On the passive remote sensing technology, a method aiming to global urban area mapping integrating ASTER data at 15m-resolution with extensive GIS data used in post-processing and information masking was proposed in by (Miyazaki et al., 2013).General issues related to these methods are i) the adoption of fixed cut-off in the feature space that are difficult to apply globally and independently from the background landscape or data collection parameters and ii) the adoption of rigid rule-based data masking strategies in the post-processing phase, in order to improve the accuracy of the classification output.Post-classification data masking strongly increases dependency of the satellite data classification output with third-parties sources: consequent strongly decreases the value added of satellitederived information.
First examples of systematic, sample-based analysis of global urbanization using Landsat data include the processing of 120 cities for two epochs (1990, 2000) (Angel et al., 2005) and a systematic analysis of 27 current mega cities using multi-temporal Landsat data (years 1975,1990,2000) coupled with TerraSar-X data (year 2010) that was presented in (Taubenböck et al., 2012).The first documented trial to generate global land cover by automatic classification of Landsat input data is the production of the Finer Resolution Observation and Monitoring of Global Land Cover (FROM-GLC) as reported in (Gong et al., 2013).In FROM-GLC only one epoch (circa 2006) was processed, and the the impervious surfaces resulted with not satisfactory classification accuracy (Ban et al., 2015, Gong et al., 2013).Successive experimental activity tried to inject in FROM-GLC output the urban or impervious information derived from third-parts, low-resolution satellite-derived information sources (Yu et al., 2014).Finally, a 30-m-resolution global land cover (GlobeLand30) was produced as reported in (Chen et al., 2014).GlobeLand30 process two global Landsat data collections (years 2000 and 2010), it integrates in the output several internationally available land cover products, and relies on large use of manual editing of the final maps done by domain experts (Chen et al., 2014).General issues related to the above methods are i) the relevant cost allocated to manual operations still needed including training set collection, data processing parameters tuning and output editing ii) the large computational cost of the adopted supervised classifiers.These issues risk to produce prohibitive costs in porting these methods to global multi-temporal, fine-scale data classification scenarios.

THE GHSL PROJECT
The Global Human Settlement project (GHSL) is supported by the European Commission, Joint Research Centre with the objective to design and test new technologies able to generate global fine-scale representations of the physical characteristics of the human settlements.In particular, the GHSL project is focused on innovative automatic image information extraction processes, using metric and decametric scale satellite data input (Pesaresi et al., 2013).The target information collected by the GHSL project is the built-up structure or building, aggregated in built-up areas and then settlements according to explicit spatial composition laws.They are the primary sign and empirical evidences of human presence on the global surface that are observable by current remote sensors.As opposed to standard remote sensing practices based on urban land cover or impervious surface notions, the GHSL semantic approach is continuous quantitative and centered around the presence of buildings and their spatial patterns: thus making the information gathering independent from any rural / urban prior abstract definition (Pesaresi and Ehrlich, 2009).
The GHSL project assumes an inclusive concept of the building, including temporary structures observable in refugees and internally displaced people (IDP) camps, and poor structures of shanty towns and slums.From the GHSL methodological perspective, automatic information gathering processes are the necessary conditions for sustainable global detailed surveys, but also for the reproducibility and public control of the information, thus contributing to the objectivity and evidence-based support to the decisional processes.Because of the institutional mandate of the project, the core scope of the human settlement theme proposed here is to support the global security and crisis management (GSCM) applications domains with global, fine scale information (Pesaresi et al., 2010, Freire et al., 2014, Freire et al., 2015a, Freire et al., 2015b, Florczyk et al., 2015a).In these applications, physical characteristics of the settlements -as number of buildings, their surface, their typology (size, high) and spatial patterns -are important information supporting decisions with large social and economical impact.This information typically supports activities as damage and reconstruction assessment, impact assessment, disaster early warning and alerting, losses estimates, exposure and risk mapping and post-disaster need assessment (PDNA), just to mention a few.From the above perspective, Remote Sensing (RS) data is potentially an interesting source because independent, globally-consistent, updated, synoptic, and objective.This is especially true for global actors involved in GSCM as World Bank, United Nations agencies and other governmental bodies that need to operate globally and multilaterally with standardized and internationally comparable models, even in areas where few or no fine-scale information are provided by the local authorities.
In (Pesaresi et al., 2013) the capacity to discriminate built-up areas was demonstrated with optical sensors in the spatial resolution range of 0.5m-10m and an extensive test set including more than 50 millions of square kilometers of mapped surface.The same system was successfully applied using 2.5-m input sensor resolution for producing large national coverages in Brazil (Kemper et al., 2013) and China (Lu et al., 2013) using CBERS-2B panchromatic data, and continental coverage in Europe (Ferri et al., 2014, Florczyk et al., accepted 2015b) using Spot-5,6 multispectral pan-sharpened data.Figure 1 shows an example of the human settlement information extracted by these technologies.
To date, these experiments are the largest and most general known attempts to apply automatic data classification techniques for mapping built-up areas using this class of image data in input.A new inter-scale inter-sensor machine learning methods was introduced by (Pesaresi et al., 2013) in the discrete classification field.The aim of the method is to substitute the expensive expertdriven train set data collection with systematic access to open source spatial information already collecting proxies (by scale, by thematic contents) of the information under request.Similar approach in the continuous classification field was independently proposed by (Sexton et al., 2013) for solving the problem of estimating Global, 30-m resolution continuous fields of tree cover.These learning methods work in the scenario where the whole data universe under processing is labeled by one or more training set data.The approach assumes that modern technology allows researchers to analyse the whole population rather than just inspecting a smaller sample: N, the number of observations in the sample, is equal to all (Mayer-Schönberger and Cukier, 2013).This approach was demonstrated robust against large errors in the training set induced by scale generalization and/or omission, commission errors in the data sources: thus allowing to substitute the expert-driven classification parameter tuning with adaptive optimization techniques automatically estimating the best parameters in the specific scene under process.Similar approach can be used for consistency optimization of the global fine-scale information mosaics (Syrris and Pesaresi, 2013), training semisupervised classifiers (Li et al., 2014) to 15 meters.Figure 2 shows the output of the automatic recognition of built-up areas as implemented in the alpha release of the Landsat GHSL.The concept of Human Settlement adopted in this study relay on the classical notion inherited from the settlement geography, defined as "...the description and analysis of the distribution of buildings by which people attach themselves to the land."(Stone, 1965).The building is the basic sign of the human presence that can be physically observable by remote sensing technologies.Consistently with the above approach, the whole classes of settlement used in this study are derived by spatial generalization of the basic information about the presence of the building, as detected by the available remotely sensed data.
The building are constructions above ground which are intended or used for the shelter of humans, animals, things, the production of economic goods or the delivery of services.The working definition above is a thematic characterization of the building class as defined by the Infrastructure for Spatial Information in Europe (INSPIRE) standard (Infrastructure for Spatial Information in Europe, 2011).In particular, respect to INSPIRE, the underground building case was excluded and the permanency of built-up structures condition was violated (Pesaresi et al., 2013).The reason of the first change is clearly related to the limit of the adopted remote sensing technology, while the second is related to the application domain.Global security and crisis management applications often require to monitor temporary or semi-permanent settlements as refugee camps, but also informal and poor built-up structures not falling inside the standard building class.The built-up area BUΨ is the set of all the spatial units collected by the specific sensor-information-model Ψ and containing a building or part of it.The sensor-information-model Ψ embeds the spatial detail (scale) of the sensor used to extract the thematic information, and the thematic detail (information sensitivity and specificity) allowed by the adopted information extraction method coupled with the available input data characteristics.
Figure 3 shows the comparison of the human settlement information extracted from satellite sensors at different spatial resolution in the area of Chicago-Detroit (US).At the top the artificial surfaces as reported by the ESA GlobCover 4 using 300-m-resolution MERIS satellite imagery.At the bottom it is showed the built-up areas as reported by the JRC GHSL using 30-m-resolution Landsat satellite data input.It is evident the gain of new information discovered by finer resolution sensors and advanced data classification techniques.Figure 4 shows the output of the multitemporal information encoded in the GHSL Landsat in three large cities: from top to the bottom Shanghai, San Francisco, and Paris.In red the built-up existing before the 1975.Orange, yellow and white are encoding the 1990, 2000, and 2014 years.

THE GEO WORKING GROUP
The JRC GHSL project supports the Global Human Settlement Working Group (GHS WG) in the frame of the Group on Earth Observations (GEO).The scope of the GHS WG is to contribute the GEO task SB-04-C1: Global Urban Observation and Information by establishing and fostering a new community of practices focused on specific goals.In particular, testing the production and the use of new global human settlement information products derived by integration of global remote sensing data, environmental data, population and socio-economic data analysis.
The scope is global and multidisciplinary, with a particular emphasis on testing the use and integration of new global fine-scale information products made available by development of the remote sensing technology and the establishment of open public data access policies.The GHS WG is committed to develop a 4 http://due.esrin.esa.int/page_globcover.phpOne important GHS WG objective is to improve the sharing of tools and data in the different expert domains.In October 2014, JRC decided to share among the partners of the working group the alpha release of GHSL Landsat for early testing and model integration activities before the official release planned in 2016.This new information layer was extracted from Landsat global data records of the last 40 years  trough the porting of the GHSL production workflow in the 10m-75m input image resolution range.To date, the user application list of the fine-scale GHSL information includes population spatial modeling, census planning, poverty mapping, slum mapping, regional development and planning, transport planning, urban and global climate modeling, spatial epidemics analysis, water analysis, ecological studies, environmental protection, agricultural fragmentation studies, and historical landscape protection.Their geographical scope may include national, regional/continental, and global coverages as well.
Some preliminary results integrating information extracted from satellite data and population census data are reported here.Figure 5 shows the aggregated global results of population and builtup areas in the years 1975, 1990, 2000 and 2014 as extracted from the World Bank Open Data5 and the JRC GHSL sources, respectively.It is evident that population and built-up areas are linked by direct relations, with global built-up areas growing more rapidly respect the population growth.This is an empirical evidence that globally we are increasing the amount of built-up space procapita.The built-up areas pro-capita can be linked to the land-use efficiency and socio-economical development, demanding proportionally more built space for housing and services.Such dynamical data will be crucial input for understanding and modeling the evolution of the human settlements in the next years.It is evident that the global trend showed in the Figure 5 averages a multiplicity of heterogeneous local situations.Figure 6 shows the estimated amount of built-up area pro-capita in the years 1975, 1990, 2000, and 2014 in the Countries with more than 50 millions of inhabitants.It is evident the large global disparity and socioeconomical divide between the two extrema of the list: namely the 560 square meters of built-up area per inhabitant in US and the 12 square meters of built-up area pro-capita in Bangladesh.Such data can potentially support the definition of globally-consistent and evidence-based indicators contributing the the monitoring of international policy processes as Sustainable Development Goals (SDGs) and Disaster Risk Reduction (DRR).

Figure 1 :
Figure 1: Example of the European settlement information extracted by the GHSL platform from Spot 2.5-m resolution input imagery in Athens (Greece).Top: input image.Bottom: output of the automatic data classification.Dark brown: built-up areas (buildings), green: vegetated open spaces, white: other open spaces

Figure 2 :
Figure 2: Example of built-up areas extracted from Landsat data processing in the GHSL workflow in the Veneto region (Italy).Top: high resolution image of a scattered settlement pattern.Bottom: output of the automatic recognition of built-up areas using 30-m-resolution input Landsat data.

Figure 3 :
Figure 3: Comparison of the human settlement information extracted from satellite sensors at different spatial resolution in the area of Chicago-Detroit (US).Top: artificial surfaces as reported by the ESA GlobCover using 300-m-resolution MERIS satellite imagery.Bottom: built-up areas as reported by the JRC GHSL using 30-m-resolution Landsat satellite data Figure 4: Examples extracted from the ALPHA release of the GHSL Landsat Multitemporal.From top to the bottom: Shanghai, San Francisco, and Paris.In red built-up areas detected before 1975.Orange, yellow, and white encode the built-up areas detected in the 1990, 200, and 2014 epochs

Figure 5 :
Figure 5: Global population and built-up areas evolution in the last 40 years.The fine-scale built-up areas are estimated by the JRC GHSL using Landsat input imagery

Figure 6 :
Figure 6: Estimated amount of built-up area pro-capita in the years 1975,1990, 2000 and 2014 in Countries with more than 50 millions of inhabitants.