AGROFORESTRY IN THE ALAS MERTAJATI OF BALI, INDONESIA. A CASE STUDY IN SUSTAINABLE SMALL-SCALE FARMING PRACTICES AND GEOAI

: This paper describes new results from a land survey of the Alas Mertajati in Central Bali based on multi-spectral data collected from a new class of commercially available satellites. We use these assets to create the first representations of sustainable small-scale farming, agroforestry, in the study area. We describe the process of producing the results, specifically establishing ground truth for complex land cover and land use classes, and discuss how input from stakeholders can be included in the creation of these representations. Furthermore, we describe an open-source software environment developed to create our classification pipeline with a focus on shallow learners, collaborative workflows and intuitive visualization results. The text ends with a discussion of bad maps; maps that contain outdated data, and why such maps are now problematic, particularly in resource constrained contexts.


INTRODUCTION
Bali, east of Java and west of Lombok, is part of the vast Indonesian archipelago covering some 2 million km 2 and encompassing over 17'000 individual islands, of which only about a third are inhabited (Ministry of Environment and Forestry Indonesia, 2020). In the center of Bali one finds the Bedugul area and the Alas Mertajati, spanning elevations from 1,200 m to over 2,100 m ( Figure 1). While Bali is a tropical island, the climate in the highlands of the Mertajati is cooler and wetter than areas at sea level. The Alas Mertajati is noteworthy because it functions as a source of freshwater for the island of Bali, as a site of traditional sustainable agriculture practices and as home to the Tamblingan, an indigenous group with ancestral bonds to the lands established at least since the 10th century AD (Utami, 2021). This text extends prior research efforts. It responds to the availability of new remote sensing assets with planet-wide coverage, and to the impacts of this condition on the majority world.

EXPANDING ON PRIOR RESEARCH
As we described previously in detail (Böhlen, Liu & Iryadi, 2022a), our project is focused on the Alas Mertajati in Central Bali. The territory is currently claimed as ancestral lands by the Tamblingan community (Suryawan, 2021). At the same time, the lands have been designated as a state forest by the Indonesian government. While both entities claim to protect the forest along sustainable principles (Strauss, 2015), each entity interprets the responsibilities and benefits of sustainable actions in different ways. Our goal is to describe how, and under which conditions, GeoAI based on newly available remote sensing assets can assist in representing the complexities of sustainable food production and land use in the Alas Mertajati. Furthermore, we consider the value of this opportunity for the Tamblingan in their ongoing effort to document established sustainable land use practices and monitor protected and sacred forest areas. Building on our prior inquiry, we apply the new remote sensing assets with an open-source GIS analysis framework to include the representation of agroforestry, an important land cover category that has to date not been addressed in machine learning generated land cover analysis of the study area.
We also discuss our collaborations with the Non-Governmental Organizations WISNU (WISNU, 2023); a trusted partner of the Tamblingan community as well as the Baga Raksa Alas Mertajati (BRASTI, 2023), an organization dedicated to conservation, documentation, education, and networking of the Catur Desa, the four villages Gobleg, Munduk, Gesing and Umajero that constitute the Dalem Tamblingan.

DATA SOURCES
Our previous research utilized 4-band (R-G-B-NIR) satellite assets provided by Planet Labs (PlanetScope, PS). With these 4-band spectral, 3.7m/pixel spatial, and 24-hour temporal resolution assets, we produced the first detailed land cover map of the Alas Mertajati that included mixed forests, clove forests and old growth homogenous forests (Böhlen, Liu & Iryadi, 2022b).
Figures 3 and 4 illustrate the differences between the previous generation Dove, the current S-Dove and the Sentinel-2 satellites in regard to spatial and spectral resolution. Sentinel-2 offers a spatial resolution of at best 10 m/pixel, depending on the specific band, rendering it unsuitable for studying small-scale agroforestry. While the spatial resolution remains unchanged, the spectral resolution of S-Dove is higher than that of Dove in spectra of relevance for agricultural land cover conditions at large, including agroforestry. Spectral reflectance of individual bands from Dove satellites (bands 1-4), S-Dove satellites (bands: 2,3,6 &8) and Sentinel-2 (bands 1-4), where: Am = Mean of agroforestry spectral response (with coffee arabica as dominant species); Ay = Agroforestry spectral minimum range (Am -standard deviation); Az = Agroforestry spectral maximum range (Am + standard deviation).

DATA COLLECTION AND VALIDATION
We augment our remotely sensed S-Dove satellite data with field level data collected by the research team and verified in exchanges with our local partners, WISNU and BRASTI. These ground truth data verification steps are an integral part of our analysis pipeline and an important part of our project philosophy. The iterative consultation with our local partners generates significant opportunities for insights and contextualization. The nuances of land cover categories and their relationship to land use practices are captured in discussions with the local experts in video meetings. These exchanges sensitize the research team to the lived conditions on the ground, and facilitate the translation of debate into the crafting of GIS conform reference data in the form of labelled regions of interest (ROIs). This is a time-consuming process that unfolds over several months. This collaborative process informs the data verification process, as well as the evaluation of analytical results. Specifically, it supports a differentiated process of establishing the 'best' results. For example, while the research team seeks to optimize the outcome of the resultant land cover maps in terms of statistical metrics, the selection of the final land cover map is left to the local land use experts from WISNU and BRASTI. Details on the reasons why this makes sense even from the perspective of optimization are offered below. Finally, we use insights gained from interactions with our local partners to devise a visualization regime that allows non-GIS experts to more easily read the results of the machine learning analysis.

SOFTWARE FOR COLLABORATION
In order to support our inquiry, we created an open source data collection and analysis pipeline based on QGIS, ORFEO and GDAL libraries called COCKTAIL (Cocktail, 2022). This repository allows one to automate the collection of Sentinel-2 and PS satellite assets, and to perform machine learning analysis procedures on the ingested data.
COCKTAIL contains modules to quickly determine GIS features of interest to our research, including the Normalised Difference Built-up Index (NDBI) as well as the Normalised Difference Vegetation Index (NDVI), and to apply these features directly onto raster imagery collected from Planet Labs servers (Planet Labs, 2023) and the European Space Agency repositories (ESA, 2022). COCKTAIL can be used for object-based classification via Random Forest, Support Vector Machine and Neural Networks through the ORFEO machine learning library (ORFEO, 2023). Moreover, textural information can be added as an additional layer of information to the classifiers. Our cloud-based classification pipeline allows us to perform permutations of hyperparameter combinations, and to keep track of the results in a sharable environment.
COCKTAIL is designed to facilitate collaboration in multiple ways. First, the code is open source, runs on Linux OS (currently Ubuntu 20.04LTS) and can easily be deployed to a remote computing environment. Second, COCKTAIL is designed to facilitate sharing. It keeps track of all pertinent settings and parameters used in the various analysis steps, such that a given result can be more easily and reliably replicated. It also manages the storage space and moves results out of the development environment to low-cost storage space. This later step can significantly reduce project costs, as data storage on machine learning enabled cloud environments is much higher than generic data storage. Together, these elements make COCKTAIL a useful enabler of collaborative GIS inquiry and GeoAI analysis.

CONCEPTUAL CHALLENGES
The challenges emerging from complex land cover categories requiring detailed scrutiny due to various kinds of heterogeneous land use scenarios (Zen, 2019) outlined in our previous work remain significant. Informed by the first round of experiments, we defined together with our local partners a revised set of land cover categories. It was essential to the project to base our technical inquiry on concepts of land cover and land use relevant to our partners and their concerns. Table 1. Land cover classes used to describe the landscape of the Alas Mertajati. This selection adds to our previous selection, the categories agroforestry and clouds, and combines multiple agriculture variations to a single category.
In the tropics, satellites in orbit perceive agricultural plots in lush greens, or barren browns in short succession. Moreover, heavy cloud cover that accompanies hot and humid climates significantly reduces the opportunity for a cloudless view of the lands.
The next sections describe the significance of the prime target of this current investigation, agroforestry, and how we proceeded to represent this category within the previously described collection of land cover categories of significance to tropical environments, and the highlands of Bali in particular.

Agroforestry
Agroforestry refers to land use systems that combine woody perennials such as trees, shrubs, palms, and bamboos with agricultural crops and animals in unique temporal and special arrangements (Lundren, 1982). Agroforestry systems are 3-dimensional arrangements with more species than other agroecosystems (Poffenberger, 1990) arranged in a time-varying spatial structure. Agroforestry does not rely on a single crop, and farmers can respond to changes in water and soil conditions with altered plantings. Diverse plantings in agroforestry plots produce a complex root system that allows the soil to absorb and hold water at higher rates. This condition reduces runoff and acts as an erosion barrier (Yuniti, 2022).
While an agroforestry plot may appear unorganized and haphazard, it is in fact designed to maintain permanently a high level of productivity across a variety of plants, even under environmental stress. A well-maintained agroforestry plot constitutes a robust, sustainable and efficient use of arable land. Because agroforestry plots are adaptive and less sensitive to fluctuations in rainfall, they increase the resilience of rural farmers and improve food security. And agroforestry systems are typically smaller scale operations managed by a limited number of farmers with family and personal connections to the lands. Moreover, the selection of species in an agroforestry system are often informed directly by established, cross-generational local ecological knowledge. The configuration of plant species in agroforestry sites also reduces the need for fertilizers and pesticides. Nothing more than manure from small-scale animal husbandry is typically applied to clove trees in agroforestry plots of the Alas Mertajati, for example.
Bali has developed a variety of agroforestry configurations, including the abian, a field located at some distance to a residential area; the kebon, a garden located close to a residence and the telajakan, a green space directly adjacent to a residence. Accessibility is important, specifically along the steep slopes of the hills and valleys surrounding the volcanic landscape. For that reason, plots are typically located in proximity to footpaths and small roads, facilitating the transport of produce from the fields to market. The sizes of agroforestry plots range from a fraction of a hectare to five and more hectares, small on all accounts compared to industrial agriculture farms. In the areas of concern to the Tamblingan, agroforestry plots often combine banana trees, coffee plants, avocado, jackfruit, guava, clove, palm trees and taro plants. Usually the plots have a subset of these species as dominant plantings, and some areas remain semi-wild.
There is a fluid boundary between agroforestry and mixed tropical gardens. In some cases, the same plants occur in either configuration. Typically, the mixed garden condition is more structured, more intensely managed and closer to larger roadways while agroforestry sites, often surrounded by forested territory, tend to be less intensely managed.
Despite the significance of agroforestry as a form of sustainable land use practice, agroforestry is not an official land cover classification category recognized by BIG, the Indonesian agency charged with cartographing the archipelago and its resources. In 2018, the Peta Kita initiative was launched with the goal of bringing together land use, land tenure and "other spatial data" into a singular database for Indonesia (Jon, 2018), (Gokkon, 2018). Given that Indonesia has a plethora of officially sanctioned maps, some for mining, others for oil and gas exploration, and yet another set for forestry, the initiative is an attempt not only to unify these disparate representations but to exert control of the resources mapped across all of these categories.
As opposed to monoculture plots, agroforestry sites are difficult to detect in satellite imagery. They are difficult to detect due to the variability of plantings, the 3-D spatial arrangements, plot irregularity and the small plot sizes. In a climate that knows no interruption to plant growth, and where plants can flower in weekly intervals, change is a constant and agroforestry sites produce and display more change than monoculture plots. Our analysis approach relies on a combination of supervised machine learning classification, fine-tuned and balanced data samples across all categories, and human feedback. Specifically, we expand the generic human feedback loop to include feedback from stakeholders who will be impacted by the information we produce. In this case, our feedback includes concerns from WISNU and BRASTI. This feedback itself is the product of an iterative process, as described above.
One lesson from our study is the insight that tropical land cover classes with complex and varying use cases, such as mixed forests, rice paddies and agroforestry, are difficult to reliably detect. Moreover, when such complex land cover classes are included as a single classification task, confusion across categories can be amplified. There are two main reasons for this condition. First, these categories can be challenging to detect in the field, even with the presence of local experts. We experienced this condition in January 2023 as the field team collected close to 100 ground truth samples of forest, rice paddies, gardens, agriculture, and agroforestry plots with a local expert. On several occasions, plots previously considered as agroforestry were re-assigned to either mixed forests or mixed garden plots. Even local experts can in some cases disagree on whether a particular plot is a wild-mixed garden or a well-kept agroforestry site. Moreover, upon inspection from members of WISNU and our research team, some plots were yet again reassigned to a different category. Second, the overlap of surface features outlined above generates an overlap in spectral responses that pose significant challenges to the classification pipeline, as Table 2 makes manifest. Unbalancing the ground truth reference data with additional samples to better represent one class can easily lead to reduced classifier performance in other categories, as described in the following section.

ANALYTICAL RESULTS
Tuning the selection of ROIs is time-consuming and difficult to control. Additional data does not always result in better outcomes.
Improvement of one category can result in deterioration in another category. Through trial and error, we eventually settled on between 15 and 20 ground truth samples per category. Some categories fared better with slightly lower sample sizes, and some better with slightly large sample sizes (rice paddies). The best result we could achieve in the detection of agroforestry was 0.6188, and in the cases of mixed forests and rice paddies the best f1-scores were 0.9458 and 0.8306, respectively. We purposely refrained from increasing the number of ROIs beyond 20 to prevent overfitting.
Given that agroforestry plots are small, and our spatial resolution only 3.7 m/pixel, identification of suitable ground truth for agroforestry pushes our endeavour to its limits. Agroforestry detection must be seen here as an edge condition, a condition that is only now starting to be detectable given the current state of remote sensing assets at hand. While ground sample collection is always an important part of land cover analysis, the data collection step takes on more significance where categories have overlapping and varying spectral characteristics, the selection of which can lead to different analytical results, as is the case with agroforestry.
Given the variance of the analytical results and the impossibility of generating one solution with f1-scores above a desirable 90% threshold for all land cover categories in our target collection, it appears logical to include additional criteria in the selection process. Specifically, we include concerns particular to our local partners in determining which kind of errors should be avoided, and which set of ground truth samples best supports the goal of adequately representing the lands. One example of the outcome of this consultation process is illustrated in Figures 7 and 8. A variety of grasses extend from the border of lake Tamblingan into the shallow waters and render the edges of the lake into swamp land with a spectral signature similar to that of rice paddies; a condition observed with the 4-band Dove, and replicated in the new 8-band S-Dove imagery. The production of food stands in opposition to the cultural significance of lake Tamblingan and the adjacent Ulun Danu Beratan temple, protected sites venerated since the 10th century. For that reason, any land cover representation that suggests commercial activities on these sacred lands would be considered by the Tamblingan to be a particularly undesirable classification error.  With these concerns in mind, we changed the previous "best" solution (#11 in Table 2) to include additional data samples, resulting in the 13th version of the ROI sample collection, with the corresponding analytical results shown in Figure 8. We unbalanced the dataset to include more samples around Lake Tamblingan to address this particular case. The number of false positive instances of rice paddies at the edges of the lake has been noticeably reduced.

Data visualization
The following section offers a visual overview of the three top performing ROI sets and the corresponding SVM classification. The source for all experiments was a crisp and cloud-fee S-Dove image from May 30th, 2022.  Table 3. Results from Support Vector Machine (top), Random Forest (middle) and Neural Network (bottom) classifiers operating on the optimal set of ROIs. In the case of the Neural Network, the polygons were converted to a points-based shapefile, based on the raster ROIs. Each ROI contains 1 to 3 distinct points (557 points in total).
During discussions with our local partners, we realized that the mapping of hue to category created some confusion when applied to all categories simultaneously. The distinction between photographic representation and analysis-based mapping was obscured by the detailed pixel level visual colour mapping, specifically where the hue selection coincided or overlapped with human perception; i.e., 'blue' for water and 'green' for forest. As a consequence, we decided to clarify the results by representing a single category per image in an arbitrary colour, and setting all others to null. The output from this operation is then filtered with a morphological operator (erosion) to remove noisy pixels. That result is in turn linearly blended with the near-infrared (or alternately green) band of the corresponding satellite image and contrast enhanced. The visual results of this conceptual focusing are shown below.  There are two main areas of settlements, one west of Lake Tamblingan and one adjacent to and south of lake Buyan.

UN-MAPPING WATER RESOURCES
As outlined in our previous work, water resources are likely the most significant natural assets within our study area. The observations we made regarding the limitation of satellite imagery to adequately capture information on water resources in the tropics also applies to the newest S-Dove configuration, unfortunately. Hydrography of small streams in the tropics requires laborious field work that is difficult to automate. This fact makes maintenance and quality control of existing maps a challenge.    -Geoportal, 2022). In our previous work, we described the significance of water resources for the island at large as well as for the indigenous groups in particular. Here we add observations on the validity of this official hydrology map.
An informal survey of a part of the Alas Mertajati in January 2023 showed that the map does not adequately represent the water streams of the area. 10 sample sites were visited, and of those 10, 2 sites were found to contain no water. Several others appeared to have reduced water flows. This informal survey occurred during the rain season, during the wet month January when the highlands of the Alas Mertajati receive up to 2000 mm of rain (Bali Besar Meteorologi, 2022). Figure 11 shows the dry streambed of one of these small rivers identified in Figure 10. This informal survey demonstrates the need to update the existing reference hydrology map. Not only is the map simply incorrect, it suggests an abundance of water that no longer exists. An update is particularly called for given the informal and formal observations and experiences of acute water shortages across the island (Cole, 2015), (Cole, 2021) over the last few years. Limited and critical resources are under stress not only due to effects of climate change but also due to the demands of an ever-increasing tourist industry, a topic fraught with baggage for the Balinese, especially as the impacts and costs of over-tourism are assessed (Sperling, 2020).

DISCUSSION
The iterative process of collecting and evaluating ROI datasets for shallow machine learners we outlined earlier, shifts here from an operation focused on optimization to one that includes the potential cultural impact produced by classification outputs.
Given that our process was only able to capture the distribution of agroforestry with a f1-score of 62%, one might dismiss the results due to inadequate statistical significance. While the numerical results are in fact of rather low confidence, we believe that establishing a precedent in mapping difficult and important sustainable land cover categories takes precedence over numerical accuracy. To be clear, we are not advocating for abandoning established quantitative metrics in the evaluation of classifier performance. Yet particularly where numerical results fail to deliver a convincing and crisp outcome, additional criteria can -and should, we argue -be included. Given that the 8-band imagery central to our operation has only been available for one year, we were not able to explore synergies from state-of-the-art deep learning in GeoAI. If ongoing advances in GeoAI for terrain analysis and agriculture can serve as an indicator (Tong 2020), (Wang, 2021), (Linaza, 2021), better solutions for the detection of agroforestry will become available in the near future. Yet even with large collections of high quality spectral and spatial data, GeoAI will be charged to consider the impacts of its operations against the interests of those most impacted by the operations. And while the performance of future classification operations will lead to more crisp and precise outcomes, errors will always occur, and it will remain critical to devise methods by which one can observe the potential impacts of these errors on stakeholders and make the impacts topics for discussion.
While the informal stream survey demonstrates the need for an updated hydrology map of this particular study site, it also points to a wider issue that accompanies the new big data regime in remote sensing. Outdated datasets not only formalize erroneous conditions, their very existence can prevent change from occurring, simply because they exist and occupy the position of an official reference. While some datasets can be easily updated, others -such as this hydrology map -require expensive field surveys and 'resist' change as they are expensive to replace. Such datasets are only recognized as the flotsam they in fact are when actively queried. The need for continuous updates, system upgrades, and the inertia that opposes change will place additional stress on under-resourced remote sensing operations in the majority world.