UAV and Deep Learning: Detection of selected riparian species along the Ganga River

Environmental protection and sustainable natural resource management are being recognised worldwide as essential goals to safe guard human health and wellbeing. Riparian zones, that face the highest decline in freshwater biodiversity, are of prime conservation priority because they are essential for regulating climate, preserving aquatic-terrestrial biodiversity, maintaining ground water recharge and restoring rivers. In today's fast-paced data-driven environment, artificial intelligence (AI) is the precise answer to a wide range of problems including biodiversity conservation and wildlife management. Leveraging advancements like Uncrewed/Unmanned Aerial Vehicles (UAVs) and AI has resulted innovative strides in wildlife conservation. This study utilised UAV imagery to record high-resolution data of aquatic habitat and species along the Ganga River and employed deep learning algorithms to analyse the data. Through extensive field surveys in the Hastinapur Wildlife Sanctuary, 7,025 photos representing a variety of environments, including 20,000 annotated samples of aquatic animals such as turtles and gharials were generated. Vision based computing capabilities such as pattern recognition model were developed to identify these species. To enrich and enhance the dataset and model, we used different image pre-processing techniques. Slight rotation (±5 degrees), minor cropping (up to 10%), and adjustments in brightness, saturation, and shear (±15%) were applied. Controlled blur (up to 0.5%) and exposure modifications (±5%) were also implemented on the image dataset to improve accuracy. Three Convolutional Neural Network (CNN) architectures, single-stage detectors named YOLO v7, YOLO v8, and Roboflow 3.0, were used for detecting the select species. Results show YOLO v8 excels, achieving mean average precision (mAP) of 98.8% for gharial and 92.2% for turtle detection, with a rapid average detection time of 0.308 seconds per frame at 3200 x 3200 resolution. Additionally, our model demonstrates real-time species detection capability through innovative frame sampling techniques with UAVs. This methodology provides promising technique to collect scientific data on IUCN red listed turtles and critically endangered gharials, allowing detection, monitoring, and real time counting with minimal intrusion. In conclusion, the fusion of UAVs and deep learning promises to revolutionize habitat monitoring, aiding conservation decision-making.


Role of AI and UAVs in conservation efforts
Artificial intelligence is offering solutions to the most leading problems of the 21st Century, in every possible sector.Today the world is data driven and data analysis at instant speed is the need of the hour.Wildlife conservation and natural resource management also have moved to technology-based advancements such as Unmanned Aerial Vehicles and Artificial Intelligence.Traditionally wildlife studies were off mainstream subject studied in isolation.However today they have been globally recognised as essential for human health and wellbeing.Riparian areas have gained priority due to their significance in river rejuvenation, biodiversity conservation, climate regulation, water and food provisioning amongst numerous other ecosystem services.Aquatic biodiversity is also amongst the most rapidly declining with many endemic species getting into IUCN red list of threatened biota.Scientific studies and the right kind of data can help direct judicious decisions thereby aiding conservation.The different ways of monitoring, keeping track, detecting, and counting wildlife species include methods like point count, transect count, area search, pellet analysis, nest counts, photographic surveys, bioacoustics monitoring, aerial surveys, etc. (Hong et al., 2019;Platts, 1987;Ralph et al., 1995).Meanwhile, UAVs are able to collect important environmental data and high-resolution imagery while navigating difficult terrain, deep forests, and isolated locations that are inaccessible to people.UAVs offer a noninvasive approach and the ability to cover areas that are challenging for human access, minimizing disturbances to wildlife and their habitats (Bennett et al., 2020;Tripathi et al., 2024;Witczuk et al., 2018).This feature allows researchers and wildlife managers to track wildlife numbers, habitat changes, and illegal activities without disrupting fragile ecosystems.UAVs equipped with various sensors enable animal monitoring, detection, counting, and even real-time surveillance and processing.Delineation of distinct riparian habitat types such as river bank, water body, aquatic vegetation, grassland, trees, shrub, grass, sand bars, built-up etc. are possible with high resolution data acquired from UAVs.With the assistance of deep learning-based object detection techniques, they facilitate informed data backed decision-making, paving way for a multitude of on-ground applications.This study explores vision based computing capability to train a model to recognise patterns and detect aquatic animals from very high-resolution drone imagery of the Gangetic riverscape.

Object detection techniques
Object detection algorithms are computer vision techniques that use patterns to train a model to detect a specific object.They are categorized into two main types.The first type, known as twostage detectors, operates in two steps to identify objects.These models prioritize accuracy in recognizing and locating objects precisely.Examples of two-stage detectors include DEtection Transformer (DETR), Faster Region-Based Convolutional Neural Network (R-CNN), Mask R-CNN, Cascade R-CNN, Feature Pyramid Networks (FPN), and Region-based Fully Convolutional Networks (R-FCN) (Carranza-García et al., 2021;Piao et al., 2022).On the other hand, one-stage detectors, the second type, are engineered for speed and can swiftly detect objects in real-time.Examples of one-stage detectors encompass various versions of YOLO, SSD, and RetinaNet (Cheng et al., 2020;Diwan et al., 2023;Jiang et al., 2022;Li et al., 2022).
One-stage detectors offer advantages over two-stage detectors in object detection due to their speed, simplicity, efficiency, scaleinvariance, and adaptability.They perform detection in a single pass, making them faster and simpler to implement, with reduced computational resources.Additionally, they excel at detecting objects at various scales and are more adaptable to different datasets.However, the choice depends on factors like application requirements, speed, accuracy, and resource constraints.Several studies investigate two-stage detectors that forgo speed in favour of greater accuracy (Carranza-García et al., 2021;Du et al., 2020).In our study, we opted for a singlestage detector to maximize accuracy in real-time scenarios.We employed widely accepted detection models, including YOLO v7, YOLO v8, and Roboflow 3.0, to accurately identify and count specific species within the study area shown in Figure 1.

Study Area
The National River of India, the Ganga, is the country's longest river, fostering unique biodiversity.The Hastinapur Wildlife Sanctuary (HWS) in the state of Uttar Pradesh (UP), with an area of 2073 km 2 falls under the Gangetic Plain (7A) biogeographic zone.Covering a length of 110 km along Ganga River, HWS is situated between 28• 46′-29• 35′ N and 77• 30′-78• 30′ E spreading over Hapur, Amroha, Meerut, Bijnor, and Muzaffarmagar districts of UP.HWS provides habitat for various wildlife species including water birds, gharials, turtles, nilgai, swamp deer, otters and dolphin.The study was conducted in the post winter months of February to April in 2022 and 2023.For this study, UAV was flown scanning the Gangetic river banks for the presence of gharial and turtles basking on the river islands and sand bars.Illustrates the selected species samples (gharial and turtle) taken by UAVs.

Selection of UAVs and data collection
An important protocol to follow during planning of UAV based survey is the procedure of obtaining permission from the concerned forest as well as administrative departments.We obtained drone-flying permissions from four district magistrates and three forest divisions where the survey was conducted.
We used two drones to collect data: a DJI Mavic 2 Enterprise Zoom with a camera that has 12 megapixels (4000 x 3000 pixels) and a 1/2.3"CMOS sensor, and a DJI Mavic Pro with a similar 12-megapixel camera (4000 x 3000 pixels) and the same type of sensor.The Mavic Pro can record 4K videos at 30 frames per second, and its camera is kept stable by a 3-axis mechanical gimbal.A detailed specification is provided in Table 1.The DJI Pilot and DJI Go 4 mobile application was utilized for gathering data.We flew the UAVs during Dawn (7:00 to 8:30 AM) and Day (2:30 to 4:30 PM) in strategic locations within the HWS along the Ganga River at multiple altitude range from 20 m to 90 m, based on inputs from previous field works.

Dataset Preparation and Processing
In this study, we employed various image pre-processing techniques to improve the robustness and versatility of animal detection models such as rotation, crop, shear, saturation, exposure, blur and brightness.In order to improve model generalisation, our method included rotation between -5 and +5 degrees, which created minor orientation differences.We also included cropping by up to 10% so that the model could identify animals in images even if they were partially obscured.In addition, shear, brightness, and saturation modifications between -15% and +15% were used to improve the model's adaptability to various lighting scenarios.As further image enhancement techniques, controlled blurring of up to 0.5% and exposure modifications between -5% and +5% were used.
Our experiments involved analyzing 7025 images collected using these two UAVs, during field surveys in the Hastinapur Wildlife Sanctuary along the Ganga River.The dataset comprised of UAV images and video recordings that were converted to images.The original dataset comprised of 3247 images of gharial and 1952 images of turtles and 1926 images of riverscape comprising of sandbars, river islands and river banks.This dataset was augmented to 21,075 images A total of twenty thousand image samples of aquatic fauna including Gharial and Turtles were annotated.

Methodology
This methodology outlines the entire process from survey planning to model deployment, focusing on image acquisition, annotation, pre-processing, model selection, training, evaluation, and deployment.Figure 2 shows the methodology flow diagram that incorporates the use of YOLO v7, YOLO v8, and Roboflow Object Detection Model 3.0 for object detection of the riverine species gharial (Gavialis gangeticus) and turtles from images collected along the Ganga river using UAVs., instances where the model incorrectly identifies an object as absent when it is actually present).N is the total number of predictions made by the detection model.

Result
This study attempts to take the process of ecological monitoring one step further by employing deep learning algorithms to analyse data collected by UAVs.The experiments involved an extensive analysis of 7025 images meticulously collected during field surveys conducted in the Hastinapur Wildlife Sanctuary (HWS) situated along the Ganga River.These images encapsulated a diverse array of habitats and conditions, contributing to a comprehensive dataset comprising 20,000 annotated sample of aquatic fauna including gharials and turtles.Further, we employed three distinct CNN architectures during the evaluation process.Three deep learning models for object detection were YOLO v7, YOLO v8, and Roboflow Object Detection Model 3.0.The results from our trials revealed that the YOLO v8 model, considered state-of-the-art, achieves remarkable accuracy and efficiency in detecting Gharial, with mean average precision (mAP) at 98.4%, accuracy at 98.8%, and recall at 98.6%.Other models, such as YOLO v7 and Roboflow, also show promising results, as illustrated in the Table 2. YOLO v7 leads in accuracy, while Roboflow 3.0 excels in precision.Comparison of the results with other model indicated significant improvement in the performance of the YOLO V8 model.The average detection time was 0.308 s per frame at 3200 x 3200 resolution.Further, our model also proves efficient in detecting real time using frame-sampling technique with UAVs.A comparison graph of all three models for both species are shown in Figure 5.We conclude that UAV and deep learning technique have the potential to give best for the habitat monitoring and informed decisions.The ongoing research represents progress in developing a real-time, fully automated system for observing freshwater species, such as Gharial and turtles, in a non-invasive manner for in-field applications.
In this way, technology can significantly contribute in the conservation of Critically Endangered species identified by the IUCN, such as Gharial (Gavialis gangeticus) by generating accurate counts, population estimation, and better coverage of habitats with minimal intrusion.

Discussion
Our study effectively used deep learning and UAV technologies to transform ecological monitoring in riparian areas.We analysed over 7,000 high-resolution UAV pictures to produce a comprehensive dataset of 20,000 annotated aquatic species of Gharial and Turtles.This demonstrates the utility of UAVs for large-scale ecological data collection with three CNN architectures, YOLOv7, YOLOv8, and Roboflow 3.0 produced promising results.This study lays the path for UAV-based deep learning systems to become the primary instrument for noninvasive, real-time animal monitoring and habitat conservation.
The capacity to track endangered animals in real time allows researchers to devise more effective conservation methods.
Targeting vulnerable species like gharials adds significance by showcasing the potential for conservation efforts.
The gharial species with a single sub species is found only in India, restricted to about less than 1000 individuals mainly in the Gangetic river system in the Chambal, Gandak, Ganga and Ghaghara, Girwa, Rapti, Ramganga, Son, Ken rivers and Mahanadi (Lang et al., 2019).Being a critically endangered endemic specialist freshwater species, monitoring of population is crucial for conservation efforts.Though population estimates in rivers such as Chambal, and Gandak (Katdare et al., 2011;Nair et al., 2012;Panda et al., 2023;Whitaker & others, 2007) have been intensively performed, similar exercise in other rivers is in nascent stage (Vashistha et al., 2023).The Ganga basin is home to 15 turtle species of which 12 IUCN redlisted as Threatened and many of them are also listed as Schedule I under the Indian Wildlife Protection Action.The small reptiles such as turtles are extremely sensitive to pollution in the river, fishing and climate change.Population estimates of turtles in their natural habitat are also limited.They are also difficult to be recorded through field surveys due to their elusive nature.
Often field surveys also miss out on alternate river channels especially in highly braided rivers like Ghaghara and Ganga.UAVs in such conditions provide uniformity and better coverage.Distribution range, hotspot identification and population estimates are essential for correct strategies for rehabilitation and breeding programs.Along Ganga, the distribution of the species in restricted to Protected areas such as the Hastinapur WLS.Despite this, these critical habitats along the Upper Ganga River have been greatly harmed by human activities, threatening the survival of species particularly in this study area (Paul et al., 2021;Tripathi et al., 2022).Hence, regular monitoring over the area using UAVs can closely track anthropogenic activities and offer better protection to animals.
While this study focused on achieving high accuracy in realtime species detection using YOLO v8, we also explored potential optimizations for further efficienc1y gains.One key consideration for real-time applications is processing speed.UAV imagery can capture video streams with high frame rates, but subsequent frames often exhibit minimal changes.Our study investigated the feasibility of frame rate selection as a method to optimize processing efficiency without compromising detection accuracy.This technique involves strategically skipping specific frames within the video stream while ensuring critical information for object detection isn't lost.
By analyzing the video data and identifying frames with minimal changes, such as those between stationary objects or slow movements, we can potentially skip those frames and focus processing power on frames with a higher likelihood of containing new information or movement.This approach could significantly reduce the computational load on the system, allowing for real-time processing at potentially higher resolutions or with additional functionalities.
Further research is needed to determine the optimal frame selection strategy for various scenarios and species detection tasks.However, the initial exploration in this study suggests that frame rate selection holds promise for enhancing the efficiency of real-time wildlife monitoring using UAVs and deep learning models like YOLOv8.
The Optical camera used in this study sense only three bands of Red, Green, and Blue.However using advanced multispectral sensors that can detect in the bandwidth of Red edge and Near Infrared can provide useful information about vegetation, habitat and water quality.

Conclusion
This study bridges the gap between cutting-edge artificial intelligence and practical ecological monitoring in riparian habitats.By leveraging deep learning algorithms, particularly YOLOv8, we demonstrate the successful analysis of highresolution UAV imagery for real-time species detection.Our research offers a significant contribution by enhancing the monitoring capabilities and real-time selected species detection, which boost the conservation efforts.
Our study reiterates that UAVs can aid aqualife observations in the Ganga River.High-resolution imageries can also be utilised for genus level differentiation.Morphometric measurements are an important parameter for scientific observation of reptiles.We were also able to determine accurate measurement of the body size and length of body parts etc.We were also able to collect habitat parameters such as elevation, bank slope, soil type, substrate, nearest vegetation, level of disturbance etc. can also be recorded using UAV.
Future research could focus on enhancing detection techniques and improving real time species detection.Populating datasets and enriching data quality with field surveys can result in upgradation to include more riparian species and thereby offer species detection and enumeration for more species of priority.Such measures can have huge impact towards data driven conservation management.
Shatakshi Sharma for their support during field visits-and lab work.

Figure 1 .
Figure 1.[a]The study area is shown on the map as being located in Uttar Pradesh, India's Hastinapur Wildlife Sanctuary (HWS).The surveyed location were along Ganga River.[b& c]

Figure 2 .
Figure 2. Methodology Chart representing the proposed methodology flow diagram of how three models is used to detect select species using RGB camera mounted on UAV including these steps.Further, we evaluated Precision, Accuracy mAP, and F1 score of the model.Precision represents how accurate the model's positive predictions are.It is calculated by dividing the true positive predictions by the total positive predictions, including both true positives and positives, while the mean average precision (mAP) is calculated as the average precision across all classes or categories of objects detected by the model.It is commonly used as a performance metric to evaluate the overall effectiveness of an object detection model.Recall is a metric that quantifies the model's capacity to extract all relevant instances from all real positive occurrences in the dataset.It is sometimes referred to as Sensitivity or True Positive Rate.The F1 score is a single metric that combines both precision and recall, giving a balanced view of the model's performance in binary classification tasks.It ranges from 0 to 1, where higher values mean a better balance between precision and recall.Mean Average Precision (mAP) serves as a widely adopted metric for assessing the effectiveness of object detection models.It gauges the average precision across various Intersection over Union (IoU) thresholds, encompassing a spectrum of values.

Figure 4 .
Figure 4. Showcases images captured by UAVs demonstrating the detection of Gharial within bounding boxes, with each box including the species name and the calculated mean average precision (mAP).YOLO v8 outperforms YOLO v7 and Roboflow 3.0 in most metrics.YOLO v8 achieves the highest precision (90.3%) and mAP (92.2%), indicating its accuracy in identifying turtles.It also exhibits strong recall (88.7%) and F1 score (0.89), indicating its balance between precision and recall.Roboflow 3.0 follows closely in precision but lags in recall and F1 score compared to YOLO v8 as shown in Figure 4.

Figure 5 .
Figure 5.Comparison of three models (YOLO v8, YOLO v7, and Roboflow 3.0) for Gharial and Turtle detection showing the accuracy, precision and recall.

Table 1 .
Key characteristics of two UAVs used in this study.

Table 2 .
The performance parameters (precision, mAP, and F1 score) of three models in detecting Gharial.

Table 3 .
The performance parameters (precision, mAP, and F1 score) of three models in detecting Turtle.