Affordable Sensors for Speditive and Accurate Documentation of Built Heritage: First Tests and Preliminary Results

When facing unregulated human interventions, the tumultuous backdrop of wars, and the relentless impact of climate change, a significant effect is held on those structures that stand as custodians of our cultural identity and historical legacy. The need to document their status is an essential first step toward preservation and restoration. This contribution will analyze and compare diverse datasets acquired through three distinct methods: Terrestrial Laser Scanner, Apple’s iPhone 15 Pro with no additional tool, and iPhone 15 Pro using Pix4D viDoc RTK Rover to assess the quality and accuracy of point clouds, as well as to determine the feasibility of each technique for the expeditious documentation of endangered built heritage. The use of mass-distributed low-cost sensors provides a rapid and cost-effective means to capture detailed 3D models of built heritage while, at the same time, democratizing the process of documentation. Low-cost sensors not only facilitate documentation but also enhance the efficiency of conservation efforts. The precision of LiDAR sensors helps identify structural vulnerabilities, allowing conservationists to prioritize interventions and allocate resources wisely. Furthermore, detailed imagery will enable conservationists to monitor subtle structural changes over time, acting as an early warning system against potential threats. As an additional aid, the RTK Rover enables centimeter-level positioning accuracy while maintaining a compact and fast deployable design, and the direct measurement of Ground Control Points (GCPs) significantly streamlines the surveying process and equipment footprint.


Introduction
Those assets that make up the wealth of a territory and its inhabitants are known as Cultural Heritage (CH).Recognizing the value of specific resources comes with the need for their protection and preservation, and predetermined sets of rules have to be set.In Italy, for instance, CH has been safeguarded by a series of laws, including the Italian Constitution of 1947.Article Nine states: "The Republic shall promote the development of culture and of scientific and technical research.It shall safeguard the natural beauties and the historical and artistic heritage of the Nation.It shall safeguard the environment, biodiversity and ecosystems, also in the interest of future generations."In fact, after the Second World War and its great destruction of Built Heritage, it became evident that such a situation should be further contained.For this exact reason, UNESCO was established by the United Nations in 1945 to promote world peace and security through international cooperation in education, arts, science, and culture.Since 1978, UNESCO has protected World Heritage Sites (WHS), landmarks, or areas designated for cultural or historical significance and considered of outstanding value for humanity.As of 2023, 1199 sites have been selected from 168 countries.With 59 areas on the list, Italy has the most significant number of sites, with five hosted in the Piedmont region, where Turin is located.Referring to Cultural Heritage, people often think about Built Heritage, like monuments or buildings.Still, it can also refer to other physical assets such as paintings and sculptures or a natural environment.At the same time, it can also deal with intangible heritage like knowledge, oral history, and even traditional traits transmitted through generations.The common goal of all humanity should be the protection of CH, which has to be preserved for future generations.The recording and documenting of Cultural Heritage is one of the most fundamental parts of the protection process, mainly when dealing with at-risk Heritage and it relies, at least for tangible heritage, on different speditive techniques regarding the field of Geomatics.The use of complex and heavy instrumentations like laser scanners, GNSS Static Receivers, or Total Stations can significantly impact the time needed for the acquisition, and RTK-capable UAS (Uncrewed Aircraft System) for a photogrammetric reconstruction are not always suitable.

Related Works
Mobile mapping systems (MMS) are widely used technology for spatial data collection of large-scale projects like city mapping and they function by equipping moving devices with laser scanning or imaging sensors, which can also be used for built heritage documentation (Alsadik & Jasim, 2018;Bonfanti et al., 2021), even in underground scenarios where TLS are scarcely used due to limited accessibility and non-uniform lighting condition (Di Stefano et al., 2021).Besides MMS, recent studies showed that other LiDAR technologies have been used to document Forgotten Cultural Heritage, such as TLS or Airborne LiDAR sensors (ALS) (Maté-Gonzalez et al., 2022).Other low-cost sensors, such as digital cameras (Fanar et al., 2017) and multi-camera LiDAR systems, have been analyzed for cultural heritage conservation and for the spatial and temporal preservation of tangible and intangible heritage.(Breggion et al., 2023).Since its release in 2020, also the LiDAR sensor on Apple's devices has been studied for different applications, such as industrial 3D scanning on small objects (Vogt et al., 2021), forest inventory (Gollob et al., 2021), snow depth changes monitoring in small areas through time (King et al., 2022), 3D survey of rocks and cliffs for geological purposes (Luetzenburg, Kroon & Bjørk, 2021) or human body measurements (Zamotsin et al., 2022).Finally, some initial tests have also been carried out for heritage documentation purposes on different scenarios: small-medium objects, monuments, exterior façades, and indoor mapping (Murtiyoso et al., 2021;Teppati Losè et al., 2022), and for 3D modeling in indoor and outdoor environments (Díaz-Vilariño et al., 2022).The results obtained with Apple's LiDAR sensor have been compared with a TLS or photogrammetry reference dataset for all these applications.

iPhone 15 Pro sensors
Apple first introduced the LiDAR sensor on its devices in 2020, with the 4th generation of the iPad Pro series and the 12th generation of the iPhone Pro series, with the primary intent to take Augmented Reality (AR) to a new level.Capturing a tremendous amount of high-resolution data in the entire field of view in an instant makes it possible to constantly map an environment.Among the available surveying techniques, the sensors available on the iPhone 15 Pro (Table 1) take advantage of laser scanning and photogrammetry, which allow the retrieval of accurate information within a short period.Laser scanning offers multiple applications in ground and aerial surveys.Terrestrial Laser Scanning (TLS) is particularly useful in inaccessible environments, becoming a valuable tool in mine engineering, architecture, and heritage documentation.Among the laser scanners, some are known as "time of flight" (TOF) since they determine distances by measuring the periods a short laser pulse takes to go back and forth from the scanner to the object.What makes this LiDAR scanner significant is the specific technology used to sense and measure depth.Apple's device uses structured light for Face ID, emitting a grid of 30,000 dots visible only to IR cameras arranged in a regular pattern for depth estimation, and it uses direct time-of-flight (dTOF) for the LiDAR sensor.Along with the LiDAR sensor, the cameras are also exploited in the acquisition process by utilizing the principles of photogrammetry, the technique used to obtain physical information such as color, shape, or dimensions of an object or environment by extracting data from photographic images.It is particularly significant for assets out of human reach or with fragile objects since physical contact is avoided.

Low-cost Time-of-Flight Sensors
Apple's solution is just the latest in providing time-of-flight sensors on mobile mass-produced devices at a reasonable price.
The main disadvantage of the LiDAR technology is its expensiveness, with prices for TLS starting from just under 20'000 € and going up to over 100'000€ in function of their range, features, and speed, reducing the appeal and availability for the vast majority of the public.Through the last two decades, though, there have been different affordable approaches to laser scanning and time-of-flight cameras, mainly focused on AR applications, but which can also be exploited for different scopes, such as robotics, geomatics, and 3D documentation.Among those noteworthy are the work done by Nintendo and Microsoft, respectively, with the Wii and Kinect.
The first one uses a sensor bar with five infrared LEDs at each end to calculate the distance and angle between the remote and the sensor bar by triangulation.The second one uses an RGB camera, a depth sensor, and an infrared projector to create a 3D model of the environment by calculating the time of flight of the transmitted near-infrared light to measure the distances of each point on the player's body.Finally, before Apple's solution of providing LiDAR sensors on a mass-produced mobile device, Google and Samsung tried implementing depth sensors respectively with the project Tango and the Galaxy S10, S20+, and S20 Ultra, but with a timid reception from the public.They both took advantage of depth and motion sensors and the primary RGB camera to accurately measure three-dimensional objects, offering virtual simulation for Augmented Reality scenarios or video games.

viDoc RTK rover
On top of the high-precision sensors offered by the iPhone 15 Pro, an auxiliary component can be used, the viDoc RTK rover (Figure 1), produced by viGram GmbH, a German company specializing in construction and surveying.The rover works via NTRIP technology and can be mounted to the back of the device to be used along with Pix4D's proprietary mobile application, Pix4Dcatch.The app itself is free, but some credits must be purchased for the online data processing, which can also be done on a PC for free using Pix4Dmatic or other photogrammetric reconstruction software.The proposed workflow is designed for iOS devices equipped with the LiDAR sensor (iPhone and iPad Pro series released after 2020), but it also works for other selected Apple and Android devices.The viDoc itself is as big as the mobile device, with only the antenna slightly protruding upwards, still making the surveying instrumentation lightweight and pocket-sized (Table 2).

Datasets and Methodology
The data on which this assessment will be carried out are two notable examples of built heritage in Turin, Piedmont, which was the home of the Savoy Royal Family and later the first Italian capital.One is the statue of a lying lion (Figure 2a) at the base of a monument to Giuseppe Garibaldi (1807-1882), one of the leading patriots who contributed to Italian unification.The second one is the external porticade of Castello del Valentino, which is included in the UNESCO WHS list as part of the former Royal Savoy residences and which now houses the Faculty of Architecture of Politecnico di Torino (Figure 2b).

Acquisition Process
Ground Control Points (GCPs) are placed and measured for each dataset to assess the quality of the viDoc RTK rover point positioning compared to the Stonex S990a GNSS receiver.The ground truth first calculates each point's coordinates by positioning the end of the telescopic pole at the center of the target and leveling it with a physical bubble (Figure 3a).The same procedure is carried out with the mobile sensor which, once leveled with a digital bubble on the screen, automatically measures the distance between the antenna phase center and the target center using the laser on its bottom (Figure 3b).For the acquisition phase, different software and hardware are used: Pix4Dcatch with the viDoc, the same application but with no additional sensor, and another app, Scaniverse, to verify whether the data acquired by Pix4Dcatch is more accurate.Each acquisition for the chosen scenarios lasts around one minute, and it essentially consists of walking slowly and steadily around the object to be surveyed, while looking at the screen to estimate the areas that still need to be captured.After this phase, the position of the Points measured with the RTK rover is manually assigned on the images acquired in Pix4Dcatch.This step requires selecting one point and locating it in a minimum of two images.By repeating this procedure on at least three GCPs, the processing done online on Pix4Dcloud or locally on Pix4Dmapper will take them into account to improve accuracy.

Validation and First Results
For assessing the quality of the viDoc RTK Rover, two ground truths have been used: the first one regards the quality of the GCPs positioning, which have been measured both by the rover and by Stonex S990a GNSS Receiver (Table 3), while the second one regards the difference in density and distance between the point clouds (PCs) generated using the real-time positioning correction offered by the viDoc with the one acquired statically by the Faro Focus X 330 TLS (Table 4).
The same comparisons are made with the device with no additional sensor, using Pix4Dcatch and Scaniverse.

GCPs Coordinate Validation
The first assessment deals with the GCPs, whose coordinates are measured by the ground truth GNSS Receiver in RTK mode and by the viDoc rover.These positions are also calculated on the PCs generated by Pix4Dcatch and Scaniverse using the iPhone's internal GPS.Tables 5 and 6 summarize for each direction the average differences (Av.Δ) between the ground truth and the other acquisitions for the two datasets, as well as the standard deviations (σ), the minimum and maximum values.From this first test, it is notable how using the RTK rover significantly improves the positioning accuracy of the GCPs and, therefore, the correct georeferencing of the whole point cloud.In the first scenario, using only GPS, the average planar error is about 1,4 meters (1,49 m in X and 1,27 in Y direction), and the elevation error is greater than 5 meters.The same values acquired by the RTK rover have errors lower than 3 centimeters for both planar and elevation measurements.Similarly, in the second scenario, the average planar error is 2,6 meters (1,8 m in X and 3,4 in Y direction), and the elevation is 2,9 meters.Once again, using the RTK rover, the errors are significantly decreased to about 2 and 3 centimeters.On top of that, this first assessment recognizes in the app exploited by the RTK rover a greater accuracy compared to Scaniverse, which for the first scenario reaches a slightly better planar difference of about 1,1 meters (1,4 m in X and 0,7 m in Y direction), but an average elevation error of 54,9 meters.In the second scenario, the results are similar, with a planar average error of about 1,3 meters (0,17 m in X and 2,45 m in Y direction) and an elevation average greater than 53 meters.

Point Cloud Displacement with respect to TLS
All the acquired PCs have been compared to the ground truth acquired using the Once all the point clouds have been aligned with respect to the one acquired with the Faro Focus X 330 and segmented removing all the unnecessary elements, two analyses have been carried out.The first one regards the spatial density of points, while the second deals with the distance from the ground truth after the ICP alignment.For what concerns the first analysis, a radius of 2 cm is set, in which the number of neighboring points is computed (Table 9).

N° of Total Points
Mean  The PCs acquired by Pix4Dcatch have a much greater number of points than those generated with Scaniverse.In comparison with the ground truth, on the other hand, the point cloud obtained with the viDoc is much more sparse for the Porticade of the Valentino Castle (597,7 vs. 133,3 average points in a 2 cm radius).Still, it is more dense for the Lion Statue (66,9 vs 334,9 average points in a 2 cm radius).This can easily be explained by the fact that by moving around with a mobile device, it is easier to capture small details from orientations that would be inaccessible with a fixed TLS, which was positioned relatively far from the statue, while in the other scenario in was directly positioned inside of the porticade, allowing for a more dense reconstruction of the scenario.The portion of the analysis concerning the distances with respect to the TLS datasets after the ICP alignment is processed on Leica Cyclone 3DR.A maximum range of 10 cm is considered to discriminate the actual point cloud distances from all those points that represent a change in the environment between the acquisition epochs (i.e., flower pots or chairs in the Valentino Castle positioned after the TLS acquisition).
Usually, when scanning an environment with a mobile device, a drift in the acquisition may occur as time passes or in the presence of a loop closure, especially for indoor environments.However, since the test on the RTK rover would have been less significant in an indoor scenario, and since the datasets are captured rapidly without loop closure, no significant drift can be noticed (Figure 6).Table 10 summarises the distances for each acquisition with respect to the ground truth.In the first scenario, after the ICP alignment with respect to the Faro Focus X 330, more than 80% of the points fall within the distance for the point cloud acquired with Pix4Dcatch RTK.By only using the GPS, this percentage is slightly lower, at 77,7%; for Scaniverse, this amount goes down to 40%.
For the second scenario, in the same range, we can find more than 50% of the points acquired using the RTK rover, while only GPS provides just above 33% of the points within a distance of 2 cm.At first sight, the result obtained with Scaniverse seems better, with 60% of points within this range.
Still, it must be taken into account that this value is a percentage referring to the number of points of the segmented cloud: 60% of 289,9 thousand points for Scaniverse against 51% of 11,4 million points for Pix4Dcatch RTK.The results of this distance assessment suggest the goodness of the acquisition process of Pix4Dcatch, both using the RTK rover and on its own, as well as of the processing, which for this contribution was done via cloud but which can be done locally using Pix4dmatic or other software for the photogrammetric reconstruction.

Conclusions and Future Possibilities
The solution offered by viGram and Pix4D for a rapid and accessible mapping of the built heritage provided promising results, with the RTK rover having low discrepancies (< 3 cm planar and elevation errors) from the GCPs measurement by the ground truth of a GNSS receiver in RTK mode.At the same time, the app itself was compared to a second one, Scaniverse, to ensure that the most outstanding data quality was exploited.
The point clouds generated by the second app were significantly less dense than the those of Pix4Dcatch, partly because the data processing can only be done in real time in a few seconds, while the cloud processing takes several minutes.Among the many free apps taking advantage of the LiDAR sensor, others may offer better results, which the authors may not have considered.
A subscription-based app, Dot 3D, available for selected iOS, Android, and Windows devices, is currently working on integrating GNSS/RTK receivers into their workflow for the automatic and precise georeferencing of point clouds.Still, no result has been published at the time of this contribution.The use of mobile mass-distributed devices with high-resolution sensors for the documentation of built heritage lays the foundations for the democratization of the surveying techniques thanks to the integration of a mobile device with a pocket-sized, and much less expensive RTK rover compared to a TLS paired with a GNSS receiver.The precision of such a compact device can be crucial in rapidly identifying structural vulnerabilities and subtle changes over time, allowing for swift responses and resource allocation against potential menaces.
The price difference between the two instrumentations justifies some limitations, such as the maximum acquirable range of 5 meters for Apple's LiDAR sensor, which could be increased by mounting the mobile device to a telescopic pole to reach higher areas to survey.Still, an integration with high-resolution, compact, and relatively inexpensive UAS, such as the DJI Mini series, can be obtained by using the GCPs measured by the RTK rover for the aerial photogrammetric reconstruction of objects of notable dimensions or areas out of reach to human operators.The complete instrumentation would still be much less expensive and more lightweight than the ground truth one (Table 11).The viDoc itself, like all RTK/GNSS receivers, is not suited for indoor scenarios.However, GCPs measured outdoors with precision in a mixed-environment acquisition can still be taken into account for a point cloud adjustment in the processing phase.Similarly, like all GNSS receivers, the device can not function independently for poor signal areas, but it needs a base station to send positioning corrections.
Along with the speditive documentation of built heritage, this compact and lightweight configuration could also be exploited for many other activities, such as for documenting trenches and construction sites, or for collision and forensics reconstruction.

Acknowledgments
The viDoc RTK rover tested in this contribution has been offered to the authors in the context of the Pix4Dcatch RTK desktop grant for university research, where a workflow regarding the use of photogrammetry to be integrated with TLS and UAS for a complete and accurate point cloud reconstruction for green architecture was proposed.The ground truth datasets have been acquired by Politecnico di Torino's Department of Architecture and Design (DAD), specifically by the Laboratory of Geomatics for Cultural Heritage (LabG4CH) research team, and have been kindly made available for the analyses and comparisons of this contribution.

Figure 2 .
Figure 2. The two case studies: a lion statue at the base of Garibaldi's monument (a) and the porticade of Valentino's Castle (b) in Turin, Italy.

Figures 4
Figures 4 and 5 show significant portions of the two point clouds generated with the viDoc RTK rover using Pix4Dcatch.The PCs acquired by Pix4Dcatch have a much greater number of points than those generated with Scaniverse.In comparison with the ground truth, on the other hand, the point cloud obtained with the viDoc is much more sparse for the Porticade of the Valentino Castle (597,7 vs. 133,3 average points in a 2 cm radius).Still, it is more dense for the Lion Statue (66,9 vs 334,9 average points in a 2 cm radius).This can easily be explained by the fact that by moving around with a mobile device, it is easier to capture small details from orientations that would be inaccessible with a fixed TLS, which was positioned relatively far from the statue, while in the other scenario in was directly positioned inside of the porticade, allowing for a more dense reconstruction of the scenario.

Figure 4 .
Figure 4. N° of Neighbors (radius 0,02 m) on the point cloud of the Lion Statue generated by the Pix4Dcatch RTK

Figure 6 .
Figure 6.Distances of the point clouds captured by Pix4Dcatch with viDoc RTK rover with respect to Faro Focus X 330

Table 1 .
Main specifications of iPhone 15 Pro

Table 2 .
viDoc RTK Rover main featuresThe International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLVIII-1-2024 ISPRS TC I Mid-term Symposium "Intelligent Sensing and Remote Sensing Application", 13-17 May 2024, Changsha, China

Table 3 .
Main specifications of Stonex S990a GNSS Receiver

Table 6 .
Comparison of the GCPs on the Castle Porticade with respect to the ground truth

Table 8 .
TLS Faro Focus X 330 through the opensource software CloudCompare and the commercial software Leica Cyclone 3DR.Once imported into CloudCompare, the clouds have been aligned with respect to the ground truth to the same position via ICP (Iterative Closest Point) to minimize the distances.Each cloud has then been segmented to analyze only the overlapping part of the datasets, excluding from the comparison all those areas not shared by different PCs or closer to the original point cloud borders, which are more jagged.In tables 7 and 8 are listed the main specifications of the PCs acquired for each dataset: the ground truth (Faro Focus X 330), Pix4Dcatch with the RTK rover (P4DC RTK), Pix4Dcatch without any additional sensor (P4DC), and Scaniverse (SV).Dimensions and number of points for each Porticade of the Valentino Castle's point cloud

Table 9
. Number of Neighbors (radius 0,02 m) for each point cloud in the two scenarios

Table 10 .
Percentage of Points for each distance range with respect to the ground truth