MULTI-CAMERA LIDAR SYSTEM FOR SPATIAL AND TEMPORAL PRESERVATION OF THE INTANGIBLE CULTURAL HERITAGE

: Cultural heritage preservation is becoming increasingly important in today's culture and advancements in technology are enabling more effective preservation of both tangible and intangible cultural heritage. This paper proposes the development of a system for documenting intangible cultural heritage using multiple calibrated and synchronized LiDAR cameras for immersive and dynamic surveying of complex scenes. The prototype will be assembled using low-cost LiDAR sensors, specifically the Intel ® RealSense™ L515. The system thanks to a higher number of devices will ensure full coverage of the scene to be acquired. The biggest development challenge is represented by the calibration phase, which, if carried out properly, allows the collection of a correctly oriented point cloud from each of the individual devices. The acquired point clouds will be recorded at a frequency of 30 Hz to create a dynamic and time-varying point cloud, which can then be viewed in a virtual reality environment. The affordability of the components used in the system combined with the higher completeness of data will enable the acquisition of scenes, actions, and events with a more comprehensive perspective. The goal is to lay the groundwork for the development of multi-platform and multi-sensor technologies capable of acquiring more data with a higher level of detail in intangible cultural heritage preservation efforts.


INTRODUCTION
Along with global advancements in conservation, the meaning of cultural heritage has evolved (Vecco, 2010). Heritage has a very broad meaning, including both tangible cultural heritage and intangible cultural heritage, according to the United Nations Educational, Scientific, and Cultural Organization (UNESCO). The former is concerned with things like monuments, collections of items, and archeological discoveries. Five regions were identified as being part of it by the 2003 UNESCO Convention for the Safeguarding of intangible cultural heritage 1 : • Oral traditions and expressions including language and storytelling; • performing arts such as singing, dancing, theater, and feasting; • social practices, festivals, and rituals; • knowledge of nature and the cosmos; • knowledge and skills required to make traditional crafts. There are numerous regional variations in use in some nations, including traditional plays and games, gastronomic practices, animal husbandry, pilgrimage and memorial sites. Preservation, a deliberate act of preserving cultural heritage for the future, is presently used in historical museums, cultural centers, scientific research, education, and other settings. The use of different 3D technologies is one way to develop this process (Liu, 2022). They give access to cultural heritage components that are challenging to attain in the physical world (Morlando et al., 2012). Documentation, security, reconstruction, restoration, conservation, dissemination, and spreading are all areas of study that are concerned with preservation. Storage of different kinds of material is related to documentation. In order to strengthen the transmission of cultural heritage's important messages and ideals, cultural heritage must be conserved. The dissemination of tangible cultural heritage and intangible cultural heritage items involves their representation and visualization using contemporary technologies (Bercigli, 2019). Spreading is the process of reaching as many potential receivers as possible to introduce them to cultural heritage (Skublewska-Paszkowska et al., 2022;Doulamis et al., 2012) Faster and cheaper surveying methods are increasingly being researched in geomatics (Wang et al., 2015). Digital technologies facilitate intangible cultural heritage documentation procedures and offer safer, more reliable data gathering. Additionally, combining different 3D technologies improves comprehension of the subject and adds fresh perspectives to the research. The concept is to take advantage of geomatics tools to record objects and actions in a 4D (x, y, z, t) space, i.e. to describe an object's shape and location in a three-dimensional space across a certain amount of time (Rallis et al., 2019). The idea is to produce a three-dimensional sequence of point clouds with fixed intervals (Weny et al., 2005). If a video clip (which has always been used as a documentation of actions and objects in their interaction) is a sequence of twodimensional images recorded using a precise framerate, the aim is to create sequences of three-dimensional models that can be explored in space and time. This system will allow to generate digital twins of objects/actions. As an example, consider the recording of a piece played on the violin. There are various degrees of documentation, including audio recording of the harmony, photo documentation of performer posture, and video filming of both manual skills and performance. However, if we wanted to record the position and motion of the fingers as they produced the first recorded sound at every moment, we could create 3D scans at intervals of 1/30 seconds. A four-dimensional recording in which the performance would be documented in space and time, allowing totally free enjoyment of the performance. Therefore, the goal of this study is to develop a working prototype able to record the intangible cultural heritage using multiple precisely calibrated LiDAR cameras. There has been a significant acceleration in the development of these systems due to two key considerations. The first is the introduction of new, low-cost, high-performance sensors to the market, which are primarily used for self-driving cars, and facial recognition (Kyungpyo et al. 2021;Nam et al., 2021). The second is the ongoing research in new tools for representing reality in VR/AR environments (Zhang et al., 2022;Fritsch et al., 2017;Ioannidis et al., 2020). Although there are numerous technologies available on the market that enable the acquisition of 360-degree 3D sceneries, they usually only allow the acquisition of a static model (Pesce et al., 2015). These systems consist mainly of a series of cameras arranged around the scene that are triggered simultaneously to capture images in a synchronized way, which are subsequently post-processed with a dedicated software. The main problem is that these systems need a considerable amount of hardware and consequently economic resources. The aim of this paper is to evaluate the potential and the limitations of a prototype built utilizing three Intel ® RealSense™ L515 depth cameras which are a new type of low-cost LiDAR sensors capable to acquire dynamic point clouds with a frequency of 30 Hz. Generally, this type of technology is used for robotics applications. The goal is to apply them to the dynamic survey of the intangible cultural heritage.

Working principle and technical Specifications
The prototyped system for the dynamic survey of intangible cultural heritage involves the use of three carefully calibrated and synchronized LiDAR sensors arranged to create a single dynamic and properly georeferenced point cloud that can be visualized and navigated in both temporal and spatial domains. The system involves the use of Intel ® RealSense™ L515 LiDAR sensors. Each of these consists of a LiDAR sensor (ToF), an RGB sensor, and an inertial module. The LiDAR uses an infrared laser, a MEMS mirror, a photodiode and an ASIC visual processing module. The MEMS mirror is used to redirect the laser beam over the entire field of view. The photodiode captures the reflected laser beam which is then processed by the ASIC module generating a point cloud as output. This process is performed for the entire image surface with a maximum frequency of 30 Hz at a maximum resolution of 1024 x 768 (786,432 dots). Through the RGB sensor, the point clouds generated by the device can be colored. The maximum range for depth sensing is 9 m. The maximum field of view is 70° x 55° The type of material and illumination greatly affect the quality of the acquired points 2 . Therefore, it is specifically designed for indoor use as sunlight disturbs the operation of the device. Within a range of 50 cm to 2 m the device can achieve a good accuracy, which remains below 2 cm (Breggion et al., 2022). The three LiDARs will be mounted on a static support and connected to an Ubuntu PC that will perform the stitching and data processing. The three devices will be hardware synchronized using a Raspberry Pi mini-PC.

System assembly
The system assembly process involves the usage of the 3 LiDAR sensors, which should be properly georeferenced and synchronized to achieve an optimal result. The devices will be placed in the best position to ensure total coverage of the survey object. For this reason, one of them will be placed in a nadiral position, and the other two will be placed at the antipodes of each other to ensure full coverage of the area (Figure 1). They will consequently be oriented to target a common point where the survey object will be placed. The devices are positioned about 1m away from the capture point, this allows for an acquisition area of about 50 cm 3 . Figure 1. assembled system.
The first obstacle to overcome in the prototyping stages of the system concerns the interference that the use of multiple devices causes to the generated point clouds. In fact, due to the use of the same wavelength of the laser beam emitted by the three devices that, in the specific case of the designed system, will be directed toward the same point (Figure 2), this will result in a significant signal interference and a consequent noise in the extrapolated data. For this reason, it is necessary to include an additional device in the system to synchronize the sensors. This is done by turning them off and on in a synchronized way. Through the use of the GPIO channels of a Raspberry Pi and through a Python code it is possible to send a pulse with the voltage of 3.3V to the sensors. In this case the LiDAR will record the data, on the other hand when the signal is off the sensor will not acquire any data, consequently allowing the next sensor the time to acquire without interference (Figure 3-4-5). When applying an external sync hardware, the pulse width will determine how many images will be taken. The initial frame is not transmitted to the host immediately after the external trigger turns on the laser. First, a 10ms laser warm up period is required. Then, depending on when the signal is received, the system may wait up to three frames before sending a valid frame to the host. Next, the synchronized frames are processed by an Ubuntu PC where the transformation parameters between devices were as well recorded. These parameters are essential to merge the three separate point clouds into one correctly georeferenced cloud. In order to calculate the transformation parameters, it is first necessary to determine the center of capture of the devices. To do this, a point cloud was acquired for each of the depth cameras already placed in the mount. Next, one of the cameras was set as reference with origin coordinates xyz 0,0,0 and omega, phi and kappa rotation to 0,0,0. Next, the other two clouds were then oriented to the reference one by the collimation of known points on a checkerboard. after that, it was only necessary to convert the resulting roto-translation matrix to determine the coordinates of the capturing centers and the 3 angles of rotation, which were later converted from gons to sexagesimal degrees needed by the software to correctly orient the scans. The RealSense™ SDK 3 was installed on the pc to which the three devices were connected. The SDK is necessary for the communication between the cameras and the different applications packages required for data processing. The management, orientation and visualization of the point clouds was done through the Rviz software included in the ROS (robot operating system) application package 4 . The software allows the clouds to be recorded at a frequency of 30 Hz and allows them to be viewed in visualization mode (Widodo et al. 2018).

RESULTS
The use of multiple LiDAR sensors connected to the same processor requires a large amount of resources, both in terms of processing power as well as data storage. It is necessary, therefore, to understand the benefits but also the physical limitations that the use of this type of device have. In fact, in the case of the processing they certainly have a very competitive advantage as the result, i.e., the point cloud, is generated almost in real-time. This element leads them to differ from a system composed only by "ordinary" cameras that needs a very substantial processing power. Another element in favor of these devices is certainly the fact that fewer devices are required compared to a photogrammetric system that would require a much larger number of cameras to obtain noise-free point clouds. At the same time, the limitations and shortcomings they have must be considered. Because of the need for external synchronization, it results in a low frame-rate output for each camera. This limitation consequently led to a 3D video that is not quite as smooth and flowing as normal video that requires a minimum frequency of 24hz. Ideally, the devices should acquire at a higher frequency. The biggest problem, however, concerns the too long time that the sensor requires for startup and shut down when receiving the synchronization signal. The startup time of the devices is 60 ms. There is some delay between the time the laser is turned on by the external trigger and the first frame is forwarded to the host. A 10 ms laser warm-up period is required. The system may then wait up to three frames before transmitting a valid frame to the host, depending on when the signal is received. This waiting period ensures that a high-quality frame is sent to the host (Figure 7). For example, for capturing 3 consecutive frames the signal needs to be high for 160ms (Mulla et al. 2021). Considering that the switching on and off of the three cameras occurs about 6 times per second, it is clearly understood that for 360 ms are spent every 1000ms just for the startup of the cameras, which consequently leads to the presence of empty frames within the video. For this reason, it was necessary to rewrite the synchronization code to allow the overlapping of the synchronization signal at least 50 ms before the switching off of the previous one. Through this operation the amount of frames per second has slightly increased but in any case isn't sufficient to obtain a smooth video. Another specific issue with this type of device is the background noise that occurs between frames. In fact, even if the noise due to the presence of other external elements is removed, the devices generate another type of noise called by the manufacturer "temporal noise" 5 that between frames leads to a discrete uncertainty between the position of the acquired points, leading to a consequent deterioration in the quality of the final data. This factor is due to several causes, first of all the very low power of the emitted laser signal, secondly it is due to the movement of the MEMS mirror that directs the beam which, probably due to small variations in the position of the beam during the movement, leads to uncertainties in the final result. This factor affects the quality in the determination of the relative orientation parameters among 5 https://www.intelrealsense.com/download/7691/ the three devices and leads to a consequent incorrect orientation among the acquired clouds.

CONCLUSIONS
The preservation of tangible and intangible cultural heritage is becoming increasingly important in modern culture. For this reason, the research of new, faster and more accurate methodologies for surveying this type of object is becoming increasingly important. Standard acquisition methods currently on the market are primarily designed to acquire static objects. For this reason, intangible assets are more difficult to acquire because they have the property of being dynamic and variable. The recording of motion involves the use of sensors with a much higher acquisition rate. Currently, the documentation of moving objects is mainly done using video cameras, but these do not allow to acquire metric and spatial data. Mostly they are limited to a single point of view pre-determined during the acquisition phase. The goal of this paper is to research, through the tools of geomatics, a methodology for the documentation of intangible cultural heritage with a certain flexibility given by the use of noninvasive technologies proper of remote sensing. The proposed system for mapping intangible cultural heritage using multiple calibrated and synchronized LiDAR cameras has the potential to revolutionize the way we survey and preserve cultural heritage. The use of low-cost LiDAR sensors like the Intel ® RealSense™ L515 provides an affordable solution that can capture dynamic and time-varying point. This system can be used to acquire scenes, actions, and events with a more comprehensive perspective, providing a higher level of detail in cultural heritage preservation efforts. The main challenge in designing the system is the calibration phase to generate a correctly oriented point cloud from the individual devices. The main limitation of this system is due to the low amount of frames it is able to generate per second. This problem is due primarily to the need for an external trigger, which is necessary for the reduction of noise generated by interference between devices. The problem could be bypassed by mounting the devices to acquire different portions of the scene. In this way the need for external hardware for synchronization would be avoided but the scene coverage would be greatly reduced. However, the results obtained from the prototype demonstrate the potential of the proposed approach in acquiring dynamic and detailed 3D models of intangible cultural heritage ( Figure 6). Future research can explore the potential of multi-platform and multi-sensor technologies to acquire even more data with a higher level of detail. The development of these technologies can contribute to the conservation, documentation, and dissemination of cultural heritage, enabling people from different parts of the world to experience and appreciate it. In conclusion, the proposed system is an innovative and cost-effective solution that can have a significant impact on cultural heritage preservation effort.