ADVANCED 3D PARAMETRIC HISTORIC CITY BLOCK MODELING COMBINING 3D SURVEYING, AI AND VPL

: The presented research aims to define a parametric modelling methodology that allows, in short time and at a sustainable cost, the digital acquisition, modelling and semantic structuring of urban city blocks to facilitate 3D city modelling applied to historic centres. The methodology is based on field surveying and derives 3D data for the realisation of a parametric City Information Model (CIM). This is pursued through the adoption of parametric modelling as main method combined with AI procedures like supervised machine learning. In particular, the Visual Programming Language (VPL) Grasshopper is adopted as main working environment. The methodology proposed, called Scan-to-CIM , is developed to automate the cognitive operations of interpretation and input of surveying data performed in the field in order to create LoD4 city block models in a semi-automatic way. The proposed Scan-to-CIM methodology is applied to a city block located in the historic centre of Catania, Italy.


INTRODUCTION
In recent years, 3D semantic models have been increasingly used in many disciplines. In particular, the field of 3D city models is developing increasingly innovative and automated procedures for the reconstruction of semantic 3D city models from point clouds usually obtained from airborne surveys useful for risk assessment procedures. Hence, many case studies in literature deal with this subject at the territorial and urban scale (LoD 0, 1, 2), rarely going into architectural details (LoD 3 and 4), which are instead fundamental to the development of effective City Information Modeling (CIM) procedures, merging BIM and GIS procedures. These activities have produced a considerable amount of training datasets related to urban objects acquired in relation to a territorial scale, therefore not very usable for transition to an architectural scale. On the other hand, analysing the training datasets related to HBIM applications, they mainly concern monumental architectures, thus too specific to be used in an urban environment. There are currently limited training datasets that allow reverse modeling operation in the urban environment at the architectural scale. Moreover, these datasets contain architectural configurations that are often not found in the Mediterranean area. This work aims to define a semi-automatic 3D semantic modeling procedure (called Scan-to-CIM) that allows the creation of City Information Models to support urban cultural heritage risk assessment activities. Specifically, in this research we propose a fast reconstruction of LoD4 city models of urban aggregates, combining Machine Learning (ML) semantic segmentation of point clouds and parametric reverse modelling via Visual Programming Languages (VPL). Our research questions are: is it possible to adopt parametric modelling via VPL for reverse modelling from classified clouds using machine learning? Is it possible to conduct instance segmentation of openings on facades using VPL? What are the advantages of the Scan-to-CIM approach proposed in this work over the construction of 3D city models?

3D city modeling: definitions and standards
Currently, a 3D City model can be defined as "a digital representation, with three-dimensional geometries, of common objects in an urban environment, with buildings usually being the most prominent objects" (Arroyo Ohori et al., 2022). This virtual representation is usually adopted to store, visualize and interact with digital urban data acquired from reality that include terrain, building, vegetation as well as roads and transportation systems models. Virtual 3D city models' ability to visually integrate diverse geoinformation into a unified framework is one of its distinguishing features. As a result, they enable the creation and management of complex urban information environments (Döllner et al., 2007;Billen et al., 2014;Zhu et al., 2009). The generation of 3D city models can take place from different types of acquisitions, data and processing methods, such as: photogrammetry (terrestrial and aerial), laser scanning, extrusions from 2D cad, CAD/BIM model conversion, procedural modelling, crowdmapped opendata, etc. Generally, 3D city models are defined with respect to a data structure and format according to the type of data sources available, the expertise of those producing them and the type of output expected. The applications of 3D city models cover almost all disciplines, so their generation and management are topics of considerable scientific relevance. In Biljecki (2017), 29 different applications of city models are identified. These include analyses for solar irradiance estimation, energy demand estimation, inhabitant estimation, visualisation of the urban environment for navigation systems, visibility analysis, shadow studies for urban climate analyses, applications for land registry, urban planning, facility management, emergency response, etc. In the context of this research, semantic 3D city models will be discussed. The motivation for semantic 3D city models is the fact that they allow information to be extracted from the city model (e.g. how many inhabitants are there in a city block? or what are the years and construction techniques of a building? etc). City models that do not allow queries and/or interactions cannot be called semantic 3D city models but 3D representations of a territory (e.g., a textured 3D mesh produced from a photogrammetric aerial block). These digital replicas are data models where the relevant objects (and their components) are structured with respect to a hierarchy and have attributes linked to them. These models are a collection of objects belonging to different classes (building, road, bridge, tree, etc.). Each object possesses at least one geometric representation and may also possess attributes. In addition, each object is decomposed into other homogenous parts, each of them with a geometric representation and attributes. Taking an object belonging to the building class as a reference, it is decomposable into walls, floors, windows, roof, etc. For this reason, a 3D city model is defined spatio-semantically coherent if there is an univocal relationship of each decomposed element with its host object, both geometrically and semantically. Conceptually, these models are structured in much the same way as BIM models which rely on families and parameters (Arroyo Ohori et al., 2022). Due to the great variety of 3D city model types, it became necessary to define an international standard that would define the data structure from a semantic point of view so that even if models were obtained from different data and processes, they would all be constituted in the same way. The modeling standard for 3D information models of cities and urban systems is the CityGML 2.0 from the Open Geospatial Consortium (OGC), which identifies five levels of detail (LOD) in the threedimensional representation from 0 (footprint on the ground) to 4 (building modelled both internally and externally) and uses a set of classes to describe city characteristics ( Figure 1). However, applications involving 3D city models have scaled down to the architectural scale requiring greater precision and more advanced management of collected geometries and attributes. Furthermore, the impossibility of some digital acquisition techniques in particular urban environments has highlighted certain drawbacks that the adoption of such LODs raises during the modelling phase (Machl, 2013;Biljecki et al., 2016;Löwner et al., 2016). According to Biljecki et al. (2016), one of the main problems with these ambiguities is the lack of standards that relate acquisition techniques to the models obtained, thus generating confusion over the use of nomenclature. How should we classify a building that is represented only by a prism (LOD1) but has surfaces inside it that represent floors (LOD4)? To answer this type of questions, a review of the LOD concept was proposed to reach 16 LODs obtained by defining 4 versions for LODs ranging from 0 to 3. LOD4 is excluded, as for urban applications it is currently not much used due to the difficulty of acquiring data concerning the interior of buildings (privacy issues) (Biljecki et al., 2016).

City Information Modeling approaches
The definition of City Information Modeling (CIM) is an issue that has been widely debated internationally in recent years (Simonelli et al., 2018;Xu et al., 2021). An outlining according to information modeling standards is at least complex given the hybrid nature of the BIM and GIS environment. Therefore, it is necessary to identify what characteristics allow a CIM model to be defined as opposed to a 3D GIS or BIM extended to the urban scale. In agreement with Xue, F. et al. (2021), the meaning that the 'I' of 'Information' takes on in CIM versus GIS and BIM provides the correct key. According to one of its first definitions, it is called the Urban Information Model which "integrate the multidimensional urban aspects like economy, society and environment with 3D urban model plus temporal dimension. Urban information model will provide comprehensive information support to various urban planning application systems" (Hamilton et al., 2005). One of the most established and recognised definitions in the literature is that a City Information Model consists of a system of urban elements represented by 2D and 3D elements containing information, linked by semantic relationships (Stojanovski, 2013). Nowadays, there is a wide range of CIM applications covering different disciplines (Xue et al., 2021). It is possible to distinguish three main approaches for developing CIM models: bottom-up, top-down and parametric. The first one (or model driven) focuses more on remote sensing on site acquisition (close-and mid-range laser scanning and photogrammetry) with subsequent manual and semi-automatic modeling processes in BIM and CAD environments (Pelliccio et al., 2017;Zhang et al, 2021;Avena et al., 2021;Parrinello et al., 2020). These procedures often merge BIM and GIS data enabling the users to make queries and display models on web-based platform. In these models Computational Design (CD) is applied, through Visual Programming Languages (VPL), to link, sort and merge metadata between models and environments but not for modeling purposes. The top-down (or data driven) procedures deal mainly with long-range remote sensing techniques (e.g., Airborne LiDAR data) and geodata (coming from online open-data sources or datasets held by local institutions) which are further developed inside GIS-based procedural modeling digital environment (Biljecki et al, 2015;Nys et al., 2020;Wang et al., 2018, Pârvu et al., 2018. Top-down models usually don't need any further integration (unlike bottom-up models) with exception for indoor data that are inserted via the conversion of IFC files into CityGML objects (Biljecki et al, 2021). These models are closer to the definition of CIM since they are based on CityGML standard where the city is treated as a whole system composed by different objects with geometries and metadata (OGC CityGML 3.0, 2012). In these models, CD is applied by using traditional textual programming languages for creating algorithms that, starting from point clouds segmentations, allow to obtain building geometries. The development paradigms for CIM presented so far are very expensive in terms of technologies and expertise needed. Therefore, they are not sustainable except for large cities, leaving small and medium centres excluded from the potential utility of CIM for emergency management. The third approach used for generating CIM model is often called 'parametric urbanism' (De Jesus et al., 2018). This approach is characterized by CD workflows that often interoperate with open-data and remote sensing products. The main work environments are VPLs connected with CAD software. In particular, the VPL Grasshopper, thanks to several plugins dedicated to 3D city modeling, has supported the development of several research activities related to the CIM paradigm (De Jesus et al., 2018;Calvano et al., 2019;Fink & Koenig, 2019;La Russa & Genovese, 2021). The parametric approach relates to the previous ones regarding responsiveness between files of different nature (e.g., IFC and SHP), interaction with digital survey products and standards for 3D city models (CityGML).

AI for 3D semantic segmentation and classification
The adoption of digital acquisition techniques is now an established practice in both industry and research. Currently, acquisition and registration workflows are increasingly automated, speeding up work and reducing errors. However, the use of the obtained 3D data has remained unchanged for a long time as the point clouds obtained are used as static references for generating views and sections, thus reducing the potential of the digital product itself. For this reason, 3D data categorization has recently been a very active study area as a result of the 3D models' steadily expanding use in a variety of applications. It has gotten increasingly important in a variety of applications and domains, including robotics (Maturana et al., 2015), autonomous driving (Wang et al., 2017), urban planning (Xu et al., 2014), heritage (Grilli and Remondino, 2019), geospatial (Özdemir et al., 2021), etc., to automatically group huge data into many homogenous regions with comparable qualities. The objective is to automatically classify semantically continuous portions of point clouds (e.g., walls, windows, columns, etc.) in order to optimize modeling operations on point clouds with automatic and semi-automatic workflows. This can be achieved through the use of Artificial Intelligence methods applied to geospatial data, in particular by adopting Machine Learning (ML) and Deep Learning (DL) techniques (Döllner, 2020;Matrone et al., 2020;Pierdicca and Paolanti, 2021). ML classifiers, such us Support Vector Machine (SVM) and Random Forest (RF), are trained using a collection of features and training data with associated label information (i.e., classes). The definition of the right features is fundamental to obtain a training phase efficient enough to semantically segment the full dataset based on the prediction of the classifier used. Extracting and/or generating the right features can sensibly change the obtained results (Georgianos et al., 2015;Guo et al., 2016;Weinmann et al., 2014). In this work, we rely on Random Forest as described in Grilli and Remondino (2019) with the following steps: (i) neighborhood selection, (ii) features extraction, (iii) features selection, (iv) manual annotation, and (v) classification (Weinmann et al., 2016). Initially, distinct geometric characteristics are extracted at various scales. Then iteratively evaluate just the more pertinent characteristics and re-run the classification procedure after conducting a multi-scale classification with a Random Forest classifier. Last, using the standard confusion matrix ratings, the various findings are compared.

DEVELOPED METHODOLOGY
The concept behind Scan-to-CIM is to automate the cognitive operations of interpreting and retrieving survey data performed by the surveyor. Hence the first step is the digital survey campaign as the goal is to achieve and manage the CIM model at a LOD higher than 3 (envelope with openings). The digital survey conducted is predominantly terrestrial since there may be many limitations to the use of drones within urban centres. The survey is therefore conducted using active sensor technologies that allow a sufficient degree of geometric detail in the produced point cloud. Once the point cloud cleaning and registration operations have been completed, the Random Forest workflow (Grilli and Remondino, 2019) is undertaken. Facade components useful for the identification of planes and openings are annotated in the point cloud and assigned to a specific class (e.g. walls, windows, etc.). This is the only manual operation in the semantic segmentation process, together with the calibration of the parameters for instance segmentation of the semantic components. The segmented point cloud, together with the footprints on the ground in the cartography, become the input data of the parametric CIM modeling VPL code. The code produces a model with a level of detail equal to LOD 3.1 (exterior architectural level without roof geometry). The algorithm also provides for the export of lower-level models meaning that LOD 1 and LOD 2.5 (without roofs but with semantic subdivision and interior floors) are available simultaneously. The clustering of the cloud is managed within Grasshopper's VPL environment thanks to dedicated plugins. At this stage, LOD 4 is developed. Currently, this research presents two solutions for this LOD. The first is manual and consists of the two-dimensional restitution within the three-dimensional models of the lines that define the main internal walls. These lines are interpreted by the algorithm that subdivides them by level and extrudes them defining the mean plane of the septa. This solution is particularly effective when documents can be found that describe the internal layout (at least of the ground floor), but it is also useful in the case of complex-shaped buildings for which it is difficult to define a construction rule for the internal layout. The second solution is automated but depends on the presence of a specific building type. In fact, the literature review shows that in some building types, patterns of internal layouts are repeated very regularly. This makes it possible to construct shape grammar rules capable of predicting the internal layout of the building under analysis (Figure 2).

CASE STUDY
A urban block in the historical centre of Catania (Italy) was considered ( Figure 3). However, the code is replicable for several blocks by adding the corresponding point clouds. The architectural features of this aggregate (floor bands, stone-framed balcony doors, pilasters) and the construction technique consisting of a masonry structure designed and built only for vertical loads made of stone-lava material combined with brick elements, make this aggregate the most exemplary for the historical centre of Catania. The entire aggregate has a rectangular shape (approximately 90x40 m) and it has five main building units of different typologies (  The topographic urban map was available on the municipality's website whereas the digital surveying was conducted with a Leica Geosystem's RTC360 terrestrial laser scanner. The 16 scans (6mm at 10m distance) took some 40 minutes and a final registered point cloud of more than 30 millions of points was produced (Figure 4).

APPLICATION AND RESULTS
For the semantic segmentation of the surveyed point cloud, it was necessary to identify the necessary classes: walls, openings and string courses ( Figure 5). The next step was the manual annotation of some random portions of the cloud by means of CloudCompare ( Figure 6). The extracted geometric features include: roughness (0.2 m), verticality (0.1 m), omnivariance (0.1 m), Z coordinate, height from ground, planarity (0.2 m), planarity (0.5 m), mean curvature (0.2 m). The training dataset contains ca 2 mil. points i.e. less than 10% of the entire dataset. This process was necessary as there are no trained models for architectures that match the architectural features of the historical centre of Catania.  The successive instance segmentation was carried out within Grasshopper thanks to the Cockroach plugin that allows the import and processing of point clouds within Grasshopper. Then, the three clouds (walls, windows and string courses), together with the ground footprints of the block obtained from the The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLVIII-M-2-2023 29th CIPA Symposium "Documenting, Understanding, Preserving Cultural Heritage: Humanities and Digital Technologies for Shaping the Future", 25-30 June 2023, Florence, Italy topographic map, become the input data of the VPL algorithm for the generation of the CIM model (Figure 8). The VPL algorithm was developed considering the specific modeling requirements related to the case study (no survey of summit parts: LOD 3.1) but still setting steps to allow for modifications if the workflow differs from the case study. Following the general methodology, for this code the reference was the sequence of LODs of CityGML, trying whenever possible to find references in the proposed 16 LODs (Biljecki et al., 2016). LOD 3 is then defined: the starting point is the openings cloud, which is subdivided into various units by facade clustering. The semantic point cloud is further subdivided with respect to the reference floor (the creation of the slabs was essential for this operation). After this, the windows are grouped by horizontal bands of openings. Following Vestartas et al. (2020) and a cockroach clustering approach (Figure 9), classes can be grouped until the individual windows are obtained. In this way, each window is a separate cloud which is, however, semantically linked to the succession floor -facade -building -city block. The clustering's objective is to create a bounding box parallel to the facade and obtain a surface useful to trim the façade and create the opening. This process is negatively affected by the noise present between openings which can create false bounding boxes, in terms of size and locations. Therefore, a proper data cleaning, performed with a SOR (Statistical Outlier Removal) algorithm available as a component of Cockroach, was necessary. Moreover, obstacles present on the ground (e.g., parked cars, commercial signs, vehicular traffic, etc.) did not allow many openings to be correctly identified during the survey campaign. However, given the extremely regular pattern of openings in the building types under study, it is possible to recreate such openings with an acceptable tolerance considering the urban scale under consideration. Once the openings have been defined, they receive the same semantic structure as the clouds, thus becoming linked to the floor and the facade to which they belong (as well as to the building and the entire city block). Each opening, as well as every other component of the CIM model, has an index assigned to it, which also defines its semantic hierarchy. In the case of openings, there are four numbers: (i) the building to which it belongs, (ii) the building facade, (iii) the floor level and (iv) the specific identifier of the specific opening ( Figure 10). Figure 10. Visualisation of coloured openings in relation to the facade they belong to. Each opening has an individual index describing its semantics thanks to DataTree structure manipulation in Grasshopper.
An analysis of the accuracy of the reverse modelling operations conducted so far was also carried out. This analysis is fundamental since the building footprints obtained from 2D CAD models were considered reliable for generating the CIM model. In addition, this analysis allowed to highlight the presence of surfaces out of the average vertical plane. This information can contribute significantly for understanding the behaviour of the city block in the presence of seismic actions. The analysis conducted computed the distances from the walls point cloud to the CIM model. The choice of the walls point cloud is based upon the fact that it is the one that best represents the envelope of the city block. In particular, a 20 cm range was considered from the facades generated for LOD3.1 in order to ensure the most reliable analysis in accordance with the point cloud selected. To evaluate the results, the point cloud was coloured with a colour gradient representing the distances. In addition, the percentages of points included and their average deviation in the intervals considered were calculated. A visual analysis of the results (Figure 11) shows that the largest deviations occur in the areas where there are the majority of moldings. In addition, the non-linearity of the facades is also evident, especially near the north-east corner of the city block. However, the facade areas without moldings show acceptable levels of deviation (under 10 cm - Figure 12). The analysis made it possible to calculate the percentage of points included and the average deviation in two ranges (+/-10 and 20 cm). In the +/-20 cm range, 92% of the total points are contained with an average deviation of 2.2 cm, while in the +/-10 cm range, 80% of the total points are contained with an average deviation of 2.1 cm. Given the urban scale of the model, these values are acceptable with respect to the CIM purpose of this research work. Finally, two approaches were taken to create the required LOD4 models. For in-line buildings, the distribution of interior spaces is widely discussed in the literature and due to the simplicity of the configuration of these buildings, it is possible to automate the creation of interior partitions relevant for the structure. On the other hand, concerning buildings with open and closed courtyards, their internal configuration is very complex to predict using rule-based algorithms. In addition, the example configurations found in the literature appear to oversimplify in comparison to real conditions where urban morphologies always determine different configurations. For these reasons, it was decided to identify the internal partitions manually on the model. The modelling of these partitions is based on collected material where possible, while where there is no source, it was assumed based on external observation of the building configuration. At this point, the CIM model is geometrically and semantically complete. The entire pipeline produces the model shown in Figure 12 in some 20 seconds. The total size of the Grasshopper file is approximately 7 Mb. The total size of the clouds, after filtering and cleaning operations, is approximately 15 Mb. It should be noted that in addition to the LOD4 model, the pipeline also produces the same model at smaller details (LOD 3, 2, 1, 0), so it is possible to choose which model to handle according to project needs.

DISCUSSION AND CONCLUSIONS
The work aimed to explore the potentialities of fast 3D surveying techniques for the creation of informed and responsive 3D city models of historical centres. The AI-based Scan-to-CIM workflow offers an innovative approach of semi-automated modeling from segmented point clouds through VPL. The reconstruction of an entire urban block can take 3 to 4 hours at most from acquisition to generation of the CIM model. The VPL algorithm allows different models to be obtained depending on the LoD of the project. The critical points of the workflow lie in the clustering steps of the point cloud, especially those relating to apertures, which are often affected by noise and classification errors due to acquisition conditions (open or closed windows, obstacles, etc.). However, this is easily solved by manual cleaning of the cloud. Regarding the segmentation with Random Forest, the application demonstrated that the approach is also valid for case studies that differ in architectural style from those already known in the literature. Furthermore, this application is among the first to include a parametric VPL approach to this type of segmentation and data-scale, leaving several paths open for future experimentation. The advantage of using VPL also lies in the real-time display of the code developed, allowing the programmer to design the code more easily via a faster trialerrors process than that of classic textual programming, typical of applications in the field of geomatics. Although the technologies used in the proposed pipeline are sophisticated, they make the procedure semi-automatic, thus ensuring that even nonexperts in the field can carry out the required operations without the need for high levels of expertise. In comparison to other model-driven experiences reported in the state of the art (Pelliccio et al., 2017;Zhang et al, 2021;Avena et al., 2021;Parrinello et al., 2020), this procedure removes a lot of manual work with little use of resources. The use of a VPL modeling environment compared to BIM modeling environments allows for greater flexibility, especially in export possibilities. Indeed, it is possible to quickly switch from the VPL environment to the BIM environment (Robert McNeel & Associates, 2023), and the same applies to the GIS environment or any other analysis environment involving the use of three-dimensional models (Fabricius et al., 2021). The complexity of the investigated case study helped to point out criticalities and advantages for setting up expeditious protocols for urban survey and CIM generation. The creation of a CIM ensures to understand the historical-morphological values of the building-environmental context and the proposed methodology can be easily replicated to larger urban block or entire cities given the availability of adequate point clouds.