A WORKFLOW FOR VALIDATION AND EVALUATION OF A DYNAMIC VISUALIZATION PLATFORM FOR UNDERGROUND ANTIQUITIES

: Smart Eye is a research project focused on developing and operating, in real-time, an innovative system, that allows the visualization of invisible, known, covered by the ground or by constructions, monuments and finds in an archaeological site, using an Augmented Reality (AR) environment. The visitors of an archaeological site will be able to observe the covered antiquities (3D models and descriptive information) on the screen of their mobile device (smart phone and/or tablet). The system integrates mobile positioning and pose estimation technologies, AR algorithms, Geographic Information Systems (GIS) and spatial databases. The AR application is implemented by IT scientists and is subsequently evaluated in the field to identify both technical problems (e.g. are files uploaded at high speed?) and problems identified by the average user of the AR application (e.g. would it be better if the button to display the 3D model of the archaeological excavation was larger on the screen or of a different color?). The paper will present, on one hand, the validation and evaluation protocols according to the relevant literature, and on the other, the exact evaluation methodology of the Smart Eye system. In addition, the problems identified and the way they were solved during the 1st evaluation of the application will be presented, as well as


INTRODUCTION
The applications of Augmented Reality (AR) are mainly focused on Medicine, Military, Industry-Maintenance-Repair, Education, Information-Entertainment and Tourism-Culture. In the latter area in particular, applications are being developed which allow the user to be guided and receive real-time information about hotels, museums, restaurants and archaeological sites (e.g. providing an audio description of the history of the archaeological site, reconstructions of ancient monuments, visualizations of findings, panoramic images, etc.) (Kostaras, 2010;Bekele et al., 2018;Bruno et al., 2019;Dragoni et al., 2019;Liritzis et al., 2021;Azuma, 1997;Piekarski et al., 1999;Hollerer et al., 1999;Morandi and Tremari, 2017;Pedersen et al., 2017;Birkfellner et al., 2000;Umeda et al. 2000;Figl et al., 2001;Stetten et al., 2001;Pierdicca et al., 2015;Galatis et al., 2016;Pietieric et al., 2016;Billinghurst et al., 2001;Klopfer et al., 2002;Doil et al., 2003;Azuma et al., 2001;Pyssysalo et al., 2000;Vlahakis et al., 2002;Vlahakis et al., 2001;Vlahakis et al., 2002;Reilly et al., 2006;Schönin et al., 2006;Paelke and Sester, 2010;Eggert et al., 2014). Smart Eye is a research project on developing and operating, in real-time, an innovative system, that allows the visualization of invisible, known, covered by the ground or by constructions monuments and finds in an archaeological site, using an Augmented Reality (AR) environment. The visitors of an archaeological site will be able to observe the covered antiquities (3D models and descriptive information) on the screen of their mobile device (smart phone and/or tablet) ( Fig.  1-4) (Kaimaris et al., 2021a;Kaimaris et al., 2021b;Efkleidou et al., 2022). The system integrates mobile positioning and pose estimation technologies, AR algorithms, Geographic Information System (GIS) and spatial databases. The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLVIII-M-2-2023 29th CIPA Symposium "Documenting, Understanding, Preserving Cultural Heritage: Humanities and Digital Technologies for Shaping the Future", 25-30 June 2023, Florence, Italy sites (Thermi and Toumpa) of Smart Eye are located, b. The Smart Eye system's Web GIS user interface (archaeological site of Toumpa). Blue dot indicates the user's location and red pins indicate the points of interest (AR markers).
Figure 2. a. The location of each info-point is visible through AR markers, which are part of the Smart Eye system's AR interface, b. After the selection of the info-point, the popup window with textual or multi media information appears.  In this paper the software/application evaluation methodologies according to international literature, the evaluation process of the Smart Eye system, and the problems identified and the way they were solved during the 1st evaluation of the application will be presented, as well as the problems identified during the 2nd evaluation.

APP-SOFTWARE EVALUATION
Usability is a key factor in the quality of a software/application, in the sense of learnability, understandability, operability and attractiveness to the user. In this context, it should ensure ease and speed with which the user can learn the system/application, performance in operation (meeting the user's objectives in terms of usability of the application), easy memorization of how to use the application, minimum number of incorrect operations (and easy disengagement from them) and finally subjective satisfaction of the user from his/her interaction with the application (Kostaras, 2010;Bevan et al., 1991;Nielsen, 1993;Gilb, 1996;Jones, 1997). Both during the design/development and during the operation of the application (before it is released to the public), an evaluation of its usability is required. Of the multitude of methods for evaluating the usability of an application, the analytical and empirical methods are two general categories of methods (Kostaras, 2010;Dumas and Redish, 1993;Lindgaard, 1994). Analytical evaluation methods are intended to simulate end-user behavior, are carried out in the laboratory and usually do not require the participation of end-users. This category includes the Cognitive Walkthrough method (which has the effect of identifying design flaws in the application under evaluation; it is applied either in the early phases of application development to identify and resolve design flaws or during the completion phase), the Pluralistic Walkthrough method (which involves the application developers and usually representative users and is mainly applied during the early stages of product development) and the Heuristic Evaluation method (focusing on the design of the graphical user interface and the flow of dialogues, messages and actions to be taken in a given task -applied in all phases of development and after the system has been completed) (Kostaras, 2010;Nielsen and Mack, 1994;Dumas and Redish, 1993;Nielsen and Molich, 1990;Nielsen, 1993). Empirical evaluation methods are carried out either in the laboratory or on-site, with the participation of representative users and application experts. They are divided into experimental methods and inquiry methods. Experimental evaluation methods are carried out in laboratories, recording user reactions and behaviors (the results are related to the identification, number and type of errors, as well as the time taken to complete a task). They can be distinguished between the Performance Measurement method (collection of quantitative data) and the Thinking Aloud Protocol method (direct, qualitative feedback from the user for an optimal understanding of the way of thinking and the terminology used to express an idea or function, which should then be incorporated into the design). Inquiry methods of evaluation are usually carried out in the field of the application's operation, recording the views of the end/ordinary-users (collecting data related to the preferences, needs and specificities of the users). They can be divided into Focus Groups (dialogue/discussion to collect user opinions and experiences after using the application) and Questionnaires (provide useful information/answers to specific questions and use three (3) types of questions: open type, i.e. with pre-selected answers, closed type, i.e. with the possibility of developing text, and complex type, which is a combination of the two previous types)) (Kostaras N., 2010;Rubin, 1994;Sharpet al., 2007;Dumas and Redish, 1999;Nielsen and Landauer, 1993;Jorgensen, 1990;Monk et al., 1993;Dix et al., 2004;Sharp et al., 2007). Of the set of software/applications evaluation methods presented in the previous paragraphs, in the case of augmented reality applications, some of these methods can be exploited with appropriate adaptations of processes/tools/rules, while others can be exploited directly (e.g. Heuristic Evaluation, Focus Groups, Questionnaires) as they are simpler to implement.

1ST EVALUATION OF SMART EYE APP
The 1st evaluation was essentially a set of successive evaluations by the authors (related to the development and operation) of the application, both in laboratory and real-life conditions. They were carried out from February to May 2022 and used a hybrid evaluation model based on analytical and experimental evaluation methods to simulate end-user behavior, identify design flaws in the application, identify general and specific design issues in the system screens and dialogue/message/action flow, identify errors and time to complete each option/command, etc. Specifically, the main evaluation axes of the system were related to: • Location and orientation management system • Augmented reality module management system • Management and visualization of 3D Models and content system • End-user interface The evaluation of the position and orientation management system focused on the mechanism of placing the AR markers in their actual positions, as well as the correction of problems that arise on the one hand form the orientation of the trenches and on the other hand with anything related to the user's position with respect to the points of interest. Two major issues were identified, and subsequently resolved. The first was the observation of disconnections between the ublox and the tablet, resulting in the inability to update the user's position relative to the AR. This led to the development of a code class to first check which IP the system connects to, and then manage it in such a way that it does not disconnect. The second issue was that the accuracy with which the position was determined by ublox was unclearThe solution was given by the creation of an icon dispalying in different colors the position accuracy in three classes ( less than 50cm, from 50cm to 1m, greater than 1m). The evaluation of the management system of the augmented reality unit involved control over the augmented reality objects, both in the scene with the points of interest (AR markers) and in the scene with its trenches and hotspots. One of the issues identified was that in some cases the AR markers were concentrated. The proposed and followed solution was to not display AR markers that are more than 15m away from the user. During the evaluation of the managing and visualizing the 3D models and their content system, the major issue identified was the size of the 3D models, which significantly slow down the application. The solution proposed and subsequently implemented was to optimally reduce their size while ensuring high visual quality of the trenchess. Finally, during the evaluation of the end-user interface, the main issues identified and subsequently resolved were the placement of some menu buttons in their correct position, buttons that were not functional and were restored, buttons that were redesigned because they did not match the mock-ups, etc.

2ND EVALUATION OF SMART EYE APP
The 2nd evaluation regarding user satisfaction and, thereafter, the effectiveness of the application, took place at the end of June-beginning of July 2022, with the participation of members of the Smart Eye research team and the PersLab (who do not participate in Smart Eye). The evaluation refers to inquiry method of assessment, specifically through questionnaires. The evaluators were nine (9) participants either engineers or archaeologists who are non-expert/ amateur users (either no technical expertise or no archaeological/heritage knowledge) and, therefore, did not participate in the development of the application (coding, software design, hardware development, etc).. The evaluation was conducted as follows: Every time the participants used a function of the application (interface, menus, capabilities) they filled two questionnaires (Tab.1). The questionnaires referred to the application's efficiency and effectiveness and to each user's unique experience. Participants were able to complete these questionnaires as they were involved in specific tasks.

Questionnaire 1
Questions on the functionality of the application related to the map interface 1. How easy is it to navigate the map interface? 2. How easy did you think the change of the backgrounds was? 3. Do you find that the legend helps to search for information? 4. Is it easy to navigate the map interface using the corresponding tools? 5. Is it easy to search the map interface using the corresponding tool? 6. Do you find that the pop-up information window is clearly visible and provides the necessary The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XLVIII-M-2-2023 29th CIPA Symposium "Documenting, Understanding, Preserving Cultural Heritage: Humanities and Digital Technologies for Shaping the Future", 25-30 June 2023, Florence, Italy information? Questions on the functionality of the application related to the augmented reality environment 7. How easy is it to navigate the augmented reality environment? 8. How easy is it to identify augmented reality markers in space? 9. How easily accessible are the 3D models and information of points of interest? 10. Is there a correct representation of the 3D models? 11. Is the hot spot content of the 3D models functionally accessible? 12. Is it easy to transition from one excavation phase to another? 13. Is the way of informing the user about the accuracy of his/her position in space clearly visible? 14. Are the tools for transitioning from the augmented reality environment to the map environment and vice versa easy to use? Questionnaire 2 Questions about the user experience 1. How easy is it to navigate the application? 2. How easy is it to navigate the points of interest? 3. How clear is the functionality of the app? 4. Does the app meet the needs of navigating the archaeological site? 5. Did you notice any problems with the data displayed in the app? 6. Did you notice any problems in the functionality of the application? 7. Did you notice any problems with the graphical interface of the application? 8. Overall, how satisfied are you with the application for navigating the archaeological site using augmented reality? Aiming to document different use behaviours the participants were instructed to freely use the Smart Eye system during their exploration. As a result, different amount of time and interest was spent on each feature and area according to the participants special interests. The above differentiations on interests, speed and movement highlighted that the Smart Eye APP could not cover the users' needs due to a lower accuracy of the in-built devices' sensors (direction / angle, location). In particular, the location accuracy had range was six to seven, whilst errors appeared in orientation sensors on mobile devices. The combined result of these two issues was that the AR models were 'moving' (not in a great scale, but noticeably) in different directions as visitors move around. The use of ublox technology and the development and improvement of the new software and hardware for an external unit solved the inaccuracies of the location in the field during the evaluation of the system. This technology improved the location accuracy error to ca. 5-7 cm. The 'movement' of AR models was almost eliminated, while, in some cases, the orientation problem of the trenches still remained. Apart from the above, less important problems, such as the small font size of the informational texts at info-points and hotspots and some distortions of artefacts' and archaeological features' images, were easily solved. One out of three users, also, stated that it would be better if the3D models were more clear/sharp. In scope of solving these issues, the team is currently testing different brightness and illumination of each device's screen under ambient light conditions, so that the contrast is the most suitable and the 3D rendering (and not the resolution) of the models is improved. Overall, users' experience of the Smart Eye system was positive with special comments of its efficiency and effectiveness, whilst expressing the feeling that their visit at the archaeological sites was aided and improved. Only one participant mentioned that it might be necessary for the user to be familiarized with the APP and AR environment at some point during the use of the system. In general, the archaeological sites' history and their remains were revealed to the participants on the screen during their visit to each site. The Smart Eye research team is currently evaluating the results as the APP must be updated and improved so that its evaluation in Spring of 2023 (late April or early May) by the general public is successful and positive.

CONCLUSIONS
Evaluations of the usability of augmented reality applications is a process that can only benefit its creators. It should be carried out both during design and development and during its operation (before it is released to the general public), in a systematic and organized manner, both in the laboratory and in the field/area of use of the application. Evaluations should involve, depending on the stage of prototype development, from the staff implementing the building blocks of the application to the ordinary user. There are several methods (and procedures) for evaluating software/applications, and in the case of augmented reality applications these methodologies can be adapted, some with more difficulty and some more easily.
In this paper, two consecutive evaluations of the Smart Eye application were presented, which allowed on the one hand to identify issues and on the other hand to systematically address and correct them. In addition, the final evaluation of the system, which will take place at the end of April or early May 2023, is another valuable opportunity to identify issues that may remain, so that they can be addressed and resolved in time before the app is released to the general public.