Precision and Accuracy Parameters in Structured Light 3-D Scanning

: Structured light systems are popular in part because they can be constructed from off-the-shelf low cost components. In this paper we quantitatively show how common design parameters affect precision and accuracy in such systems, supplying a much needed guide for practitioners. Our quantitative measure is the established VDI/VDE 2634 (Part 2) guideline using precision made calibration artifacts. Experiments are performed on our own structured light setup, consisting of two cameras and a projector. We place our focus on the inﬂuence of calibration design parameters, the calibration procedure and encoding strategy and present our ﬁndings. Finally, we compare our setup to a state of the art metrology grade commercial scanner. Our results show that comparable, and in some cases better, results can be obtained using the parameter settings determined in this study.


INTRODUCTION
Structured Light (SL) systems enable robust high quality capture of 3D geometry, and are actively used throughout several fields.These systems can be constructed using commercial off the shelf (COTS) hardware, making them accessible and affordable.The obtainable accuracy and precision of such systems vary considerably, and are mainly functions of several design parameters.The influence of these parameters has not been studied extensively in the literature.Previously, no combined study has systematically investigated the effect of common parameter choices on the final result and quantified them using an established standard.
To address the lack of work in this regard, we investigate how common design choices influence precision and accuracy.Our analysis is based on our own active stereo-vision setup consisting of two industrial cameras and a consumer projector.We empirically show our parameter selection such that maximum performance is obtained, and quantify using the VDI/VDI 2634 (Part 2) guideline.Finally, we compare our results to a commercial metrology grade scanner (GOM ATOS III Triple Scan) as a benchmark against state of the art, with decent results.Throughout this study we seek to employ widely available and accepted methods & models used in such systems to obtain easily reproducible results.
The contribution of this paper lies in the attempt to quantitatively answer the following questions • What calibration parameters should be included in the calibration procedure?
• What angular range of observations is required in the calibration procedure?
• How many observations are required for calibration?
• Which SL encoding strategy is the overall best performer?
We believe this to be valuable information for practitioners wanting to build their own system, e.g. as part of research projects or industrial implementations.
This paper is structured as follows.Section 2. covers related work.Section 3. gives an overview of our experimental setup.Section 4., 5. and 6. covers our investigations on calibration parameters, calibration observations and encoding strategies respectively.In section 7. we compare our system to a commercial system and finally, we conclude in section 8..

RELATED WORK
Much work has been devoted to the field of SL systems e.g.(24,7,6,11,30).These contributions have mostly dealt with the methodological development of such systems whereas less focus has been placed on quantitative accuracy and precision analysis.One of the most important factors with respect to accuracy is system calibration.While recent focus has been placed on projectorcamera calibration (32,18,17), we here consider an active stereo vision setup (14,31,33), without projector calibration.Precision is considered to be mostly dependent on the encoding strategy.A vast selection of methods have been proposed, see (26,10,8) for recent surveys.While many of these methods aim to reduce the number of patterns, the amount of outliers and computational complexity, less focus has been placed on precision.Here, we compare selected encoding strategies from a precision and accuracy perspective.
Characterising SL systems in terms of accuracy is a challenging and ongoing problem, which despite its relevance has only seen few published guidelines and standards.The only currently published standard is the German VDI/VDE 2634 Part 2 guideline (1, 13), Optical 3-D measuring systems -Optical systems based on area scanning.This guideline aims to capture the complex nature of such a system, using a number of length and shape measurements throughout the scanning volume.Researchers have already accepted this guideline for evaluation of 3D scanning systems (4, 20, 3, 2).We here argue that the guideline is lacking to some extent.Firstly, it fails to capture frequency response characteristics of SL systems using the proposed low frequency artifacts.Lastly, the artifacts are optically ideal for SL scanning.Therefore, results only indicate 'best case' results, given that particular material.The standard is however well suited for relative measures e.g. for acceptance testing and benchmarking purposes.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-5/W8, 2016 LowCost3D (LC3D), Sensors, Algorithms, Applications, 1-2 December 2015, Berlin, Germany This contribution has been peer-reviewed.doi:10.5194/isprsarchives-XL-5-W8-7-2016 Figure 1: Our structured light system setup with two highresolution industrial cameras, a Full HD LED projector and a rotation stage mounted on a rigid aluminum mount.Specifications are given in Table 1.Limited work has been conducted on SL parameter investigations and their effect on overall performance (19).However, to the authors knowledge, no quantitative evaluation has been performed on how the different SL parameters directly influence the final results as defined by the VDI/VDE guideline.

EXPERIMENTAL SETUP
Our structured light setup, as seen in Figure 1 Figure 2 shows the calibration plate and Figure 3 shows the VDI/VDE 2634(2) measurement artifacts used during this study.The arti-facts consist of a flat white painted aluminum plate and two ceramic spheres separated by a known distance.Both artifacts have been measured according to procedure T3-01 of ISM3D using a coordinate measurement machine (CMM), and traceability has been established through the virtual CMM method.Specifications for nominal values and attached uncertainties are listed in Table 2 and 3 table 2 and 3. Following the VDI/VDE 2634 (2), we use four quality parameters: • Probing error form, PF , which describes the radial range of residuals from a least squares fit sphere with up to 0.3% of the worst points rejected.
• Probing error shape, PS, measuring the signed deviation between the least squares fit diameter and the nominal.Again, up to 0.3% of the worst points are rejected.
• Sphere distance error, SD, denoting the signed difference between the estimated and nominal distance between the spheres.Up to 0.3% of the worst points are rejected.
• Flatness, F, which is the range of residuals from the measured points to a least squares fitted plane, with up to 0.3% of the worst points rejected.
PF and PS are measured using one of the spheres at 10 positions within the system's FOV.SD is measured with the ball-bar at 7 positions, while F is determined using the flat in 6 positions.These positions are illustrated in Figure 4.

CALIBRATION PARAMETERS
The industry standard models that are essential for calibration of an SL system contain several parameters.Which of these parameters to include in the calibration process is unclear.To solve for the calibration parameters we employ the commonly used method proposed by Zhang (33).We use the 4 parameter pinhole model with the addition of up to five lens distortion parameters.Hence, the camera is modeled as The use of a non-unit aspect ratio (i.e., fx = fy), makes it possible to model non-square pixels and/or capture compound nonuniformity in the lens.Likewise estimation of the principle point, (cx, cy), makes it possible to describe cameras in which the principle ray does not strike the image sensor in it's exact center.With quality components such as ours, we would expect these parameters to be unnecessary.At the same time, the inclusion of these parameters increases the risk of false estimation, numerical instability and non-convergence.In fact, it was shown, that principle point estimation is especially prone to misinterpretation, and that the parameter can often be neglected in cameras of medium to long focal length (25).
Radial lens distortion is modeled according to where (k1,k2,k3) are the three distortion coefficients.Tangential distortion is modeled where (p1, p2) are the tangential distortion parameters.This five parameter "Brown-Conrady" model is widely accepted (5).
The stereo relationship between cameras is described using three rotations and three translations.Due to weak inter-dependencies, the calibration can be performed individually per camera, followed by stereo calibration.Still, the risk of over-fitting and converging to local minima remains, and therefore higher order distortion parameters are used only when considered relevant.To investigate these factors, we calibrate using 8 different configurations of parameters and evaluate by means of VDI/VDE quality parameters.Each calibration is performed using 81 observations of the calibration board, evenly sampled in the range from −40 to 40 degrees relative to baseline.
Figure 5 shows performance results for the different calibration parameter configurations.The baseline setting generally yields sub-millimeter results.The free aspect ratio (fx = fy) and principal point estimation degrade the performance from "baseline".
These results show that in a typical setup, omitting the principle point estimation makes calibration significantly more stable.It can be seen that by enabling the first two distortion coefficients, significant improvement is obtained.This is especially noticeable in the sphere distance metric, SD, being a measure of accuracy.
No significant improvement is obtained with additional distortion parameters.

Conclusion
Given our setup, only the k1 and k2 distortion coefficients are required for accurate calibration.The inclusion of both aspect ratio and principle point estimation makes the calibration procedure unstable, and considerably better results are obtained without them.With their removal, we see consistently low results of PF , SD and F, while the estimation of sphere sizes (PS) is biased to positive values.This indicates that one should carefully consider which camera model is used.

CALIBRATION OBSERVATIONS
An important question in calibration is in which poses the calibration board needs to be observed.Viewing the calibration board at very shallow angles means higher uncertainty in point localization.In addition, the effect of non-planarity becomes larger.However, it is necessary to observe some degree of foreshortening for focal length estimation (33).
In this section we attempt to obtain the optimal angular range of observations relative to the baseline.We tested 8 different ranges starting from −5 • to 5 • relative to baseline and ending in −40 • to 40 • .For each range, we sample evenly 11 images of the calibration board.For the rotations performed, most foreshortening will be observed around the rotation axis, thus constraining the focal length parameter fx well.With a fixed aspect ratio, this in turn constraints fy also.
The results from the experiment can be seen in Figure 6.It is seen that increased foreshortening affects the sphere distance parameter (SD) positively indicating better calibration.In general, the results are quite comparable for all ranges.Comparing to Figure 5 it is also apparent that using 11 observations and 81 observations ranging from −40 to 40 yields similar results.Baseline denotes the pinhole model with fixed aspect ratio, fixed principle point and without distortion parameters."ar" adds aspect aspect ratio, "pp" principle point determination.The other groups show results when different combinations of lens parameters are used.From this we see significant improvements when lens distortion parameters are added.The inclusion of both aspect ratio and principle point estimation makes the calibration procedure unstable, and considerably better results are obtained without them.
The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-5/W8, 2016 LowCost3D (LC3D), Sensors, Algorithms, Applications, 1-2 December 2015, Berlin, Germany This contribution has been peer-reviewed.doi:10.5194/isprsarchives-XL-5-W8-7-2016 Conclusion In terms of accuracy, it is slightly beneficial to use a large angular range during calibration.However, even a smaller amount of foreshortening is sufficient to accurately estimate parameters.We opt for the 80 • range.Furthermore, the difference between 11 and 81 observations is negligible, thus for the sake of practicality we proceed by using the former.

ENCODING STRATEGIES
The encoding strategy of a structured light system determines how correspondences are found, and can be expected to be a major factor in system precision.We identify three main categories of algorithms which are relevant in this setting: • Fourier methods, prominently phase shifting (PS) methods ( 28).
• Line shifting ( 12), which is the same principle underlying triangulation based laser line scanners, with multiple lines sweeping the scene simultaneously.
Phase Shift (PS) based methods encode the scene by means of a series of shifted sinusoidal patterns.The phase is then recovered and matched between cameras ( 16).The advantage is that the scene can be densely encoded using at least 3 patterns, and more can be naturally added to increase precision.For correct encoding, the projector-camera system should have a close to linear intensity-response.The frequency of sinusoids can also be altered, with higher frequencies yielding higher precision at the cost of phase ambiguities, which then have to be "unwrapped" using additional patterns.Our PS implementation performs 32 steps at the highest frequency sinusoidal pattern (period 19.2 px), and unwraps the resulting phase using two sets of lower frequency patterns (34).The total number of projected patterns is 64.
Binary boundary based methods, such as the Gray Code method, encode scene points directly by means of binary codes, which are decoded and matched in the cameras.These methods are flexible in the number of patterns, and allow for the natural addition of redundant information, which can then be used to reject outliers.Feature points locations can be estimated with sub-pixel accuracy.Our Gray code implementation encodes uniquely every other column in projector space, and employs patterns and their inverse for added robustness.Boundaries are detected at the intersection of the pattern and its inverse with subpixel accuracy.The total number of patterns is 20.
Line shifting can be performed with a single laser line as the projection source, however with a digital projector, many lines can be projected in parallel.Correspondence points are found at the peak of the stripes.Several methods exist for subpixel peak detection (29).For Gühring's Line Shifting method, we employ Gray codes to partition the encoded space into 256 unique regions.For each of these regions, a single projector line then walks across it in 8 steps, resulting in a total of 28 patterns.The peak of each single line is determined as the first derivative zero crossing using a Gaussian derivative filter of size 5 px.The SL system is calibrated with the previously determined angular range of 80 degrees and 11 positions.Furthermore, we use the k1,2 parameter selection, as previously identified.A comparison of the VDI/VDE quality parameter results for these three encoding strategies is seen in Figure 7.A summary of the results may be seen in Table 4.
The precision of the different strategies can be indirectly estimated from the spherical form parameter PF .This is because the calibration spheres cover only a small part of the scanning volume, whereas the flat plane occupies a substantial part.Flat plane measurements will thus be more affected by the quality of calibration and lens distortion correction.Both of which directly affect precision and accuracy.From the results we confirm that the PS method is more tolerant to depth of field limitations, where positions close and far away show no signs of degradation.The PS method also shows superior precision characteristics in the PF parameter.For the PS parameter, there is a clear bias present in both the PS and Gray code method, whereas Line Shift appears bias free.Figure 8 illustrates sphere fitting results for the three methods.Here it is seen, that PS and Gray have systematic positive residuals towards the lateral edge in horizontal (encoding) direction.This in turn leads to slight overestimation of sphere diameters.

Conclusion
In the quality parameters PF (Spherical form) and F (Flatness), the PS encoding strategy provides the best results in all artifact positions.The sphere diameter is biased positively in the PS and Gray methods, while it appears unbiased with Line Shift.Overall it appears that the PS method is the best performing method.

COMPARISON TO METROLOGY SCANNER
Finally, we have compared our system to a high-end commercial scanner (GOM ATOS III Triple Scan), which has a FOV of 240× 320 × 240 mm, similar to the FOV of our system (see Table 1).
In this experiment we used the PS algorithm, and calibrated as in the preceding experiment.Again, quality measures defined by the VDI/VDE 2634(2) were measured.Results are seen in Figure 9.The results show that our system is more precise and in terms of PF and exhibits lower variance.However, a bias is present in the PS whereas the commercial system appears free of such.It is apparent from the sphere spacing term (SD) that the commercial system indicates better accuracy.
Interestingly, the GOM scanner shows significant improvements in the flatness form error metric, F, compared to the sphere form, PF , where one would expect similar performance (as seen in our system).Reasons for this will only be cause for speculation as the scanning procedure and reconstruction is proprietary, that being said, some form of smoothing favoring planar surfaces might be at work.The International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences, Volume XL-5/W8, 2016 LowCost3D (LC3D), Sensors, Algorithms, Applications, 1-2 December 2015, Berlin, Germany

Conclusion
This contribution has been peer-reviewed.doi:10.5194/isprsarchives-XL-5-W8-7-2016 unbiased sphere diameter results and achieves higher accuracy.Since accuracy is a deterministic noise component, this indicates that a custom calibration method could be advantageous.

DISCUSSION AND CONCLUSION
In conclusion, in this paper we have shown through quantitative analysis how the most common parameters within structured light systems affect the overall performance.Our quantitative measure is the accepted VDI/VDE 2634(2) guideline which nicely captures critical parameters in terms of precision and accuracy.We perform a series of experiments on our experimental setup using precision made calibration artifacts.We start by investigating calibration parameters as defined by the most commonly used models and follow by determining the angular foreshortening and the amount of observations required to yield the best results.We proceed by comparing three commonly used algorithms against each other in order to determine the best method.Finally, we compare our setup to a metrology grade commercial scanner, using the previously determined parameters.Our results show that comparable and in some cases better results can be obtained using standard methods and models if care is taken in the parameters choice.We expect these findings to be of help to practitioners wanting to build their own SL systems.
Even though the VDI guideline indirectly captures some of the error sources, such as depth of field, calibration performance and acquisition noise, it is lacking to some extent.The calibration shapes suggested consists of low frequency features, thus a Gaussian filtering operation on the measured point cloud will yield better results for some of the parameters.Although it is stated that all filtering operations must be noted; In many cases these filtering operations are required or inherent in the triangulation algorithms at a pre point cloud level.The use of such filtering will affect the frequency response of the system, where low-pass operations will limit the systems capability of resolving high frequency features.
In order to better analyze a systems performance, a frequency analysis must be conducted, indicating if any such smoothing is taking place.Such a frequency response characterization calls for an additional calibration artifact in the form of a high frequency feature.In addition, results from the VDI/VDE only provide quantitative evaluation of the artifacts used.Thus the results cannot be transferred to other less optically ideal materials.

Figure 2 :
Figure 2: The calibration plate used in this study sitting on a rigid wooden support frame.Manufactured from (400×280×12 mm unhardened float-glass.A high resolution printed checkerboard is glued on the flat surface. .

TheFigure 4 :
Figure 4: Measurement positions used throughout the paper.The outer frame represents the FOV, as seen from the cameras (Position 1 being closest).Left: ball-bar positions used for sphere distance SD.Right: positions of the flat used for the flatness error metric, F.

Figure 5 :
Figure 5: Results obtained with different camera and lens models.Colors represent different positions of the dumbbell or flat artifact according to Fig. 4 (Position 1 being the leftmost bar).Baseline denotes the pinhole model with fixed aspect ratio, fixed principle point and without distortion parameters."ar" adds aspect aspect ratio, "pp" principle point determination.The other groups show results when different combinations of lens parameters are used.From this we see significant improvements when lens distortion parameters are added.The inclusion of both aspect ratio and principle point estimation makes the calibration procedure unstable, and considerably better results are obtained without them.

Figure 6 :Figure 7 :Figure 8 :Figure 9 :
Figure 6: Results obtained with different angular ranges of the calibration plate relative to the camera baseline.Colors represent different positions of the artifacts according to Fig.4(Position 1 being the leftmost bar).We see that in terms of accuracy, it is slightly beneficial to use a large angular range during calibration.However, even a smaller amount of foreshortening is sufficient to accurately estimate parameters.

Table 1 :
Technical specifications of our structured light setup.

Table 2 :
Specification of the dumbbell used for our experiments.

Table 3 :
Specification of the flat plane used for our experiments.

Table 4 :
These classes of encoding strategies have fundamentally different error characteristics.The binary and line shifting methods may be very robust against point outliers, but PS patterns are often less affected by projector and camera blur due to their low-frequency nature.Interpretation of the algorithm performance.