ROBUST VISION-BASED POSE ESTIMATION ALGORITHM FOR AN UAV WITH KNOWN GRAVITY VECTOR

Accurate estimation of camera external orientation with respect to a known object is one of the central problems in photogrammetry and computer vision. In recent years this problem is gaining an increasing attention in the field of UAV autonomous flight. Such application requires a real-time performance and robustness of the external orientation estimation algorithm. The accuracy of the solution is strongly dependent on the number of reference points visible on the given image. The problem only has an analytical solution if 3 or more reference points are visible. However, in limited visibility conditions it is often needed to perform external orientation with only 2 visible reference points. In such case the solution could be found if the gravity vector direction in the camera coordinate system is known. A number of algorithms for external orientation estimation for the case of 2 known reference points and a gravity vector were developed to date. Most of these algorithms provide analytical solution in the form of polynomial equation that is subject to large errors in the case of complex reference points configurations. This paper is focused on the development of a new computationally effective and robust algorithm for external orientation based on positions of 2 known reference points and a gravity vector. The algorithm implementation for guidance of a Parrot AR.Drone 2.0 micro-UAV is discussed. The experimental evaluation of the algorithm proved its computational efficiency and robustness against errors in reference points positions and complex configurations.


INTRODUCTION
The problem of external orientation estimation is one of the central problems in photogrammetry and computer vision.It could be stated as an estimation of six parameters of the external orientation that define the spatial position and orientation of the camera coordinate system with respect to the global object coordinate system (Luhmann et al., 2014).This problem is commonly known as the problem of camera calibration.In computer vision society this problem is sometimes referenced as Perspective-n-Point (PnP) problem, where n represent the number of available reference points.

Related work
It was shown by (Fischler and Bolles, 1981) that at least 3 reference points are required to find a solution to pose estimation problem.However direct solution for P3P problem may not be found from weak reference point configurations.A direct solution for P4P problem for arbitrary reference point configuration was proposed in (Abidi and Chandra, 1995).A large number of algorithms were developed for solution of PnP problem with n ≥ 4. Such algorithms are more robust against noise in the detected reference points positions but require an iterative approach to solve an overdetermined system of equations.Hence the computational complexity of state-of-theart PnP algorithms usually has a O(n 5 ) relation with respect to time.
In (Lepetit et al., 2008) a non-iterative robust EPnP algorithm for n ≥ 4 was proposed.Due to the non-iterative approach the computational complexity of EPnP grows linearly with respect * Corresponding author to n.A different approach to the PnP problem utilizing an error metric based on collinearity in object space (as opposed to image) is proposed in (Lu et al., 2000).The LHM iterative algorithm is based on this approach and provide a fast and globally convergent solution for PnP problems with n ≥ 3.All methods presented above perform the pose estimation using only the information about the reference points positions in the object space and their projections in the image space.However even human operator may have problem with an interpretation of a scene on an image that was captured with unnatural camera orientation (i.e.zenith direction or local gravity vector doesn't coincide with the vertical axis).In contrast, any human usually has no problem with interpretation of the scene that he perceives by his vision.However, human vision system has a hint for estimation of the head rotation because the direction of local gravity vector is provided by the vestibular system.
Hence if the direction of a gravity vector with respect to the camera is known the pose estimation problem could be simplified.A closed-form solution to the pose estimation problem with known vertical direction was proposed by (Kukelova et al., 2011).A uP2P algorithm finds the solution for camera pose and an unknown rotation angle about a single axis using collinearly equations.To simplify the equations trigonometric functions in the rotation matrix are eliminated using a substitution.The resulting system of equations could be solved to obtain a single polynomial in one variable of degree two that could be solved for the unknown rotation angle.Such approach was proved to be robust against small errors in gravity vector direction and errors in reference point positions up to 1 pixel.However, as the substitution use a square of a tangent function of the unknown rotation angle the solution may become unstable when the unknown angle is close to 90°.
Another solution for P2P problem with known gravity vector direction is proposed in (D'Alfonso et al., 2014).In this method it is assumed that both camera and object with reference points are equipped with sensors that provide the orientation of gravity vector.Hence pose estimation with respect to a moving object is possible.The performance of the method is close to the performance of uP2P.However, estimation of the rotation angle is also unstable if the angle is close to 90°.
Pose estimation for an arbitrary configuration of 2 points is often required in practical applications such as initial estimation of parameters for bundle adjustment or UAV autonomous flight.It is obvious that if 2 points are projected into a single point in the image, the pose of the camera couldn't be estimated.Such weak configuration occurs if the line that connects 2 points is normal to the image plane.Direct solutions of collinearity equations for weak configurations are generally unstable.
However, the robustness of the solution could be improved if the estimation of the rotation angle will be performed using homogenous coordinates on a unit sphere.Such approach is widely used for vanishing point detection and rotation estimation (Barnard, 1983), (Kalantari et al., 2008), (Förstner, 2010).This paper is focused on the development of a new algorithm for Minimal number of points Linear pose estimation with known Zenith direction (MLZ).The MLZ algorithm provide robust P2P pose estimation for weak configurations of reference points2 .

Paper outline
The rest of the paper is organized as follows: in the second part the developed MLZ algorithm is presented.The problem statement is given and unit sphere based rotation angle estimation is discussed.The second part is concluded with the solution of the pose estimation problem and the algorithm pipeline.The third part is dedicated to the experimental evaluation of the algorithm using a computer simulation and a Parrot AR.Drone 2.0 micro-UAV.The paper is concluded with the analysis of the experiments and prospects for further research.

Problem statement
Assume a camera and two reference points in the field of view of the camera.Three coordinate systems are defined.Object coordinate system О o Х o Y o Z o is related to some object of interest in the observed scene and defined as follows: О о Х о Z о plane is normal to gravity vector, the Y о axis is normal to Х о , Z о axes.The point of origin is related to some point of the observed scene and is selected appropriately for a given problem (figure 1).
The origin of the image coordinate system О i Х i Y i Z i is located in the upper left pixel, the X i is directed to the right, the Y i axis is directed downwards.
The origin of the camera coordinate system О c Х c Y c Z c is located in the perspective center, the X c axis is collinear with the X i axis, the Y c axis is collinear with the Y i axis, the Z c axis is normal to X c and Y c axes.The rotation of the camera coordinate system with respect to object coordinate system is defined using rotation matrix R oc : where R α -rotation matrix around the axis Y, R⍵ -rotation matrix around the axis X, R ϰ -rotation matrix around the axis Z .

Figure 1. Coordinate systems
The developed algorithm should provide an accurate estimation of the camera external orientation for a given pair of reference points X 0 , X 1 , with known coordinates in image space x 0 , x 1 and known direction of a gravity vector in the camera coordinate system provided as rotation angles  and .The estimation should be robust against outliers in coordinates of reference points in image space and weak configurations of reference points.

Unit sphere based rotation estimation
The unit sphere mapping provides a method to avoid singularity during estimation of the position of a vanishing point.In (Barnard, 1983) a method for camera rotation estimation based on pairs of parallel lines in object space is proposed.Each pair of parallel lines is projected onto a unit sphere with a center in the perspective center.Each projection is contained by a plane that intersect the center of a unit sphere.A vector from the center of the sphere to the point of intersection of planes gives a direction vector of parallel lines in the camera coordinate system.Hence the rotation of the camera with respect to a given pair of parallel lines could be found.
In the MLZ algorithm it is proposed to estimate the rotation of the camera using unit sphere mapping.Let α be the rotation angle of the camera coordinate system with respect to a line L that pass through the reference points X 0 , X 1 (figure 2).A second line G that is parallel to L is required to found the angle α.Assume that the line L is coplanar to the plane О c Х c Z c .Let us introduce an ancillary plane P that contains line L and is parallel to axis Y i .Then the line G that is parallel to the line L could be found as the intersection of the plane P and the plane О c Х c Z c .The projection g of the line G on the image plane will be collinear with the X i axis.Then the rotation angle α could be found using the virtual line g and the projection l of the line L: where n is a cross product between direction vectors x 1 ', x 2 ' on the unit sphere, m is a cross product between the unit vector y and the vector n.To find the rotation angle for a general case let us introduce ancillary coordinate systems 3).The origin of the point coordinate system О p Х p Y p Z p is located in the point X 0 , the X p axis is parallel to line L, the Y p axis is coplanar with plane P and normal to the axis X p , the Z p is normal to X p , Y p axes.Let R op be the rotation matrix from the object coordinate system to the point coordinate system.Let R oc0 be the rotation matrix from object coordinate system to camera coordinate system in the case when angle α is equal to zero, and angles ω and ϰ are known.Then the coordinate system The the angle α' to could be found using the projection g' of the line G and the line l using equation (2).

Determination of the perspective center
To find the location of the perspective center O in the object coordinate system, direction vectors x 0 ', y 0 ' in the camera coordinate system could be used.Let X c0 , X c1 be the direction vectors given by rotation of vectors x 0 ', y 0 ' with rotation matrix R oc -1 : The point of intersection of lines l 0 and l 1 from points X 0 , X 1 defined by direction vectors X c0 , X c1 correspondingly will give the location of the perspective center O.In real cases lines l 0 , l 1 will be skew due to measurement errors.Then the position of perspective center O could be defined as: where   ,   -end points of the perpendicular between lines l 0 and l 1 (Luhmann et al., 2013): where a 0 , b 0 , c 0 -direction cosine of the line l 0 ; a 1 , b 1 , c 1direction cosine of the line l 1 : X 0 , X 1 except the case when the line l crosses the of the projection (i.e.except the case when lines l and g' coincide).

ALGORITHM EVALUATION
To evaluate the MLZ algorithm both real experiments and computer simulation were used.Experiments were performed using a Parrot AR.Drone 2.0 micro-UAV and an external motion capture system.The position of the UAV was measured independently by the motion capture system and the MLZ algorithm.Bellow the configuration of the evaluation system is presented.
The MLZ algorithm 1. Get coordinates of reference points X 0 , X 1 in the object space and coordinates of their projections x 0 ', y 0 ' in the camera space 2. Calculate the rotation matrix  "#A using equation ( 1) and known values of angles ω and ϰ.

System configuration
The Parrot AR.Drone 2.0 UAV is a quadcopter equipped with two cameras.The first camera is forward looking and provide 720p video at 20 fps.The second camera is directed downwards and provide 320p video at 60 fps.For evaluation of the algorithm the forward looking camera was used.The quadcopter is also equipped with a precise 3 axis gyroscope that provide an accurate measurement of angles ω and ϰ (pitch and roll angles).Technical specifications of the AR.Drone 2.0 UAV are presented in table 2.

Motion capture system
To record the ground truth position of the UAV with respect to the object coordinate system the 'Mosca' motion capture system was used (Knyaz, 2015).The 'Mosca' system could provide the information about the UAV position and its rotation with high accuracy and high sample rate.3. Technical characteristics of the motion capture system An original camera calibration and external orientation procedure was used to reach a high accuracy of 3D measurements.The calibration procedure is highly automated due to applying original coded targets (Knyaz, 2010) for identifying and measuring image coordinates of reference points.The system calibration provides accuracy of 0.01% of working area of the motion capture system.

Unit
The external orientation of the motion capture system is performed after choosing a working space and a camera configuration for motion capture.For an external orientation a special test field is used.It defines the object coordinate system in which 3D coordinates are calculated.For registration of the UAV flight a flat upright standing test field was used.
The motion capture works with special (optional retro-reflective) artificial targets that mark required points of interest on a captured object.The original technique for targets detection and identification is proposed based on similarity analysis of targets and epipolar based points correspondence determination.For artificial targets image coordinates are determined with sub-pixel accuracy using centroid operator.To register the flight a set of circular targets was located on the bottom side of the UAV.They define the UAV coordinate system О u Х u Y u Z u .The targets #1, #2 defines the Z u axis.The origin of the coordinate system is located between points #1 and #2.The X u axis passes through the point #3.The Y u axis is normal to X u , Y u .The transformation from the UAV coordinate system to the camera coordinate system was estimated using a calibrated stereo image pair (figure 5).
Figure 5. Orientation of the UAV coordinate system with respect to the UAV frame and circular targets.The UAV lies upside down, the bottom side is visible

Algorithm evaluation using an UAV
To evaluate the performance of the UAV the algorithm was implemented in a dedicated software for quadcopter guidance.Two kind of experiments were performed: a hovering flight and a flight along the given trajectory.For both experiments the MLZ algorithm provided the information about UAV position and rotation.A simple feedback control based on the distance between the next waypoint and the estimated UAV position was used.
Two reference points are required as the input for the MLZ algorithm.During the flight experiment reference points were detected automatically at 18 FPS using circular targets detection algorithm.The coordinates of circular targets were calculated with sub-pixel accuracy.To the distortion the camera lens Brown-Conrady distortion model was used (Brown, 1966).Two reference points were chosen from the test field.Hence the object coordinate system of the MLZ algorithm coincided with the motion capture coordinate system.Such approach simplified the processing of the sequences recorded by the motion capture system.

Computer simulation
To find out the dependency between the accuracy of coordinates of reference points in the image space and the measurement error of the MLZ algorithm a computer simulation was performed.For the simulation two points X 0 , X 1 with random position were chosen.The distance from points to the origin of the object coordinate system was equal to 500 mm.All reference points configurations were used except those that result in the angle α between line l and the optical axis of the camera smaller than 1 degree.The perspective center of the camera was located at the point  = 0 −1000 4000 ` (coordinates in mm).The parameters of the internal orientation of the camera were equal to the parameters of the camera of the Parrot AR.Drone 2.0 UAV.
The coordinates of the projection x 0 , y 0 of points X 0 , X 1 were found using collinearity equations.Uniform random error vector e was added to the coordinates of points x 0 , y 0 .The distorted coordinates were used as the input for the MLZ algorithm.For each position error e from 0 to 10 pixels 1000 point configurations with different values of angle α from 0° to 89° were calculated.For each configuration the position error of the MLZ algorithm was calculated.The averaged results are summarized in figure 7 using the boxplot representation.

CONCLUSIONS
A new algorithm for estimation of an external orientation of a camera with known gravity vector was developed.The MLZ algorithm estimates the pose of a camera with respect to the given object coordinate system.To perform the estimation, the algorithm must be supplied with coordinates of two reference points in the object space, their projections in the image space and the direction of the gravity vector in the camera coordinate system.
The algorithm was implemented in a dedicated software for guidance of the Parrot AR.Drone 2.0 UAV.The evaluation of the algorithm was performed using computer simulation and flight experiments.The assessment of the algorithm proved that it is robust against errors in the image space coordinates of reference points up to 5 pixels.The analysis of the computer simulation has shown that the algorithm is robust against configurations of reference points.
The overall performance of the algorithm was accurate enough to perform a guided flight along the desired trajectory and a hovering flight with a quadcopter equipped with a frontal camera.
The MLZ algorithm could also be used for a fast and robust estimation of the initial parameters of the external orientation for a bundle adjustment.
In further work it is planned to compare the algorithm performance with the performance of other modern P2P algorithms for cameras with known gravity vector direction such as methods proposed in (Kukelova et al., 2011), (D'Alfonso et al., 2014).To study the dependency between the field of view of the camera and the measurement error of the MLZ algorithm further experiments with different kinds of UAVs and different cameras are required.

Figure 2 .
Figure 2. MLZ rotation estimation for a case when line L is coplanar to plane О c Х c Z c Figure 3. MLZ rotation estimation for a general case

Figure 4 .
Figure 4. Assembled motion capture system, the test field and Parrot AR.Drone 2.0 UAV Figure6shows two trajectories of the UAV.The first trajectory was recorded by the motion capture system.The second trajectory was estimated by the MLZ algorithm.The analysis of the trajectories proves that the algorithm was accurate enough to guide the quadcopter along the desired trajectory.The distortion of the box trajectory was caused by an insufficient controllability of the UAV with a simple feedback control.

Figure 6 .
Figure 6.Flight trajectory recorded by the motion capture system and the trajectory estimated by the MLZ algorithm

Figure 7 .
Figure 7. Dependency between noise in the position of reference points and the measurement of the MLZ algorithm

Table 2 .
Technical specifications of AR.Drone 2.0 UAVThe system includes up to four IEEE 1394 cameras that work in a synchronized mode at a frame rate up to 100 frames per second under the control of a personal computer (PC).The PC provides the possibility for accurate calculation of 3D coordinates of points of interest.The assembled system is shown on figure 4. The system could be extended to more cameras by including an additional PC station in the system.The control commands for the UAV are sent using WiFi interface.The synchronization is established by turning on LED lamps on the UAV at the beginning of the capture session.Technical characteristics of the motion capture system are presented in table 3.