Closed-form camera pose and plane parameters estimation for moments-based visual servoing of planar objects

Image moments are global descriptors of an image and can be used to achieve control-decoupling properties in visual servoing. However, only a few methods completely decouple the control. This study introduces a novel camera pose estimation method, which is a closed-form solution, based on the image moments of planar objects. Traditional position-based visual servoing estimates the pose of a camera relative to an object, but the pose estimation method directly estimates the pose of an initial camera relative to a desired camera. Because the estimation method is based on plane parameters, a plane parameters estimation method based on the 2D rotation, 2D translation, and scale invariant moments is also proposed. A completely decoupled position-based visual servoing control scheme from the two estimation methods above was adopted. The new scheme exhibited asymptotic stability when the object plane was in the camera field of view. Simulation results demonstrated the effectiveness of the two estimation methods and the advantages of the visual servo control scheme compared with the classical method.


Introduction
Visual servoing control refers to the use of computer vision data to control the motion of a robot. 1 In visual servoing, two closely linked problem themes are subjects of active research 2 : the design of visual features pertinent to the robotic task to be realized and a control scheme with the chosen visual features such that the desired characteristics are obtained during visual servoing. The latter adopts the control scheme of ensuring an exponential decoupled decrease in error. The former employs the observed parameters of the visual feature in the image of geometric primitives (points, straight lines, ellipses, and cylinders) in image-based visual servoing (IBVS). [3][4][5] In position-based visual servoing (PBVS), [6][7][8] the geometric primitives reconstruct the camera pose, which then serves as input for visual servoing. The above approaches subject the image stream to an ensemble of measurement processes, including image processing, image matching, and visual tracking steps, from which the visual features are determined. 9 With large measurement errors, PBVS can be affected by instability in pose estimation and IBVS designed from the image points is subject to local minima, singularity, and limited convergence domain. 10 This is due to the image mismatch, strong nonlinearities, and coupling in the interaction matrix of image points. As a solution to these issues, image moments were introduced for visual servoing. [11][12][13][14] Image moments have a broad spectrum of applications in image analysis, such as invariant pattern recognition, 15,16 pose estimation, 17,18 and reconstruction. 19 A set of moments computed from a digital image represents the global characteristics of the image shape and provides much information regarding the different geometrical features of the image. 20 Because the descriptors used are global and nongeometrical, they avoid image processing such as feature extraction, matching, and tracking. 18 Moments are also useful in achieving control-decoupling properties and choosing a minimal number of features to control the whole degrees of freedom of a camera. Therefore, visual servoing using image moments obtains a large convergence domain and adequate robot trajectories, because of the reduced nonlinearities and coupling in interaction matrix of the adequate combinations of moments.
The analytical form of the interaction matrix related to any moment generated from segmented images was determined, and the result was applied to classical geometric primitives. 11 However, this method requires an object plane configured such that the object and camera planes are parallel to the desired position. Another drawback is the strong coupling of the interaction matrix. Tahri and Chaumette 12 designed a decoupled control scheme to weaken the coupling of the visual features and functions when the object is parallel to the image plane. A generalization of the aforementioned property, that is, when the desired object position is unparallel to the image plane, has achieved excellent experimental results, but it should determine what virtual rotation to apply to the camera and that the x-and y-axes angular velocities of the camera are still coupled. To solve the problem of selecting the image moments to control the rotational motions around the x-and y-axes simultaneously with the translational motions along the same axis, new visual features computed from a low-order shifted moment invariant are proposed. 13 On the basis of the shifted moments, the selection of a unique feature vector independent of the object shape is proposed and exploited in IBVS. Although this method significantly enlarges the convergence domain of the closed-loop system, the interaction matrix is also coupled. The above methods are all IBVS schemes. Presently, few studies on PBVS control schemes based on image moments exist, but the hybrid visual servoing is very popular. A novel moment-based 2 1/2D visual servoing method for grasping textureless planar parts is proposed by He et al. 14 Instead of applying high-order image moments, it uses rotation features, providing a decoupled interaction matrix that has a full rank and has no local minimum in the control scheme. However, the estimation method of the relative rotation of the textureless parts in real time is based on the cross-correlation analysis and is not a closed-form solution.
To handle the above issues more effectively, this study proposes a closed-form solution of camera pose estimation based on the image moments of planar objects. This method directly estimates the relative pose of the initial camera and the desired camera and not the pose of the camera relative to the object plane. Because the estimation method follows plane parameters, a plane parameters estimation method based on invariant moments is also proposed. Finally, we adopted the PBVS control scheme, which has a completely decoupled interaction matrix.
The rest of the article is organized as follows. The second section discusses preliminary knowledge; the third section introduces camera pose estimation based on image moments; the fourth section is devoted to plane parameters estimation based on the invariant moments; the fifth section discusses the PBVS control scheme and stability analysis; the sixth section presents the simulation results obtained from the co-simulation of MATLAB and CoppeliaSim; and finally, the seventh section outlines the conclusions and future work.

Preliminaries
This section introduces the preliminary knowledge used in this study. The pinhole camera model and imaging geometry are introduced, which play important roles in deriving formulas. Then, the relationships between the image moments of two cameras are described, laying the foundation for two camera pose estimation.

The pinhole camera model and imaging geometry
From the pinhole camera mode, 21 for a 3D coordinate P ¼ ½X ; Y ; Z T in the camera frame, which projects images as 2D homogeneous normalized coordinates p ¼ ½x; y; 1 T , we have p ¼ P=Z. The image plane measurement (in pixels) of points is given by p ¼ ½p u ; p v ; 1 T . The relationship between p and p is expressed as p ¼ K À1 p, where K is a nonsingular matrix containing the intrinsic parameters of the camera. 22 To simplify the notation, we will assume that any quantity is expressed in the normalized space. This is equivalent to assuming a calibrated camera, that is, full knowledge of the calibration matrix K.
The homogeneous normalized coordinates of the two camera image points in Figure 1(a) are n p ¼ ½ n x; n y; 1 T ; o p ¼ ½ o x; o y; 1 T , satisfying a certain relationship. Assuming that point b P on planar b is represented as n P and o P in the camera frames F n and F o , respectively, then n p and o p satisfy where n Z and o Z are the depth of the observed point b P relative to the camera frames F o and F n , respectively, and o R n and o t n are the rotation matrix and translation vector of the camera frame F n relative to the camera frame F o , respectively. Substituting for n P using equation (1), equation (2) can be written as The following will only consider a planar or has a planar limb surface. Here, the depth Z of any object plane can be expressed as a continuous function of its image coordinates (see the study of Espiau and Chaumette 3 for more details) and as follows where A, B, and C are the object plane parameters, and we define N ¼ ½A B C T . The parameters of a plane containing typical primitives has been studied by Espiau and Chaumette 3 and Chaumette et al. 23 To find the parameters of a plane containing any pattern, the general method for estimating plane parameters based on the invariant moments will be introduced in the fourth section.

Image moments
Moments are generic (and intuitive) descriptors computed from several kinds of objects defined either from closed contours or from a set of points. The ðp þ qÞ th order 2D geometric moments are denoted by m pq and can be expressed as where z is the region of the normalized space in which the image intensity function f ðx; yÞ is defined. The intensity centroid ðx g ; y g Þ is given by The moments computed with respect to the intensity centroid are called centered moments and are defined as The following considers objects defined by a binary region. Therefore, we assume that f ðx i ; y j Þ ¼ 1 in all the region that defines the object.
In the studies of Espiau and Chaumette 3 and Chaumette, 11 the interaction matrix L c pq related to any centered moment defined from equation (7) has been determined. So we can get where the relative kinematic screw v c ¼ ½n c v c T between the camera and the object, where n c and v c represent the translational and rotational velocity components, respectively. And the interaction matrix L c pq can be calculated as L c pq ¼ ½c vx c vy c vz c wx c wy c wz where c vx ¼ Àðp þ 1ÞAc pq À pBc pÀ1;qþ1 c vy ¼ ÀqAc pþ1;qÀ1 À ðq þ 1ÞBc p;q c vz ¼ ÀAc wy þ Bc wx þ ðp þ q þ 2ÞCc pq c wx ¼ ðp þ q þ 3Þc p;qþ1 þ px g c pÀ1;qþ1 À 4pn 11 c pÀ1;q þðp þ 2q þ 3Þy g c pq À 4qn 02 c p;qÀ1 c wy ¼ Àðp þ q þ 3Þc pþ1;q À ð2p þ q þ 3Þx g c pq Àqy g c pþ1;qÀ1 þ 4pn 20 c pÀ1;q þ 4qn 11 c p;qÀ1 c wz ¼ pc pÀ1;qþ1 À qc pþ1;qÀ1 where n pq is defined as n pq ¼ c pq =c 00 .

Relationship between the image moments of two cameras
If two cameras are not parallel to the object plane (see Figure 1(a)), then n Z and o Z in equation (3) are not constants and they will complicate the calculation of the image moment. Therefore, we introduce a simple situation where two cameras are parallel to the object plane (see Figure 1(b)). We denote the initial and transformed image coordinates of the feature pattern by ð n 0 x; n 0 yÞ and ð o 0 x; o 0 yÞ, respectively, the corresponding intensity functions by f ð n 0 x; n 0 yÞ and gð o 0 x; o 0 yÞ, and their image moments by m n 0 pq and m o 0 pq . Therefore Assuming that the image intensity values are preserved during the transformation, we have f ð n 0 x; n 0 yÞ ¼ gð o 0 x; o 0 yÞ. Further, dð o 0 xÞdð o 0 yÞ ¼ Ddð n 0 xÞdð n 0 yÞ, where D is the Jacobian of the transformation. The following notations are used for the moments of the initial and transformed images According to equation (3), when the two cameras are parallel to the object plane (see Figure 1(b)), we can obtain the following where L¼ n 0 Z= o 0 Z, n 0 Z and o 0 Z are the depths of the observed point b P relative to the camera frames F n 0 and F o 0 , respectively. The expressions relating to the image moments can be derived by substituting the image coordinate transformation equation (11) and different values of p and q in equation (9). The moment equations up to second order are given below

Camera pose estimation based on image moments
This part estimates the relative homogeneous transformation of the two camera frames based on the image moments of the object plane in the two cameras and the plane parameters. In other words, the rotation matrix o R n and the translation vector o t n in Figure 1(a) will be estimated using image moments and plane parameters. The estimation method is a closed-form solution.

Estimate the rotation matrix of the two camera frames
The estimation method of the rotation matrix o R n in Figure 1(a) will be introduced in this section. Firstly, to make the imaging plane of the camera parallel to the object plane, the camera frames F n and F o need to be rotated, and the rotation matrices are n 0 R n and o 0 R o , respectively. In other words, the camera frames F n and F o in Figure 1(a) will be converted to the camera frames F n 0 and F o 0 in Figure 1(b). Then, the rotation matrix o 0 R n 0 of camera frame F n 0 relative to F o 0 in Figure 1(b) will be calculated according to the image moment. Finally, the rotation matrix o R n can be calculated as Any rotation matrix R can be expressed by Z À Y À X Euler angles as follows where R z ðÁÞ, R y ðÁÞ, and R x ðÁÞ are rotation operations about coordinate frame axes z, y, and x, respectively, and c a ¼ cosðaÞ, c g ¼ cosðgÞ, and s g ¼ sinðgÞ. Appendix 1 explains the relationship between the rotation matrix o R n and Z À Y À X Euler angles. The following will introduce the calculation of the Euler angles of the rotation matrices o 0 R o , o 0 R n 0 , and n 0 R n .
The calculation of the rotation matrices n 0 R n and o 0 R o . From equation (4), the formula of the object plane relative to the camera frame can be expressed as AX þ BY þ CZ À 1 ¼ 0. Therefore, the normal vector of the object plane b in Figure 1(a) is expressed as n N ¼ ½ n A n B n C T and tively. n N and o N are normalized as n n ¼ ½ n a n b n c T and To make the imaging plane of the camera parallel to the object plane, the camera frame F n is rotated (see Figure 1(a)). The Euler angles of this rotation matrix n 0 R n are n g ¼ 0, n b, n a, so we can obtain We find that the unit normal vector of the object plane in the rotated frame F n 0 is n 0 n ¼ ½0 0 1 T . Therefore, we can get To calculate Euler angles n b and n a, equation (15) is expressed as n acn b þ n bsn a sn b þ n ccn a sn b ¼ 0 n bcn a À n csn a ¼ 0 As a result, the solution of equation (16) is obtained where Similarly, the camera frame F o in Figure 1(a) can also be rotated. The Euler angles of this rotation Therefore, the Euler angles o b and o a can be obtained by analogy to equation (17).
The calculation of the rotation matrix o 0 R n 0 . Firstly, the images n and o are rotated to n 0 and o 0 by n 0 R n and o 0 R o , respectively. Then, the following shows how to estimate the rotation matrix o 0 R n 0 of frame F n 0 relative to frame F o 0 based on the image moments.
Because the z axes of frames F n 0 and F o 0 are parallel to each other (see Figure 1(b)), o 0 R n 0 can be calculated by the image-centered moments. The rotation matrix o 0 R n 0 can be calculated by where Dg¼ o 0 gÀ n 0 g. n 0 g and o 0 g are the orientation angles of the inertial principal axes of the target plane b on the images of cameras n 0 and o 0 . The orientation angles n 0 g and o 0 g can be calculated as (see the study of Horn 24 for more details) where c 11 , c 20 , and c 02 are the second-order central moments and are defined by equation (7). It is well known that sinð2DgÞ and cosð2DgÞ can be expressed as It is easy to get that 2sinðDgÞcosðDgÞ ¼ sinð2DgÞ Thus, the solution of equation (22)  There are two solutions calculated for Dg using equation (23), but only one solution set is correct. The following will explain how to choose the correct solution.
Selection of angles Dg 1 and Dg 2 . This section will introduce the method for selecting the correct angle Dg by introducing the third-order moments.
The third-order moment equations are Substituting for o 0 O n 0 ; o 0 Q n 0 using equation (12), we get Therefore, it is easy to know that A 11 ¼ cosðDgÞ, A 12 ¼ ÀsinðDgÞ, A 21 ¼ sinðDgÞ, and A 22 ¼ cosðDgÞ. m pq , m pq and n pq are calculated by equations (25) and (10) If Dg 1 is correct, then J ! 0. Otherwise, J ! 1. So the positive coefficient K m is defined (usually K m ¼ 1 can be taken). If J > K m is satisfied, Dg 2 is the correct solution; otherwise, Dg 1 is the correct solution.
Finally, the relative rotation matrix o R n of the two camera frames F n and F o can be calculated by equation (13), where o 0 R o ; o 0 R n 0 , and n 0 R n are defined in equations (18), (19), and (14), respectively.

Estimate the translation vector of the two camera frames
This part will introduce the estimation method of the translation vector o t n in Figure 1(a).
From the camera frames F n and F o rotation introduced in the previous section, we can obtain where n 0 R n and o 0 R o are defined in equations (14) and (18), respectively, and n 0 Z and o 0 Z can be calculated as where n 0 C and o 0 C are expressed as Substituting for n p, o p, and o R n using equations (28) and (13), equation (3) can be converted to Because the imaging planes of the camera frames F n 0 and F o 0 are parallel, n 0 Z and o 0 Z are constant. According to equations (29) and (12), the relationship between the image moments of cameras o 0 and n 0 can be expressed as where o 0 V and n 0 V are defined as As a result, the translation vector o t n is obtained Therefore, the calculation method of the homogeneous transformation matrix has been completed.

Plane parameters estimation based on image moments
The homogeneous transformation matrix calculation method introduced in the third section needs to know the plane parameters in the frames F n and F o . In visual servo control, the latter is usually known, but the former is difficult to know. So we need to use the known information of image o and plane parameters o N to estimate plane parameters n N. The methods introduced in the studies of Chaumette 11 and Chaumette et al. 23 apply only to typical primitives. Therefore, the following will present a plane parameters estimate method, which is suitable for any pattern.
Generally, the plane parameters o N are known, so we can calculate the rotation matrix o 0 R o and get the rotated image o 0 . However, there are errors in the plane parameters n N. Similarly, the rotation matrixñ 0 R n and the rotated imageñ 0 can be calculated (note if there is no error iñ n N,ñ 0 ¼ n 0 ). Theoretically, the 2D rotation, 2D translation, and scale invariant moments of images o 0 and n 0 are the same. So we can correct the plane parametersñN according to these invariant moments.
We choose two moments w ¼ ½' 1 ; ' 2 T that are invariant to 2D rotation, 2D translation, and to scale 25 It is easy to know that n 0 w¼ o 0 w, so we can calculate n 0 w from image o 0 . According to equation (9), the interaction matrix related to the invariant moments w has the following form We want to rotate imageñ 0 to image n 0 that is parallel to image o 0 , so n c ¼ 0 and v c will be calculated. Therefore, we can get where L ' is not affected by plane parametersñN with error. The invariant moments can be expressed as the Taylor Truncating the Taylor expansion at first order and substituting forñ 0 _ ' using equations (34), we can approximate equation (35) asñ Therefore, rotation matrix n 0 Rñ0 of the frame Fñ0 relative to F n 0 can be calculated as where ½ñ 0 v c Â is a 3 Â 3 skew-symmetric matrix representation ofñ 0 v c . Now, we start to correct the initial plane parametersñN. First, the imageñ 0 will be rotated to n 0 by the rotation matrix n 0 Rñ0 . Noting that where S is the area of the object plane. Then, the plane parameter in the frame F n 0 is n 0 N ¼ ½0 0 n 0 C T . According to equations (29) and (38), n 0 C can be expressed as The final corrected plane parameters n N are Plane parameter error e N can be calculated as e N ¼ñNÀ n N Ã , where the desired plane parameter n N Ã can be approximately expressed as n N Ã % n N. Finally, we can design the correction scheme as whereê N ¼ñNÀ n N.

PBVS and stability analysis
This section first introduces the PBVS control scheme, which is based on camera pose estimation described in the third section. Then, the stability of the method will be analyzed. This visual servoing scheme has been studied by Chaumette and Hutchinson. 26 We briefly revisit the PBVS control scheme.
In this study, we can define the expected and current values of features s Ã ¼ 0 and s ¼ ½ o t n q o v n T , in which o t n is a translation vector calculated using equation (31) and q o v n gives the angle/axis parameterization for the rotation matrix o R n calculated by equation (13). Therefore, the vision-based control schemes minimize an error eðtÞ, which is defined as The relationship between _ e s and v c is given by whereL s is an estimation of the interaction matrix L s related to e s and be calculated aŝ where L q o v n is given by 27 where sincðxÞ is the sinus cardinal defined such that xsincðxÞ ¼ sinðxÞ and sincð0Þ ¼ 1. According to equation (43), rotation and translation are decoupled, which allows us to obtain a simple control scheme Next, simply analyze the stability. Lyapunov function candidate is defined as By deriving the Lyapunov function concerning time, equation (45) was transformed into If n N Ã % n N is satisfied and the object plane is always in the camera field of view, then both Àl s e T s L sL À1 s e s % Àl s e T s e s < 0 and Àl N e T Nê N % Àl N e T N e N < 0 will be guaranteed. So _ V < 0, the system is asymptotically stable.

Simulation results
We evaluate the control scheme proposed in this article. Considering a vision sensor and a object plane as examples, the co-simulation was conducted on the MATLAB 2020b and CoppeliaSim 4.1 platforms. Some different patterns, that is, an "octopus," a "whale," and a "flame" (see

Simulation results using camera pose estimation
This part mainly verifies the effectiveness of the camera pose estimation method introduced in the third section, so we assume that plane parameters in the frame F n are known ( n N ¼ ½À0:3716 À 0:0567 0:5307 T ). We consider the case where the image and object planes with the "octopus" pattern are nonparallel at the desired position ( o N ¼ ½0:2986 0:1526 0:8273 T ). Figure 2(a) shows the  green and red contours representing the desired and initial image contours, respectively. Then, PBVS scheme is used for camera control.
The obtained results are given in Figure 2. They show the good behavior of camera pose estimation and the control law. The translation component of the visual features is o t n , so the obtained camera trajectory is straight line (see Figure 2(b)). Despite the corresponding displacement is very large (t x ¼ 0:1084 m; t y ¼ 0:3403 m; t z ¼ À0:5524 m; q! x ¼ 0:1370 rad; q! y ¼ 0:4581 rad; and q! z ¼ À2:1889 rad), we can note in Figure 2(c) and (d) the decoupled and exponential decrease of the six combinations of the corresponding displacement and the six camera velocity components. Because the interaction matrix is decoupled, the corresponding displacement in Figure 2(c) all converge to less than 10 À4 . Although noise is added, the pose estimation method proposed in this article is suitable when the image and object planes are nonparallel at the desired position.
In addition, the method proposed in this article is suitable for the case that jq! z j > p=2, but some methods based on image moments, that is, the methods proposed by Chaumette 11 and Tahri and Chaumette, 12 do not have this advantage. The following will design two simple situations to illustrate this problem.
The visual features in the method of Tahri and Chaumette 12 are s ¼ ½x n y n a n ' 1 ' 2 a T where ' 1 and ' 2 can be calculated by equation (32), and x n ¼ a n x g y n ¼ a n y g a n ¼ o Z The "octopus" contour continues to be adopted, but q! x ¼ 0; q! y ¼ 0 (equivalent to A ¼ 0; B ¼ 0). One situation is jq! z j < p=2, the other is jq! z j > p=2 (see Figure 3(a) and (b)). The obtained results are given in Figure 3. The method in this article can make the feature errors (e a s ) and pose errors (½e a t x e a t y e a t z e a q! x e a q! y e a q! z T ) converge to less than 10 À4 for both situations. However, the method proposed by Tahri and Chaumette 12 is only suitable for case that jq! z j < p=2 (see Figure 3(c) and (e)). For jq! z j > p=2, this method can only converge feature error (e g s ) to less than 10 À4 (see Figure 3(d)), while the pose error (½e g t x e g t y e g t z e g q! x e g q! y e g q! z T ) still has a large error, that is, e q! z ¼ Àp (see Figure 3(f)). The reason is that the feature of the velocity ! z is a, which is only available for jq! z j 2 ½Àp=2; p=2. Therefore, the calculation of o 0 R n 0 proposed in "The calculation of the rotation matrix o 0 R n 0 " section has certain advantages for case that jq! z j > p=2.

Simulation results using plane parameters estimation
This part mainly verifies the plane parameters estimation method introduced in the fourth section. We simulated a situation where a desired image and a object plane with the "whale" pattern are nonparallel at the desired position. ( o N ¼ ½0:6016 0:2122 1:0419 T and n N ¼ ½À0:3716 À 0:0567 0:5307 T ). Twenty-seven percent errors are randomly added to n N to obtain plane parametersñN ¼ ½À0:4617 À 0:0775 0:6778 T . Note that n N is unknown, but n N is known. Figure 4(a) shows the green and red contours representing the desired and initial image contours, respectively. The PBVS schemes with and without plane parameters estimation are used to control the cameras, respectively.
The obtained results are given in Figure 4. The camera trajectories controlled by the PBVS schemes with and without plane parameters estimation are shown in Figure 4(b) . Due to the early correction of plane parameter errors (Figure 4(c)), the trajectory obtained by the former method is slightly different from the straight-line trajectory in   Despite the corresponding displacement is large (t x ¼ À0:9133 m; t y ¼ À0:7534 m; t z ¼ À0:1625 m; q! x ¼ À0:2980 rad; q! y ¼ 1:0479 rad; and q! z ¼ 1:1873 rad), the pose errors obtained by the former method still all converge to less than 10 À4 , but the pose errors (½e b t x e b t y e b t z e b q! x e b q! y e b q! z T ) obtained by the latter cannot converge to a small value (see Figure 4(d)). The PBVS scheme shows camera velocities (see Figure 4(e) and (f)). There is no oscillation for any component of the camera velocity in both methods.
As a result, the plane parameters estimation method proposed in this article can effectively eliminate the parameter error, and satisfactory results are obtained in PBVS scheme.

Simulation results compared to the classical method
This part will show the comparison results between the method proposed in this article and the classical method proposed by Tahri and Chaumette. 12 The latter's visual features are expressed as equation (47) and the interaction matrix isL which largely improves the system behavior. 28 The former's interaction matrix is calculated by equation (43). We consider the case where the image and object planes with the "frame" pattern (see Figure 5(a)) are parallel at the desired position ( o N ¼ ½0 0 1:1765 T and n N ¼ ½À0:0083 À0:2941 0:5290 T ). The corresponding displacement is t x ¼ À1:0735 m; t y ¼ À0:4274 m; t z ¼ À0:8021 m; q! x ¼ À0:4925 rad; q! y ¼ 0:1439 rad; and q! z ¼ À0:5007 rad.
Firstly, we assume that the planar parameter n N is known, and then use these two methods for visual servo control, respectively. The obtained results are shown in Figure 5.
The camera trajectories controlled by the two methods are showed in Figure 5(b). Because our method adopts PBVS control scheme, the camera trajectory is a straight line; while the classical method adopts the IBVS control scheme, so the camera trajectory is a curve. The two methods can not only make the feature errors converge to 10 À4 (see Figure 5(c) and (d)) but also make the pose errors converge to 10 À4 (see Figure 5(e) and (f)). However, the camera velocities controlled by our method is more stable than the camera velocities controlled by the classical method (see Figure 5(g) and (h)). The result is that the interaction matrix computed by our method has a better condition number than the classical method. Because the former has a completely decoupled interaction matrix (equation (43)). The boxplots of condition numbers of interaction matrices obtained by two methods with known plane parameters are shown in Figure 6. The maximum, minimum, and mean values of the condition numbers calculated by the classical method are 3206:2393, 65:1757, and 209:3060, respectively. However, the maximum, minimum, and mean values of the condition numbers calculated by the our method are 1:0217, 1:000000, and 1:0014, respectively. Therefore, the method proposed in this article has obvious advantages over the classical method when the planar parameter n N is known.
Finally, 30% errors are randomly added to n N to obtain plane parametersñN ¼ ½À0:0116 À 0:1860 0:3650 T . Then, the two methods are respectively used for visual servo control. The obtained results are shown in Figure 7.
The camera trajectories controlled by the two methods are shown in Figure 7(a). Although there are errors in plane parameters, both methods can converge the visual features to 10 À4 (see Figure 7(c) and (d)). However, our method can also converge the pose error to 10 À4 , while the classical method can only converge to 10 À2 (see Figure 7(e) and (f)). This is because the former method has plane parameter estimation and quickly converges the plane parameter errors to 10 À4 , but the latter method does not have this advantage (see Figure 7(b)). It can be seen from Figure 7(g) and (h) that the camera velocities calculated by the method proposed in this article are still more stable than the camera velocities calculated by the classical method. The boxplots of condition numbers of interaction matrices obtained by the two methods with errored plane parameters are shown in Figure 8. The maximum, minimum, and mean values of the condition numbers calculated by the classical method are 2244:6194, 65:6998, and 130:4685, respectively. However, the maximum, minimum, and mean values of the condition numbers calculated by our method are 1:0214, 1:0000, and 1:0015, respectively. The latter is significantly smaller than the former. Therefore, the method proposed in this article still has obvious advantages over the classical method when the planar parameterñN is errored.

Conclusion
This study proposes two new estimation methods based on image moments, which are used to estimate camera pose and plane parameters, respectively. The former method, which is a closed-form solution, directly estimates the relative pose of the initial camera and the desired camera, not the pose of the camera relative to the object plane. The latter method uses the 2D rotation, 2D translation, and scale invariant moments to estimate the plane parameters. From both estimation methods, the article employs a PBVS scheme for the object plane. The simulation results have validated our approaches. One advantage of the two estimation methods is that they do not require image processing such as features extraction, matching, and tracking, the other is that the PBVS scheme is suitable for the case that jq! z j > p=2. In addition, the condition number of the interaction matrix calculated by the method proposed in this article is very small, which is very important for visual servo control. However, some particular configurations can make the object plane leave the camera field of view, which will affect the stability of visual servoing control. Future work will be devoted to designing a visual servoing scheme that increases the stability and prefer better decoupling of the interaction matrix.