A correspondence selection method based on same object and same position constraints

Establishing robust correspondences between two images is important for computer version tasks. However, in the real scene incorrect correspondences are inevitable no matter what kind of correspondence matching algorithms are adopted due to some complex factors, such as illumination, occlusion, and so on. To reduce the number of incorrect correspondences, an algorithm with the same object and same position constraints (SOSPC), is proposed to remove wrong correspondences from the given putative correspondences in this paper. The algorithm is based on the fact that in the given image pairs correct correspondences locate at the same position on the same objects. To select the correspondences on the same objects, an object matching method based on the correspondences selected by GMS is proposed. To select the correspondences on the correct positions, an iterative fundamental matrix estimation method based on clustering is presented. The experimental results have validated the effectiveness of the same object and the same position constraints, and the method achieves the state-of-art performance on five datasets.


Introduction
Finding feature correspondences is a fundamental task in the field of computer vision, which has a wide range of applications such as structure-from-motion, 1 multiview stereo, 2 image retrieval, 3 simultaneous localization and mapping, 4 face verification, 5 and et al.To acquire accurate correspondences, a popular two-step strategy is used.In the first step, the initial correspondence set is computed by correspondence matching algorithms (e.g.SIFT, 6 ORB, 7 A-KAZE, 8 etc).In the second step, more robust correspondences are determined out from the initial correspondence set by correspondence selection algorithms.By the second step, the ratio of correct correspondences can be raised, which offers more opportunities for the success of the subsequent operations in most cases.
Most correspondence matching algorithms establish correspondence relationship between two images by extracting keypoints and local image descriptors. 6,7,9,10ach keypoint has a descriptor to describe its texture, illumination and other features, and keypoints with most similar descriptors are matched.However, many correspondence matching algorithms usually have such problems: (1) due to complexity of real scenes, they are difficult to be adaptable to different photograph conditions in images simultaneously, such as changing viewpoints, repeated structures, different illumination, occlusion, similar textures and so on; (2) many matching algorithms only consider local descriptor, and some global structure information is ignored, which results in many correspondences having similar local structure but belonging to different objects.
Because of the above reasons, in the first step it is inevitable that correspondence matching results will suffer from a lot of incorrect matches as shown in Figure 1(a) and (b), no matter what kind of correspondence matching algorithm is applied.Thus in the second step it is necessary to remove the incorrect matches by the correspondence selection algorithm.Correspondence selection algorithms may select robust correspondences based on different principles.For example, some methods (e.g.RANSAC 11 and USAC 12 select robust correspondences by estimating a transformation model from one image to another image, and some other methods (e.g.GMS 13 and VFC 14 select correspondences based on the consistency of local geometric structures.However, these methods still have their own drawbacks: in the real world some transformation relationships between image pairs are complicated, for the methods that estimate the transformation model, it is hard to pick all correct correspondences out only by one kind of transformation model, for example, in Figure 1(c) correct correspondences on some objects are not selected using RANSAC method.For the methods based on local geometric structures, they often select the correspondences that match right regions, but sometimes they may not match the position of correspondences accurately in pixel level.Therefore, correspondence selection algorithm still deserves further study.
In this paper we focus on the second step and propose a method named SOSPC to select robust correspondences from the initial correspondence set.Based on the fact that correct correspondences should be on the same object and at the same position of the same object, we introduce the two constraints into the selection mechanism of correct correspondences in our method.For the same object constraint, the same objects in different images can be matched well by our designed method based on GMS, 13 then correspondences that match objects wrongly will be removed.For the same position constraint, the fundamental matrix that corresponds to each object is calculated by an iterative method, and according to the obtained fundamental matrix these correspondences on wrong positions are removed.
In summary, the contributions of this paper are as follows: The constraint that correct correspondences are on the same objects is introduced into our algorithm.To select the correspondences on the same objects, an object matching method using the correspondences selected by GMS 13 is proposed.The constraint that correct correspondences locate at the same position on the same objects are integrated into the correspondence selection algorithm.An iterative fundamental matrix estimation method based on clustering is presented to improve the precision of the extracted fundamental matrix.Our method has achieved the state-of-art performance comparing with other correspondence selection algorithms on several challenging datasets.

Correspondence matching algorithms
The most popular strategy to establish the relationships among keypoints in different images are divided into three stages: extracting keypoints from images, computing descriptors of keypoints, and matching the keypoints in image pairs based on the descriptors.The most famous correspondence matching approaches are SIFT 6 and SURF. 9But, because both of them use Gaussian scale space, some details and noises are smoothed, which affects the localization accuracy and distinctiveness.To solve this problem, KAZE method 15 is proposed, which uses nonlinear diffusion filtering to increase the distinctiveness.For decreasing the computational cost and taking the benefits of nonlinear diffusion filtering, some people design A-KAZE 8 method based on the Fast Explicit Diffusion framework.In some scenes, the computational resources are limited, ORB 7 and BRISK 16 are proposed, which both combine modifications of FAST corner detector 17 and binary descriptors based on BRIEF. 18

Correspondence selection algorithms
In the last decades, many correspondence selection algorithms have been presented.Generally speaking, the correspondence selection methods can be divided into parametric methods and non-parametric methods.Parametric methods select robust correspondences mainly by estimating global transformation model.ACC, 19 as a kind of parametric method, uses Hessian affine detector, which is invariant to affine transformations, to estimate the local homography matrix as constraints.The most well-known correspondence selection algorithm is RANSAC, 11 which randomly samples some correspondences and estimates a plane homography matrix or a fundamental matrix based on the maximum number of inliers.According to RANSAC, some variants are developed such as MLEESAC, 20 PROSAC, 21 LO-RANSAC, 22 and USAC, 12 which also select robust correspondences by estimating a global homography or essential matrix.But such methods have some drawbacks: (1) the accuracy of estimated global transformation will be affected especially when many incorrect correspondences occur; (2) they often rely on a predefined parametric model, which is not suitable for the non-rigid image transformation.
For non-parametric methods, their principles may be different.A popular design strategy is based on the consistency of local geometric structures or appearance (feature similarity).Considering that the correct feature correspondences on the same object have coherent transformations, CoSeg-HV 23 integrates image cosegmentation into feature matching and applies the Hough voting and its inverted variant are applied to establish correspondences.LPM 24 removes the false matches in the initial correspondence set according to spatial neighborhood structures.The authors formulate the spatial structure constraint into a mathematical model, and derive a closed-form solution with linearithmic time.Based on the solution the true matches can be selected.VFC 14 assumes that noise around correct matches and wrong matches has different distribution.Then it estimates the probability of the correct matches by the maximum likelihood estimation.In Lee et al., 25 propose a robust correspondence selection method based on local neighborhood.In this method the feature of each correspondence consists of a local graph combined by features of neighboring correspondences, and by comparing the similarity of the local graph the algorithm can select robust correspondences.GMS 13 incorporates smoothness constraints into matching, and it proves that the number of correspondences around true matches and false matches has different distributions, which establishes a link between correspondence numbers and match quality.Thus the correct matches can be determined by the correspondence numbers around a correspondence.Because such methods distinguish true or false correspondences mainly by the local structure and impose no position constraint in the pixel level, sometimes the accuracy of such methods may be lower.

Problem definition
Given two images I A and I B , C initial = fc 1 , c 2 , . . ., c i , . . ., c M g is the set of initial correspondences obtained by the correspondence matching algorithm, where c i = fp ai , p bi g, p ai and p bi are keypoint locations in I A and I B , respectively.C selected is a subset of C initial , which is composed by elements selected from C initial by the correspondence selection algorithm.Our goal is to select robust correspondences from C initial and put them into C selected as far as possible.In our algorithm, the selection of robust correspondences is based on the same object constraint and the same position constraint.

The same object constraint
The same object constraint means that the correct correspondences should locate on the same objects in image pairs, so different objects should be distinguished and matched in image pairs firstly, then correspondences that correspond to different objects should be removed.
To distinguish different objects in an image, the image is divided into several colorful regions by segmentation algorithm, 26 as shown in Figure 2. The region set that consists of these regions in I A is denoted as Re A = fRe a1 , . . ., Re ai , . . ., Re aU g, and the region set in I B is denoted as Re B = fRe b1 , . . ., Re bi , . . ., Re bV g, where U and V denote the number of divided regions in I A and I B , respectively.Each region with a kind of color is regarded as an object, although they may be not a real object in the real scene.
The same object constraint can be defined that: when Re ai and Re bj correspond to the same object in image pairs, if correspondence c k satisfies the same object constraint and p ak locates on Re ai , then p bk should locate on Re bj .
Based on the definition, the relationship of the same object in the two images should be established firstly.An intuitive method is to use existing initial correspondences to match the objects in image pairs.But, in fact, matching objects correctly may be difficult by the initial correspondences, because many initial correspondences are not reliable enough.Using initial correspondences to match regions will cause many incorrect object matching, which will result in the reduction of the obtained correct correspondences.
In this section, we propose an object matching method that utilizes the correspondences selected by GMS method. 13GMS uses supported correspondences of a correspondence to determine whether the correspondence is robust.Let j(p) denote the neighbor of point p, the neighbor correspondence set of c i is fc j jp aj 2 j(p ai )g, the supported correspondence set of c i can be denoted by fc j jp aj 2 j(p ai ), p bj 2 j(p bi )g.According to the smooth motion assumption, it is derived that the supported correspondence number of c i follows binomial distribution as shown in Figure 3.When c i is correct, the number of elements in supported correspondence set follows the binomial distribution denoted by the blue curve, and when c i is wrong, the number of elements in supported correspondence set follows the binomial distribution denoted by the red curve.In Figure 3, the threshold is used to distinguish robust correspondences, which is related to the number of elements in neighbor correspondence set of c i .
Although by GMS algorithm some robust correspondences can be obtained, due to its limitation, many selected correspondences usually correspond to the same regions between images rather than the accurate position in pixel level.Fortunately this characteristics is suitable for object matching.
Another problem that needs to be solved is that due to the limitation of image segmentation algorithm, a region with a certain color may not be a real object, which causes many correct correspondences to be lost.For example, in Figure 2, due to the result of image segmentation, the real object that consists of Re a1 , Re a2 , and Re a3 is the same as the object that consists of Re b1 and Re b2 .In this case, if only c i that p ai locates in Re a1 and p bi locates in Re b1 are selected, c j that p aj locates in Re a1 and p bj locates in Re b2 will be removed.In order to obtain as far as many correct correspondences, such one-to-one region mapping model cannot be adopted.
For a region Re ai , c k that p ak locates in Re ai is selected as a voter to determine its corresponding region in I B .If p bk locates in Re bj , c k votes Re bj as the corresponding region of Re ai , and the number of votes that Re bj receives to determine the corresponding region of Re ai is denoted as jRe ai bj j.For a region Re bi , c k that p bk locates in Re bi is selected as a voter to determine its corresponding region in I A .If p ak locates in Re aj , c k votes Re aj as the corresponding region of Re bi , and the number of votes that Re aj receives to determine the corresponding region of Re bi is denoted as jRe bi aj j.The corresponding region of Re ai in I B is represented by Re aiÃ bj , and the corresponding region of Re bi in I A is represented by Re biÃ aj .The two corresponding regions are determined by equations ( 1) and (2), respectively.By this method, the corresponding region of each region in other image is determined.
Re aiÃ bj = arg maxfjRe ai b1 j, . . ., jRe ai bk j, . . ., jRe ai bV jg ð1Þ Re biÃ aj = arg maxfjRe bi a1 j, . . ., jRe bi ak j, . . ., jRe bi aU jg ð2Þ For correspondence c k whose p ak is in Re ai , whether the correspondence satisfies the same object constraint is determined by equation (3).For correspondence c k whose p bk is in Re bi , whether the correspondence satisfies the same object constraint is determined by equation ( 4).Thus a correspondence can be selected as a true match not only by equation ( 3) but also by equation ( 4), which alleviates the problem that many correct correspondences are ignored due to the performance of segmentation algorithm.
The true correspondence c k where its p ak is in Re ai is put in set S 0 ai .The true correspondence c k where its p bk is in Re bj is put in set S 0 bj .All the correspondences that satisfy equations (3) or (4) form the set S 0 = fS 0 a1 , . . ., S 0 ai , . . ., S 0 aU , S 0 b1 , . . ., S 0 bj , . . ., S 0 bV g.

The same position on the same object constraint
To describe the position constraint in pixel level, the fundamental matrix tool is used in our algorithm.From Faugeras, 27 for a correspondence c i , it can be obtained that where p ai = (u 1 , v 1 , 1) and p bi = (u 2 , v 2 , 1) are the homogeneous image coordinates in I A and I B , respectively.F is the fundamental matrix, C 1 is the camera that captures I A , and C 2 is the camera that captures I B .K 1 and K 2 are the internal parameters of C 1 and C 2 respectively, R c 1 c 2 denotes the rotation matrix from C 2 Figure 3.The distributions of correspondence number. 13oordinate to C 1 coordinate, and S is a matrix that only has relationship with translation vector t c 1 c 2 , which is the coordinate of optical center of C 2 in C 1 coordinate system.From equation (5), it can be known that if the internal parameters of C 1 and C 2 are the same, F is only related to R c 1 c 2 and t c 1 c 2 .Many accurate algorithms, for example, RANSAC and its variants, use a fundamental matrix to describe the position constraint in pixel level.However, if the relative positions of different objects are changed in image pairs, such methods only select the correspondences of one object.Because when the relative position of different objects are changed, according to Theorem 1 it can be known that the these objects correspond to different fundamental matrix.Under the guidance of Theorem 1 and the goal to describe the position constraint in pixel level, the designed algorithm needs to obtain the fundamental matrix of each object.denotes the rotation matrix from the coordinate system of O b 1 to the coordinate system of O a 1 , and R a 2 b 2 denotes the rotation matrix from the coordinate system of O b2 to the coordinate system of O a2 .
Referring to the coordinate system of O a , the rotation matrix R c 1 c 2 from C 2 coordinate system to C 1 coordinate system is represented by a R c 1 c 2 , which is obtained by equation (6).The translation vector t c 1 c 2 is represented by a t c 1 c 2 , which is calculated by equation ( 7).
where p a 2 c 2 is the coordinate of the optical center of C 2 in the coordinate system of a 2 .
Referring to the coordinate of O b , the rotation matrix from C 2 coordinate system to C 1 coordinate system is denoted by b R c 1 c 2 , which is obtained by (8).The translation vector b t c 1 c 2 is calculated by equation ( 9).
(a) When the relative position between O a and O b is unchanged in images I A and I B : The condition is equivalent to that R a Equations ( 8) and ( 9) can be rewritten as follows: From equations ( 10) and ( 11), it can be seen that a R c 1 The condition is equivalent to that R a Equations ( 8) and ( 9) can be rewritten as follows: Epipolar distance.As shown in Figure 5, according to the properties of fundamental matrix, it can be known that the epipolar lines in I A and I B are l 1 and l 2 , respectively.
where m, n, and h are the parameters to denote the straight line mu + nv + h = 0 in images.The distance d 1 from p ai to l 1 and the distance d 2 from p bi to l 2 are: In equations ( 16) and ( 17), d 1 and d 2 are both epipolar distances, and the minimum epipolar distance of c i using F is defined as equation (18).
An iterative fundamental matrix estimation method based on clustering (IFMEM).To find the fundamental matrix of each object, the correspondence in S 0 is applied.The correspondence set S 0 ai in S 0 consists of correspondences on Re ai .According to S 0 ai , a seed fundamental matrix F 0 ai that corresponds to Re ai is calculated by RANSAC. 11Besides, other seed fundamental matrix that corresponds to Re aj and Re bk can also be calculated, and they form a seed fundamental matrix set F 0 = fF 0 a1 , . . ., F 0 ai , . . ., F 0 aU , F 0 b1 , . . ., F 0 bj , . . ., F 0 bV g.However, these seed fundamental matrix cannot be directly used to select correspondences at the same position on the same object, because under the limitation of accuracy of image segmentation and object matching, some correct correspondences may be removed while some wrong correspondences still remain.This disadvantage makes not all seed fundamental matrix so reliable, which causes some incorrect correspondences be selected and many correct correspondences be ignored.
To improve the precision and increase the number of selected correct correspondences, IFMEM is proposed.If the epipolar distance of the correspondence c i calculated by fundamental matrix F 0 aj is less than t 0 , which means that c i is on the object that can be described roughly by F 0 aj , c i is put into set S  19), a is a factor between 0 and 1.
While t becomes smaller more precise points are preserved and the estimated fundamental matrix can be more accurate.Finally F l is used to select the correspondences in C initial , and correspondences that satisfy the epipolar distance \ 3.0 are put into C selected .The workflow of the whole algorithm is shown as Figure 6, and it can be also written as Algorithm 1.The number of initial feature matches is denoted as N, the number of objects segmented in an image is M, the maximum iteration times of the method to calculate the fundamental matrix is L, the time complexity of the image segmentation algorithm is O seg , the time complexity of the method to calculate a fundamental matrix is O ransac .The time complexity of the object matching algorithm based on GMS is O(M 2 N), the time complexity of the method to calculate fundamental matrix of each matrix is O(LMO ransac ), so the time complexity of the whole algorithm is O seg is related to the adopted image segmentation algorithm, and the time complexity of different image segmentation algorithms varies greatly.The time complexity of O ransac is related to N, and as the accuracy increases, the number of iterations will increase.In the actual calculation process, an upper limit on the number of iterations can be set, so the final time complexity of the whole algorithm can be denoted as

Experiments
In this section, the performance evaluation and analysis of the proposed method are reported.The open source library OPENCV is employed to detect the initial correspondences by ORB algorithm.

Datasets
Five datasets are employed in our experiments: Person, Gerrard, Graham, South, and MultiObjects.The first four datasets all contain 400 pairs of images, which are obtained from, 1 and their ground-truth camera parameters are provided.Based on the camera parameters, the fundamental matrix between image pairs are calculated, then the corresponding epipolar distance for each correspondence is calculated.The correspondence with epipolar distances \ 3.0 is regarded as the groundtruth.In the four datasets there are many challenges to select robust correspondences.Graham, Gerrard, and South are facing several problems, such as changed viewpoint, repeated structure, different illumination.In Person, there are a lot of rotation scenes between image pairs.In the four datasets, there are many similar textures in images, which produce a lot of incorrect correspondences.MultiObjects is a self-made dataset, which contains 45 pairs of images.There are some objects in each image pairs, some objects may be covered by other objects, and the relative positions among these objects are usually changed in image pairs.The fundamental matrix that corresponds each object is provided.The correspondence with epipolar distances \ 3.0 is regarded as the ground-truth.

Performacne evaluation
In the following, we refer to paper 13,24 to evaluate our algorithms on five datasets by the precision, recall, and F-measure.They are defined as follows: where TP and FP is the number of correct and incorrect correspondences selected by a specific algorithm, respectively.FN is the number of correct correspondences in the initial correspondences that are not selected by the algorithm.

Verification experiments
In this subsection, the first experiment is used to prove Theorem 1. From Theorem 1, it can be known that each object corresponds to a fundamental matrix.Figure 7(a) is the initial correspondences obtained by ORB.The correct correspondences on O 1 and O 2 are selected manually to calculate the fundamental matrix F o1 and F o2 , respectively.Figure 7(b) and (c) are the results of correspondence selection based on F o1 and F o2 , respectively.From Figure 5, it can be seen that F o1 and F o2 can select the correspondences on O 1 and O 2 , respectively.The experiment result is in accordance with the expectation of Theorem 1.
To justify the two proposed constrains' effects, we have also conducted two comparison experiments.Our contrast tests are both tested on the subset of the five datasets.The first comparison experiment is used to prove the effect of the same object constraint, and the results are shown in Table 1.From this table, it can be seen that the precision of GMS is higher than the initial correspondences, which means that the GMS methods can select more robust correspondences from the initial correspondences.Then the initial correspondences are used to match the same objects, and based on the object matching result the precision and F-measure of the selected correspondences are improved comparing with the initial correspondences, which illustrates that the same object constraint has a positive effect on the correspondence selection.The robust correspondences selected by GMS are also used to match the same objects in image pairs, and by this way it acquires the best precision and F-measure.This experiment shows that using robust correspondences for object matching can improve the performance of the same object constraint.
The second comparison experiment is used to prove the effect of the same position on the same objects constraint, the related result is shown in Table 2.At first the correspondences that selected by the same object constraint using GMS is obtained, then the same position constraint is applied.The same position constraint is tested by two methods: one method is to utilize the seed fundamental matrix F 0 to select the correct correspondences, the other method is to utilize F l to select the correct correspondences.From this table, it can be seen that by using the same position constraint the precision and F-measure have been raised a lot, which proves that the same position constraint can further improve the performance of our algorithm on the basis of the same object constraint.From Table 2, it can also be seen that the same position constraint by F l have a better performance than the same position constraint only by F 0 .Because the correspondences in S 0 are not reliable and enough, the calculated fundamental matrix set F 0 is not accurate enough, which results in many correct correspondences being ignored.But by iteration, the proposed IFMEM method can use more and

Input:
The initial correspondences C initial .

Output:
The correspondences C selected selected by our algorithm.1: By segment algorithm, 26

Contrast experiments of different methods
Our method is also compared with some competitive correspondence selection methods such as RANSAC, GMS, VFC on the five datasets.Table 3 shows the performance of these feature correspondences selection methods.Because different thresholds of an algorithm have different influences on precision and recall, the statistical results of precision and recall in Table 3 are obtained according to the thresholds corresponding to the highest F-measure for each algorithm.Due to similar texture in these datasets, there are a lot of incorrect initial correspondences, for example, the correspondence result shown in Figure 8(a).From Table 3, it can be seen that our algorithm achieves the best precision and F-measure in the first four datasets, which proves that our algorithm has better performance in these datasets.In the MultiObject, our method achieves best F-measure.Although RANSAC algorithm achieves the best precision, its selected correspondences often locate only on one object and the correspondences on other objects are often discarded as shown in Figure 9(a).Besides, our method and GMS are both tested on TUM-RGB dataset, 28 and the results are shown in Table 4.Because many images in this dataset are continuous, both methods achieve high accuracy compared with previous datasets.From Table 4, it can be seen that GMS achieves a higher precision, however, our method achieves higher recall and F-measure.That is because in our method every object matching result has a fundamental matrix to fit the correspondences on each object, more correspondences that are relative robust can be selected compared with GMS, therefore our method can obtain a better comprehensive result.Furthermore, we analyze the performance of different correspondence selection algorithms, and explain the result based on a pair of images taken out from dataset Person and shown in Figure 6. Figure 8(a) shows the initial correspondences obtained by ORB, and Figure 8(b)-(e) are the correspondence selection results of RANSAC, GMS, VFC, and Ours, respectively.Table 5 records indicators of these correspondence selection algorithms on this image pair.In the image pair, it is seen that there are many similar textures, the motion of the camera is changed much, and there are many wrong initial correspondences in Figure 8(a).For RANSAC algorithm, it often has excellent performance when wrong correspondences are less, but in these datasets, images contain many similar textures (e.g.leaves), which has caused many wrong correspondences, so the estimated model by RANSAC is affected a lot.GMS often selects the correspondences that match the same regions, but it cannot judge whether correspondences are true or false in the pixel level, so the precision is lower than RANSAC.Our method can match many same objects in image pairs and remove the wrong correspondences that match incorrect objects.Then our algorithm will calculate the fundamental matrix of each object, and by IFMEM more and more robust correspondences are used to calculate the fundamental matrix to improve its precision.By these fundamental matrix, many correct correspondences will be selected.The Precision-Recall (PR) curves are also drawn in Figure 10(a)-(e).These algorithms adopts different thresholds on Graham, Gerrard, Person, South, and MultiObjects.In the first four figures, it can be seen that the curves of our approach are on the upper right of other PR curves, which further proves that our method has the best performance.In the last figure, our method can achieve a high recall rate while maintaining a high accuracy, which also performs better performance.

Conclusion
In this paper, we have proposed the correspondence selection method SOSPC to select the correct correspondences from the initial correspondences.In our approach, the same object constraint and the same position on the same object constraint are integrated into our correspondence selection algorithm.To match the same objects in the image pairs, we propose an object matching method by the correspondences selected by GMS.To improve the precision of the fundamental matrix, we propose the IFMEM method to calculate the fundamental matrix iteratively.Finally  we have conducted several experiments to verify the effectiveness of the proposed constraints, and prove that our approach can achieve the best performance comparing with other correspondence selection algorithms in the given datasets.

Figure 1 .
Figure 1.(a) The initial correspondences obtained by A-KZKE algorithm, 8 (b) the initial correspondences obtained by ORB algorithm, 7 and (c) the ORB correspondences selected by RANSAC algorithm. 11

Figure 2 .
Figure 2. (a) The divided regions in I A and (b) the divided regions in I B .The arrow from Re a1 to Re b1 represents that the corresponding region of Re a1 in I B is Re b1 .

Theorem 1 .
Each object corresponds to a fundamental matrix.If the relative positions of these objects are unchanged in I A and I B , these objects correspond to the same fundamental matrix; otherwise, they correspond to different fundamental matrix.Proof: As shown in Figure 4, O a and O b are two objects.O a1 denotes O a captured by C 1 , and O a2 denotes O a captured by C 2 .The coordinate system of O a when C 1 captures I A is denoted by a 1 , and the coordinate system of O a when C 2 captures I B is denoted by a 2 .R a 1 b 1

c 2 and b R c 1 c 2 are the same, and a t c 1 c 2 and b t c 1 c 2
are also the same, thus O a and O b correspond to the same fundamental matrix in image pairs.(a) (b) When the relative position between O a and O b is changed in images I A and I B :

Figure 4 .
Figure 4.The diagram of relative position of objects in image pairs: (a) the diagram when the relative position of objects are unchanged in image pairs, (b) the diagram when the relative position of objects are changed in image pairs, (c) assuming that the coordinate system of O a1 is static, Figure 4(b) can be converted into this situation, and (d) assuming that the coordinate system of O b1 is static, Figure 4(b) can be converted into this situation.

Figure 5 .
Figure 5.The illustration of epipolar distance.O 1 is the optical center of the camera that takes image I A , and O 2 is the optical center of the camera that takes image I B .d 1 and d 2 are both epipolar distances.

Figure 6 .
Figure 6.The workflow of SOSPC with the same object and same position constraints.

Figure 7 .
Figure 7.The experiments to prove that each object corresponds to a fundamental matrix: (a) the initial correspondences, (b) the correspondences selected by F o1 , and (c) the correspondences selected by F o2 .

Figure 8 .
Figure 8.(a) The initial matches obtained by ORB, (b) the matches selected by RANSAC, (c) the matches selected by GMS, (d) the matches selected by VFC, and (e) the matches selected by our method SOSPC.

Figure 9 .
Figure 9. (a)The matches selected by RANSAC, (b) the matches selected by GMS, (c) the matches selected by VFC, and (d) the matches selected by our method SOSPC.

Table 1 .
images I A and I B are divided into regions Re A and Re B , respectively.2:ByGMSmethod the reliable correspondences are selected and they are used to establish the mapping from Re A to Re B and the mapping from Re to Re A [equations (1) and (2)].The experiment about the same object constraint.

Table 2 .
The experiment about the same position constraint.

Table 3 .
Evaluation results of the five correspondence selection algorithm on the five datasets.

Table 4 .
Evaluation results of GMS and ours on TUM-RGB dataset.

Table 5 .
Evaluation results on an image pair.