Three-dimensional nonrigid reconstruction based on probability model

Most nonrigid motions use shape-based methods to solve the problem; however, the use of discrete cosine transform trajectory-based methods to solve the nonrigid motion problem is also very prominent. The signal undergoes discrete transformation due to the transform characteristics of the discrete cosine transform. The correlation of the data is well extracted such that a better compression of data is achieved. However, it is important to select the number and sequence of discrete cosine transform trajectory basis appropriately. The error of reconstruction and operational costs will increase for a high value of K (number of trajectory basis). On the other hand, a lower value of K would lead to the exclusion of information components. This will lead to poor accuracy as the structure of the object cannot be fully represented. When the number of trajectory basis is determined, the combination form has a considerable influence on the reconstruction algorithm. This article selects an appropriate number and combination of trajectory basis by analyzing the spectrum of re-projection errors and realizes the automatic selection of trajectory basis. Then, combining with the probability framework of normal distribution of a low-order model matrix, the energy information of the high-frequency part is retained, which not only helps maintain accuracy but also improves reconstruction efficiency. The proposed method can be used to reconstruct the three-dimensional structure of sparse data under more precise prior conditions and lower computational costs.


Introduction
Three-dimensional (3-D) reconstruction encompasses many fields such as image processing, stereoscopic vision, and biological engineering, and has attracted considerable research interest in computer vision. The 3-D motion reconstruction of a nonrigid body is an important technique for virtual representation of the objective world. Generally, 3-D motion reconstruction involves recovering camera rotation matrix R and 3-D structure S of a nonrigid body from a given set of 2-D dynamic image sequences.
There are currently four mainstream reconstruction schemes: shape-based 3-D reconstruction, 1 force-based will lead to the introduction of a large number of unknowns, which makes the algorithm complex and limits its scope. The advantages of force-based 3-D reconstruction are that it is based on the deformed low-rank force space to formulate the problem, which can better explain the acquired a priori information and more accurately represent the behavior of the actual object but in the process of reconstruction, in addition to determining the force and reality. In addition to rotating the data S, it is also necessary to estimate the elastic model of the object, which increases the uncertainty of the reconstruction result. The advantage of the 3-D reconstruction based on the shape trajectory is that it combines the advantages of both the trajectory basis and shape basis; however, the disadvantage is that the a priori unknown is added. Although the reconstruction-based trajectory-based method solves the limitations of the above three methods, the number and type of predefined trajectory basis in this method are difficult to select. The number of trajectory basis and the choice of trajectory basis directly affect the reconstruction accuracy. In particular, the main difficulty in solving the nonrigid motion (NRSFM) problem is that many different 3-D graphics can produce similar observation images, and uniquely considering the re-projection constraints is not sufficient to obtain a single solution of the shape. Therefore, there is a need for more prior knowledge of the deformation of the structure and the motion of the camera. The method based on automatic selection basis-probability model proposed in this article can effectively solve the problems caused by the number of trajectory basis, combination of trajectory basis, and complex prior knowledge.

Related work
Most existing methods adopt the matrix decomposition algorithm to decompose a rigid reconstruction [5][6][7][8][9] and use prior information in the form of low-rank shape basis. 10,11 Similarly, a low-rank model is proposed to constrain the motion of every point on the object through a predefined trajectory basis. 3 The disadvantage of these methods is that they need to be decomposed into a destination matrix that is proportional to the input points, and they can only be applied to relatively low-resolution shapes.
In the NRSFM, 2-D point tracks obtained only from camera motion and reconstruction of time-varying 3-D shapes are an unconstrained problem. This is because the observation results of different 3-D objects and 2-D images obtained using the camera are similar. A few algorithms have been proposed for solving the rigid constraints in a nonrigid body in the process of solving the 3-D motion reconstruction of a nonrigid body by factorization. Costeira and Kanade 12 constructed an orthogonal projection model and then applied the factorization method to reconstruct the structure and motion of independent moving objects. However, its application scope was limited; thus, it could not satisfy the changeable linear combination. Bascle and Blake 13 proposed the decomposition of the basic shape of a reconstruction target into a group of linear combinations of basic shapes, and the reconstruction problem was simplified into a problem of solving for the basic shape coefficient, which was also the prototype of solving the 3-D motion reconstruction problem of a nonrigid body with the model based on shape basis. Torresani et al. 14 adopted lowrank constraints to track a nonrigid body, that is, a time constraint method combining the simulation of the shape basis coefficient as a linear dynamic system 15 and the establishment of a nonrigid body deformation distribution model with layered factors. 10 Rabaud and Belongie 16 removed the linear basis representation and proposed a method for learning the shape structure from a video. The advantage of this method is that it combines time specifications to prevent the camera and structure from having excessive and blurred changes between frames. Agudo and Moreno-Noguer 17 introduced the force model into 3-D nonrigid reconstruction, which has the advantage of formulating problems based on the low-rank force space of deformation, and better physical interpretation of the obtained prior information. However, in the process of reconstruction, if there is a lack of force and real data, the reconstruction will become difficult. Recently, the research direction of NRSFM introduced the concept of compressibility to enhance the joint of subspaces. Each shape instance 18 uses a different set of shape basis. The application of the above model converts the NRSFM into a three-wire problem, which can be solved using the decomposition technology 11 or optimization strategy, implementing smooth spatial, 10 temporal, 19,20 or tight 3-D shapes. 21 The advantage of this method is that it can reduce excessive and fuzzy changes between the frames of a camera and structure.
Akhter et al. 22 proposed that the trajectory of each point can be limited to a low-dimensional subspace. The advantage of this method based on the trajectory basis is that the target unknown basis, such as discrete cosine transform (DCT), can be used to reduce the number of unknown parameters and improve the accuracy of a 3-D structure. Gotardo and Martinez 23 proposed a shape basis coefficient based on DCT by combining the shape basis and trajectory basis. Zhu et al. 24 pointed out that sequences with poor reconstruction ability could be remedied by adding rigid key frames. They also emphasized the necessity of selecting a trajectory cardinal order K, instead of using all the DCT basis or applying normalization rules to coefficient vectors to obtain a sparse solution set. However, Zhu et al. 24 could not exploit the known advantage of DCT coefficient distribution in natural signals. 25 In addition, many methods use predefined basis to restrict the trajectory of each target point, thus transforming the trilinear problem into a bilinear problem, 22 which can greatly simplify the trilinear operation problem. 26 In the study of Valmadre and Lucey, 27 a priori on the trajectory is introduced by using the differential of 3-D points. Its advantage is that it combines the shape basis and trajectory space. The advantage of combining the shape basis with the trajectory basis is that it can generate a smooth time trajectory of a nonrigid shape in the linear shape space. 3 In this study, we first use the automatic selection of the trajectory basis, which can not only effectively reduce the large K value used in the previous method, or the smaller K value, but can also reasonably select the combination of the trajectory basis. Firstly, the efficiency and accuracy of the reconstructed structure are maximized. Secondly, the matrix normal distribution establishes a model of the known trajectory space, 10,16 which combines the spatial smoothness with the inherent temporal smoothness of the subspace. Based on the probabilistic model proposed by Agudo and Moreno-Noguer, 28 this study adds accurate prior information and provides more accurate decomposition. The experiment shows the accuracy, versatility, and efficiency of our approach in sparse data sets.

Low-rank model NRSFM
The standard matrix decomposition method is generally used for the NRSFM problem. Suppose the motion model being studied consists of F frame images, P feature points are marked on each frame, and the 2-D position of each feature point of the image is marked as i T is the 2-D image projection. All frames and feature points are represented together and arranged into a 2F Â P observation matrix W as follows According to Agudo and Moreno-Noguer, 28 k i j is the zero mean coordinate, which is obtained by subtracting the average translation vector from the original coordinate, which means Unlike other sparse 22 technologies, the method proposed in this article can deal with track points lost owing to occlusion or outliers. We initially set the low-rank track prediction model with the 2-D value of k i j . These initial predictions are then used to calculate R and F, F is a matrix of unknown coefficient vectors j 2 R 3KÂ1 , for each of the points j ¼ 1; . . . ; P f g . After estimating R and F, k i j can be further refined as follows where represents the Kronecker product. Matrix B is the trajectory basis matrix obtained after automatically selecting the basis, and matrix R is a block diagonal, consisting of a R j 2 R 2Â3 rotation of an orthophoto camera. In short, the NRSFM problem can be expressed as a given 2-D trajectory matrix W, and the attitude parameter R and the 3-D shape S are recovered simultaneously. The observation matrix W can be decomposed into W ¼ RS, including R 2 R 2FÂ3F and S 2 R 3FÂP . Then, we will further decompose it into , and the maximum rank of matrix W is 3K. We can use singular value decomposition (SVD) to decompose it as follows Updating the transition matrix Because the SVD decomposition of matrices is not unique, matricesL andÃ are different from L and A, respectively. For any correction matrix of Q 2 R 3KÂ3K ,LQ and Q À1Ã are also valid factorizations. Therefore, to restore the transfer structure, we can obtain Instead of the entire matrix Q, we only need to estimate the three columns of Q to correct forL andÃ. Thus, for the F frame, there are 3F constraints and 9K unknown parameters in Q jjj . The rotation matrix R can be estimated by determining Q jjj .

Automatic base selection
The type, number, and combination of trajectory basis considerably influence the performance of the NRSFM algorithm, and the DCT trajectory basis is the optimal general trajectory basis. After the types of locus basis are determined, it is important to select the number and combination of locus basis.
In this article, an automatic selection algorithm based on trajectory basis is proposed. 29 The error in the actual 3-D structure S and first SVD decomposition S 1 are analyzed in the frequency domain space. In addition, the K value is expanded and compared with the 3-D shape error of the sequential trajectory basis restoration. The optimal trajectory basis is selected, thus reducing the 3-D reconstruction error greatly.

Automatically select trajectory basisprobability model
Solution and improvement of correlation matrix C In real life, deformations observed in sports are often not singular, such as the movement of the face or the entire body, and there is similarity between the points of the objects under movement. Therefore, in the normal distribution of a matrix, we utilize a symmetric matrix C as a covariance matrix. Then, we assume that the observation matrix W is formed by a low-rank matrix C combined with the noise term E. Therefore, the following idealized robust principal components analysis (PCA) problem can be obtained.
For the observation matrix W, W ¼ C þ E, according to Liu et al., 30 where C is a low-rank matrix and E is a sparse matrix. We can obtain a conceptual solution to the above problem, which can be expressed as follows where jj:jj Ã is the nuclear norm and jj:jj 1 is the L 1 norm. L 1 norm and L 0 norm can be sparse, and L 1 is widely used because of its superior optimal solution characteristics compared with those of L 0 . Agudo and Moreno-Noguer 28 introduced C into the covariance matrix of the probability model. According to Costeira and Kanade, 12 the method for solving the covariance matrix is modified to: Therefore, we convert the expression of the robust PCA problem to For this problem, we can use the exact augmented Lagrange multiplier method to solve the problem. However, the operational cost is relatively high. Therefore, an inexact augmented Lagrange multiplier (IALM) method is used to solve the problem. Compared with the exact algorithm, the inexact Lagrange multiplier method 31 has a considerable improvement in computing speed while maintaining accuracy.
In the IALM algorithm, when k increases linearly, the exact Lagrange multiplier method will converge linearly. When k increases rapidly, the convergence speed will also be faster. However, when k is very large, the convergence speed of solution will be low. Therefore, the IALM algorithm starts by reducing the computation time of subproblems.
In the precise Lagrange multiplier method, the IT algorithm is used to solve the subproblem Automatically select the trajectory basis algorithm (1) A nonrigid body structure S 1 is obtained by SVD decomposition of observation matrix W.
(2) Calculate the error of S 1 obtained from the decomposition of S and SVD of an actual nonrigid body structure of each frame, where p is the number of columns in the observation matrix (3) 1-D DCT is adopted for error errS j ð Þ, where z is the number of frames Determine the spectrum of G i and the magnitude of G i j j. (4) Select a combination of trajectory basis according to the obtained error spectrum amplitude. Then, select K frequency points with the largest amplitude and use them to represent the actual error level t. If the corresponding amplitude of the selected K points satisfies the following expression where C kþ1 and E kþ1 are, respectively, the values of C and E, updated after the KTH iteration, k is the penalty parameter, and Y k is the Lagrange multiplier. According to the experiment, 31 for C kþ1 and E kþ1 , we do not need to be precise in the above subproblems. A solution of SVD can obtain a relatively approximate optimal solution, which is sufficient to achieve the desired effect. Therefore, in the IALM algorithm, we removed the iterative solution of the subproblem by using the IT algorithm, and replaced IT with a direct SVD solution, as follows Among them where x is represented as a soft threshold, In this manner, one layer of iterative loop can be removed, considerably increasing the calculation time. After obtaining the accelerated correlation matrix C kþ1 , we normalize it and force it to be unitary on the diagonal where represents the hada code product and 1 N represents a vector.

Adjusting prior rows and column covariance
Matrix normal distribution uses Kronecker covariance to provide the idea of natural combination around data. Specifically, the normal random variable X in the matrix represents the matrix itself. The distribution is parameterized by the mean matrix and two covariance matrices, which represent the covariance of the rows and columns of matrix X. The prior covariance matrix has a considerable influence on the 3-D structure restoration. The processed prior covariance can increase the stability and accuracy of 3-D structure restoration. In other words, prior covariance with higher accuracy can provide better restoration to the 3-D structure movement. Therefore, we adjusted the initial row covariance I 3K and column covariance inverse C ÃÀ1 accordingly, where K is the number of trajectory basis vec F ð Þ*N ; S c S s ð Þ ð 17Þ Then, the matrix normal distribution of Y is According to Gupta and Nagar, 32 equation (18) can be simplified as follows Then, the logarithmic likelihood function of parameter according to equation (20), and the maximum likelihood estimation (MLE) can be obtained as followŝ However, for any k 6 ¼ 0, we defineS c ¼ kS c ;S s ¼ S s =k and thenS c S s ¼ S c S s . Both estimates produce the same covariance matrix of the population. To solve this problem, according to Glanz and Carvalho, 33 we propose the following modifications to the model vecðY Þ*N vecðÞ; & 2 S c S s À Á ð22Þ To estimate & 2 , similar to equation (20), the new likelihood function can be obtained The MLE generating & 2 iŝ However, because the variance scale is determined by & 2 , when & 2 1, the update of matrixŜ c andŜ s will be abnormal. This study uses the empirical value instead. We need to consider the scale constraints of S c and S s when deriving their MLE. For this, we use the results of 33 Parameter solution The observed 2-D locus point matrix W is accompanied by Gaussian noise, which is represented by the matrix N 2 R 2FÂP . W can be redefined as follows According to Agudo and Moreno-Noguer, 28 we can include the accelerated correlation matrix C Ã in the probability model in the form of covariance where vecðÁÞ represents the vectorization operator of a matrix. E step: In step E, we estimate the conditional distribution of potential variable Y . We apply Bayes' rule to the equation. According to some properties of matrix variational normal distribution, 32,34 it can be known that this distribution is also a Gaussian distribution M step: We update model parameters A and s 2 and obtain where matrix D, which is a non-singular matrix, corresponds to the covariance matrix of the central observation as Experimental results

Selection of base
The number and combination of trajectory basis have a considerable influence on the structural errors of reconstructing 3-D nonrigid bodies. This study uses the spectrum analysis method to analyze the frame error of an actual nonrigid structure S of the known data set and the SVD decomposition of the observation matrix W to obtain a nonrigid structure S 1 . The number of trajectory basis K and the combination form of trajectory basis are determined. Then, the structural errors after the automatic selection of trajectory basis and sequential trajectory basis are compared, and the combination of trajectory basis with smaller reconstruction errors is taken as the final trajectory basis. The recovery method of Figure 1 is based on previous studies, [6][7][8][9] which also employ a similar nonrigid reconstruction method. Figure 1 shows the trajectory basis combination of the yoga 3-D structure of the nonrigid body restoration shown in the literature. 3,5,[10][11][12][14][15][16]19 The first image on the left is the elevation diagram of frame 50, the second image is of frame 140, the third image is of frame 210, and the fourth image is of frame 240. Figure 2 shows the mean error comparison of 3-D points of the yoga data set of the frames of 50, 100, 150, 200, 250, and 300 for the automatic selection base method and the sequential selection method; the mean errors were obtained for K tracks of the cardinal head. It can be clearly seen that the automatic selection of basis is superior to the sequential selection of basis.
However, different data sets adopt different trajectory basis forms. For example, in the drink data set, the sequential selection of basis performs better than the automatic selection of basis, as shown in Figure 3.

Method comparison
For the quantitative evaluation, we follow the indicators used in the studies of Dai et al. 35 and Gotardo and Martinez 3 to show the average rotation error e R and standardized average 3-D error e S , which are defined as In frame f, R f is the estimated rotation matrix and R f is the corresponding real rotation of the ground. e S is calculated as where e f p is the 3-D reconstruction error of point p in the f coordinate system. s f x , s f y , and s f z represent the standard deviation of the x, y, and z coordinates of the original shape    Table 1 is due to the lack of reality rotation matrix R of dance data so that the error cannot be analyzed.
in the f coordinate system. When the surface truth 3-D data or rotation data are available, we provide e S and e R , respectively, where K is the number of trajectory basis. Table 1 shows the errors of the five 3-D reconstruction methods.
Because of the lack of codes of comparative papers, 10,22,35 the errors of these data are derived from the study of Agudo and Moreno-Noguer. 28 In the experiment, the K values of the five data sets in the subspace are yoga (K ¼ 11), pick-up (K ¼ 12), drink (K ¼ 13), stretch (K ¼ 12), and dance (K ¼ 5). Table 1 shows average rotation error e R and Table 2 shows standardized average 3-D error e S ðKÞ.
Further, the stretch is used as the experimental data set. The blue trajectory represents the error curve after the reconstruction of the automatic selection of basis, and the red trajectory represents the error curve after the reconstruction of the sequential selection of trajectory basis. The graph shows the coordinates of each feature point in each dimension. The reconstruction error of the automatic trajectory basis selection method was low. The reconstruction error shown in the figure was obtained using the following equation where Tx; Ty; Tz are the initial motion trajectories before reconstruction andT x;T y;T z are the motion trajectories obtained after reconstruction. Figures 4 and 5 show an analysis of the reconstruction errors of the trajectory basis 3,5,[10][11][12][14][15][16][17][18]20 and the first 11 trajectory basis in order; moreover, a comparison of the reconstruction errors and overall errors of each point in X, Y, and Z coordinates is shown. Figure 6 shows the percentage of frames 50, 100, 150, 200, 250, and 300 that are lower than the number of basis points that are selected sequentially. It can be clearly seen that the error level of 11 reconstructed trajectory basis is much lower than that of the trajectory basis selected in sequence. That is, the reconstruction accuracy and time are considerably improved, which demonstrates that the method proposed in this study can improve efficiency on the premise of ensuring reconstruction accuracy. Table 3 compares the execution time (in second) of probabilistic correlation point trajectory approach (PCPTA) 28 with that of block matrix method (BMM), 25 which are two highly accurate and advanced methods to recover 3-D nonrigid body structures. All methods are executed in MATLAB 2018b, where K is the number of locus basis.
From Table 2, we can see that the performance of the proposed method is better in terms of time and accuracy. Before acceleration, the method shows a lower error level. Its emphasis is that the value of C of the correlation matrix has a higher accuracy. However, although the method of determining the acceleration has slightly increased the error, the speed increased significantly. Unfortunately, the source code of the first two methods is not open. Thus, we cannot complete the comparison of noise observations. The reason why our unaccelerated method is longer than the PCPTA method is that we use the automatic selection base and the ADJUST method. Figures 7 to 9 show the restored graph of the pick-up data set recovered by the algorithm of the proposed automatically selecting trajectory basis-probability model after acceleration. The sparse 3-D nonrigid body recovery structures of frames 5, 50, 150, and 200 are shown in the order of left to right. Here, t ¼ 0:96, trajectory basis item K ¼ 9, and the combination of the base positions of the obtained trajectory is selected as described in the literature. 3,5,10,11,14,15,18,19 It can be seen that, compared with the previous methods, 10,22,28,35 the reduction in trajectory basis purpose does not significantly affect the recovery accuracy, but can accelerate the recovery efficiency.     Figure 11 shows an example of cubes toys, the recovery structures of frames 1, 72, 197, and 200, respectively. In the cubes model, trajectory basis item K ¼ 2, the trajectory is selected as described in the literature. 5,10 It should be mentioned that although both cubes are rigid individually, they are connected as a whole by a wire. So, when another wire which connects to one cube is pulled, they move like a no-rigid body as a whole. This kind of movement is simpler than that of the human body or dinosaur toy. Therefore, the number of K here is relatively low. In the dinosaur model, the trajectory is selected as described in the literature. 3,5,10,[12][13][14][15][16]18,19,24 It can be seen from the figure that the method in this article can recover the 3-D nonrigid structure of two toy models well. The data set is from http://mocap.cs.cmu.edu. In particular, the examples of dinosaur toy and cubes toy lack the reality 3-D structure matrix S so that the standardized average 3-D error e S analysis cannot be analyzed.

Adjustment to & 2
For the dance data set, when the parameter & 2 in equations (5) to (13) is obtained as 0.2072, which is less than 1,Ŝ c Figure 6. Stretched data set in frames 50, 100, 150, 200, 250, and 300 is better than the percentage graph of sequential DCT basis points. DCT: discrete cosine transform.

Conclusion
This study adopts a model of automatic selection of trajectory basis combined with a probability framework. The number and combination of trajectory basis are obtained such that the trajectory basis can maximize the reconstruction accuracy for recovering nonrigid structures. The latter   incorporates the low-rank trajectory model into the probability framework of a matrix normal distribution and can also improve the restoration efficiency of a 3-D nonrigid structure within the allowable range of error accuracy by using more precise prior conditions. The combination of the two methods can achieve the accurate reconstruction of sparse data sets. More importantly, the proposed reconstruction method is more accurate and efficient than most previous methods. In the days to come, we want to enhance the solution of the correlation matrix C, not only in accuracy, but also in solution speed. In addition, we hope to find or synthesize useful dense data sets for 3-D nonrigid reconstruction. After all, dense data sets are closer to actual daily life activities and more authentic; however, this will be a huge challenge.