Near-field sound source localization using principal component analysis–multi-output support vector regression

In this article, principal component analysis method, which is applied to image compression and feature extraction, is introduced into the dimension reduction of input characteristic variable of support vector regression, and a method of joint estimation of near-field angle and range based on principal component analysis dimension reduction is proposed. Signal-to-noise ratio and calculation amount are the decisive factors affecting the performance of the algorithm. Principal component analysis is used to fuse the main characteristics of training data and discard redundant information, the signal-to-noise ratio is improved, and the calculation amount is reduced accordingly. Similarly, support vector regression is used to model the signal, and the upper triangular elements of the signal covariance matrix are usually used as input features. Since the covariance matrix has more upper triangular elements, training it as a feature input will affect the training speed to some extent. Principal component analysis is used to reduce the dimensionality of the upper triangular element of the covariance matrix of the known signal, and it is used as the input feature of the multi-output support vector regression machine to construct the near-field parameter estimation model, and the parameter estimation of unknown signal is herein obtained. Simulation results show that this method has high estimation accuracy and training speed, and has strong adaptability at low signal-to-noise ratio, and the performance is better than that of the back-propagation neural network algorithm and the two-step multiple signal classification algorithm.


Introduction
The estimation of direction of arrival (DOA) has been widely used in many research fields, such as passive location, sonar array direction finding, seismic and geological resource detection, and mobile communication. The traditional DOA estimation algorithms include the maximum likelihood method, 1 the propagation operator method, 2 the multiple signal classification (MUSIC) algorithm, 3 the estimating signal parameter via rotational invariance techniques (ESPRIT) algorithm, 4 and other algorithms. 5,6 The traditional spatial spectrum estimation algorithm usually assumes that the signal source is located in the far field of the array, that is, the range from the source to the array is far enough, so that the spherical wavefront of the signal radiation can be approximated as a plane wavefront at the receiving array. However, 1 School of Physics and Optoelectronic Engineering, Xidian University, Xi'an, China when the source is closer to the array, the curvature of the wavefront at the aperture crossing cannot be ignored. At this point, the location of the near-field source needs to be described in conjunction with the DOA and range parameters. Therefore, the highresolution direction-finding algorithm based on the far-field hypothesis cannot be directly applied to the near-field case. The near-field source parameters estimation has become a hot issue because of its breadth application and engineering applicability, and many DOA estimation algorithms suitable for near-field source scenarios have been proposed. [7][8][9] In terms of the MUSIC algorithm, Huang and Barkat 10 extended the traditional far-field MUSIC method to the near-field source, and this method continued singular value decomposition of the multidimensional matrix, requiring spectral peak search, so this method is very computationally intensive. Starer and Nehorai 11 proposed an improved MUSIC algorithm based on the path tracking method, the 2-dimensional (2-D) search problem of near-field MUSIC algorithm was converted into the 1dimensional (1-D) search problem, and the 2-D parameters of the source were estimated by the iterative method. Challa and Shamsunder 12 proposed the method based on high-order cumulant for the location of near-field signal source, which showed superior performance; however, this kind of method needs to calculate the cumulant. Zhang et al. 13 proposed a reduced-dimension MUSIC algorithm based on the directional matrix split method, and this reduceddimension algorithm was converted into the optimization of the reduced-dimension spectrum function, and the spectral search is only involved in the angle domain. The advantage of the traditional MUSIC algorithm is its easy implementation and high resolution at a high signal-to-noise ratio (SNR), but the disadvantage is it is very inefficient and computationally intensive at a low SNR.
In recent years, the intelligent algorithms have been developed, such as the neural network algorithm, [14][15][16] the support vector machine (SVM) algorithm, 17,18 and other artificial intelligence algorithms. 19,20 The SVM is a machine learning method developed by Vapnik,21 which is established based on the Vapnik-Chervonenkis (VC) dimension theory of statistical learning theory and minimum structure risk principle. In recent years, the SVM has been successfully applied in the design of spread spectrum receivers, speech recognition, image processing, regression problem, and other fields. Machine learning establishes the relationship between the input and output of the model through training data, which is not affected by array error and other factors, and has good robustness. The studies on the application of SVM to near-field sources have been less reported domestically and abroad so far, and compressed sensing and other methods adopt step-by-step estimation method due to the computational complexity.
The near-field source signal received by linear array is a 2-D parameter of distance and angle, so the calculation of direct parameter estimation is too large. Most of the near-field source subspace-based parameter estimation method needs to realize the decoupling of distance and angle under specific array arrangement and certain approximate conditions. Support vector regression (SVR) parameter estimation algorithm is usually oneparameter regression. When multiple near-field source signals are incident to the receiving array at the same time, SVM regression becomes the regression of multiple 2-D parameters, and the algorithm becomes more complex. Therefore, the research of SVM regression algorithm of near-field source is rarely reported. The estimation accuracy of parameters and the generalization performance of the algorithm have a lot to do with the training process. In order to improve the estimation accuracy of parameters and the generalization performance of the algorithm, a very large training data set is needed, which greatly improves the amount of data in the training process. There is an urgent need to find a way to reduce the amount of computation. Principal component analysis (PCA) has attracted our attention. As a basic mathematical analysis method, PCA is often used in face recognition, image compression, feature extraction, and other fields. The advantages of PCA method are data compression and multidimensional data dimensionality reduction and noise reduction, so as to reduce redundancy and transition fitting, and the operation is simple without parameter restrictions. In this article, PCA method is introduced into the SVM regression algorithm of near-field parameter estimation, and PCA is used to extract the number of input features of SVM regression algorithm and reduce noise.
This article uses an SVR method to model the signal. In most of the literature, the upper triangular element of the covariance matrix of the received signal is used as the input to the SVR machine. However, the number of upper triangular elements in the covariance matrix is large, resulting in too large an input feature dimension, too long training and testing time, and too high algorithm complexity. In this article, a method of near-field acoustic source localization based on PCA and SVR is presented. The upper triangular elements of the covariance matrix of the received data are extracted first, feature extraction and dimension reduction are realized through the PCA, and the less number of features than those of original features are obtained. The dimensionreduced features are used as input features of SVR, the incident angle and range are used as output for training, and the function between the input and output signals is herein obtained. Simulation results show that this method has a high estimation accuracy and practical computational speed, and has strong adaptability at low SNR, and the prediction precision is better than that of the traditional method.

Array structure and receiving data model
In this article, the scalar sound pressure sensor array is adopted as the data-receiving model of the near-field sound source. The array element is a spatial omnidirectional sensor uniformly distributed on the x-axis. The number of array elements is M, and the spacing between adjacent elements is d. The schematic diagram of array structure is shown in Figure 1.
Suppose that K narrow-band, non-Gaussian, independent stationary signal is injected on the above array antenna from the near field. Let the array element at the origin of coordinates (0,0) be the array reference element, and r k and u k denote the range and the elevation between the kth source and the reference array element, respectively.
At the sampling time t, the output of the mth element on the uniform linear array can be expressed as where s k t ð Þ is the kth source, n m t ð Þ is the noise on the mth element, t mk represents the propagation phase difference for the kth signal arriving at the mth element with respect to the reference element of the array,j denotes the imaginary unit, 1 ł m ł M.
In the near-field case, there exists an approximate relationship After Fresnel approximation is adopted, t mk can be expressed as 8 Thus, the received signal on the mth element can be obtained as follows where m k = À 2pd sin u k =l k ,u k = pd 2 cos 2 u k l k r k , l k is the kth signal wavelength, u mk represents the angle between the kth signal source incident on the mth element and the y-axis, r mk is the range from the kth signal to the mth element.
Equation (4) can be written as matrix where The signal is sampled, and the number of snapshots is N. The covariance matrix of the signal is where X H is the conjugate transpose of the matrix X.

Principle of PCA-MSVR algorithm
Multi-output support vector regression algorithm In order to solve the problem of regression estimation of multiple variables, multi-output support vector regression (MSVR) proposed by Pe´rez-Cruz et al. 22 is the promotion of standard SVR. This article is just a brief introduction to MSVR; for more details, please refer to the literature. [23][24][25] In this article, the upper triangular element of the covariance matrix Rs is used to construct the matrix R, and the sample input feature R and y = y 1 , . . . , y i , . . . , y Z ½ are used as the input of the SVR machine, where y is the set of arrival angle and range of near-field source signals. Assume that the given sample data is The problem MSVR needs to solve is how to select W, b to minimize the error of regression resultỹ i , whereỹ i is the minimization objective function is constructed as follows The e-insensitive loss functions can be extended to multidimensional forms, the L 2 norm of the output variable error is used to replace the original 1-D error, and can be defined as where u i = e i k k = ffiffiffiffiffiffiffiffiffi ffi e i T e i p , e i = y i À ϕ(R i )W À b = y i À y i , e is the allowable deviation, C is a hyperparameter used to determine the trade-off between regularization and reduction of error terms.
Optimization problems are solved using an iterative process, each depending on the previous solution (W k and b k ) to get the next solution until the optimal solution is reached. In order to optimize equation (7), the iteratively reweighted least squares (IRWLS) procedure is needed.
The first-order Taylor expansion of the target function (7) is shown as follows where Furthermore, the second-order Taylor expansion is obtained from equation (7) L p 00 = 1 2 where , CT is a constant independent of W and b. Take the partial derivatives of w j and b j and set them equal to 0, then ∂L 00 ∂L 00 Equations (11) and (12) can be written as matrix where F = ½ϕ(R i ), . . . , ϕ(R Z ) T , a=½a 1 , . . . , a Z T , (D a ) ij = a i d(i À j), y j = ½y 1j , . . . , y Zj T . The inner product kernel function k(x i , x j ) = ϕ T (x i )ϕ(x j ) is usually used to replace the whole nonlinear mapping. Using the representer theorem, it can be shown, under fairly general conditions, that the best solution of the learning problem can be expressed as a linear combination of training samples in the feature Substitute this expression into equations (11) and (12) and you get the following expression where (K) ij = k(x i , x j ) is known as the kernel matrix. The IRWLS procedure can be summarized as the following steps 1. Initialization: Set k = 0, b k = 0, b k = 0, and compute u i k and a i . 2. Compute the solution to equation (14), and label them as b s and b s , determine the direction of gradient degradation method as follows: 3. The search step size h k is solved by heuristic method, and the solution of the next iteration is 4. Compute u i k + 1 and a i , set k = k + 1, and go back to step (2) and continue until L p no longer decreases.
The convergence proof of the above algorithm is given in Sanchez-Fernandez et al. 24 For each new vector R, we can calculate the jth output as y j = ϕ T (R)ϕ T b j . Now, we define the matrix b = ½b 1 , b 2 , . . . , b N , and the N outputs can be expressed as Since the covariance matrix of the signal Rs has more upper triangular elements, using the upper triangular element of the covariance matrix as the training data increases the complexity of the algorithm. In the case of more sample data, the training time will be increased and the training speed will be slower. Therefore, without affecting the performance of estimation, a PCA algorithm is introduced to reduce the dimension of R and the number of features of input samples; thus, the complexity of the algorithm is herein reduced. For example, for an array of eight elements, the upper triangular element of the covariance matrix has 36 elements, and after the dimensional reduction by PCA, the number of features can be changed to about 8, and the complexity of the algorithm is reduced to a quarter of that of the original one. This algorithm will be described in detail below.

PCA algorithm
PCA is a statistical method. PCA is to replace the original indicators with a new set of independent comprehensive indicators by recombining the original indicators which have a certain correlation. 26 The main idea is to establish the feature mapping relationship from high-dimensional space to low-dimensional space, and the original complex features have to reduce several main features, so that the original feature information are retained as much as possible, and not related to each other. This set of linearly independent feature by the orthogonal transformation is called principal component. After the dimensionality reduction of the upper triangular element of the signal covariance matrix by PCA, the redundant information is discarded, which increases the sampling density of the sample. At the same time, when the data are affected by noise, the eigenvector corresponding to the minimum eigenvalue is often related to noise, abandoning redundant information can remove noise to a certain extent. 27 In recent years, PCA has been widely used in various fields, and granted results have been achieved. In this article, PCA method is applied to the dimension reduction and noise reduction of support vector input features. The main components of input characteristic variable are extracted to reduce data redundancy, imperfection, and over fitting, so as to reduce the dimension and calculation complexity of regression model matrix of W k . The specific steps are as follows 1. The covariance matrix of Z groups of sample signals is calculated respectively, and the upper triangular elements are extracted to form the row vectors, and the signal feature matrix R 0 of Z groups of row vector is calculated. 2. After the signal, eigenmatrix R 0 is normalized, and the sample eigenmatrix R 00 is obtained. 3. The covariance matrix C R 00 (A 3 A) of the normalized sample eigenmatrix R 00 is calculated.
Eigenvalue decomposition on C R 00 is performed to obtain its eigenvalue l i and the corresponding eigenvector q i where E is the number of training samples.

Find the percentage of eigenvalues, and select a larger eigenvalue, that is
In the above expression, G is the cumulative contribution rate, generally greater than 85% can be considered to contain the vast majority of information. 28 In this article, the cumulative contribution rate is 95%.

The eigenvectors corresponding to eigenvalues
are constructed into a matrix Q = ½q 1 , q 2 , . . . , q a of Z 3 a, and the sample input feature R 000 of MSVR is obtained by linear transformation R 000 = Q T R 00 .
R 000 after PCA dimensionality reduction is used to replace R as the input feature of MSVR, and the training data are trained using the MSVR. For convenience, the algorithm described above in this article is called PCA-MSVR method. The steps of PCA-MSVR method are as follows. The performance will be explained in the simulation results.
1. Data preparation. The covariance matrix of the training sample signal is obtained, and the upper triangular element is extracted as the characteristic matrix of the signal. 2. Dimensionality reduction. PCA is used to reduce the dimension of the signal's feature matrix. 3. Forecast the model. The training data are trained using MSVR to obtain a predictive model of the signal. 4. Performance estimation. The predictive data are used to predict and estimate performance.
Under the condition that the estimation accuracy is almost the same, PCA-MSVR algorithm can well retain the characteristic information of the signal with minimal datum as possible as we can. The PCA-MSVR algorithm also does not need eigen-decomposition and peak search, and can be realized quickly.

Simulation results and performance analysis
Two near-field, narrow-band, non-Gaussian stationary, sound source signals are incident into the uniform linear sensor array as shown in Figure 1. The receiving array is composed of eight arrays, The inter-element spacing is d = l min =4, the signal frequency is set as ½f s =8, f s =10, f s is the sampling frequency, the number of snapshots is 1024, and the noise is Gaussian white noise. Angular spacing of the training sample data is set as Du = 88, range interval is Dr = 0:05l min , l min is the wavelength of the signal corresponding to frequency f s =8, the training angle range is ½Àp=2, p=2, the training range between two sources with the reference array is ½2:1l min , 3:1l min . The number of sample data is 180, and it is divided into two equal parts, one for training and the other for testing. Then, the two untrained signals are taken for prediction, and the simulation results are shown in Figures 2-7. Figure 2 shows the contribution rate of each principal component to the signal characteristics. It can be seen from the Figure 2 that the contribution rate of the first few principal components is higher. Figure 3 shows the cumulative contribution rate of principal components. From Figure 3, it can be seen that the cumulative contribution rate of the first eight principal components is 99%. In this article, the cumulative contribution rate of the principal component is 95%. Therefore, the first eight principal components are selected.    Figure 6 plots the DOA estimation root mean square error (RMSE), respectively estimated by two-step MUSIC, back propagation (BP), MSVR, general regression neural network (GRNN) and the proposed PCA-MSVR algorithm, at various SNR levels. As can be seen from the Figure 6, in the SNR range (namely, at or above -10 dB), the RMSE of proposed PCA-MSVR algorithm is nearly the same as MSVR algorithm, the DOA estimation precision based on the PCA-MSVR and MSVR algorithm is more notable than that of the BP and two-step MUSIC algorithms. When the SNR is within the range of -5 to 20 dB, the RMSE of proposed PCA-MSVR algorithm is nearly the same as MSVR and GRNN algorithm; the DOA estimation precision of PCA-MSVR, MSVR, and GRNN algorithms is more notable than that of the BP and two-step MUSIC algorithms. Although the performance is the same, the calculation amount of proposed PCA-MSVR algorithm is significantly less than the MSVR and GRNN algorithms.
The complexity of the training process is determined by the convergence performance of the algorithm, so it is difficult to give a quantitative analysis of the complexity of the algorithm by using mathematical formula. The training time of the PCA-MSVR algorithm is compared with MSVR algorithms in Figure 7, and it can be seen that the training time of the PCA-MSVR is 0.06 s; however, the training time of the MSVR is about 0.08 s. Compared with MSVR algorithm, the complexity of PCA-MSVR algorithm is significantly reduced.
For further explanation, the frequency condition in simulation experiment is reset as ½f s =8, f s =8, the incoherent signal is herein changed into coherent signal, other conditions remain unchanged, and then the proposed method in this article is used to estimate the parameters of the near-field source coherent signal. The simulation results are as follows. Figures 8 and 9 are the scatter diagram of DOA and range estimation for two near-field coherent signals, and it can be seen in these two figures that the DOA and range estimated values of the near-field signals can fit well with the actual ones, and the proposed algorithm can estimate the parameters of the coherent nearfield sources.

Conclusion
In this article, an SVR method of combing elevation and range is implemented by PCA dimensionality reduction. With the performance guaranteed, the   computational complexity is almost not increased. First, the upper triangular elements of the covariance matrix of the received signal from the sample data are extracted, and the dimensionality is reduced through PCA. Second, the reduced dimensionality matrix is taken as the input feature of the MSVR machine. Finally, the multi-output SVR algorithm is used for modeling to obtain the parameter model of the nearfield estimation. PCA greatly reduces the dimension of input features of SVM, and it also reduces the complexity of data processing, and the training time is also shortened accordingly. At the same time, the noise is restrained without losing the lossless original data information, and the SNR is herein improved; as a result, the estimation accuracy is improved. The proposed method has a more superior performance compared with BP and GRNN algorithms in low SNR. This method has no special requirements for the array structure and is suitable for both uniform and nonuniform linear arrays. Because the model parameters are obtained through data training, the array error does not affect the accuracy of parameter estimation. Simulation results show that the proposed method based on PCA dimension reduction and multiple output SVR has high estimation accuracy.