A new K-means singular value decomposition method based on self-adaptive matching pursuit and its application in fault diagnosis of rolling bearing weak fault

Sparse decomposition has excellent adaptability and high flexibility in describing arbitrary complex signals based on redundant and over-complete dictionary, thus having the advantage of being free from the limitations of traditional signal processing methods such as wavelet and fast Fourier transform being imposed by orthogonal basis. Sparse decomposition provides an effective approach for feature extraction of intricate vibration signals collected from rotating machinery. Self-learning over-complete dictionary and pre-defined over-complete dictionary are the two dictionary construction modes of sparse decomposition. Normally, the former mode owns the virtues of much more adaptive and flexible than the latter one, and several kinds of classical self-learning over-complete dictionary methods have been arising in recent years. K-means singular value decomposition is a classical self-learning over-complete dictionary method and has been used in image processing, speech processing, and vibration signal processing. However, K-means singular value decomposition has relative low reconstruction accuracy and poor stability to enhance the desired features. To overcome the above-mentioned shortcomings of K-means singular value decomposition, a new K-means singular value decomposition sparse representation method based on traditional K-means singular value decomposition method was proposed in this article, which uses the sparse adaptive matching pursuit algorithm and an iterative method based on the minimum similarity of atomic structure. The effectiveness and advantage of the proposed method were verified through simulation and experiment.


Introduction
Sparse decomposition (SD) is a relative new signal processing method which could capture the essential feature of the analyzed signal without the premise of orthogonal basis expansion, and it has been attracted more and more attention in the area of mechanical fault diagnosis. Dictionary construction is one of the key steps of SD, and there are two ways to construct the over-complete dictionary: pre-defined dictionary and self-learning dictionary. Normally, the pre-defined dictionary SD method has the shortcomings of needing prior knowledge of the analyzed signal and the selflearning dictionary SD method has the disadvantage of poor interference robustness. To solve the abovementioned problems, a large amount of literatures have been arising in recent years. A hierarchical discriminating SD method was proposed in Jiao et al., 1 which could isolate the interferences effectively. In order to address the same above-mentioned problems, a parametric impulsive dictionary based on Laplace wavelets is designed for impulsive fault feature extraction of rolling bearing in Sun et al., 2 which has better feature extraction performance. The traditional orthogonal matching pursuit method was improved and an adapted dictionary-free orthogonal matching pursuit (ADOMP) method was proposed in Huang et al., 3 which has the advantages of flexibility and noise robustness, and the adapted dictionary orthogonal matching pursuit method presented potential advantages in early weak fault feature extraction of rolling bearing. An improved variational mode decomposition (VMD) method was proposed in Li et al., 4 which overcame the shortcomings of VMD in reasonable selection of algorithm parameters. Besides, the improved VMD was combined with adaptive sparse code shrinkage denoising and used in fault diagnosis of rotating machinery. To overcome the difficulty of fault identification in planet bearing due to the intricate kinematics and multiple modulation effects, a classification method based on dictionary learning was introduced in Zhao et al., 5 which has the virtues of avoiding the feature design requirement in common intelligent diagnosis method. To alleviate the drawback of pursuit efficiency of optimal atom in SD, a new SD method based on timefrequency spectrum segmentation was proposed in Yan et al., 6 and the higher decomposition efficiency and better approximation precision of the proposed method were verified by conduction of simulations and experiments. The empirical wavelet transform was improved based on the idea of SD and the sparsity-guided empirical wavelet transform method was proposed and used in fault feature extraction of rolling bearing, 7 and the effectiveness of the proposed method was verified by single and multiple bearing fault signals. A new SD signal processing method based on multi-scale chirplet was proposed in fault diagnosis of gearbox vibration signals, 8 and the feasibility of the proposed method was validated by both simulations and experiments. A novel intelligent fault diagnosis method of rolling bearing was proposed by combining SD with neighborhood preserving deep extreme learning machine in Li et al. 9 A supervised feature extraction method named supervised regularized sparse filtering was proposed in feature extraction of rotating machinery which provided a new way for optimizing the solution of sparsity. 10 Considering the sparsity of impact force, an enhanced sparse regularization method based on reweighted l1norm minimization is developed for reducing the peak force error and improving the identification accuracy of impact force. 11 A novel gear fault diagnosis method based on the structured sparsity time-frequency analysis is proposed. 12 Following the emerging sparse representation theory, a combined method of sparse reconstruction and delay-and-sum (DAS) for highresolution Lamb wave inspection is proposed. 13 Although a large number of rolling bearing fault diagnosis methods based on SD have been arising as stated above, most of them are pre-defined dictionary SD methods which have the drawback of lack of flexibility. Besides, most of the self-learned SD methods own the disadvantage of low computational efficiency. In this article, an improved K-means singular value decomposition (KSVD) sparse representation method based on traditional KSVD method was proposed using the sparse adaptive matching pursuit algorithm and an iterative method based on the minimum similarity of atomic structure to overcome the abovementioned problems. Through the verifications of simulation and experiment, the proposed method not only could capture the latent features of analyzed signal successfully, but also has the advantage of higher computational efficiency.
This article is organized as follows. The basic theory of traditional KSVD is presented in section ''Traditional KSVD.'' The theory and details of the proposed method are presented in section ''Improved KSVD.'' The simulation and experiment verifying the feasibility and effectiveness of the proposed method are carried out in sections ''Simulation'' and ''Experiment,'' respectively. Besides, the analyzed experimental results using the Autogram method are also presented in section ''Experiment'' to prove the advantage of the proposed method. Conclusions obtained from the above results are presented in section ''Conclusion''.

Traditional KSVD
KSVD could be regarded as the extension of K-means. Optimal matching pursuit and singular value decomposition (SVD) are used together in KSVD to improve the convergence rate of sparse coding and dictionary atom due to the reason that they could synchronize the updating of sparse coefficients and dictionary atom.
The purpose of KSVD algorithm is to find the dictionary D that makes the coefficient S as sparse as possible to represent the object signal sparsely where Y is the trained signal, and D is the over-complete matrix (column number greater than row number) or complete matrix (column number equal to row number). The column vectors (s 1 , s 2 , :::s i , :::, s N ) in S are corresponding to Y , which means that Y is composed linearly of the columns in D according to s i , and it could be transformed into mathematical model as the following equation Equation (2) could be rewritten as follows Finding the global optimal dictionary D once is an NP-Hard (non-deterministic polynomial-time hardness) problem giving the trained data Y , and the optimal solution is only approximated step by step. There are mainly two steps in KSVD: sparse coding and dictionary updating.

Sparse coding
First, set an initialized dictionary D and use it to represent the given data Y sparsely. The sparse coefficients S of the given trained data Y on dictionary D could be obtained using optimal matching pursuit. Now, DS could be regarded as the sum of the product of each column in D and the corresponding row in S, which is expressed by the following equation where d i represents the ith column of D, and s T i represents the ith row of S.

Dictionary updating
The updating of dictionary D is carried out column by column. First, suppose S and D being fixed, and the kth column d k in D and the kth row s T i in S are to be updated, then there is the following equation In equation (5), it is only to adjust d k and s T k to make the error between E K and d k s T k as small as possible. The main steps of traditional KSVD are shown as follows: Step 1: Initialization Set the initialization matrix D 0 2 R N 3 K and the initialization iteration value J = 1.
Step 2: Sparse coding Solve S approximately using optimal matching pursuit algorithm and set the error e, which satisfies equation (2), and the sparse coefficient s i is obtained to represent y i .
Step 3: Dictionary updating Restrict E k by only selecting the columns corresponding to w k and make the non-zero positions in E k be the same as s T k , then the new E k is obtained, which is marked as , v 1 is the first column of the right vector singular matrix V , and u 1 is the first column of the left vector singular matrix U . 5. Update J = J + 1.
Step 4: Output The learned dictionary D J is obtained.

Improved KSVD
The traditional KSVD uses optimal matching pursuit to solve the sparse coefficients approximately in the stage of sparse coding, and equation (2) is used as the object function due to the reason that the sparsity of the sparse coefficients is unknown. So the accuracy of the reconstructed signal is determined by the set residual error e. Besides, the increase in the iteration number also might cause the non-convergence of e, which brings out the poor reconstruction result of the reconstructed signal. To solve the above-mentioned problems, an improved KSVD method is proposed in this article.
In the stage of sparse coding, the self-adaptive matching pursuit (SAMP) 14 is used instead of optimal matching pursuit in improved KSVD method. Because the sparsity of the sparse coefficients is not known, it could be described adaptively by changing the step length. Besides, the iteration termination conditions of SAMP are stricter than optimal matching pursuit the latter is determined by the setting iteration number and the residual error e, and the former is determined by the set iteration number and the updated residual error e or the updated residual error e bigger than the residual error e. So, much more accurate dictionary atom and sparse coefficients could be obtained in improved KSVD than traditional KSVD.
Furthermore, the K atoms in dictionary D J do not need to be updated in the improved KSVD method, and the atoms among the K atoms owing the greatest impact on the reconstructed signal are only to be updated. Calculate the structural similarity 15 of the residual error e corresponding to D J , and the structural similarity between signal x and y could be defined as follows where u x and u y are the means of x and y, respectively; d 2 x and d 2 y are the variances of x and y, respectively; d xy is the covariance of x and y; c 1 and c 2 are the constants; and C SS (x, y) 2 ½0, 1.
The C SS of residual error E k is the C SS of Y and P i6 ¼k d i s T i , and compare all the C SS corresponding to E 1 , E 2 , :::, E and select the residual error E with the smallest value of C SS . Once the minimum residual error E is determined, the corresponding dictionary atom d k and sparse coefficients s T k are also updated. Then, calculate the C SS values of all K residual errors corresponding to the updated d k and s T k and compare their values, and the smallest one is selected. The above-stated iteration process is repeated to avoid the blindness of traditional KSVD algorithm in selecting updated atoms.
The main steps of improved KSVD are as follows: Step 1: Initialization Set the initialization matrix D 0 2 R N 3 K and the initialization iteration value J = 1.
Step 2: Sparse coding Solve S accurately using SAMP algorithm, which satisfies equation (3), and the sparse coefficient s i is obtained to represent y i . Input: Sensing matrix Y with a size of M 3 N and the M-dimensional observed signal y. Set the relevant initial values as follows: the initial residual error r 0 = y, supporting set O = F, step length L = S = 1, index set T add , the total number of iterations I, and the initial iteration number t = 1.
The main steps of sparse coding are as follows: 1. Calculate the correlation coefficients u = fu j ju j = jhr, u j ij, j = 1, 2, . . . , N g, where u j is the column vector of Y, and L maximum values in u are selected and composed as set T add = max 1 ł j ł L fjujg.

Calculate and obtain the sparse signal
Select the largest absolute value among s t and mark as s tL , and mark the corresponding Y Ot as Y OtL and the corresponding supporting set as O tL , that is, F = O tL . 5. Update the residual error r tnew = y À Y OtL s t . 6. If there is r tnew = 0, then stop the iteration.
Otherwise, if there is k r t k 2 ø k r tÀ1 k 2 , update iteration step as L = L + S and go back to Step 1 and continue the iteration. If the above two conditions are not satisfied, then O t = F, r t = r tnew and the iteration number is added by 1. 7. Reconstruct the sparse signal s using the nonzero terms in O t . 8. Output the reconstructed sparse signal s.
Step 3: Dictionary updating Calculate the residual error E k and make it satisfy E k = Y À P i6 ¼k d i s T i , and calculate the C SS of E k and then select the smallest C SS . 3. Restrict E k by only selecting the columns corresponding to w k and make the non-zero positions in E k be the same as s T k , and new E k is obtained, which is marked as E R k . 4. Decompose E R k = U DV T using SVD and update dictionary atom asd k = u 1 ,s T k = D½1, 1 Á v 1 . 5. Update the iteration number as J = J + 1.
Step 4: Output The learned dictionary D J is obtained.

Simulation
First, the outer race fault simulation of rolling bearing is carried out. The local pitting of inner race or outer race is often manifested as the early weak fault of rolling bearing. In the section, the rolling bearing fault model whose mathematical equation can be expressed as equation (7) 16,17 is used to verify the feasibility of the proposed method. t i is the tiny fluctuation around mean period T . Set the sampling frequency as f s = 25, 600 Hz. The outer race fault characteristic frequencies (FCFs) are set as f o = 25 Hz. Assume the random slide between the rolling element and race is normally distributed whose standard deviation is 2% of the shaft rotation ratio The time-domain waveform of the simulation signal of rolling bearing outer race fault is shown in Figure 1 from which the impulsive characteristic is evident, and the time interval between each shock is the reciprocal of FCF of f o , that is, 1/25 s. In order to simulate the vibration signal of rolling bearing early weak fault, add white noise into the simulation signal, and the signalto-noise ratio (SNR) is 210 dB. The noised simulation signal is shown in Figure 2 from which the impulsive characteristic of the simulation signal is buried by the strong background noise.
Before applying the improved KSVD as introduced in section ''Improved KSVD'' on the noised signal, the noised signal should be segmented into several training samples to increase the training efficiency and the length of each training sample is set as 64 point empirically, and the overlap rate between each sample is 0 to avoid the reuse of training signals. The number of the learned dictionary atoms is set as 256 points (256 points are enough to reflect a complete impact characteristic when rolling bearing fails) and the length of each dictionary atom is also set as 256 points for calculation efficiency. When the iteration number is chosen as 4, the matching speed is highest with higher accuracy through repeated verification, so the iteration number is set as 4. Apply the improved KSVD on the signal as shown in Figure 2, and there are a total of 256 dictionary atoms being learned. Here, part of the learned dictionary atoms is shown in Figure 3. Then, reconstruct the analyzed signal using the learned dictionary atoms, and the last reconstructed signal is shown in Figure 4;   it is evident that the impulsive components are extracted successfully by the proposed method. Apply the envelope demodulation spectral analysis on the reconstructed signal, and the result is shown in Figure 5 from which the outer race FCF with its harmonics is extracted successfully. Since the proposed method is mainly developed from the traditional KSVD method, the traditional KSVD and the proposed method are compared. As analyzed above, the early failure of rolling bearing is manifested as periodic impacts. Therefore, the reconstruction accuracy of the fault signal is important. It is the extraction of the occurred impacts which we should focus on. So kurtosis of the reconstructed signal (KRS) could be used as a good indicator reflecting the de-noising effectiveness between the traditional KSVD and the proposed method. Add different levels of white noise into the simulation signal as shown in Figure 1 and the SNR varies from 210 to 2 dB, and the KRS using the traditional KSVD and the proposed method is shown in Figure 6 from which the advantages of proposed method over the traditional KSVD method are evident.
Besides, the Autogram 18 is also used as comparison, which is a new and effective rolling bearing signal processing method. Envelope analysis is one of the most advantageous methods for rolling element bearing diagnostics, but finding a suitable frequency band for demodulation has been a substantial challenge for a long time. Introduction of the spectral kurtosis (SK) and Kurtogram mostly solved this problem, but in situations where SNR is very low or in the presence of non-Gaussian noise, these methods will fail. This major drawback may noticeably decrease their effectiveness, and the goal of Autogram is to solve the abovementioned problem. Vibration signals from rolling element bearings exhibit high levels of second-order cyclostationarity, especially in the presence of localized faults. The autocovariance function of a second-order cyclostationary signal is periodic and the proposed method, named Autogram, takes advantage of this property to enhance the conventional Kurtogram. The Autogram computes the kurtosis of the unbiased autocorrelation of the squared envelope of the demodulated signal, rather than the kurtosis of the filtered time signal. The analysis result of the simulation signal using the Autogram method is presented to verify the advantage of the proposed method. Figure 7 is the analysis result of the noised simulation signal as shown in Figure 2 using Autogram. The key parameters such as center frequency and bandwidth of band-pass filter could be obtained based on Figure 7: center frequency f c = 1000 Hz and bandwidth (B w ) = 800 Hz. Construct an optimal band-pass filter using the obtained key parameters to filter the noised signal as shown in Figure 2, and then apply the envelope demodulation spectral analysis on the filtered signal; the analysis results are shown in the bottom chart of Figure 7 from which the FCF of outer race could not be identified, and it shows that the proposed method has the advantage of noise robustness compared with Autogram method.
The simulation of rolling bearing inner race fault model is same as outer race fault: Set the inner race FCF as f i = 40 Hz, sampling frequency as f s = 25, 600 Hz, and rotating frequency as f r = 10 Hz. When the pitting occurs in the inner raceway of rolling bearing, its absolute position changes with the rotation speed, and the modulation phenomenon will arise in the vibration signal. Ideally, the FCF inner race is not only extracted, but also the modulation characteristic could be extracted successfully. Same as the above   outer race fault simulation analysis processes, the corresponding analysis results are presented from Figures  8-13, and the last analysis result using the proposed method is shown in Figure 12 from which the FCF inner race is not only extracted, but also the modulation characteristic is extracted perfectly.

Experiment
In most of the relevant literatures relating to rolling bearing fault diagnosis, the defective bearings with typical simulated or seeded damages are used in bearing diagnosis research. These damages are mature faults   and evident which are less capable to discover the nature degradation of rolling bearing in early stage. In order to validate our proposed method and truly reflect the real natural degradation, accelerated fatigue test of rolling bearing is carried out in this article. Refer to Wang et al. 19 for the details and processes of the test. In Wang et al., 19 the data at 2297th minute are taken as the data of early weak fault stage of rolling bearing. In this article, the proposed method could identify the fault feature much earlier than the method used in Wang et al. 19 to avoid the occurrence of sudden catastrophic accidents, and the data at 2290th minute are analyzed using the proposed method. Note that because the experiment is conducted in the case of accelerated bearing run-to-failure condition, even 1 min earlier will make significant difference. The time-domain waveform of the data at 2290th minute is shown in Figure 14 from which no impulsive characteristic could be identified due to the interference of strong background noise. Apply the envelope demodulation spectral analysis on the signal as shown in Figure 14, and the corresponding result is presented in Figure 15; the spectral lines are chaotic and the inner race FCF could not be extracted. These further prove that the traditional envelope demodulation spectral analysis is not fit for early weak fault diagnosis of rolling bearing. Before applying improved KSVD on the signal as shown in Figure 14, divide the raw signal into segments with 256 points per each segment first. Then, the dictionary is initialized by choosing 256 segments randomly as columns, which is the same as the simulation case.
Parts of the learned dictionary atoms are presented in Figure 16. Same as the ideology presented in simulation, construct the signal using the learned dictionary Figure 10. Part of the learned dictionary atoms using the improved KSVD method on the signal as shown in Figure 9.  atoms, and the corresponding reconstructed bearing early weak fault signal is shown in Figure 17. Finally apply the envelope demodulation spectral analysis method on the signal as shown in Figure 17, and the obtained result is shown in Figure 18 from which the inner race FCF of the test bearing is extracted successfully. To further illustrate the high effectiveness of the improved KSVD in fault feature extraction of rolling bearing early weak fault, the analysis results of the original signal displayed in Figure 14 using other methods are presented.
1. First, same as simulation signal, the analysis result of signal as shown in Figure 14 using Figure 13. The analysis result of the signal as shown in Figure 10 using the Autogram method. Figure 14. The time-domain waveform of the data at 2290th minute of the accelerated fatigue test of rolling bearing. Figure 15. The envelope demodulation spectral of the signal as shown in Figure 14.
Autogram is presented in Figure 19, from which the inner race FCF could not be identified, and the advantage of the proposed method over Autogram is further verified.
2. The second method being used as comparison is the traditional KSVD method. Sparse extraction of impulse by adaptive dictionary (SpaEIAD) being proposed by Chen et al. 20 is a successful application of traditional KSVD-based denoising method, and the analysis result of signal as shown in Figure 14 using SpaEIAD is shown in Figure 20. It could be observed that although impulses are extracted and noise level is decreased, the noise is still strong by being compared with Figure 17. Envelope demodulation spectral analysis is also applied on the de-noised signal as shown in Figure 20 to validate whether the inner race FCF could be detected or not, and the result is shown in Figure 21 from which the advantage of the improved KSVD over SpaEIAD is verified.

Conclusion
Early weak fault diagnosis of rolling element bearing is a difficult and hot study area. Aiming at the difficult problem of extracting early weak fault features of rolling bearings and the potential of sparse representation self-learning dictionary method in early weak fault diagnosis of rolling bearings, this article proposes an improved KSVD method based on traditional KSVD method, in which the matching pursuit algorithm used in the construction of traditional KSVD sparse dictionary is changed to sparse adaptive matching pursuit algorithm. Figure 16. Part of the learned dictionary atoms using the improved KSVD method on the signal as shown in Figure 14.  The effectiveness and advantage of the proposed method are verified through simulation first. The improved KSVD method could extract the fault feature of rolling early weak fault successfully, and this virtue is verified by the rolling bearing simulation signals interfered by strong noise. Besides, the accuracy of signal reconstruction based on the proposed method is higher than that based on traditional KSVD method, and it is verified by comparing the kurtosis of the signals obtained by traditional KSVD and improved KSVD methods.
Subsequently, the accelerated fatigue test of rolling bearing was carried out and the early weak fault signal of the test was used to testify the effectiveness and advantage of the proposed further. Besides, the traditional KSVD and Autogram method are also applied   Figure  14 using the SpaEIAD method. Figure 21. The envelope demodulation spectral of the signal as shown in Figure 20.
on the early weak fault signal of the test and used as comparison to verify the advantage of the proposed method.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.