Compound mechanical fault diagnosis based on CMDE

The fault diagnosis technique is of important for the safety operation of the rotating machinery. In the fault diagnosis framework, the entropy-based method is a promising tool for the feature extraction and signal processing. Among the entropy-based methods, the diversity entropy has arisen increasing attention due to its merits of high consistency, strong robustness, and high calculation efficiency. However, it suffers the defect that the multiscale procedure leads to unstable complexity estimation at higher scales. This induces a poor cluster performance in analyzing the compound mechanical fault signals. To address this issue, this paper presents a novel feature extraction method called composite multiscale diversity entropy (CMDE). The proposed CMDE utilizes the mean complexity value of multiple sliding windows for each scale to enhance the stability, which enables the diversity entropy could dig richer fault information from deeper scales for the compound fault diagnosis of rotating machinery. Then, the stability of CMDE has been evaluated using synthetic gear signals. At last, the proposed CMDE has been applied in the compound mechanical fault diagnosis. The experimental results show that the CMDE achieves the highest diagnosis accuracy compared to the existing entropy-based feature extraction methods.


Introduction
Fault diagnosis technique aims to provide reliable ensurence for the safety operation of the rotating machinery. Since the intelligent fault diagnosis technique has merits of fast response, easy to installation, and low cost, it has been widely used in the modern industry. 1,2 In general, the intelligent fault diagnosis technique can be divided into three steps: data acquisition, feature extraction, and fault classification. Among these three steps, feature extraction is the most crucial step. 3 To effectively extract features from the measured signal, the entropy-based method has become a hot topic due to the advantages of independence with prior knowledge, unnecessary of preprocessing, and easy to perform. 4,5 The commonly used entropy methods including sample entropy (SE), 6 fuzzy entropy (FE), 7 and permutation entropy (PE). 8 However, it is hard to directly use their original form for feature extraction. This is because the unidimensional feature provides insufficient information for the complex signal analysis. To solve this problem, Costa et al. 9 developed multiscale sample entropy (MSE), which provided a measure called multiscale procedure to extend the unidimensional entropy method into a multidimensional feature extraction tool. In the multiscale procedure, the raw signal is decomposed into multiple scale time series by the Haar wavelet. The scale factor in the multiscale procedure is designed to control the decomposition layer of the Haar wavelet. Then, the MSE value can be obtained by calculating SE value under each scale factor. Inspired by MSE, permutation entropy and fuzzy entropy were also extended into multiscale permutation entropy (MPE) 10,11,12 and multiscale fuzzy entropy (MFE). 13,14 Recently, Wang et al. 15 proposed a new entropy metric called multiscale diversity entropy (MDE). Different from the existing methods, the MDE utilize the diversity of the orbits in the phase space to quantify the dynamical complexity of time series. Compare to the MFE, MSE, and MPE, the MDE has merits as follows: (1) High consistency. The diversity entropy can estimate the dynamical complexity with high consistency, while PE, SE, and FE may yield inconsistent results in estimating some periodic systems.
(2) High calculation efficiency. The diversity entropy has the lowest calculation complexity, which saves at least half of the computing time compared to PE, SE, and FE. (3) Strong robustness. Compared to PE, SE, and FE, the diversity entropy has the strongest robustness when the periodic signals are interfered by Gaussian white noise.
Although the MDE has above-mentioned advantages, it encounters challenge of extracting features from the mechanical compound fault. The compound fault means the machine occurs multiple fault synchronously, that is, the bearing and gear failed at the same time. In this circumstance, the measured vibrational signal consists of multiple features from different components, which requires the multiscale procedure to explore richer features from deeper scales. Actually, the multiscale procedure used in MDE is a Haar wavelet. As the scale increase, the fault information hidden in the different spectral resolution can be dig out. The richer fault information hidden in the deeper scale can provide sufficient fault features for the classifier to accurately recognize the compound fault (the detailed discussion can be seen in Section 4''Results and analysis''). However, the original multiscale procedure used in MDE has a certain defect that the multiscale time series will be greatly shortened at high scale. This makes the diversity entropy has insufficient information to estimate the dynamical complexity, which presents unstable behavior at high scale factor. This phenomenon restricts the application of diversity entropy in the compound fault diagnosis of rotating machinery.
To make up this defect, composite multiscale diversity entropy (CMDE) is proposed in this paper. In the proposed CMDE method, a moving averaging window is used to generate several composite multiple scale time series under each scale factor. Then, the mean complexity value of composite multiple scale time series is calculated as the CMDE value. With the composite multiscale procedure, the stability of diversity entropy at high scale has been greatly enhanced. This enables the diversity entropy could dig richer fault information from deeper scales for the compound fault diagnosis of rotating machinery.
Based on CMDE and k-nearest neighbor (kNN) classifier, an intelligent fault diagnosis frame for the mechanical compound-fault diagnosis has been proposed. The effectiveness of the proposed method is evaluated through simulated and experimental signals. The results show that the proposed CMDE has the best feature extraction ability compared to MSE, MPE, MFE, MDE, CMSE, CMPE, and CMFE. Moreover, the results also show that the composite multiscale procedure can effectively improve the stability of the diversity entropy at high scales, which provides more reliable dynamical complexity estimation for feature extraction.
The rest of this paper is organized as follows. Section ''Methodology'' introduces the basic definition of the proposed method. Section ''Simulation evaluation'' briefly presents the simulated gear signals and evaluates the stability of the proposed CMDE. Section ''Experiment evaluation'' evaluates the performance of the proposed fault diagnosis frame using the experimental signals. Last, Section ''Conclusions'' summarizes the conclusions.

Diversity entropy
For an arbitrary time series X = fx 1 , x 2 , Á Á Á , x i , Á Á Á , x N g, i 2 ½1, N, N indicates the data length. The diversity entropy (DE) can be computed by four steps.
Step 1. Reconstructed the time series X into a series of orbits with embedding dimension m as expressed in equation (1).
Step 2. Construct the similarity set D(m) as equation (2). The D(m) can be computed by the calculating the cosine similarity between each adjacent orbits as equation (3).
Step 3. Divide the range [21, 1] into e intervals. The e is called the number of symbols in diversity entropy. Then count the cumulative frequency of D(m) falling into each interval to obtain the state probability (P 1 , . . . , P e ). Obviously, P e k = 1 P k = 1. The physical meaning of this step is that symbolize the similarity set D(m) with e symbols, then the infinite state of the orbits can be numerical.
Step 4. The diversity entropy value can be calculated as equation (4).
As the diversity entropy defined the last step as the Shannon entropy, it presents the same mathematical properties: when DE value approaches 1, the dynamical complexity of the time series is high; when DE value approaches 0, the dynamical complexity of the time series is low. During the rang of [0, 1], the diversity entropy value monotone increases with the dynamical complexity of the time series.

Multiscale diversity entropy
To extend the unidimensional entropy method into a multidimensional feature extraction tool, the diversity is combined with the multiscale procedure as shown in Figure 1. The multiscale diversity entropy can be computed by two steps: Step 1. For a given time series X = fx 1 , x 2 , Á Á Á , x i , Á Á Á , x N g, segment the X into multiscale time series as equation (5): where the t represents the scale factor in the multiscale analysis, which should be a positive integer. For t = 1, the time series fy (1) g is the original time series.
Step 2. Calculate the entropy value of each multiscale time series as equation (6).
where the m and e indicate the embedding dimension and number of symbols, which has the same meaning as in Section ''Diversity entropy.''

Composite multiscale diversity entropy
The original multiscale procedure used in MDE has a certain defect that the multiscale time series will be greatly shortened at high scale. This makes the diversity entropy has insufficient information to estimate the dynamical complexity, which presents unstable behavior at high scale. This phenomenon restricts the application of diversity entropy in the mechanical compound-fault diagnosis. To make up this defect, composite multiscale diversity entropy (CMDE) is proposed in this paper. The stability of these two procedures will be detailed explored and discussed in Section ''Simulation evaluation.'' The composite multiscale diversity entropy can be calculated as following steps: Step 1. For a given time series X = fx 1 , x 2 , Á Á Á , x i , Á Á Á , x N g, segment it into composite multiple scale time series as equation (7): where the t represents the scale factor which has the same meaning as multiscale procedure in Section ''Multiscale diversity entropy.'' For t = 1, the time series fy (1) g is the original time series.
Step 2. Compute the diversity entropy value of each composite multiple scale time series.
where the m and e indicate the embedding dimension and number of symbols, which has the same meaning as in Section ''Diversity entropy.'' The composite multiscale procedure can be seen in Figure 2. Seen that the CMDE uses repeated estimation of DE from composite multiple time series at high scale, which can avoid the unreliable estimation of shorten time series used in MDE. This can make up the shortage of original multiscale procedure used in MDE. Noted that the scale factor t has the same meaning as MDE, this indicates each scale factor t corresponding to an individual DE value. The composite multiscale procedure only enhances the stability of the diversity entropy, while the physical meaning is unchanged.

KNN classifier
The kNN is a commonly-used supervised learning model which has merits of less parameters, easy to perform, and high calculation efficient. In this paper, the kNN is used as a classifier to accomplish the fault diagnosis frame, and the detailed discussion of kNN is not involved. The principle of kNN is shown in Figure 3. The main calculation steps of kNN classifier are briefly reviewed as follows: Step 1. Calculate the distance between the test sample and each training sample.
Step 2. Rank distances in ascending order.
Step 3. Select k samples with the smallest distance.
Step 4. Count the occurrence frequency of the label to the first k samples.
Step 5. The label with the highest frequency in the first k samples is returned as the prediction classification of the test sample.

Compound fault diagnosis frame based on CMDE and kNN
In the practical engineering, the multiple components of rotating machinery may fail at the same time. In this circumstance, the measured vibrational signal consists of multiple features from different components, which requires the multiscale procedure to explore richer features from deeper scales. However, the original MDE has poor stability at high scales. Thus, this paper proposed a novel compound fault diagnosis frame based on CMDE and kNN. The proposed fault diagnosis frame can be seen in Figure 4. It can be implemented by three stages as follows.

Simulation setting
In this section, the synthetic gear signals are used to evaluate the stability of the proposed CMDE method. The synthetic gear signals simulate the normal meshing and pitting faults of two spur gears as shown in Figure 5(a) and (b). The synthetic signals are generated as Liang et al. 16 In this paper, we briefly review the basic assumption and the dynamic model of the synthetic gear signals. The basic assumptions of the simulation are given as follows.
(1) The two gears are standard spur gears, and the meshing between gears is simplified as springdamping structure. (2) The friction is ignored.
(3) The support of shaft and bearing is simplified as spring-damper structure. (4) The centralized parameter method is adopted, which regards each component of the system as a mass point, and the connection between each component is a spring-damper structure. (5) The vibration in the x direction is free response and will disappear due to inherent damping. In this paper, we focus only on the motion in the y direction.
The dynamic model is given as Figure 6. The vertical motion in the y direction of the input/output gear are expressed as equations (9)-(10).
where the m 1 /m 2 indicates the mass of input/output gear. y 1 /y 2 indicates the linear displacement of input/output gear in y direction. F k /F c indicates the stiffness/damping between gears.F k1 /F c1 indicates the stiffness/damping of input bearing. F k2 /F c2 indicates the stiffness/damping of output bearing. The rotary motion of the input/output gear are expressed as equations (11) where I 1 /I 2 indicates the rotational inertia of input/output gear. u 1 /u 2 indicates the angular displacement of input/output gear. M pk /M pc indicates the stiffness/ damping moment of input couplings. R b1 /R b2 indicates the base circle radius of input/output gear. M gk /M gc indicates the stiffness/damping moment of output couplings. The rotary motion of the motor/load are expressed as equations (13) and (14).
where I m /I b indicates the rotational inertia of motor/ load. u m /u b indicates the angular displacement of motor/load. M 1 /M 2 indicates the torque of the input/ output. The values of forces and moments are expressed as equations (15)-(24).
M pk = k p (u m À u 1 ) ð19Þ where k 1 /k 2 indicates the vertical radial stiffness of input/output bearings. c 1 /c 2 indicates the vertical radial damping of input/output bearings. k p /k g indicates the torsional stiffness of input/output flexible coupling. c p / c g indicates the damping coefficient of input/output flexible coupling. k t indicates the total mesh stiffness. c t indicates the mesh damping coefficient. Additionally, the synthetic signals are added with white Gausses noise, and the signal-noise-ration is 10 dB. The data length is set to be 1024 points.

Results and analysis
After acquiring the synthetic gear signals as described in Section ''Simulation setting,'' the MDE and CMDE are used to estimate the dynamical complexity of the synthetic gear signals. Each method has been run 20 times and the standard deviations of the diversity entropy under each scale are plotted in Figure 7. To be  fair, the MDE and CMDE are set with same parameters: embedding dimension m = 4, symbols e = 30, and scale factor t = 30. In Figure 7(a) and (b), it can be seen that the standard deviation of MDE increase with the scale factor. This is coincident with the discussion in Section ''Methodology'': since the averaging window increases with the scale factor, the multiscale time series will be greatly shortened. The short multiscale time series cannot provide sufficient information for DE to estimate the dynamical complexity. Thus, the diversity entropy value will fluctuate at high scale, which presents unstable behavior in Figure 7. Obviously, the entropy value with large standard deviation will make the extracted features unreliable at high scale, which restrict the wilder application of diversity entropy.
Conversely, the CMDE presents stable dynamical complexity at high scale. Can be seen from both Figure 7(a) and (b), the standard deviation of CMDE maintains no more than 2% even at scale 30. Benefiting from the composite multiscale procedure, the CMDE can repeatedly estimate the dynamical complexity to obtain a constant diversity entropy value. From classification perspective, a stable multiscale feature is more conducive to categorization.

Experiment setting
To evaluate the performance of the proposed CMDE-kNN method, a compound mechanical fault experiment was conduct on the test rig called Machinery Fault Simulator produced by SpectraQuest. The test rig and its schematic diagram are shown in Figure 8(a) and (b) respectively. This experiment utilized an electric motor to drive a three-way gearbox. The rotating speed of motor was set as 3000 rpm. The three-way gearbox consists of one pair of straight cut bevel gears and the gear ratio is 18:27 (seen Figure 8(b)). The load was set as 5 in-lbs of torque. The accelerometer was mounted on the top of the gearbox to acquire the vibrational signal. The sampling frequency was 12,800 Hz.
Noted that since the rotating speed of motor was set as 3000 rpm, it takes 0.02 s to make a revolution. That means a sample with data length 1024 points (0.08 s) will contain four revolutions. Theoretically, for periodic signals, the data length should be greater than one period. That means a sample with data length 1024 points contains sufficient fault information for analyzing.
The different health conditions were realized by replacing the test bearing and test gear with artificial  damage, as shown in Figure 9. The testing bearing type was 6202 deep groove ball bearing. Six types of faults were designed in this experiment: pitting in the driving gear (PIT, seen Figure 9(a)), broken tooth in the driving gear (BRO, seen Figure 9(b)), missing tooth in the driving gear (MIS, seen Figure 9(c)), ball fault (BF, seen Figure 9(d)), inner race fault (IRF, seen Figure 9(e)), and outer race fault (ORF, seen Figure 9(f)). Totally, there will be 16 health conditions as Table 1. There are has 100 samples in each health state and 1600 samples in total. In each test, we randomly choose 50% samples from each health conditions as training samples, and the residual samples are treated as training samples. Meanwhile, the data length is 1024 points. The waveforms of 16 health conditions are showed in Figure 10.
To compare, MPE, MFE, MSE, CMPE, CMFE, and CMSE were used to identify the health condition with the same procedure as CMDE. The parameters of each entropy based method were set as their best parameters 17 as seen in Table 2. Noted to be fair, the scale factor of eight methods were the same, and the classifier of each method was kNN.

Results and analysis
The finial classification results are shown in Figure 11. From Figure 11, three conclusions can be made as follows: First, all the composite multiscale based entropy methods outperform the multiscale based entropy methods. For example, the diagnosis accuracy of CMPE (84.76%) is higher than MPE (52.96%). This can attribute to the stable entropy estimation generated by the composite multiscale procedure: the moving averaging procedure can repeatedly estimate the dynamical complexity to obtain a constant entropy value. Thus, the kNN classifier can distinguish the fault type much easier.
Second, the standard deviation of composite multiscale based entropy methods is less than that of the multiscale based entropy methods. This can also attribute to the stability of the composite multiscale procedure: the small standard deviation at high scales can make cluster boundary much clearer, which makes the kNN classifier much easier to category the samples around the boundary. The detailed discussion of cluster boundaries will be made later.
Last, the CMDE achieves the highest accuracy among eight methods. Benefiting from the three advantages of diversity entropy, 15 the MDE achieves the highest accuracy among the four multiscale based entropy methods. Besides, CMDE achieves the highest accuracy among the four the four composite multiscale based entropy methods. This suggests that the final classification accuracy of multiscale based method depends on the entropy value method itself: the stronger the feature extraction ability of the entropy method, the higher the clustering ability of the fault features.
Additionally, the relationship between scale and accuracy has been investigated and the result is shown on Figure 12. Figure 12 shows the test accuracy of MDE and CMDE with different number of features (scales). From Figure 12, for MDE, it can be seen that the accuracy increase as the scale increase when the scale is less than 16. This means the measured vibrational signal from compound fault consists of multiple features from different components, which requires the multiscale procedure to explore richer features from deeper scales. Actually, the multiscale procedure used in MDE is a Haar wavelet. As the scale increase, the fault information hidden in the different spectral resolution can be dig out. However, the accuracy decreases when the scale exceeds 16. This is because the To show the feature extraction ability of CMDE more intuitively, Fisher score (FS) and visualized feature are adopted in this section. Fisher score is a simple criterion to judge the separability of the extracted features. FS uses the ratio of inter-class distance to inner-class distance as equation (25). Obviously, a bigger FS value indicates a better cluster ability of the features.
where S B is the distance between different health conditions, and S W is the distance within a single condition. When the ratio of the S B and the S W is maximized, the samples can be separated easily. N indicates the total number of the samples. C represents the total number of health conditions. N i indicates the number of the samples in health condition i, m i is the mean vector of samples in condition i. m 0 is the mean vector of all samples. x i, j is the two-dimensional feature vector of jth sample in health condition i. The visualized features and FS of each methods are displayed in Figure 13. Four conclusions can be made as follows: First, among the multiscale based methods ( Figure  13(a) MDE, Figure 13 Second, the class centers of the composite multiscale based methods are much clearer than the multiscale based methods. The most obvious example is MFE and CMFE: for MFE, we can hardly observe the class centers in Figure 13(c); but for MFE, each types of sample forms a clear cluster in Figure 13(g). This proves the effectiveness of the composite multiscale procedure in the feature extraction.
Third, although MDE ( Figure 13(a)) can forms explicit class centers, but the inter-class distance is every close. In other words, each cluster is close. And we can see parts of samples are far from the class center, that is, condition 15, the pink samples in Figure 13(a). In Figure 13(e) CMDE, not only the inter-class distance is larger, but also the samples outer center are fewer.
Last, the CMDE achieves the highest FS (7.87), which indicates the CMDE owns the best feature extraction ability among the eight methods. This result coincides with the classification accuracy ( Figure 11). This proves the proposed CMDE has the best feature extraction ability and the CMDE is independent with the classifier.

Conclusions
The conclusions of this paper are summarized as follows: (1) A novel feature extraction method called composite multiscale diversity entropy (CMDE) has   been proposed. The CMDE utilizes the mean complexity value of composite multiple scale time series to enhance the stability of diversity entropy at high scale, which enables the diversity entropy could extract richer fault information from deeper scales for the compound fault diagnosis of rotating machinery. (2) The simulation results prove that the stability of diversity entropy has been greatly improved, which can dig richer fault information from deeper scales. (3) The experimental results show that the proposed CMDE has the best feature extraction ability in compound mechanical fault diagnosis compared with the existing entropy-based methods.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.