Condition monitoring of water pump bearings using ensemble classifier

The bearings faults are reported to be the major reason for centrifugal pump (CPs) failures. Limited literature is available to diagnose the minor scratches in the bearing surface through non-intrusive condition monitoring techniques. Recent research on the analysis of bearing scratches through non-intrusive motor current analysis (MCA) has shown encouraging results where the comparison of machine learning and convolutional neural networks (CNNs) was performed in the classification of healthy bearings and faulty bearings (holes and scratches). The fault classification accuracy of 89.26% through MCA combination with machine learning and CNN algorithm was reported which is very low. The key factors of low accuracies were identified as low amplitudes of the harmonics in the MCA spectrum, the magnitude of environmental noise, and utilization of conventional feature extraction techniques. This problem has been tackled in this paper by developing a novel feature extractor (NFE) that extracts powerful features from the integrated current and voltage sensors data. The NFE has been derived using the threshold-based decision mechanism which has the capability to identify the location of the feature harmonic, feature extraction, measure the amplitude of the fault component, and compare it with the derived threshold. The experimental data has been collected for the bearing balls (BB), bearing cage (BC), inner race (IR) and the outer race (OR) faults, and the performance of the NFE has been tested on an ensemble classifier (CatBoost) and the better classification accuracy (99.2% for an individual feature and 100% with the combination of two or more features) of NFE has been achieved as compared to previously reported methods.


Introduction
CPs usually operate in the industrial environment where there are issues of high temperatures, humidity, dust, and noise. [1][2][3] The bearing is one of the critical components of CPs which prone to various faults. The Figure 1 indicates that the CPs maintenance cost is highest in the industry and Figure 2 shows that the malfunction in bearings is highest among other components. 4 Thus, the fault diagnosis of bearings of the CPs has a great importance. [5][6][7][8][9][10] The bearings are the reason for more than 41% of machine breakdowns, thus, this paper investigates various faults in bearings. The bearing structure and its components are shown in Figure 3. 11,12 Most of the literature has focused on diagnosing techniques for holes and cracks in IR and OR. [13][14][15][16] Only two papers are found in the published literature which focuses on the diagnosis of mini scratches in OR. 17,18 The diagnosis of mini scratches in IR is not reported in the literature. [19][20][21][22] Thus, the scope of this study is to examine the mini scratches in IR and OR along with faults in BB and BC.
The typical sensors used in the fault diagnosis of the CPs are accelerometers (vibration sensors), current transducers (CTs), voltage transducers (VTs), noise sensors, temperature sensors, and magnetic flux sensors. 23 The fault diagnosis techniques are usually named based on the sensor type used for data collection. The vibration analysis, thermal analysis, magnetic flux analysis, noise analysis, and acoustic emission techniques are categorized as intrusive techniques as the sensors used in intrusive techniques are installed on the machine surface. Although intrusive techniques are well known in the industry and ISO standards are available to categorize the machine failures based on the sensors data. However, some machines are located in such positions that access to the machine is not easy for sensor fitting and status monitoring. Thus, the intrusive techniques are not suitable for such applications. Furthermore, the high cost of the sensors is another major disadvantage associated with intrusive techniques. [24][25][26][27][28][29] In the past, some researchers had developed a non-intrusive condition monitoring technique through fast Fourier transform of the motor line current data and analyzing the fault-related harmonics. This technique was named motor current analysis (MCA). [30][31][32][33][34] Several papers have been published in the past decade to improve the performance of the MCA by developing pre-processing algorithms to make the condition monitoring system reliable, efficient, economical, and to reduce the complexity. [35][36][37][38] Although MCA is a better alternative to intrusive techniques but harmonics associated with defects are suppressed by the amplitude of line frequency and give a false alarm. Another issue reported with MCA is the high tendency of false alarms in a highly noisy environment. [39][40][41] Fault diagnosis of machines through artificial intelligence (AI) has been the trend in the last couple of years and various AI algorithms such as support vector machine (SVM), Naive Bayes classifier (NBC), knearest neighbor (k-NN), and neural networks (NN) have been used in the literature for the diagnosis and classification of bearing faults. However, low classification accuracies have been observed in conventional machine learning models for bearing scratch type of faults diagnostics and classification. 17,18,[42][43][44][45][46][47] Furthermore, the capability of individual features has never been tested in the past to determine fault classification accuracy.   The literature review given in earlier paragraphs highlights the limitations of intrusive and non-intrusive condition monitoring techniques and gives a direction for the significant improvements for non-intrusive condition monitoring techniques so that it could be capable of reliable fault diagnosis in a highly noisy environment. The low classification accuracy has been reported to be the main issue in conventional machine learning models. The literature on bearing fault diagnosis gives a potential research direction for reliable fault diagnosis and fault classification for scratches in OR and IR faults. Thus, the contributions of this paper are: A novel feature extractor (NFE) has been developed which extracts only powerful features from the IPS data with the objective to enhance the classification accuracy of the ensemble classifiers. The ensemble classifier (CatBoost) algorithm has been developed to identify the effectiveness of the NFE and to test the capability of the individual features as well as a combination of features of IPS for fault diagnosis.
The rest of the paper has been structured as: Section 2 presents the mathematical steps for the feature identification, selection, extraction, and NFE development. Section 3 describes the condition monitoring setup. The results and discussions are provided in Section 4. Finally, the conclusion has been presented in Section 5.

Development of novel feature extractor
The features for bearing ball (BB), bearing cage (BC), inner race frequency (IRF), and outer race frequency (ORF) are shown in Table 1. The derivation of the fault features and the sample calculations are shown in Appendix A.
In Table 1, f x1 represents the principle harmonics. f x2 is the lower side band and f x3 is the upper sideband.
The novel feature extractor (NFE) has been developed using the following steps: The data has been collected from voltage and current sensors and has been converted into frequency spectrum using IPS algorithm. The normal bearings data has been used as a benchmark for amplitude calculation and comparison. The frequencies associated with the faults such as f x1 , f x2 , f x3 are identified. An algorithm has been constructed to automatically extract the f x1 , f x2 , f x3 from the spectrum. The magnitude of the extracted components has been identified. The amplitude difference between benchmark values and values of f x1 , f x2 , f x3 has been calculated. The zero difference represents the case of healthy bearing. The non-zero amplitude difference gives the indication of the presence of the bearing scratch. The final comparison of the magnitudes has been performed between the threshold value and those f x1 , f x2 , f x3 features whose amplitude difference is greater than zero. Those features whose values are greater than the threshold are segregated and used in the ensemble learning algorithm for the bearing fault classification.
The flow chart of the NFE development has been shown in Figure 4.

Experimental procedure
The system developed for the performance monitoring of the centrifugal pump has been shown in Figure 5. The current and voltage sensors are placed on the electric power line. The data collected from the sensors is interfaced through NI PXIe 6363 and is examine in LabVIEW. The four faults are simulated in the bearing: Type 1 is ball defect, Type 2 is broken cage, Type 3 is a scratch of 0.5 mm width, 0.5-mm depth, and 5-mm length in the inner surface of the bearing, Type 4 is a scratch in the outer surface of bearing with the same dimensions as of Type 3. The simulated bearing faults have been shown in Figure 6. The NFE has been used to extract amplitudes from each feature (f x1 , f x2 , f x3 ) shown in Table 1. The NFE calculates the amplitude difference for each type of fault. If the output of NFE is zero, then it represents the normal bearing case. If the output of the NFE is non-zero, then it is the case of bearing fault. The Figure 7 to Figure 10 shows that the various faults cause the amplitude difference in the range of 10-18 dB. Such small amplitude differences could be miss detected when the machines are operating in the industrial environment and the noise variations are sometimes much larger than the fault amplitudes. Such type of scenario will cause a misdetection or false detection in an automatic fault detection system. This issue has been addressed here by comparing the amplitude differences with the threshold value. The threshold has been derived keeping into consideration of the noise variations. Those harmonic components whose amplitudes are higher than the threshold value are considered as robust features and are segregated and fed to ensemble learning classifier. Those harmonic components whose amplitudes are lower than the threshold value are weak features and are neglected. The feature segregation has been shown in Table 2. The threshold derivation has been shown in Appendix B.

Performance of fault classification algorithm
The features of Normal Bearing (NB) and four fault classes shown in Table 1 are extracted from the IPS through the NFE algorithm and are utilized by ensemble learning algorithm for the bearing fault classification. The total number of samples is 640 out of which 70% samples are used for training the algorithm and 20% of the samples are used for testing the algorithm and measuring the performance. The total number of features for classification is 3 which are defined as A1, A2, A3. Where, A1 is the amplitude at feature f x1 , A2 is the amplitude at feature f x2 , A3 is the amplitude at feature f x3 . These are shown in IPS spectrum from Figure  7 to Figure 10.
The CatBoost is a decision tree gradient-boosting method. The uniqueness of the CatBoost algorithm is mainly compromised of three main points. Firstly, it reduces target leaking by modifying gradient boosting with an ordered boosting technique. Secondly, the algorithm works efficiently with small datasets. Thirdly, the     algorithm can handle a wide range of data and formats. Since its inception, CatBoost has been used in many other areas, including finance and with many different datasets. These include time-series data and other similar kinds of datasets. Each category gets a new binary feature in place of the original variable. Additionally, the algorithm uses random permutations to estimate leaf values while selecting the tree structure in order to avoid overfitting that is common with conventional gradient boosting methods. When dealing with categorical features during model training, the CatBoost method uses efficient modified target-based statistics that handle them properly, which saves a significant amount of computational time. The CatBoost algorithm's ordered boosting process is another key feature. A prediction model is built by performing multiple boosting steps on all of the training data in conventional GBTs. As a result of this strategy, the model's predictions change, creating a new kind of target leakage issue. The ordered boosting architecture used by the CatBoost algorithm overcomes the previously mentioned problem.
Feature set = f i = X : where S denotes the different paths to the leaf nodes in the decision tree, c1 and c2 denote the total weight coefficient in the left and right leaves, respectively, and y1 and y2 denote the formula value in the left and right leaves, respectively. The block diagram of the Catboost  classifier has been shown in Figure 11. The performance of the Catboost classifier has been shown in Table 3.
The accuracies of the individual features and the combination of features fed to the Catboost classifier have been shown in Table 3. The confusion matrix for the various combinations of features has been shown in Figure 12. The Catboost achieves a classification accuracy of 100% for the individual as well as with various combinations of features.

Comparison of the performance with other machine learning techniques
The classification accuracy of the Catboost classifier has been compared with other well-known machine classifiers using the same dataset and the summary of the results has been shown in Table 4. It has been concluded that the Catboost classifier is giving better accuracy than the Support Vector Machine, Naı¨ve Bayes Classifier, and Gradient Boost Classifier.

Comparison of the performance with other published papers
The comparison with other published papers indicates that the performance of the XGB and CatBoost have been significantly improved. For example, 17 has investigated the minor scratches in the bearing outer surface. They have used the non-intrusive MCA as a data collection, frequency analysis for feature extraction, and several classification algorithms such as SVM, k-NN, NBC, and CNN were used to measure the classification accuracy. However, they could achieve a maximum of 89.26% accuracy. Vakharia et al. 48 have used minimum permutation entropy based best wavelet feature All features used for classification 1 1 1 100 100 Figure 11. The block diagram of the CatBoost classifier.    52 has used an improved AdaBoost classifier fed with multi-sensors data for motor fault diagnosis and has achieved a classification accuracy of 92.38%. The multi-sensory data collection setup has a high cost and AdaBoost performance was not satisfactory. Zhang et al. 53 have used time-domain vibrational analysis of gearbox. The wavelet packet decomposition was used as a feature extraction method and AdaBoost classifier was used to achieve a classification accuracy of 96.94%. However, the CatBoost used in the present work has shown better accuracy. This improvement in classification accuracy reflects the significance of the proposed novel feature extractor named NFE. The comparison of the proposed work with other published papers has been summarized in Table 5.

Conclusions
This paper has developed a novel feature extractor for the classification of various faults in bearings. The derivation of the location of features (f x1 , f x2 , f x3 ) has been done through mathematical models. A novel NFE method has been developed to extract the features (f x1 , f x2 , f x3 from the IPS plot). The NFE eliminates the noise impact by adopting the thresholding technique. The amplitudes (A1, A2, A3) of extracted fault features (f x1 , f x2 , f x3 ) are measured and the comparison with the benchmark data has been performed to verify the amplitude variation. The ensemble learning approach, CatBoost classifiers have been developed to classify various machine health conditions using individual features as well as the combination of extracted features. It has been concluded that the proposed method gives satisfactory classification accuracies as compared to previously published techniques. The performance (classification accuracy) comparison of the developed feature extraction technique with other state-of-the-art