Multiway dynamic nonlinear global neighborhood preserving embedding method for monitoring batch process

Aiming at the dynamic and nonlinear characteristics of batch process, a multiway dynamic nonlinear global neighborhood preserving embedding algorithm is proposed. For the nonlinear batch process monitoring, kernel mapping is widely used to eliminate nonlinearity by projecting the data into high-dimensional space, but the nonlinear relationships between batch process variables are limited by many physical constraints, and the infinite-order mapping is inefficient and redundant. Compared with the basic kernel mapping method which provides an infinite-order nonlinear mapping, the proposed method considers the dynamic and nonlinear characteristics with many physical constraints and preserves the global and local structures concurrently. First, the time-lagged window is used to remove the auto-correlation in time series of process variables. Second, a nonlinear method named constructive polynomial mapping is used to avoid unnecessary redundancy and reduce computational complexity. Third, the global neighborhood preserving embedding method is used to extract structures fully after the dynamic and nonlinear characteristics are processed. Finally, the effects of the proposed algorithm are demonstrated by a mathematical model and the penicillin fermentation process.


Introduction
Batch process mainly exists in food, semiconductor, chemical production, and so on. It is vital to develop approaches that ensure product quality and production safety requirements. As one of the popular monitoring methods, data-driven statistic process monitoring (SPM) is widely used in batch process. [1][2][3] Compared with the continuous process, the data of batch process have the structure of three-way data array. For the array, tensor and tucker models are used to deal with data directly. [4][5][6] Some SPM methods unfold three-way array data to two-way array data, and then traditional SPM models of batch process are established, [7][8][9] such as multiway principal component analysis (MPCA) 8 and multiway partial least squares (MPLS). 10,11 The methods unfolding three-way array are along the batch direction or variable direction, but the conventional linear monitoring methods cannot describe the underlying nonlinearity of process variables. 12 For the nonlinear characteristic, the regular methods combine kernel mapping, which linearizes the relationship with high-dimensional projection, such as kernel principal component analysis (KPCA), 13,14 kernel independent component analysis (KICA), 15 and kernel Fisher discriminant analysis (KFDA). 16 However, they only preserve global structure, and the neighborhood structure of data is neglected. The loss of neighborhood structure would inevitably influence the effect of process monitoring.
As one of the manifold learning algorithms, neighborhood preserving embedding (NPE) can preserve the local structure of data and has been widely used in processing monitoring. [17][18][19][20][21] As a nonlinear extension of NPE, 22 kernel NPE tries to preserve local topology relationship. But kernel NPE only extracts the neighborhood structure and ignores the global structure. 23,24 Therefore, to reveal the inherent properties of process data effectively, Tong and Yan 25 proposed a method that considered the global-local structures. Hui and Zhao 26 divided the batch process variables into related and independent variables and then used the corresponding method to realize process monitoring. Zhao and Tao 27 proposed a tensor global-local model in batch process monitoring. Although the global and local structures were preserved in dimension reduction, in real industrial process, the nonlinear relationships were often limited by physical constraints, resulting in kernel mapping being exorbitant. [28][29][30] The type of kernel function determines the dimensions of feature space. The radial basis kernel is widely used to solve nonlinear relationships between variables by providing an infinite-order mapping. 31 But the infinite-order mapping would lead to a higher computational complexity; especially for batch process, it needs to unfold the three-array data into two-array data. After unfolding, the row would be larger than before, so if the kernel mapping is used in batch process, the computational complexity is very high and may be beyond the computer memory.
In the present work, multiway dynamic nonlinear global neighborhood preserving embedding (MDNGNPE) method is proposed for dynamic nonlinear batch process monitoring. Because the data at sampling time t are related to the data before and after sampling time t, the time-lagged windows are used to remove the auto-correlation in time series of process variables. Then, for the nonlinear characteristic of process data, constructive polynomial mapping (CPM) is used for achieving nonlinear mapping and for considering the physical constraints. The mapped data samples are extracted to the global and local data structures using global neighborhood preserving embedding (GNPE). The MDNGNPE method is applied in a numerical process and the penicillin fermentation process to verify the monitoring effects.

NPE
The NPE preserves reconstruction neighbor relationships in projected low-dimensional subspace. For training dataset X, NPE can extract a linear projection matrix A to project the dataset X into low-dimensional subspace. The NPE procedures can expressed as follows: 1. Construct the adjacency graph: the neighbors are defined by k-nearest neighbors (knn). If x q is one of the knn of x g , there is an edge between the qth and gth node; otherwise, there is no edge. 2. Calculate the weight matrix W: the weight of node q to node g can be represented by W q, g ; if there is no edge, W q, g is set as 0. The weight can be computed as follows where P q k g = q 1 W q, g = 1, g = 1, 2, . . . , m, q 1 , . . . , q k , is determined by the adjacency graph.
3. Compute the projection: the projection matrix P can be obtained as follows where Y T = p T X, M = (I À W) T (I À W), and Y T Y = p T XX T p T = 1. The matrix P can be obtained by solving the following generalized eigenvector problem

MDNGNPE method
Three-way array data unfolding Batch process consists of batches, variables, and sampling points. Therefore, the data are expressed as X(I3J3Z) (I, J, and Z represent batches, variables, and sampling points, respectively). Here, we unfold X(I3J3Z) into X(I3ZJ). Then, X(I3ZJ) is rearranged as X(ZI3J). 21,32 This unfolding method reflects its dynamic characteristic over time. The detailed flowchart is shown in Figure 1. Dynamic nonlinear global neighborhood preserving embedding method The auto-correlation in time series of process variables widely exists in real industrial process. To remove autocorrelation, an efficient method is to utilize time-lagged data matrix. After batch process data are unfolded by hybrid approach, each variable is augmented on the original time series. The time-lagged data matrix is as follows where d is the window length, R is the total number of samples, and X(r) = ½x 1, r , x 2, r , . . . , x J, r T is the measurement vector of the rth sampling time (j = 1, 2, . . . , J is the measurement variable). The nonlinear relationship is limited by some physical constraints. The kernel mapping is used to solve the nonlinear relationship between variables with an infinite-order mapping; this may cause redundancy, and the computational complexity is high. CPM is introduced to employ nonlinear mapping. 28 For training data X 2 R m3n , after building time-lagged data matrix X d 2 R m3nd (nd = n3d, d is the window length), GNPE is used to maintain the global-local structure.
For the NPE algorithm, the local features can be extracted by equation (5) after seeking the nearest k neighbor points where Y T = p T X and M = X(I À W) T (I À W)X T ; the constraint is p T XX T p = 1. The global structure is preserved by seeking the maximum direction of variance, as shown in equation (6 where x = ( P n i = 1 x i )=n and G = (X À X i )(X À X i ) T . Since the low-dimensional space Y and the original space X maintain the same maximum variance direction, the same global structures are preserved in the extracted low-dimensional space. GNPE preserves both the global and local structures, and the optimal objective function is shown in equation (7) J Equation (7) can be converted to equation (8) by the projection relationship y T i = p T x i .

Fault detection
After unfolding three-dimensional data matrix X(I 3 J 3 Z) into X(ZI3J), we use the proposed MDNGNPE method to establish the statistic model. For the order of nonlinear mapping, K is specified by the user. After each iteration, squared prediction error (SPE) and Hotelling's T 2 statistics can be calculated by equation (14) SPE can be obtained by the online data X new , P(k) = ((B(k)) T B(k)) À1 B(k) is the transformation matrix of the kth step and k 2 f0, 1, . . . , Kg is the iteration times, L(k) = ((Y(k)) T Y(k))=(n À 1), and n is the row of Y(k).H(k) is obtained using equations (10)- (13).
The monitoring statistics of MDNGNPE are given as follows The SPE and T 2 control limits can be determined by kernel density estimation (KDE). 33 Here, the window length is chosen as 2 to remove the auto-correlation of process variables. 28,34 Monitoring steps Offline monitoring.
Step 1. Unfold three-way array data X(I3J3Z) into X(ZI3J) and normalize; Step 2. Construct the time-lagged data matrix using equation (4); Step 3. CPM is introduced to employ nonlinear mapping; Step4. GNPE is used to reduce dimensions after each iteration of CPM nonlinear mapping, and then the MDNGNPE method is obtained; Step 5. Calculate SPE and T 2 statistics based on MDNGNPE; Step 6. Compute control limits of SPE and T 2 using KDE.
Online monitoring.
Step 1. Normalize the new batch data; Step 2. Construct the time-lagged data matrix for the new samples; Step 3. Obtain the nonlinear mapping of new timelagged data matrix and then project into dimensional reduced subspace using GNPE; Step 4. Calculate SPE and T 2 statistics for new samples; Step 5. Judge whether the SPE and T 2 statistics exceed the control limits.
The monitoring flowchart based on the MDNGNPE method is shown in Figure 2.

Simulation verification and analysis
In order to verify the performance of MDNGNPE, two cases are selected for verification. The first one is a numerical process with dynamic nonlinear characteristic; the second one is a well-known penicillin fermentation process. In these two cases, the monitoring effects of MPCA, multiway neighborhood preserving embedding (MNPE), multiway global neighborhood preserving embedding (MGNPE) and MDNGNPE are compared and analyzed.

Numerical process
A nonlinear dynamic numerical system is as follows 7,35 q(s) = Aq(s À 1) + Bl 2 (s À 1) where q 2 R 331 , l 2 R 231 is the correlation input, v(s) is the Gaussian distribution noise with 0 mean and 0.01 variance, and w(s) is the Gaussian distribution noise with 0 mean and 1 variance. A, B, C, and D are as follows The process monitoring variables are x(s) = ½l 1 (s), l 2 (s), y 1 (s), y 2 (s), y 3 (s), and the normal operation dataset of 20 batches is generated, in which the small differences of variable correlation coefficients in each simulation are considered, and the whole duration is composed of 300 samples for each batch.
Furthermore, an abnormal case is added to test the effect of MDNGNPE: variable l 2 is linearly increased by 0:083(s À 150) from the 151th sampling point to the end (the whole duration is composed of 300 samples).
MPCA, MNPE, MGNPE, and MDNGNPE are applied to monitor the numerical system. Monitoring charts are shown in Figures 3-6, respectively. Figure 3 shows the MPCA monitoring results, and the SPE and T 2 statistics alarm the fault at the 251th and 223th sample, but the fault is added at the 151th sampling point, due to which there is a large delay in fault detection. Figure 4 shows the SPE and T 2 statistics of MNPE; we can see that the fault is detected at the 241th and 240th sample, respectively, which causes a large delay in fault detection. Figure 5 shows the monitoring results of MGNPE; the SPE and T 2 statistics alarm the fault at the 241th and 232th sample, respectively, due to which there is a large delay. By contrast, the monitoring results of MDNGNPE are shown in Figure 6; the SPE statistic detects the fault at the 215th sample, and the T 2 statistic detects the fault at the 205th sample. It is obvious that MDNGNPE responds more quickly and detects the fault earlier than the other three methods. It shows that MDNGNPE has better monitoring performance than MPCA, MNPE, and MGNPE. This is due to the fact that the MDNGNPE method can fully extract the nonlinear dynamic characteristic of the process data.

Penicillin fermentation process
The benchmark simulation of penicillin production is a typical dynamic, nonlinear batch process. In this paper, batch process data were generated by the standard simulation platform Pensim 2.0 in the penicillin fermentation process. 36 Pensim2.0 developed by Illinois Institute of Technology to study the typical batch process more conveniently can produce penicillin fermentation process data under different initial conditions and different working conditions for analysis and research. The production process of penicillin fermentation is shown in Figure 7. The fermentation process lasts for a total of 400 h, and the sampling interval is 1 h. A total of 36 normal batches are obtained. The 10 measurement variables which can represent the main characteristics of process are chosen, so the normal data X (36 3 10 3 400) can be obtained.   The fault detection capability is evaluated using six fault cases that involve different fault types and magnitudes. The details of fault cases are shown in Table 1. Tables 2 and 3 show the monitoring results. We can see from the tables that step fault 1 and step fault 3 can be detected by four methods. Although step fault 5 cannot be detected by these four methods immediately, the performance by the MDNGNPE method is better. There is a delay in fault detection because of the slow spread of the change in glucose substrate rate through the related variables. In contrast to step faults, the ramp faults are hard to detect because of slow changes in its variables. For ramp fault 2, ramp fault 4, and ramp fault 6, the MDNGNPE monitoring method first detects the fault and synthesizes fault detection rate (FDR) and false alarm rate (FAR), which can accurately distinguish normal and abnormal states.
Fault 2 is the ramp fault of aeration rate. Figures 8-11 show the monitoring results of MPCA, MNPE, MGNPE, and MDNGNPE methods under fault 2, respectively. Figure 8 shows the fault detection result of MPCA; the SPE and T 2 alarm the fault at the 300th and 308th sample, and it cannot detect the fault on time when fault 2 occurs; the false alarm occurs at the first 40 samples in T 2 statistics. Figure 9 shows the SPE and T 2 statistics of MNPE; the fault is detected at the 329th and 302th sample, due to which there is a large delay in fault detection; it also generates false alarm at first 40 sampling data in SPE and T 2 statistics. Figure 10 shows the monitoring results of MGNPE; SPE detects the fault at the 307th sample and T 2 detects the fault at the 259th sample; it also generates false alarm before 45 sampling point data of SPE. By contrast, the SPE and T 2 statistics of MDNGNPE in Figure 11 alarm the fault at the 182th and 198th sample, which gives early fault detection results and has higher FDR. Generally, MDNGNPE has better monitoring performance than MPCA, MNPE, and MGNPE.
The second test case is fault 6 in which the fault occurs at the 150th sample point and continues to the end. MPCA, MNPE, MGNPE, and MDNGNPE monitoring charts of fault 6 are shown in Figures 12-15, respectively. Figure 12 shows the fault detection result    Figure 13 that the SPE and T 2 of MNPE alarm the fault at the 264th and 300th sample; it also generates false alarm at first 40 samples in SPE and T 2 . Figure 14 shows the monitoring chart of MGNPE; the SPE and T 2 alarm the fault at the 281th and 234th sample, due to which there is a large delay in fault alarm, and the false alarm occurs at first 40 sample data in SPE and T 2 . By contrast, the SPE and T 2 of MDNGNPE in Figure 15

Conclusion
In this paper, an MDNGNPE method is proposed. First, the time-lagged window is constructed to solve the problem of auto-correlation in time series. For nonlinear characteristic, an effective nonlinear mapping method is used to retain the nonlinear structure between process variables, which considers the physical limits when nonlinear mapping is employed. Second, after employing the time-lagged window and nonlinear mapping, GNPE is used to preserve global and local structures and reduce the complexity of nonlinear mapping. Finally, the simulation results obtained from the numerical simulation and the penicillin production process prove the effect of the proposed method. But the parameters in nonlinear mapping need to be set by experience. Therefore, we will build a mathematical model in the future to calculate specific parameters, instead of setting by experience.