Power load forecasting in energy system based on improved extreme learning machine

Through the accurate prediction of power load, the start and stop of generating units in the power grid can be arranged economically and reasonably. The safety and stability of power grid operation can be maintained. First, chicken swarm optimizer based on nonlinear dynamic convergence factor (NCSO) optimizer is proposed based on chicken swarm optimizer (CSO) optimizer. In NCSO optimizer, nonlinear dynamic inertia weight and levy mutation strategy are introduced. Compared with CSO optimizer, the convergence speed and effect of NCSO optimizer are obviously improved. Second, the random parameters of extreme learning machine (ELM) model are optimized by NCSO optimizer, and NCSOELM model is established to predict the power load. Finally, the NCSO optimization extreme learning machine (NCSOELM) model is used to predict the power load, and compared with back propagation (BP), support vector machine (SVM) and CSO optimization extreme learning machine (CSOELM) model. The experimental results show that the fitting accuracy of NCSOELM model is high, and the determination coefficient r2 is above 90%. And the root mean square error value of the NCSOELM model is 0.87, 0.41, and 0.25 smaller than the root mean square error values of the support vector machine, BP, and CSOELM models, respectively. Experiments show that the model proposed in this study has high fitting effect and low prediction error, which is of positive significance for the realization of economic and safe operation of energy system.

model, and considered the difference of seasonal load. Because of the non-stationary and nonlinear characteristics of load series, it will increase the difficulty of forecasting. For this reason, Mohan et al. (2018) proposed a load data-driven forecasting model based on dynamic model decomposition. The biggest advantage of this model is that it can identify the external factors that affect the characteristics of load data. Liu et al. (2019) proposed a hybrid method for power load forecasting. Due to the high noise in the original power load data, the high noise signal will affect the prediction results, so the original power load data are preprocessed to reduce the impact of noise. Then, the super parameters of SVM are optimized by whale optimizer, and finally the power load is predicted by SVM model. Ceperic et al. (2013) established the support vector regression model to predict the power load. This model uses feature selection algorithm to determine the input of the model, and uses particle swarm optimization algorithm to optimize the super parameters, so as to reduce the interaction. Cevik and Cunkas (2015) used fuzzy logic to forecast power load. In order to improve the prediction effect, first, the samples are grouped according to the load characteristics, and then the fuzzy logic is used to predict the load. For machine learning model based on kernel function, such as SVR, the choice of kernel function has great impact on the forecasting results. Che and Wang (2014) proposed a new selection algorithm to select the kernel function of the model, so as to improve the forecasting effects.
In this paper, extreme learning machine (ELM) model is used as the power load forecasting model. Compared with classical forecasting methods, ELM model has faster computing speed and stronger generalization ability. Compared with the SVM model suitable for small samples, the ELM model is less sensitive to the number of samples and is more applicable. So ELM model is used to predict the power load. First, based on CSO optimizer, NCSO optimizer is proposed. In NCSO optimizer, nonlinear dynamic convergence factor and levy mutation strategy are introduced to improve the convergence ability of the optimizer. Second, in order to improve the forecasting effects of ELM model, the parameters are optimized by NCSO optimizer, and then NCSOELM prediction model is established. Finally, the power load is forecasted by NCSOELM model and compared with SVM, BP, and CSOELM model. The test results show that the model proposed in this study has high forecasting accuracy and fitting effect. This paper has three main contributions. First, this paper proposes the NCSOELM model for power load forecasting. Second, this study proposes an NCSO optimizer and applies it to the field of load forecasting. Third, through the accurate prediction of the power load, the safe and stable operation of the power system can be guaranteed.
The rest of this paper is organized as follows. The next section introduces the basic principles of ELM model, NCSO optimizer, and NCSOELM prediction model. The "NCSO optimizer test and power load forecasting" section introduces the test process of NCSO optimizer and the prediction results of each model for power load.

ELM model principle
The ELM is developed based on single-layer feedforward NN. Feedforward NN is more sensitive to learning rate (Liu et al., 2020a). When the learning rate is small, the convergence speed of the NN is slower and it takes longer to calculate; when the learning rate is large, the convergence of the NN is unstable (Huang et al., 2011(Huang et al., , 2012. ELM is an improvement of feedforward NN. The learning speed and generalization ability of ELM are faster than feedforward NN. Therefore, ELM is widely used in the fields of prediction, pattern recognition, and fault diagnosis. The ELM model consists of three layers: the input layer, the hidden layer, and the output layer. Each layer is composed of neural nodes, which are connected by connection weight. Suppose the input layer, the hidden layer, and the output layer have a, b, and c neural nodes, respectively. The connection weight D 1 between the input layer and the hidden layer is as follows (Huang et al., 2010;Li et al., 2015;Wang, 2016;Wang et al., 2017;Zong et al., 2013) D The connection weight D 2 between the output layer and the hidden layer is as follows Suppose that the elm model has F training samples. The input matrix of the sample set is G and the output matrix is H Let the activation function of the hidden layer network be JðaÞ, then the output of the network is I i cj 2 6 6 6 6 6 4 3 7 7 7 7 7 5 kc JðD 1 k g j þ e k Þ 2 6 6 6 6 6 6 6 4 3 7 7 7 7 7 7 7 5 cÂ1 (7) where Transform equation (7) as follows The least square solution obtained is as followŝ where L þ is the generalized inverse matrix of hidden layer neural nodes. Compared with the feedforward NN, the calculation speed of ELM model is faster. Because the connection weights and thresholds of ELMmodel are randomly initialized and remain unchanged during the model training process. However, if the random initial super parameters are not selected properly, it will increase the calculation amount and affect the forecasting accuracy of the model. Therefore, this study improves the shortcomings of ELM model and uses intelligent optimizer to optimize the super parameters.

Chicken swarm optimizer (CSO)
The chicken optimizer (CSO) imitates the hierarchy and behavior characteristics of chicken (Meng et al., 2014). The CSO optimizer includes cocks, chicks, and hens. Compared with the classical optimizer such as particle swarm optimizer and ant swarm optimizer, it has stronger convergence ability and robustness. The CSO optimizer follows these rules (Al Shayokh and Shin, 2017;Fu et al., 2019;Tiana et al., 2017): 1. The population is divided into groups. 2. In the population, the rooster is the leader; the number of hens is the most, but the foraging ability is weaker than that of the rooster; the foraging ability of the chicks is the worst and they forage around the hens. 3. In the CSO optimizer, the hierarchical relationship of the population is rebuilt at regular intervals. 4. The positions of cocks, hens, and chicks are updated according to their respective motion rules.
The location of each individual in the CSO optimizer represents a possible solution to the actual problem. There are n particles in the population. Suppose the number of cocks, hens, and chicks is n c , n h , and n k , respectively. Assuming the flock is optimized in m-dimensional space, the position of the ith chicken in the j-dimensional space at time t can be expressed as W j i ðtÞ. The cock example has the best fitness value and the largest range of foraging. The cock's position is as follows where Randnð0; r 2 Þ stands for Gaussian distribution, where 0 is the mean and r 2 is the variance. fit i represents the fitness value of the ith cock, fit k ðk 2 ½1; n and k 6 ¼ iÞ represents the fitness value of the other rooster, and e represents the infinitesimal avoiding denominator of 0.
In the CSO optimizer, there is a competitive relationship between various groups, and hens can steal food from other groups. The location of the hen is as follows where Rand is the random number; c1 2 ½1; n is the cock in the population of the ith hen; c1 2 ½1; n ðc1 6 ¼ c2Þ is the cock or hen in other populations. When R1 ¼ 0, the hens scramble for food from other populations; when R2 ¼ 0, the hens only follow the males of their own populations for food.
Chicks forage around the hens, and the chicks have the smallest range of foraging. The chick's position is as follows where BL 2 ½0; 2 is the following coefficient; W j m is the mother chicken of the ith chick.
CSO based on nonlinear dynamic convergence factor (NCSO) a. Nonlinear dynamic convergence factor When the CSO optimizer is solving more complex problems, due to the limitations of the CSO optimizer itself, the optimization ability of the CSO optimizer is limited. Cock represents the strongest particle in the population, with the best fitness value and maximum search ability. If the cock, as the leader of the population, cannot jump out of the local search, the convergence ability of the whole population will be limited. Therefore, in this study, nonlinear dynamic convergence factor is introduced to improve the convergence ability of the cock.
Equation (18) shows the mathematical model of nonlinear convergence factor where Z max ¼ 0:9, Z min ¼ 0:4, t max is the maximum number of iterations, and t is the current number of iterations. As shown in Figure 1, in the early stage of the iteration, the nonlinear dynamic convergence factor is large, so as to ensure that the cock has a large search range, so as to achieve a strong global optimization ability; in the middle and later stages of the iteration, the nonlinear dynamic convergence factor is small, so that the cock can achieve a strong local optimization ability. The cock is located as follows High quality population initialization has a great impact on accelerating the convergence speed of CSO optimizer.

b. Levy mutation strategy
In the CSO optimizer, the fitness value of the chick is the worst and easily falls into local extremes. In this study, Levy mutation strategy was used to improve the foraging range of chicks. The foraging method based on Levy flight strategy has stronger natural adaptability. Levy distribution shows short-distance searching and occasional long-distance jumping. This method can better maintain the diversity of the population.
Levy's random step size a is expressed as follows where c and h obey normal distribution where C is the gamma distribution. The schematic diagram of the two-dimensional plane Levy flight is shown in Figure 2. The chick position based on the Levy mutation strategy is as follows The optimization process of the NCSO optimizer is shown in Figure 3.

Improved ELM
In the ELM model, connection weights and thresholds are chosen randomly. When the random weights and random thresholds are not appropriate, the calculation cost of the model will increase and the prediction accuracy of the model will be affected. In this study, the hyper parameters of the ELM model are optimized by the NCSO optimizer, and the power load is predicted by the NCSOELM model. The process of NCSOELM model predicting power load is shown in Figure 4. Figure 4 shows the forecasting process of the NCSOELM model:

NCSO optimizer test
To test the optimization capability of the NCSO optimizer presented in this study, this section uses testing functions to test the convergence performance of NCSO and compare it with the CSO optimizer.
Four test functions are Sphere, Schwefel, Griewank, and Rastrigin functions (Li et al., 2019a, 2018aand Liu et al., 2019b. The values of the four function variables are [À100, 100], [À10, 10], [À600, 600], and [À5.12, 5.12]. The optimal values of test functions are all 0. The population size of NCSO and CSO optimizer is set as 20; the number of iterations is set as 500; the change frequency of hierarchy is set as 10. The dimension of the test function is set as 30 dimensions, and each function repeats the test for CSO optimizer and NCSO optimizer for 15 times, respectively.
The test results of NCSO and CSO optimizer are as follows. By analyzing the convergence results of CSO optimizer and NCSO optimizer in Table 1, it is found that the convergence accuracy of NCSO optimizer is significantly higher than that of CSO optimizer. For the Rastrigin function, the CSO optimizer converges to the optimal value. But for the other three functions, the CSO optimizer does not converge to the optimal value. For Sphere, Griewank, and Rastrigin functions, the NCSO optimizer converges to the optimal value. For Schwefel function, compared with the convergence result of CSO optimizer, the convergence result of NCSO optimizer is closest to the optimal value.
Because of the nonlinear dynamic convergence factor in the NCSO optimizer, the NCSO optimizer has stronger local and global convergence capabilities. The Levy mutation strategy can ensure the diversity of the population and enhance the ability of the population to jump out of the local optimum to a certain extent. To analyze the convergence capacity  (14) Update chick position according to equation (23) Update optimal individual and optimal location N End Figure 3. Optimization process.    of the CSO optimizer and NCSO optimizer, Figure 5 plots the iterative curves of the two optimizers. As shown in Figure 5, for the four test functions, the NCSO optimizer's iteration curve drops faster. The nonlinear dynamic convergence factor accelerates the convergence rate of NCSO optimizer. Compared with the iterative curve of NCSO optimizer, the convergence speed of the iterative curve of CSO optimizer is slower.

Power load forecasting based on NCSOELM model
The simulation data used in this paper comes from the sample data provided by the European intelligent technology network. In this paper, a week's power load data are selected from the sample set for model training and testing. The power load data of a week include 336 samples, and the time series curve of the samples is shown in Figure 6.
Relative error (RE) and root mean square error (RMSE) are used to evaluate the fitting error of the model. The determination coefficient (r2) is used to evaluate the fitting degree where Num is the total number of samples, R k is the sample value, and F k is the sample prediction value.
First, the sample power load data of the first six days are used as the training sample, and the power load data of the last day are used as the prediction sample. The power load is predicted by the NCSOELM model on the seventh day and compared with the prediction results of the BP, SVM, and CSOELM models. The prediction results on the seventh day of each model are shown below.
The power load curve predicted by each model is shown in Figure 7(a). The four forecast curves all reflect the fluctuation trend of the true value curve. Figure 7(b) shows the REs of the four models. The prediction error distribution intervals of SVM model,BP model,CSOELM model,and NCSOELM model are [0.02%,7.03%], [0.05%, 6.88%], [0.03%, 7.68%], and [0.15%, 6.61%], respectively. For one day's power load prediction results, the RMSE and r2 values of the four forecasting models are shown in Table 2.    Table 2, the determination coefficient r 2 values of the four models areabout 90%. The highest r 2 value of the NCSOELM model is 90.66%, which indicates that the NCSOELM model has a better fitting degree. For the RMSE evaluation index, the minimum RMSE value of the NCSOELM model is 16.16 and the maximum RMSE value of the SVM model is 17.92, indicating that the forecasting errors of the NCSOELM model are smaller.

As shown in
Second, the power load sample data from the first five days are used as the training set, and the power load data from the next two days are used as the prediction set. The SVM, BP, CSOELM, and NCSOELM models were used to forecast the power load for two days. Figure 8(a) shows the forecasting curves. The forecasting curves can still reflect the changes of the two-day true value curves. Figure 8(b) shows the forecasting error curves. The RE fluctuation intervals of BP, SVM, CSOELM, and NCSEOLM models are [0.03%, 6.69%], [0.01%, 7.16%], [0.55%, 7.99%], and [0.01%, 6.66%], respectively. By comparing the error intervals, it can be found that the error interval of the NCSOELM model is smaller, which indicates that the NCSOELM model has higher forecasting stability.
The results of the two-day power load forecast evaluation are shown in Table 3. As shown in Table 3, the four models have achieved good fitting effect, and the determination coefficient r2 has reached 91%. Among them, the highest decision coefficient of NCSOELM model is 91.83%. The decision coefficients of NCSOELM are 0.58, 0.23, and 0.25% higher than those of SVM, BP, and CSOELM, respectively. For RMSE values, the RMSE value of the NCSOELM model is at least 15.87, and the RMSE value of the NCSOELM model is 0.87, 0.41, and 0.25 smaller than the RMSE values of the SVM, BP, and CSOELM models, respectively.

Conclusions
With the development of clean energy, electric power has become one of the most important clean energy. Electric energy has become a guarantee for the development of all walks of life. It is of great significance for energy system to mine the data of power load in the past. In order to promote the economic operation and development of energy system, NCSOELM model is proposed to predict the power load. By using the NCSOELM model to predict the power load of one day and two days, respectively, the following conclusions are obtained.
1. To improve the local and global search ability of CSO optimizer, the NCSO optimizer based on nonlinear dynamic convergence factor is proposed. And the levy mutation strategy is used in NCSO optimizer. 2. Compared with CSO optimizer, NCSO optimizer has faster optimization speed. For Sphere, Griewank, and Rastrigin functions, the NCSO optimizer converges to 0. 3. For one-and two-day power load forecasting, NCSOELM model shows high fitting effect and low prediction error. The decision coefficients of NCSOELM model are all above 90%. The RMSE value of the NCSOELM model is at least 15.87, and the RMSE value of the NCSOELM model is 0.87, 0.41, and 0.25 smaller than the RMSE values of the SVM, BP, and CSOELM models. 4. The economic operation of energy system can be realized by accurately forecasting the power load.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Natural Science Foundation of Hebei Province of China (Project No. E2018202282) and the key project of Tianjin Natural Science Foundation (Project No. 19JCZDJC32100).