A new algorithm for considering green communication and excellent sensing performance in cognitive radio networks

Multi-node cooperative sensing can effectively improve the performance of spectrum sensing. Multi-node cooperation will generate a large number of local data, and each node will send its own sensing data to the fusion center. The fusion center will fuse the local sensing results and make a global decision. Therefore, the more nodes, the more data, when the number of nodes is large, the global decision will be delayed. In order to achieve the real-time spectrum sensing, the fusion center needs to quickly fuse the data of each node. In this article, a fast algorithm of big data fusion is proposed to improve the real-time performance of the global decision. The algorithm improves the computing speed by reducing repeated computation. The reinforcement learning mechanism is used to mark the processed data. When the same environment parameter appears, the fusion center can directly call the nodes under the parameter environment, without having to conduct the sensing operation again. This greatly reduces the amount of data processed and improves the data processing efficiency of the fusion center. Experimental results show that the algorithm in this article can reduce the computation time while improving the sensing performance.


Introduction
Developments in wireless communication technology have increased the need for spectrum resources, which are currently limited. 1,2 To address this problem, cognitive radio networks have been proposed to improve existing spectrum resources. 3,4 In this context, spectrum sensing technology is the basic link between cognitive radio networks. 5 Sensing nodes need to quickly and accurately perform spectrum sensing in order to efficiently utilize the idle frequency band without interfering with the primary user. 6 This presents another issue; due to the impacts of path loss, shadow fading, and hidden terminals, it is difficult for a single sensor node to accurately detect the primary user's status. 7,8 Nevertheless, cooperative sensing can effectively overcome these impacts by fusing detection information from multiple nodes in different geographical locations. 9 In centralized cooperative sensing, there is a special data fusion center in the cognitive network. This collects local perception results from the nodes participating in 1 cooperative perception, judges the current usage of authorized bands, and then broadcasts the decision results in the network or directly controls and schedules the perception nodes. 10 The local sensing result collection process will increase the communication overhead when a large number of nodes are participating. 11,12 However, too many cooperative cognitive users (sensing nodes) will cause vast communication overhead. A proposed review method addressed this problem by examining the observed values of perceived nodes and only allowing nodes containing sufficient information to send their decision values (0 or 1) to the fusion center. 13 This method reduces communication overhead but also reduces sensing performance. Aiming at the excessive overhead created by equal gain fusion, 14,15 a double threshold method is used to perform cooperative spectrum sensing in which each node adopts double threshold detection and sends the detected value directly to the fusion center, which then makes the judgment. Combined with the judgments of each node and its own judgment, the fusion center makes two judgments to determine whether the primary user exists. This employs a combination of soft and hard fusion methods, but performs two operations in the fusion center, thus increasing computational power. Furthermore, a hierarchical cooperative spectrum detection method has been proposed to solve the problem of excessive cooperative sensing overhead. 16,17 Here, when nodal observation values are between two thresholds, the region between these two thresholds is evenly divided into four parts; four different regions are thus quantized by 2 bits. The sensing nodes then send 2 bits of information to the fusion center. When compared with the equal gain fusion method, this reduces communication overhead. However, the sensing performance of hard fusion decreases. 18 Spectrum sensing performance directly affects the throughput of cognitive users, 19 and multi-node cooperative spectrum sensing is a common method to improve the performance of spectrum sensing. 20 However, when multi-node participates in cooperative sensing, the sensing data will increase greatly, and the fusion center cannot process a large number of data in time, which will cause delayed decision, which will affect the security of the main user or the throughput of cognitive users. In order to make a decision in time, a large number of data in the fusion center needs to be processed quickly, which requires the selection of some node data to reduce the number of processed data. Therefore, in order to achieve the real-time spectrum sensing, the fusion center needs to quickly fuse the data of each node. In this article, a fast algorithm of big data fusion is proposed to improve the real-time performance of the global decision. The algorithm improves the computing speed by reducing repeated computation. The reinforcement learning mechanism is used to mark the processed data. When the same environment parameter appears, the fusion center can directly call the nodes under the parameter environment, without having to conduct the sensing operation again. This greatly reduces the amount of data processed and improves the data processing efficiency of the fusion center. Experimental results show that the algorithm in this article can reduce the computation time while improving the perceived performance.

System model
This study designed an analog cognitive radio system consisting of a primary user (PU) and 16 nodes (cognitive users). 21 Each node communicates with the fusion center through a channel, while the fusion center fuses information from each node to determine whether the primary user channel is idle. The system model is illustrated in Figure 1.

Derivation of the optimum local detection threshold
For spectrum sensing, every SU independently performs an energy detection process. The signal received by an SU is determined as follows 22 where s(m) is a PU signal, v(m) is the additive white Gaussian noise with zero mean and variance s 2 v , m represents the serial number of the sampling point, M is the number of samples, and h is channel gain. Suppose PU is absent (i.e. s(m) = 0); the hypothesis of free channel is then denoted by H 0 , whereas the hypothesis of busy channel is denoted by H 1 , as follows 23 Assuming E is the average collected energy of an SU and it is expressed as follows 24 If each SU can make its local decision according to single threshold l o with probabilities of detection P d and false alarm P f , the equation is as follows 25 where s 2 s is signal power and s 2 v is noise power; the complementary cumulative distribution function Q(x) will be described as follows 26 Assuming the presence probability of a PU is P(H 1 ) = b, 0\b\1 and the absence probability of the PU is P(H 0 ) = 1 À b, then the probability of error detection (P e ) is as follows 27 where P e is the quadratic function of the l o , we can derive ∂ 2 P e =∂ 2 l o .0. The optimal threshold l o is obtained by ∂P e =∂l 0 = 0 and it is expressed as follows 28 It is easy to obtain optimal threshold l o if signal power s 2 s , noise power s 2 v and the SNR of the SU's receiving terminal are known.

Estimating the optimal threshold
If a sensing cycle sampling point is M = 2 L and L is a positive integer, M can be divided into the two following equal sections: (1) the previous M=2 sampling points can be expressed with x 1 ; (2) the later M=2 sampling points can be expressed with x 2 . x 1 and x 2 are given as follows where M 1 = M=2, the average energy of each section is expressed as follows If E 1 \E 2 , x 1 denotes AWGN, x 2 denotes the sum of signal and AWGN (i.e. Let E 1 denote estimated noise powerŝ 2 v and E 2 denote the estimated sum of signal and noise powerŝ s 2 v +ŝ 2 s ; estimated signal power is thenŝ 2 s = E 2 À E 1 . As such, the estimated SNR of received signal x(m) is expressed as followŝ The estimation of optimal thresholdsl o is expressed as followŝ Here, it should be noted that the conditions for the solution of equation (2) should be satisfied according to the following equation Transform equation (15) type to the following When the number of sampling points is M = 1024 and the SNR g = À 20 dB, after mathematical derivation, the value of b must satisfy b 2 (0, 0:731), and the equation (16)  This means that during a sensing cycle and when there is a sufficient number of sampling points (even in the case of low SNR and when the probability of the primary user signal is uncertain), equation (15) is established and there are solutions to equation (14).

Adaptive double energy thresholds
To avoid error judgments due to SNR variations in a received end, lower threshold l l and upper threshold l h are set based on optimal threshold l o (Figure 2). 29 In Figure 2, d is the distance between l o and lower threshold l l or l o and upper threshold l h . The following is thus obtained For weak PU signal detection, threshold l o should be decreased. However, it should not be lower than s 2 v . Noise would otherwise be detected as a PU signal. However, l o should not be larger than s 2 v + s 2 s . The PU signal would otherwise miss detection. To assure a lower probability of false alarm and a higher probability of detection, we thus put a limiting range on threshold l o , as follows Here, equations (18) and (19) gives the following Different from conventional double threshold settings, we introduced control parameter e to accurately fine-tune the double thresholds and define the following According to equation (20), e satisfies the following When e = 0, it is equivalent to a single threshold case. Thus, l l and l h can be rewritten as follows Parameter e is an impact factor for double thresholds.

Quantization and coding based on adaptive double energy thresholds
We considered the cognitive radio network as shown in Figure 1. Here, each node communicates with the fusion center. This study assumed that the channel between the node and fusion center was perfect. l l, i , l l, i , l 0, i and w i (i = 0, 1, . . . , N À 1) denote the upper threshold, lower threshold, optimal threshold, and weight of node E i , respectively.

Calculating bode weights
Assume E i is the average energy collected by ith node. First, if E i is not lower than upper threshold l h, i and its weight equals 1, then the SU decides that the PU is present. Next, if E i is not larger than lower threshold l l, i and its weight equals 0, then the node decides that the PU is absent. Finally, if E i is located between l h, i and l l, i and cannot determine whether the primary user is present, then upper threshold l h, i is set as a comparison value. That is, E i is normalized by l h, i and its weight is equal to the normalized result. The weight calculation is expressed as follows where i = 0, 1, . . . , N À 1 is index of SUs, after N nodes performed local spectrum sensing, the N weights form a set u = fw 0 , w 1 , . . . , w N À1 g. Figure 3 shows the assigned fusion weights and cooperative spectrum sensing algorithm. According to Figure 3, global sensing performance will change when the two nodal thresholds are altered. According to equation (24), both the weights of the SUs and global sensing performance will change when the two thresholds are altered. As such, it is highly important to select an optimal e to establish two proper thresholds, thus improving sensing performance. A grid search is conducted to obtain the best parameters for e in part 4. These are memorized through reinforcement learning strategies to obtain better sensing performance and higher sensing efficiency.

Cooperative spectrum sensing based on quantization and coding
After an SU obtains weight and encodes weight w i , where i = 0, 1, . . . , N À 1, the coding rules are as follows: First, if w i = 1, is encoded as 1, and expressed as C i, 1 = 1, then this will be sent to the fusion center, which denotes that an SU transmitted 1 bit of data and consumed 1 unit of energy (e.g. e j = 1). Next, if w i = 0, is encoded as 0 and expressed as C i, 2 = 0, then this also denotes that an SU sent 1 bit of data to the fusion center and consumed 1 unit of energy. Finally, if which denotes that an SU sent 3 bits of data to the fusion center and consumed 3 units of energy.
The fusion center will decode for C i after receiving sensing results from an SU. Here, the decoding rules can be described as one of three types. First, if C i = 1, then the decoding data are expressed as D i = 1. Second, if C i = 0, then the decoding data are expressed as D i = 0. Third, if C i = d 3 d 2 d 1 , then the decoding data are expressed as The decoding rules can thus be expressed as follows where 0:125 ł D i, 3 ł 0:875, and its resolution ratio is 0.125; this can reflect that a node made a contribution for the cooperative spectrum sensing to match its weight.
The fusion center will use majority-rule fusion after completely decoding all data received from all SUs. The fusion expressed is as follows where < is the fusion result, i is the index of the nodes, and D i is the decoding data for the node. The fusion center compares < and N =2 to decide whether there is a signal from the PU. The expression of this decision is as follows The code algorithm summarizes the coding-based cooperative spectrum sensing.
For the note code algorithm, the transmission of all SUs combines 1 bit sent and 3 bits sent. As such, the algorithm can improve sensing performance while reducing communication overhead.

Grid search algorithm
A grid search can be used to obtain optimal e according to the feedback of global P d . We conducted a grid search to train parameters in order to improve search efficiency. The trained parameters included SNR (225 to 0 dB with step 1 dB) and e (0 to 0.5 with step 0.05). These were saved as a prior knowledge to a knowledge base. Spectrum sensing will then directly invoke optimal e under an SNR according to the prior knowledge. Figure 4 shows the grid search algorithm used for parameter training.
In Figure 4, k and j stand for SNR and control parameter e, respectively; (k n , j n ) is a prior knowledge group, while j n is optimal e under k n , n = 1, 2, . . . , 26. If an SNR is newly appearing, then this algorithm will immediately train new parameters; these will be saved as prior knowledge in the knowledge base.
The grid search algorithm process is described as follows 1. When an SNR k i appears, the fusion center will conduct a real-time search to find the optimal e i and obtain the highest P d . It will then proceed to step 3, where k i is the ith newly appearing SNR and e i is the corresponding optimal control parameter. These results will be output when P d and e i are returned. Furthermore, e i and k i will become a prior knowledge couple that the fusion center will then learn, for example where i is a positive integer, u is a storage library, and f ( Á ) is a learn function. 2. When SNR k i is not newly appearing, the fusion center will utilize learned knowledge to directly select optimal e i ; for example where f À1 ( Á ) is a function reading knowledge from the storage library. 3. Under SNR k i , the range of parameter e i is divided into 10 equal grids by 11 grid points, e i, j is the jth grid point j 2 f0, 1, 2, . . . , 10g. De = 0:05 is the searching step. The searching process is from e i, 0 to e i, 10 with step De. 4. P d, j is P d of the jth grid point, when a P d, j = 1, j 2 f0, 1, 2, :::, 10g, stopping search, the P d, j and corresponding e i, j are returned to 1. The search otherwise continues. 5. When the real-time search has been finished, highest probability P d, max and corresponding optimal parameter e i, j are returned to step 1, as follows where f 1 À1 ( Á ) is a function seeking e i, j through P d, max . else if E i ł l l, i ;w i = 0; else w i = Ei lh, i ; 3: Encode the weights for all SUs if w i = 1;

Reinforcement learning based on the grid search algorithm
The learning process is as follows: 1. After the fusion center finishes executing the grid search, the obtained grid coordinates are represented as follows . .
The first column in matrix A represents the value of control parameter e, while the second column represents the value of signal-to-noise ratio a. Each row represents the optimal control parameters found by executing the grid search algorithm in a specific radio environment.
2. The fusion center sends matrix A and the matrix description to cognitive users. 3. The cognitive user can memorize the data from matrix A; the local detection threshold is set according to the data and is the best to be preserved. When the radio environment is consistent with the memory, then the optimal threshold in this environment can be directly invoked to perform the next spectrum sensing process. 4. In the case of a new radio environment, the fusion center must alter the range of a values and re-execute the grid search algorithm (i.e. execute steps 1-3 again).

Experiments and evaluation
This study designed three groups of Monte Carlo simulation experiments to evaluate the performance of the cooperative spectrum sensing method, as follows: (1)  The simulation experiments set the PU signal to a BPSK, bandwidth to 100 kHz, and sensing duration to 100 ms. 30 The PU was placed in the center of a 1000 3 1,000 m square and surrounded by 16 evenly distributed sensing nodes. The simulation scenario is shown in Figure 1. The probability of setting the channel of PU occupancy was b = 50%, while the transmitting power of the PU signal was 100 mW. 31 Each node sampled 20 points; the noise power range was set to between 0 and 2 dB, while the path-loss exponent was 2.7, the standard deviation of the shadow was 5 dB, and the mean of the multipath Rayleigh fading was 1. 32-34 Figure 5 illustrates a comparison of detection probabilities (P d ) between the traditional fusion methods and the new fusion method proposed in this article. The traditional methods used for comparison included AND, OR and Majority fusion methods. As seen in Figure 5, the detection probability (P d ) obtained by the new fusion method was higher than that obtained by traditional methods. This is more obvious in cases involving low SNR because each sensing node uses double the threshold energy detection and calculates weights according to the signal energy received by itself. Such nodes can then make appropriate contributions to the spectrum sensing process based on their own weights. As such, the new fusion method more accurately reflects the actual roles of each node when compared to traditional methods. Figure 6 illustrates the comparison of probability of error (P e ) between the grid search algorithm used in this study and other algorithms (i.e. fixed-single and fixed-double threshold algorithms). As seen in Figure  6, the grid search algorithm exhibited the lowest error probability during spectrum sensing. This is because the best detection threshold can be obtained through the grid search algorithm in any radio environment. The other two algorithms have fixed thresholds and can therefore not adapt to noise fluctuations. While the probability of error (P e ) increases when SNR decreases in all cases, the grid search algorithm results in the smallest increase. Figure 7 shows a sensing speed comparison between reinforcement and non-reinforcement learning. As seen, reinforcement learning takes less sensing time than non-reinforcement learning under the same SNR conditions. This is because reinforcement learning can directly invoke detection thresholds in the same environment from the repository. If reinforcement learning is not used, then every spectrum sensing procedure requires a grid search algorithm to find the optimal threshold; this requires more sensing time. Sensing time decreases when SNR increases because the radio environment is simpler; with an increased signal-to-noise ratio, less information is stored and judgments are easier to make.
In order to verification of fast fusion algorithm, compare the processing time of using fast fusion algorithm with that of not using fast fusion algorithm. The experiments are all under the same number of nodes. In order to highlight the advantages of the fast algorithm proposed in this article, observe the data processing time under different node numbers. Table 1 shows the processing time at different nodes. The processing environment is MATLAB 7.0, and the computer configuration is Intel (R) Core (TM) i5-8500 CPU at 3.00 GHz, RAM is 8 GB, and 64-bit operation system.
It can be seen from Table 1 that the fast algorithm used by the fusion center can effectively reduce the data processing time, and the average processing time can be reduced by 18%. When the number of nodes is more, the advantage of fast algorithm in dealing with big data is more obvious. When the number of nodes is more than 30, the time of fast algorithm in dealing with data is less than that of not using fast algorithm in dealing with 25 nodes, which can be the advantage of fast algorithm in this article.

Conclusion
This article studies a new perceptual data fusion algorithm, which can process the perceptual data of each node quickly without delay. In the cognitive radio network, different nodes have different perception data due to different geographical location, and the contribution of each node's perception data to cooperative perception is also different. The fusion center uses reinforcement learning mechanism to select cooperation nodes by identifying the sensing performance of node, which can reduce the processing data to a certain extent, and enable the fusion center to process quickly the data sent by each node will not cause decision delay. This greatly improves the throughput of cognitive users while protecting the primary users. The experimental results show that the big data fast fusion algorithm in this article can effectively reduce the data processing time, The average processing time of using fast algorithm is 18% less than that of not using fast algorithm. When the number of nodes is more than 30, the time of fast algorithm in dealing with data is less than that of not using fast algorithm in dealing with 25 nodes, which   can be the advantage of fast algorithm in this article. Furthermore, the algorithm in this article can reduce the processing time of node data and improve the sensing performance at the same time and increase the throughput of cognitive users, which is of great significance. However, at present, only the fast algorithm of big data is implemented in the fusion center, but not the energy-saving algorithm in the node itself, which is the follow-up research goal.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.