Game algorithm based on link quality: Wireless sensor network routing game algorithm based on link quality

Aiming at the problems of low data transmission efficiency and uneven energy consumption caused by unreliable link communication in the routing process of wireless sensor networks, this article designs a routing game algorithm based on link quality. In this article, the index for evaluating link quality is defined first. Then, the link quality, node residual energy, and minimum jump transmission strategy are integrated into the utility function to establish a game model to determine the best next hop transmission node. Finally, the routing optimal transmission path is obtained according to the analysis of the existence of Nash equilibrium in the game. In the simulation experiment, the influence of the change of link quality parameters on the performance of the algorithm is analyzed, and the proposed algorithm is compared with non-linear weight particle swarm optimization (NWPSO) algorithm and Low Energy Adaptive Clustering Hierarchy-Improvement (LEACH-IMPT) algorithm in three aspects: the number of surviving nodes, network lifetime, and network energy consumption. The results show that the network lifetime of this method is 16.8% longer than that of LEACH-IMPT algorithm and 7.5% longer than that of NWPSO algorithm. This shows that the algorithm can effectively balance the network energy consumption and prolong the network life cycle. In addition, according to the routing path obtained in the simulation experiment, the optimality of its link quality is verified in the real experimental environment, and the experimental results prove the feasibility of the method in this article in practice.


Introduction
Wireless sensor network (WSN) is a wireless network composed of a large number of mobile or stationary sensor nodes in a self-organizing and multi-hop manner, which is widely used in national defense and military, industrial control, ecological monitoring, intelligent transportation, and other fields. 1,2 These nodes perceive, collect, process, and transmit the data information in the network coverage area in a cooperative manner, and they are often deployed in some complex environments during monitoring, but they are prone to failure due to the influence of special communication methods and limited energy. However, it is very difficult and expensive to replace these faulty nodes artificially. In addition, in practical applications, 1 the topology of WSNs changes frequently, 3,4 so the design of reasonable routing mechanism plays an important role in WSNs.
The introduction of game theory in WSNs provides a new model for the study of routing problems. 5 Game theory mainly describes how participants choose strategies to maximize benefits according to their environment in the game, and how to deal with and balance when the chosen strategy conflicts. In WSNs, the choice of data transmission path is similar to a game; nodes also choose strategies based on their environment to maximize revenue. However, when transmitting data, the network will be affected by the rationality and selfishness of the nodes. On one hand, the rationality of the nodes will tend to choose whether to participate in collaboration according to their own conditions, so as to extend their survival time and network life cycle as much as possible; 6,7 on the other hand, because the energy of the node itself is very limited, it will also produce a certain selfishness, this selfishness refers to refusing to participate in the cooperation only to save the current energy consumption without considering the long-term benefits brought by the collaborative data forwarding. The existence of node selfishness greatly affects the cooperation between all nodes, resulting in uneven energy consumption and shortened life cycle of the network during routing. 8 Therefore, the game theory is introduced to restrain the selfish selection of nodes in the routing process and solve the policy conflicts between nodes, so as to further solve the problem of uneven energy consumption in the network.
In the existing routing protocol research, most of them ignore the evaluation of the link quality when selecting the data forwarding node, and only take the node residual energy and hop count as the standard; a few take the link quality as the standard but do not consider the node energy problem. However, in actual applications, sensor nodes are extremely susceptible to interference when transmitting data, so it is impractical to assume that the link is in an ideal state. We find that combining link quality with node residual energy and minimum hop forwarding strategy to establish a game model can show better results, and it also provides an opportunity for us to verify the link quality in real experiments. The routing protocol 9 after combining these indicators is of great significance in improving the reliability of data transmission and reducing node energy consumption. In response to the problems of low efficiency of routing data transmission and uneven energy consumption caused by unreliable link communication, the main contributions of this article are as follows: Combined with the more sensitive characteristics of the received signal strength indication (RSSI) and the more accurate characteristics of the link quality indicator (LQI), a link quality evaluation index more suitable for the actual application environment has been established. According to the characteristics of nodes when forwarding data, the idea of game theory is introduced into the routing mechanism, and the game model is established with the indicators of link quality, current energy of nodes, and minimum hops as utility functions. Through the utility function, the cluster head node with the maximum benefit is selected as the next hop cluster head forwarding node, and then the Nash equilibrium is reached, and the optimal data transmission path is obtained. In the simulation experiment, according to the influence of different parameter values in the link quality formula, the variation curves of network residual energy and life cycle are given when RSSI and LQI account for different proportion of link quality. After comparing with other algorithms, the advantages of this algorithm are obvious. An experimental platform was built to verify the link quality in a real experimental environment. The link quality of different data transmission paths was compared according to the RSSI and LQI measured by the cluster head nodes, and the experimental results verified the practical feasibility of the method.
The rest of this article is organized as follows. We briefly reviewed the relevant work in section ''Related work.'' In section ''Description and energy consumption model,'' we describe the problems that this method can solve and establish the energy consumption model. Section ''Design and analysis of routing game model'' introduces the design and analysis of the model in this article. Section ''Routing game algorithm'' introduces the detailed design of the game algorithm based on link quality (GABLQ) method. Section ''Experiment'' introduces the experimental setup and analysis of the experimental results. Finally, we summarize our work in section ''Conclusion.''

Related work
At present, researchers have proposed many routing protocols based on different characteristics of WSNs. The common goal of studying these routing protocols is to establish a reasonable routing mechanism in the network to improve the reliability of data transmission, maximize the balance of network energy consumption, and extend the network life cycle.
This article introduces the existing research work from the aspects of reducing network energy consumption and improving data transmission reliability. In terms of reducing network energy consumption, Junli et al. 10 proposed a new energy-saving routing algorithm based on software-defined wireless sensor network (SDWSN), which establishes a distance queue based on messages collected from nodes and calculates the nearest node to transmit data. The proposed routing algorithm makes use of the advantages of centralized control and topology management in software-defined networking (SDN), reduces the energy consumption of broadcast and reception, and has good performance in prolonging the network lifetime. Zhang et al. 11 proposed an efficient hop count routing protocol (EHCR) for mobile ad hoc networks. This article assumes that all nodes have different transmit powers and sets power thresholds. When a node receives a path search message from the source node, it compares its own power with the set threshold. When its own power is greater than the threshold, it will choose the path with less hops, and if it is less than the threshold, it will choose the path with larger hops. This method weighs the transmission power and the number of hops when selecting the path, and achieves the purpose of reducing network consumption. Wu et al. 12 proposed a clustering routing protocol based on improved particle swarm optimization algorithm to reduce network energy consumption. First, the paper considers the node position and residual energy, and improves the fitness function of particle swarm algorithm to select cluster head nodes. Second, the multi-hop method based on minimum spanning tree is designed to select the optimal path for data transmission. Kaur and Singh 13 proposed Optimized Cost Effective Routing protocol (OCER). In this paper, the genetic algorithm (GA) is used to solve the multiobjective cost optimization problem, and the path from the forwarding node to the sink node is determined according to the optimal solution obtained by the GA, which reduces the energy consumption of the node when transmitting data. With the increasingly widespread application of WSNs, it is relatively simple and unrealistic to consider only the problem of network energy consumption. Then, researchers began to consider the reliable transmission of data. Liao et al. 14 analyzed the influence of hop count on network connectivity and path reliability in wireless multi-hop networks through modeling, which provided a theoretical basis for the study of wireless multi-hop networks. Tall et al. 15 proposed a link quality and delay based Composite Load Balancing routing protocol (ComLoB). This protocol solves the problem of data packet overflow caused by limited storage space in a way of evenly distributing traffic load, and then achieves a better data transmission rate. In Mostafaei,16 in order to solve the end-to-end reliability and delay requirements of WSN in multi-hop data transmission, an algorithm based on distributed learning automata is proposed. This method uses the advantages of distributed learning automaton (DLA) to find the minimum number of nodes that meet the required quality of service (QoS) requirements, and improves the network efficiency under multi-constrained QoS parameters. Zhang et al. 17 proposed a high-speed-road scenario based adaptive routing algorithm (RAR) in order to solve the problem of frequent changes in network topology. In this paper, the greedy routing algorithm is first used to select relays in the candidate node domain, and second, the data forwarding path is selected according to the link status between nodes and the next hop node degree. This method shows good routing stability.
In the above routing mechanisms with data as the center, the network performance has been greatly improved. With the intelligentization of sensor nodes, scholars have proposed to use nodes as the center and introduce game theory to analyze the behavior of nodes in routing. Shen et al. 18 studied the trust decision and its dynamics which play a key role in the stability of the whole network using evolutionary game theory. In this paper, the evolution process of sensor nodes (SNs) selection behavior is explained, and the theorem that evolutionarily stable strategies can be obtained under different parameter values is proposed and proved. In addition, a trust mechanism is introduced to ensure the security and stability of WSNs. Attiah et al. 19 proposed an evolutionary game for efficient routing in WSNs. In this paper, the routing problem in WSNs is analyzed, and the method of evolutionary game theory is used to show how sensor nodes evolve their routing strategies to transmit data packets efficiently and stably. This method effectively reduces network congestion and prolongs the network life. Deng et al. 20 proposed a distributed energy equilibrium routing algorithm based on hierarchical idea and Markov game aiming at the energy hole and funnel effect of WSN. Based on the traditional cluster structure and hierarchical routing model, the algorithm designs a balanced structure and hierarchical routing model to adjust the relationship between the amount of transmission data, transmission power, and the number of nodes. This method effectively prolongs the network life cycle and reduces the average energy consumption of nodes. Umar et al. 21 proposed an adaptive data communication method based on game theory reward for selfish or noncooperative behavior of nodes. This method encourages the cooperation of nodes in the forwarding process through the reward and punishment mechanism, and effectively balances the workload of network nodes. Yang et al. 22 proposed a game theory method to balance the energy consumption of clusters, which solves the problem of how to maximize the benefits of sensor nodes and obtain a balance strategy at the same time, which greatly improves the network lifetime. Du et al. 23 proposed a cross-layer optimization energy balance topology control algorithm (COETC). In this method, game theory is introduced to establish the topological game model. Cross-layer information such as node degree, network connectivity, and medium access control (MAC) layer interference is introduced into the utility function to achieve good energy balance and high-energy efficiency. In Wu et al., 24 in order to maintain the load balance of cluster head nodes, game theory is introduced into the allocation problem of member nodes, and a clustering method of WSNs based on energy balance and game theory is proposed. This method achieves better performance in prolonging the life cycle and balancing energy of WSNs.
Compared with the data-centric routing pattern, the node-centric routing protocol based on game theory is more in line with practical applications. As far as we know, most of the current routing game algorithms only consider the energy consumption of the network and ignore the reliable transmission of data. However, the reliable transmission of data is also crucial in practical applications. Therefore, we study a routing game algorithm that comprehensively considers link quality and network energy consumption.

Problem description
In WSNs, all sensor nodes are self-managing individuals. Because these nodes are limited by their own energy and communication distance, they generally need to cooperate with each other to complete data transmission tasks in the network. In practical applications, nodes will consume energy when transmitting data, the rationality of nodes will tend to choose whether to participate in collaboration or not according to their own conditions to extend their own survival time and network life cycle as much as possible; however, the selfishness of the nodes will not consider long-term interests, and will refuse to participate in cooperation in order to save the energy consumed in the process of data transmission, so not all nodes will actively cooperate with this cooperative forwarding mode. When the cluster head node transmits data to the base station (BS) in a multi-hop manner, there are multiple transmission paths. However, the existence of selfish nodes on these paths greatly affects the effective collaboration between the nodes, resulting in a series of problems in the network, such as uneven energy consumption and shortened network life cycle. Therefore, it is urgent to design a reasonable routing mechanism to suppress the selfish choice of nodes.
In order to promote cluster head nodes to participate in better collaboration and reduce conflicts between nodes while maximizing their own interests, we build a game model based on the rational characteristics of nodes. In this model, the cluster head node calculates its own utility value according to the utility function, and plays a game with other nodes to become a forwarding node, so as to extend its survival time as much as possible. In traditional routing protocols, most consider the number of hops from the node to the destination node as the basis for selecting the next hop sending node, but when the node with a smaller hop number is selected, the single-hop distance between the nodes will become larger. 25 In this case, the data transmission will not be affected only when the link is very reliable. However, in the actual environment, when the singlehop distance between nodes is large, the link will be susceptible to interference from electromagnetic waves, shadows, etc., resulting in poor link quality, which is usually manifested as an increase in the number of data retransmissions and an increase in delay, etc. This further increases the energy consumption of the node, so the link quality is also used as one of the criteria for selecting the next hop node in the revenue function, so that it can play an important role in reducing energy consumption and extending the network life cycle.

Consumption model
In WSNs, nodes consume energy when they perceive, process, and transmit data, in which the energy consumption of data transmission accounts for the largest proportion, so we mainly analyze the transmission energy consumption of nodes when designing routing algorithms. This part of the energy consumption consists of two parts: the energy consumption of the node receiving data E Rx and the energy consumption of sending data E Tx . The energy consumption E Rx (l) of the node receiving l bits data is 26 The energy consumption when sending data includes two parts: the energy consumption of sending wireless circuits and the energy consumption of sending amplifiers. When the distance between nodes is d, the energy consumed E Tx (l, d) to send l bits data is 26 In the formula, E elec represents the energy consumption when the wireless circuit is used to receive or send unit data; d 0 is the critical value of node distance 27 The free space model is used when d\d 0 , and the multi-channel attenuation model is used when d ø d 0 ; e fs and e amp represent the power amplification coefficients in the free space model and the multi-path attenuation model, respectively.

Design and analysis of routing game model
This article considers the link quality during data transmission in addition to the network energy consumption when designing the routing mechanism. In this section, we first introduced the link quality evaluation index, and then established a routing game model based on the link quality evaluation index, and further analyzed the model.

Link quality evaluation index
In WSNs, sensor nodes are easily affected and disturbed by the surrounding environment in the process of data transmission. In the design and simulation of many existing routing protocols, most of them ignore the evaluation of link quality, which is generally based on the assumption that the communication link is an ideal link. However, in practical applications, communication links often exhibit the characteristics of random fluctuations. Such fluctuations may cause problems such as reduction of network throughput and increase of energy consumption. Therefore, it is of great significance to choose a reasonable link quality measurement method to accurately evaluate the link quality in the application, which is the basis for achieving reliable network communication. 28 At present, there are three commonly used link quality evaluation indicators: RSSI, LQI, and packet reception rate (PRR). RSSI is the received signal strength indication, which refers to the signal power of the receiving end in the process of wireless communication, usually expressed by the following formula In the formula, P r represents the power of the receiving end when receiving the signal, which is related to the transmitting power P t of the sending end and the path loss Loss during transmission The loss can be described by the model of free space transmission loss. In free space, G t is defined as the transmit antenna gain, d is the propagation distance, then on the spherical surface with the sending end as the sphere center and the radius d, the power per unit area s is When the receiving end is presented in a parabolic shape, G r is defined as the receiving gain of the antenna, l is defined as the signal wavelength, then the effective area A of the antenna can be expressed as Then, the received power P r can be expressed as According to the definition of path loss, equation (8) can be rewritten as From the above formula Express the path loss of free space in decibels In practical applications, we usually rewrite equation (10) as Among them, d is the distance between the transmitting end and the receiving end; f is the signal frequency; n is the signal attenuation factor, ranging from 2 to 4. LQI is the link quality indicator. When supported by hardware devices, both RSSI and LQI can be measured in real time, and can quickly reflect changes in link quality at a lower network cost. 29 There is a correlation between the value of LQI and RSSI, which can be calculated according to the RSSI value 30 Compared with these two evaluation parameters, the PRR does not have an advantage, because nodes consume more energy and time when counting transmission times and expected transmission times. In the actual measurement, LQI can describe the link quality more accurately than RSSI, but its numerical value does not change in time, and the RSSI value will change sensitively with the change of the network environment, so we introduce the RSSI and LQI to comprehensively evaluate the link quality. 31 When node i sends data to node j, the link quality LQ(i, j) of the link ij is represented by the RSSI(i, j) and LQI(i, j) read by node j when receiving data Among them, RSSI(i, j) 2 ½RSSI min , 0 and LQI(i, j) 2 ½0, LQI max . l and h represent the impact of their corresponding RSSI and LQI on LQ, respectively. When the transmission power of the nodes is the same, the larger the values of RSSI(i, j) and LQI(i, j), the higher the link quality.

Game model
Because the game theory can well deal with the strategy conflicts between the nodes and make the cluster head nodes participate in the better cooperation, this article builds a game model based on the rational preference characteristics of nodes to study routing problems in the network. First, the three elements of the game model are defined as 1.
M is the set of participants and M = f1, 2, . . . , mg, where m refers to the number of all participants. 2. S is the strategy space of participants, and s = (s i , s Ài ) 2 S is used to denote a strategy combination, where s i represents the strategy selected by participant i, and s Ài represents the strategy selected by the remaining m À 1 participants. 3. u = fu 1 , u 2 , . . . , u m g represents the utility function of participants, and u i represents the maximum utility value that participant i can achieve in strategy combination (s i , s Ài ).
In the game model of this article, all the cluster head nodes in the monitoring area are participants in the whole game process, and this process is dynamically divided into different stages by the cluster head nodes who are willing to participate in the cooperation, and the set of participants in each stage is N = f1, 2, . . . , ng. Suppose the current cluster head node i with data transmission needs initiates a game. If the strategy for determining neighbor cluster head node j is defined as s ij , then the strategy set of participant cluster head node i is expressed as S i = fs i 1 , . . . , s ij , . . . , s in g. When cluster head node j is selected as the next hop cluster head node, the value of s ij is 1. If cluster head node j is not selected as the next hop data forwarding cluster head node, the value of s ij is 0. In this article, only one next hop cluster head node can be selected as the next hop node. Therefore, there is only one value of 1 in the policy set of the cluster head node i. In the game model, the best next hop cluster head node is determined according to the node's utility function value, and the utility function is given by combining the number of hops when the cluster head node forwards data to the BS, the current energy and the link quality between the cluster head nodes Among them, Mhc max is the maximum hop number between the participant cluster head node and the BS in the monitoring area. Mhc i is the minimum number of hops in the path that the current cluster head node i forwards data to the BS, LQ(i, j) is the link quality, and E i and E init are the current residual energy and the initial energy of the cluster head node i, respectively. Mhc max =Mhc i means to increase the probability that the cluster head node with a smaller hop count from the BS becomes the next hop cluster head node. E i =E init represents that when the remaining energy of the cluster head node is less, the probability of becoming the next hop node is reduced. a, b, and g are the weighting factors; the purpose is to adjust the influence degree of each parameter on the game result according to the actual network situation.

Model analysis
When the game reaches a certain stage, all participants choose their own optimal response strategy, and they will not actively deviate from the current strategy selection when other players' strategies remain unchanged. At this time, the game reaches a stable state, which is called game equilibrium.
Definition 1 (Nash equilibrium). If any participant i 2 M and any strategy is a Nash equilibrium of the game G = fM, S, ug. 32 Property 1. There is at least one Nash equilibrium in a finite perfect information game. 33 In the routing game model in this article, the utility function combines the consideration of the link quality between nodes, the node energy, and the minimum forwarding hop to determine the next hop cluster head node, which maximizes the benefit of the participants, that is, to maximize their own survival time and network life. In order to obtain the highest return when the game reaches a steady state, the strategies of all the participants are the best, and the set of these optimal strategies is the Nash equilibrium. 34 In this article, the game is dynamically divided into different stages in terms of time, and each stage of the game is initiated repeatedly by the determined cluster head nodes, the specific process is as follows: 1. Determine the participant set N . When the cluster head node i has the transmission demand, it initiates the game, and all the cluster head nodes in the perceived range are the participants in the game. 2. Determine the strategy of the participants.
According to the information of the participants, the utility function is calculated to get the utility function set u. The node with the largest utility function value is the next hop cluster head node, and the strategy set s of the participants is obtained. 3. Initiate the game repeatedly. The cluster head node determined in Step (2) repeatedly initiates a game to determine each next hop cluster head node. When all the participating cluster head nodes in the same stage play the game, the strategy information of the cluster head nodes determined in the previous stage can be observed. In the end, the strategies of all participants constitute the optimal data transmission route, that is, the game reaches the Nash equilibrium.

Routing game algorithm
The implementation of the routing GABLQ has the following requirements for WSNs: (1) all nodes can only receive information from some nodes when broadcasting; (2) nodes have storage space and certain computing power; (3) the node can obtain its own energy remaining situation; and (4) the hardware of the node except the BS is simple, and the RSSI can be directly obtained when receiving the receipt packet. The GABLQ algorithm consists of three stages: the hop count determination stage, the routing game stage, and the data transmission stage, which will be explained in detail in the following.

Hop count determination stage
Each cluster head node establishes its own neighbor list, which is formatted as shown in Table 1. ID(j) is the ID information of the neighbor cluster head node j, Mhc j is the minimum hop count information from the neighbor cluster head node j to the BS, E j is the current remaining energy value of neighbor cluster head node j, LQ(i, j) represents the link quality of the link formed between the current cluster head node i and the neighbor cluster head node j, and u j is the utility function value of the neighbor cluster head node j.
In order to obtain the minimum hop count Mhc from each cluster head node to the BS, the minimum hop count Mhc of all cluster head nodes is initially set to infinity, and the minimum hop count of the BS Mhc BS is 0 by default. The BS establishes a network topology after initiating broadcast information ER (Establish Routing), and the ER information contains the minimum hop count information Mhc BS = 0 of the BS itself. After receiving this ER message, any neighbor cluster head node j updates its own hop value Mhc j to Mhc BS + 1. After the hop count update, the cluster head node j continues to send broadcast information ER to its neighbor cluster head node. The information ER includes its own ID(j) and Mhc j . When any neighbor cluster head node i compares the received information Mhc j with its own information Mhc i , there are four situations as follows: Case 1: when Mhc i is still infinity, update Mhc i = Mhc j + 1 and add cluster head node j to its neighbor list. Case 2: if Mhc i ł Mhc j , ignore this Mhc j message and keep its hop value unchanged. Case 3: if Mhc i = Mhc j + 1, add the cluster head node j to its neighbor list. Case 4: if Mhc i ø Mhc j + 2, clear the neighbor list information of cluster head node i, modify Mhc i = Mhc j + 1, and add cluster head node j to its neighbor list.
After the information update of all the cluster head nodes in the network is completed, each cluster head node obtains the minimum hop count information to the BS and obtains a list storing all neighbor cluster heads.

Routing game stage
When the cluster head node i has data transmission requirements, it broadcasts the Hello information including the cluster head node number ID(i) and the minimum hop count information Mhc i to the BS. After receiving the Hello message sent by cluster head node i, the neighbor cluster head node j reads and stores its minimum hop information Mhc i , and packs its own ID number and current energy value E j into a feedback message to the cluster head node i. After the cluster head node i receives the feedback information, it parses the feedback information to delete the neighbor cluster head node with insufficient energy from the neighbor list and updates the neighbor list. The cluster head node i initiates a game after updating the neighbor list, and sends request information to the remaining cluster head nodes in the list to request the u value of the Table 1. The head of the neighbor list of the cluster head node i.

ID(j)
Mhc j E j u j remaining cluster head nodes in the list. The cluster head node j that receives the request information reads the RSSI(i, j) and LQI(i, j) corresponding to the link ij to calculate LQ(i, j), and combines the current energy value E j and the minimum hop count Mhc j to calculate the utility function value u j , which is fed back to the game initiator cluster head node i. Cluster head node i arranges the u values of all remaining neighbor cluster head nodes in descending order, and selects the neighbor cluster head node with the largest u value as the next hop cluster head node.

Data transmission stage
When the cluster head node j of the next hop is determined, the cluster head node j becomes the new game initiator to repeatedly initiate the game to determine the cluster head node of the next hop to transmit data until the data reaches the BS. In the process of data transmission, the order of utility function values u in the neighbor list is constantly changing due to the influence of link quality and node energy consumption, which can be divided into two cases: 1. After the data transmission path is determined by the routing game algorithm, the cluster head node calculates the corresponding link quality for each data packet received. When the link quality LQ(i, j) changes abruptly, the corresponding utility function value u j will also change sharply. At this time, the cluster head node i marks the neighbor cluster head node j as the suspicious node (SN), and the link ij as the suspicious link. If the marked link returns to the original level after five subsequent communications, the marking is deleted, otherwise the cluster head node j is deleted from the neighbor list of cluster head node i, and the cluster head node with the next best u j is selected. 2. During data transmission, the energy of the node is continuously decreasing. When the cluster head node i judges through feedback information that the current energy information E j of the next hop cluster head node j is insufficient to normally receive and forward data, the cluster head node i selects the neighbor cluster head node of the sub-optimal value u in its own neighbor list to forward the data.

Algorithm analysis
In this algorithm, the cluster head node uses the linked list to store the information of the neighbor cluster head node. Assuming that the number of neighbor cluster head nodes j is n, the information that the cluster head node needs to store after establishing the neighbor list is the ID(j), minimum hop information (Mhc j ) of the neighbor cluster head node, residual energy information E j , and utility function value u j . The space occupied by this information is 2n. When determining the data transmission path, each hop node can only monitor the information of the participants in the previous stage and does not store the information of all nodes, so the space complexity is O(n). In the process of the game, the node with the largest utility function u j is the next hop node, and the calculation task of these u j values is assigned to each neighbor cluster head node, which effectively balances the spatial load. In addition, for the data information generated by the cluster head node, assuming that the speed of generating the data information is m, the running time of the network is T , and the number of nodes is P, the data information stored by the node after the network running t is positively related to m, T , and P. In extreme cases, the stored data information can never find the destination node, and the storage complexity is C = O(mTP).

Experimental simulation
In order to verify the performance of the algorithm in this article, we conducted simulation experiments with MATLAB 2016. The simulation parameters are shown in Table 2.
Effect of tunable parameters on algorithm performance. The parameters l and h respectively represent the proportion of RSSI and LQI in the link quality evaluation index, and determining the values of l and h is of great significance for us to analyze the performance of the algorithm. When discussing l and h, we fixed l = 10, assigned h to 3, 5, and 7 for simulation experiments, and compared network performance to select the optimal parameter combination. We evaluate the network performance based on the remaining energy of the network and the life cycle. First, we deployed 200 cluster head nodes in a 500 m 3 500 m square area to conduct experiments to compare the remaining energy of the network in these three cases; second, compare the life cycle of the network in these three cases by adjusting the node size in the same area. In each group of experiments, we have carried out five experiments under the same conditions to get the average value, and the effects of parameters l and h on network performance are shown in Figures 1 and 2. Figure 2 shows the residual energy curve of the network in the three sets of parameter combinations as the network operating cycle changes. It can be seen that the network residual energy shown in the three curves is decreasing rapidly after the network starts running, and the decline rate of the curve gradually slows down when the network runs to a certain number of rounds, and there is still a small amount of energy left at the end of the network operation. Overall, the slope of the curve is the largest at l = 10andh = 3, the slope of the curve is second at l = 10 and h = 5, and the slope of the curve is the smallest at l = 10 and h = 7, which indicates that the energy consumption rate of the network is the slowest at l = 10andh = 7. From the three curves, when the number of rounds of the network is the same, the curve of parameter l = 10andh = 7 has the most residual energy in the curve: in the curve of parameter l = 10 and h = 3, the rate of energy reduction gradually decreases when the network runs to 800 rounds, and the remaining energy does not change when it approaches 1200 rounds; in the curve of parameter l = 10 and h = 5, the network energy keeps decreasing evenly until 1445 rounds; and in the curve of parameter l = 10 and h = 7, the curve changes smoothly after 1400 rounds, and the network ends running after nearly 1600 rounds. According to the performance of the three parameters in terms of network operation cycle, it can be seen that the parameter l = 10 and h = 7 has the largest number of network operation rounds, l = 10 and h = 5 is the second, and the network operation round number is the shortest at l = 10 and h = 3. The above shows that when l = 10, with the increase in h, the less energy consumed by the network and the longer the running time of the network.
For ease of discussion, the network life cycle is defined as the number of running rounds when 50% of the nodes die in the network. Figure 3 compares the life cycle of the network in the three parameter combinations when 100, 150, 200, 250, and 300 cluster head nodes are deployed in a 500 3 500 square area. Overall, when the number of cluster head nodes in the network increases, the life cycle of the network among these three parameters increases significantly. This can be understood as the increase in the number of cluster head nodes in the network reduces the network energy consumption to a certain extent, and appropriately extends the network life cycle. Judging from these five situations, the network lifetime is the longest when l = 10 and h = 3, and the network lifetime is the shortest when l = 10 and h = 7. The above shows that as h increases, the network life cycle becomes shorter. In summary, according to the comparison of the network residual energy and life cycle in these three parameter combinations, the parameter l = 10 and h = 5 is the optimal case, and we all use this set of parameters in the following experiments.
Comparison of algorithm performance. As shown in Table 3, in Xiang et al., 35 the non-linear weight particle swarm optimization (NWPSO) algorithm defines the fitness function according to the residual energy of nodes and the average distance between nodes to find the best cluster head node. The Low Energy Adaptive Clustering Hierarchy-Improvement (LEACH-IMPT) algorithm in Wang et al. 36 chooses the best data transmission route according to the energy consumption, hop count, and remaining energy of the path. The two algorithms show good performance in terms of network lifetime and network lifetime, respectively. Compared with these two algorithms, the novelty of this routing algorithm is that it not only combines the residual energy of nodes and the minimum number of hops but also considers the link quality between nodes. In order to verify the feasibility and effectiveness of this algorithm, the two algorithms are simulated on the MATLAB simulation platform, and the comparative experiments are carried out from three aspects: the number of survival nodes, network survival time, and network energy consumption. Experimental environment: assume that 200 cluster head nodes are deployed in a 500 m 3 500 m area, the BS is located at the center of the area, and the remaining parameters are shown in Table 1. Before conducting comparative experiments, we set the weighting factors of the utility function GABLQ: game algorithm based on link quality; WSN: wireless sensor network; NWPSO: nonlinear weight particle swarm optimization; SDWSN: software-defined wireless sensor network.
in the GABLQ algorithm to a = 0:2, b = 0:3, and g = 0:5 to analyze the performance of the algorithm in this article. Network survival time is one of the important standards for measuring network performance. To facilitate discussion, we divide the network survival time into two parts: one part is the stable period in which the number of surviving nodes remains unchanged, that is, the period before the first dead node appears in the operation of the network; the other part is the unstable period in which the number of nodes is decreasing. Figure 4 compares the changes in the number of network surviving nodes as the number of running rounds increases in different algorithms. It can be seen that the network survival time of the algorithm in this article is significantly longer than the NWPSO algorithm and LEACH-IMPT algorithm. During the stable period of the network, the number of network surviving nodes of the three algorithms is 200, but the length of time in the stable period is different. When the LEACH-IMPT algorithm is used, the first dead node appears only after the network runs to 600 rounds. The stable periods of the NWPSO algorithm and GABLQ algorithm continue to 837 rounds and 984 rounds, respectively. The network stability period of the algorithm in this article is the longest. When the network enters the unstable period, the number of network surviving nodes of the GABLQ algorithm is always greater than the other two algorithms at the same time. In the LEACH-IMPT algorithm, the network ends running at 1130th round, and in the NWPSO algorithm, the network runs to 1320 rounds, while in this algorithm, the network runs until the 1428 rounds. According to the performance of different algorithms in the stable and unstable periods of the network, this algorithm is superior to the other two algorithms. Figure 5 shows the comparison curve of the network energy consumption of the three algorithms as the number of network operation cycles increases. It can be seen that the energy consumption of the network in this algorithm is lower than the NWPSO algorithm and LEACH-IMPT algorithm. On the whole, in the three algorithms, the network is consuming energy quickly in the early stage of operation. After running a certain number of rounds, the energy consumption rate gradually slows down, and there is still a small amount of energy remaining at the end of the network operation. According to the changes of these three curves, in the early stage of network operation, the slope of the network energy consumption curve of GABLQ algorithm is smaller than that of NWPSO algorithm and LEACH-IMPT algorithm, which shows that the speed of network energy consumption in this algorithm is less than that of the other two algorithms, and the network consumes the least energy when the network runs to the same number of rounds. In the LEACH-IMPT algorithm, network energy consumption slowly increases after 620 rounds until the end of 1130 rounds. In the NWPSO algorithm, network energy consumption slowly increases between 800 and 1320 rounds. However, the network energy consumption of the algorithm in this paper has maintained a uniform increase to 1060 rounds, and quickly reached the maximum value at 1420 rounds, which shows that the network energy consumption of the algorithm in this paper is more balanced.
According to the number of surviving nodes in the network in Figure 3, the number of running rounds of the network in FDT, HDT, and LDT can be further obtained, as shown in the histogram of Figure 6. Among them, FDT represents the number of running rounds of the first node death in the network, HDT represents the network running when 50% of the nodes die, and LDT represents the number of network running rounds when the remaining cluster head nodes cannot form a complete transmission path. It can be seen that, in these three cases, the number of network  running rounds of the algorithm in this article is significantly longer than the NWPSO algorithm and the LEACH-IMPT algorithm. And according to the running rounds of the network during HDT, it can be concluded that when the number of cluster head nodes in the network is the same, the network life cycle is the longest in the algorithm in this article, followed by the NWPSO algorithm, and the shortest LEACH-IMPT algorithm.

Experimental verification
In the real experimental environment, the link quality in the WSN is often interfered by external environmental factors. Considering the feasibility of applying the algorithm of this article to practice, we built an experimental platform to verify. As shown in Figures 7 and 8, the equipment used in the experimental platform is the STM32W108 development board and the STM32W108 radio frequency transceiver module that conforms to the IEEE802.15.4/Zigbee standard, which is powered by battery and can be connected to JLink to realize the functions of program burning and online debugging. The development board supports random selection of channels 11-26, and the radio frequency transceiver module sends broadcast packets on one of the channels randomly selected by the development board.
Experimental design. First, we simulate the rectangular monitoring area of 60 m 3 30 m in MATLAB. Eleven cluster head nodes are deployed in the network, the source cluster head with transmission demand is located at (5 m,5 m), and the BS is located at (60 m,30 m). The remaining 10 forwarding cluster head nodes are initialized in this area, and the perception radius of the node is adjusted to 20 m. The remaining parameters are shown in Table 2. After running the routing algorithm in this article, the data transmission path from the source node to the BS is obtained. From this point of view, the data generated by the source cluster head node will transmit the data to the BS through three hops, that is, there will be two games in the transmission path to determine the two-hop forwarding node in the path. We set up an experimental platform in an outdoor scene of the same size. After the nodes are deployed according to the location initialized by the algorithm, the network is constructed, and the effectiveness of the algorithm is verified by comparing the link quality of each link in each hop.
Outside, we select the rectangular area of the 60 m 3 30 m, deploy the nodes according to the node location information in the algorithm, and run the network after deployment, as shown in Figure 9. All the nodes are arranged on the tripod. The three blue circles evenly distributed near the source cluster head node (CH send ) represent the cluster member nodes. The red circles are the cluster head node (CH 1 and CH 6 ) obtained according to the simulation experiment, the   yellow circles (CH 1 , CH 2 , CH 3 , CH 4 , and CH 5 ) are all the first-hop neighbor cluster head nodes, and the green circles (CH 6 , CH 7 , CH 8 , CH 9 , and CH 10 ) represent the second-hop cluster head nodes. In order to observe more clearly, we only deployed the intra-cluster nodes of the source cluster head node, use the serial cable to connect the cluster head node on the selected path to the PC, and transfer the data to the PC via USB. When the source cluster head node sends data packets to five neighbor cluster head nodes (nodes indicated by yellow circles), we monitor the RSSI values of these five cluster head nodes and analyze the corresponding LQI values. After the experiment, the quality of each link is obtained by analyzing the RSSI and LQI data of the cluster head nodes in the first hop, and the effectiveness of the algorithm is verified by comparing the link quality of each link in each hop.

Experimental analysis.
In the link quality evaluation index in this article, the RSSI and LQI are used to evaluate the link quality. The higher the RSSI and LQI values, the better the link quality. We collect the RSSI values of all cluster head nodes in the actual scene, and compare the RSSI values of all neighboring cluster head nodes in the first hop and the second hop, as shown in Figures 10 and 11. The LQI values of all links are compared according to the RSSI values, as shown in Figures 12 and 13.
We define the five candidate cluster head nodes in this first hop as CH 1 , CH 2 , CH 3 , CH 4 , and CH 5 , respectively. In Figure 10, we give the RSSI changes of the original RSSI and the neighbor cluster head nodes of each candidate in the first hop. Figure 10(a) shows the RSSI of the source cluster head node (CH send ). After the source cluster head node forwards the data of its  . RSSI comparison of cluster head nodes in the first hop: (a) RSSI curve when CH send receives data from cluster member nodes, (b) RSSI curve of neighbor cluster head node CH 1 , (c) RSSI curve of neighbor cluster head node CH 2 , (d) RSSI curve of neighbor cluster head node CH 3 , (e) RSSI curve of neighbor cluster head node CH 4 , and (f) RSSI curve of neighbor cluster head node CH 5 .
cluster member nodes I, II, and III, Figure 10(b)-(f) respectively compares the RSSI changes of the five neighboring cluster head nodes. Among them, the neighbor cluster head node CH 1 shown in Figure 10(b) is the first-hop optimal cluster head node obtained by running the algorithm. From the overall fluctuation range, the fluctuation range of the RSSI value of CH 1 shown in Figure 10 However, when CH send forwards the data of its cluster member node I to the neighbor cluster head node, the signal reception strength of CH 1 shown in Figure   10(b) is optimal, and the average is maintained at 269 dBm; the signal reception strength of CH 2 shown in Figure 10(c) is followed by an average of 270 dBm; while the signal reception strength of CH 5 shown in Figure 10(f) is the weakest, and its RSSI fluctuates between 278 and 282 dBm. When CH send forwards the data of cluster member node II to the neighbor cluster head node, the RSSI value of CH 2 shown in Figure  10(c) is maintained between 270 and 275 dBm, and the curve fluctuates most smoothly; in contrast, the RSSI value of CH 1 shown in Figure 10(b) is between 266 and 274 dBm, although the difference between the maximum and minimum values of the RSSI in this curve is 8, its RSSI is significantly better than CH 2 ; in addition, the RSSI value of the remaining three neighbor cluster head nodes is all less than the RSSI value of CH 1 . When CH send forwards the data of cluster Figure 11. LQI comparison of cluster head nodes in the first hop: (a) cluster member node I, (b) cluster member node II, and (c) cluster member node III. Figure 12. RSSI comparison of cluster head nodes in the second hop: (a) RSSI curve when CH 1 receives data from cluster member nodes, (b) RSSI curve of neighbor cluster head node CH 6 , (c) RSSI curve of neighbor cluster head node CH 7 , (d) RSSI curve of neighbor cluster head node CH 8 , (e) RSSI curve of neighbor cluster head node CH 9 , and (f) RSSI curve of neighbor cluster head node CH 10 . member node III to the neighbor cluster head node, the RSSI curve of CH 2 shown in Figure 10(c) and CH 5 shown in Figure 10(e) has the smallest fluctuation amplitude, and the difference between the maximum and minimum values of RSSI is 5, except that the RSSI value of CH 2 is larger. The RSSI curve of CH 1 , CH 3 , and CH 4 fluctuates greatly, but the RSSI value of CH 1 is the largest among the three. In summary, combined with the change in RSSI of the five neighbor cluster head nodes after the first-hop routing, the signal reception strength of the neighbor cluster head node CH 1 shown in Figure 10(b) is optimal.
According to the definition of LQI, we can get the LQI through the RSSI of the node. The effective range of LQI is [0,255], and the larger the value of LQI, the better the link quality. Figure 11 shows the LQI of each neighbor cluster head node at different times in the first hop. Figure 11(a)-(c) shows the comparison of LQI when CH send forwards the data of cluster member nodes I, II, and III to the next hop, respectively. For the convenience of comparison, when LQI is less than or equal to 0, we collectively express it as 1. Viewed as a whole, the LQI of CH 1 and CH 2 is significantly greater than the remaining three neighbor cluster head nodes. When receiving the data of node I, the LQI difference between CH 1 and CH 2 is small, but when receiving the data of node II and node III, the LQI of CH 1 is obviously larger than that of CH 2 , so the LQI of CH 1 is better than the LQI of CH 2 as a whole.
Combined with Figures 10 and 11, it can be concluded that the link quality between the CH 1 and the source cluster head node CH send is the best among all the candidate cluster heads in the first hop. After the first-hop selection, we use the selected best neighbor cluster head node as the new source cluster head node and perform the second-hop link quality verification. We define the five candidate neighbor cluster head nodes in this second hop as CH 6 , CH 7 , CH 8 , CH 9 , and CH 10 , respectively. In Figure 12, we show the changes of the RSSI of the optimal cluster head node in the first hop and the RSSI of each neighbor cluster head node in the second hop, where Figure 12(a) shows the RSSI change curve of the optimal cluster head node CH 1 in the first hop. After CH 1 forwards the data of cluster member nodes I, II and III in its cluster, Figure 12(b)-(f) respectively compares the RSSI changes of five neighboring cluster head nodes in the second hop, where neighbor cluster head node CH 6 shown in Figure 12(b) is the second-hop optimal cluster head node obtained by running the algorithm. Overall, the RSSI fluctuation range of CH 6 shown in Figure 12(b) is between 266 and 273 dBm, the RSSI value of CH 7 shown in Figure 12(c) fluctuates from 270 to 276 dBm, the RSSI fluctuation range of CH 8 shown in Figure 12(d) is 271 to 279 dBm, the RSSI value of CH 9 shown in Figure 12(e) fluctuates between 273 and 282 dBm, and the RSSI of CH 10 shown in Figure 12(f) ranges from 276 to 285 dBm.
However, when CH 1 forwards the data of cluster member node I to the neighbor cluster head node, the RSSI of CH 7 shown in Figure 12(c) remains between 270 and 272 dBm, and the curve fluctuates more smoothly. In contrast, the RSSI fluctuation range of the remaining four neighbor cluster head nodes is larger, but the RSSI value of CH 6 shown in Figure  12(b) is greater than or equal to 270 dBm at any time, which is better than the RSSI of CH 7 . When CH 1 forwards the data of cluster member node II to the neighbor cluster head node, the RSSI value of the CH 6 shown in Figure 12(b) is averagely maintained at 270 dBm; the RSSI values of the CH 7 shown in Figure  12(c), the CH 8 shown in Figure 12(d), and the CH 9 shown in Figure 12(e) are all between 270 and 280 dBm, and the RSSI value of CH 10 shown in Figure 12(f) is the smallest, always lower than 280 dBm. When CH 1 forwards the data of cluster member node III to the neighbor cluster head node, the RSSI curves of CH 6 shown in Figure 12(b) and CH 7 shown in Figure 12(c) fluctuate more gently, but the difference is that the average RSSI of CH 7 remains at 276 dBm, while the average RSSI of CH 6 remains at 274 dBm. In addition, the RSSI values of the remaining three neighbor cluster head nodes are small and the curve is relatively volatile. Combined with the change in RSSI of the five neighbor cluster head nodes after the second-hop routing, the signal reception strength of the neighbor cluster head node CH 6 shown in Figure  12(b) is optimal. Figure 13 shows the LQI of each neighbor cluster head node at different times in the second-hop selection. Among them, Figure 13(a)-(c) shows the LQI comparison of all neighboring cluster head nodes in the second hop when the first-hop node CH 1 forwards the data of node I, node II, and node III to the next hop. From the overall view of the three figures, with the increase in network running time, the LQI of CH 6 is significantly larger than the remaining four neighbor cluster head nodes. When receiving the data of node I, the LQI values of CH 6 and CH 7 are much larger than the LQI values of other neighboring cluster head nodes. At every moment, the LQI value of CH 6 is more than 30, while the partial LQI value of CH 7 appears to be less than 30, and the LQI of CH 6 performs better. Similarly, the LQI value of CH 6 is still the largest in Figure 13(b) and (c).
Combined with Figures 12 and 13, it can be concluded that among all the candidate cluster heads of the second hop, the link quality between CH 6 and the optimal cluster head node CH1 of the first hop is the best. In summary, in the selection of the first hop and the second hop, the link quality of the data transmission path selected by the algorithm in this article is optimal. In the experimental platform, we compare the link quality of each link during the operation of the algorithm. The experimental results prove the effectiveness of the proposed algorithm.

Conclusion
Reasonable routing protocols can effectively extend the life cycle of WSNs. Different from the traditional routing protocol, this article proposes a routing GABLQ. First, the link quality evaluation index which is more suitable for the practical application environment is established by combining RSSI and LQI. Then, according to the characteristics of node forwarding data, the idea of game theory is introduced into the routing mechanism. And the link quality, node residual energy, and minimum hop forwarding strategy are incorporated into the revenue function to establish a game model. On this basis, the cluster head node with maximum benefit is selected as the next hop cluster head forwarding node through the utility function, and then the Nash equilibrium is achieved and the optimal data transmission path is obtained. In addition, we set up a real experimental environment to verify the link quality in multi-hop transmission. According to the deployment location and transmission path obtained from the simulation experiment, we have carried out networking experiments in the actual environment, and the experimental results prove the feasibility of this method.
In the next step, we will further improve the GABLQ algorithm with the combination of node coverage and data transmission rate, and design an Internet of things monitoring scheme for regional effective monitoring and reliable data transmission by setting the weights of different parameters in the utility function. In the real experimental scenario, the verification of energy consumption, life cycle, packet loss rate, and delay in large-scale node multi-hop transmission is also a valuable research direction.

Author contributions
Z.H. has contributed toward the algorithms and the analysis. As the supervisor of J.H., he has proofread the paper several times and provided guidance throughout the whole preparation of the manuscript. J.H. has contributed toward the algorithms, the analysis, and the experiments and wrote the paper. J.D. has simulated the comparison part of the algorithm in the experimental simulation and modified the relevant definitions of the paper. X.D. and N.Q. have revised the equations, helped in writing the introduction and the related works, and critically revised the paper. All authors read and approved the final manuscript.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.