A Betweenness Calibration Topology Optimal Control Algorithm for Wireless Sensor Networks

In self-organized wireless sensor networks (WSNs), any two sensor nodes can connect if they are placed in each other's communication range. Therefore, the physical topology of WSNs is usually a strongly connected topology. Sensor nodes should frequently receive and process data from their large number of neighbors, which will consume great amounts of energy. Shocking wireless channel collision also causes low throughput and high loss packets ratio during data transmission. To improve the transmission performance and save scarce energy, a logical topology generating from the physical one is necessary for the self-organized WSNs. Based on the complex network theory, this paper proposed a novel betweenness addition edges expansion algorithm (BAEE). With betweenness calibration, BAEE algorithm expanded the minimum-cost edges to optimize the network topology. Two performance metrics-connectivity functions, robustness function R ( G ) and efficiency function E ( G ) , were utilized to evaluate the network capability of the robustness and invulnerability. R ( G ) is the parameter to measure the topology connectivity, and E ( G ) is the parameter to evaluate the network exchanging information capability. Based on the simulation under various random failures and intentional attack scenarios, BAEE can effectively optimize WSNs' topology and improve the network's robust connectivity and extremely efficient exchanging information capability.


Introduction
Wireless sensor networks (WSNs) are a class of self-organized wireless communication networks, in which many sensor nodes collect, process, and exchange information acquired from the physical environment or the monitor objects and then send it to the external base station, called Sink [1]. WSNs have a wide range of potential applications including environment monitoring, smart grid, medical systems, and robotic exploration [2].
There are two main difficulties in WSNs' design: (1) the limited and nonreplenishable energy supply and (2) the limited transmission bandwidth and high packet loss rate caused by the out-of-order distributed communication. Hence, the energy control algorithm and robust infrastructure are necessary to prolong the networks' lifetime and improve the communication performance.
Topology optimal control (TOC) is to design a good logical network infrastructure, one of the key techniques used in wireless self-organized sensor networks [3]. In a network, if there is at least one route to connect any two sensor nodes, such network is regarded as a connected one. Because of the omnidirectional antenna, any two nodes in WSNs can communicate if the Euclidean distance between them is less than the communication range. Therefore, the physical topology of WSNs is usually a strongly connected topology. Any node will frequently receive and process data from the quantity of its neighbors, which will consume great amounts of energy. The minimum energy network connectivity (MENC) problem was defined and proved to be an NP-complete problem [4].
In the research on TOC, the previous research can be classified into two types based on the optimized objects: physical topology control algorithms (PTCA) and logical topology control algorithms (LTCA) [5,6]. PTCA adjust sensor nodes' transmission power to control the physical topology. On the other hand, LTCA restrict one node connected with a certain number of neighbors to satisfy the network connectivity. This neighbor reduction mechanism helps to reduce the routing overhead and relieve the channel collision problems.
Different from the wired communication network, such as IP network, WSN is one type of dynamic networks. There are many factors causing the dynamic structurefrom system hardware to application-for unattended sensor nodes with miniature sizes (mm scale for smart dust motes), limited battery-power, and low reliable hardware circuits when coping with harsh conditions. Other factors that may affect network connectivity and communication among sensor nodes are fading, signal strength, obstacles, weather conditions, interference, and so forth [7,8]. An immutable topology structure is not enough for the WSNs, and any dynamic change will break original optimization and reduce the network performance.
To overcome this critical problem, this paper proposed a novel betweenness addition edges expansion algorithm (BAEE). With the betweenness parameter, BAEE algorithm expanded the minimum-cost edges to optimize the network topology with maximum improving of the efficiency function values. The preliminary simulation results, compared with Fiedler-vector-based strategy, showed that our algorithm could obtain more robust topology with higher invulnerability under both the random failures and intentional attack scenarios.
This paper is organized as follows. Section 1 introduces the TOC problem in WSNs, and Section 2 presents the related work. The problem's mathematic description and model building are presented in Section 3. Section 4 proposes the BAEE algorithm in detail. Section 5 presents simulation results to demonstrate the effectivity of the algorithm. Section 6 concludes the paper.

Related Work
There are three types of approaches in the previous TOC research presented as follows. (1) Control each node's emission power to reduce the strong connectivity of the physical topology and to effectively save the energy consumption and prolong network lifespan. Rodoplu and Meng [9] introduced the notion of relay region and enclosure for the purpose of power control. It was shown that the network was strongly connected if every node maintained links with the nodes in its enclosure. With reducing the transmission power, the topology connectivity becomes thin. Building a minimumpower-connected topology is a multiobjective optimization problem. (2) Reduce the total number of working nodes in WSNs, and let other nodes suspend to hibernate. It can also reduce the topology complexity. Moreover, the approach helps to reduce the interference that exists in wireless network, which means that a greater signal-to-noise ratio will be obtained at receiving nodes. The most common schemes based on this principle are sensor-MAC (S-MAC) [10], timeout-MAC (T-MAC) [11], and data-gathering MAC (D-MAC) [12]. (3) Control sensor node's logical degree in its logical topology, thus helping to reduce MAC layer contention and improve space reuse. A less node's logical degree may also help to mitigate the hidden and exposed terminal problems. Clustering topology control strategy is one of the effective approaches, similar to spanning-tree structure in WSN. The low-energy adaptive clustering hierarchy (LEACH) [13] is the most notable clustering algorithm for wireless sensor networks. LEACH combines the ideas of energy-efficient cluster with application-specific data aggregation to achieve good performance. Its improved algorithm, power-efficient gathering in sensor information systems (PEGASIS) [14], is a chain-based clustering scheme. Another effective topology structure is the spanning-tree [15]. Li et al. [16] proposed a fully distributed topology control algorithm called LMST. A similar method, -local MST, was addressed by Li et al. [17].
With the number of sensor nodes increasing, the topology of large-scale WSNs becomes more and more complex, and TOC, as a type of multiobjective optimization problems, is very difficult to explore the global optimal solution, such as the degree-constrained minimum spanning-tree problem. Some heuristic methods were developed to improve the optimization performance [18][19][20][21][22][23]. In [18], the authors proposed two heuristics based on a minimum spanning-tree algorithm and a broadcast incremental power method, respectively. Konstantinidis developed a genetic algorithm with local search that performs better than the MST heuristic [19]. Guo presented an improved discrete particle swarm optimization algorithm for generating topology schemes [20]. A simulated annealing algorithm was designed in [21], and it is also applied to solve the problem of minimizing broadcast tree, one type of the physical topology control problems [22]. In [23], ant colony optimization, a framework inspired by the ant foraging behavior in the area of swarm intelligence, is applied to physical topology control.
The above heuristic algorithms focused on the solution procedure of optimization problem itself, in which topology control had been abstracted into the multiobjective optimization problem. On the different view to analyze the topology control problem, we use the complex network theory to calculate the network's long-range and short-range connectivity, and then a novel BAEE algorithm is proposed to improve the networks' robustness and invulnerability with the minimum-cost edges expanded.

The Network Model and the Parameters of Complex Network
The formal definition of the TOC problem in WSNs is presented as follows. In a special sensor area, there are a set of wireless nodes . . , } is the set of communication links. When an adjacent pair V , V shows the same wireless medium, (V , V ) indicates that both nodes are within their wireless transmitting ranges 0 ; Therefore, the wireless sensor network is represented as a simple digraph = ( , ). Because of the omnidirectional antenna, in WSNs, any two nodes can communicate if the Euclidean distance between them is less than the communication range. Therefore, WSNs' topology is usually strongly connected. The complex network theory is utilized to analyze this type of strongly connected topology in this paper. Some parameters of complex network are presented firstly in the following section.

The Parameters of Complex Network.
A complex network's attribute can be described by its key parameters: degree International Journal of Distributed Sensor Networks 3 distribution, clustering coefficient, average path length, and betweenness [24,25].
(1) Cumulative Degree Distribution Function. The degree of a node in a network is the number of connections, and the degree distribution ( ) is the probability distribution of these degrees over the whole network. The cumulative degree distribution function ( ) is the probability distribution of all of the nodes whose degree is not less than . Consider the following: (1) (2) Clustering Coefficient. In graph theory, a clustering coefficient is a measure of the degree to which nodes in a graph tend to cluster together. Firstly, the local clustering coefficient of a node V in a graph quantifies how close its neighbors are to being a clique, that is, complete graph. Let (V) be the number of triangles on V ∈ ( ) for undirected graph . That is, (V) is the number of subgraphs of with three edges and three nodes, one of which is V. Let (V) be the number of triples on V ∈ ( ). That is, (V) is the number of subgraphs (not necessarily induced) with two edges and three nodes, one of which is V such that V is incident to both edges. Then, local clustering coefficient can be defined as The clustering coefficient for the whole network is given as the average of the local clustering coefficients of all of the nodes as follows: Evidence suggests that, in most real-world networks, nodes tend to create tightly knit groups characterized by a relatively high density of ties; this likelihood tends to be greater than the average probability of a tie randomly established between two nodes.
(3) Average Path Length. Average path length is defined as the average number of steps along the shortest paths for all possible pairs of network nodes. It is a measure of the efficiency of information or mass transport on a network. The definition is shown as Average path length is one of the most robust measures of network topology.
(4) Betweenness. There are two definitions: the vertex betweenness (V) and the edge betweenness ( ). Here, we used the (V) as the example. Betweenness (V), a centrality measure of a node within a graph, centrality quantifies the number of times a node acts as a bridge along the shortest path between two other nodes. Consider the following: Here, is the total number of the shortest paths from node to node , and (V) is the number of those paths that pass through V.

Two Evaluation Functions for Measuring the Network's
Robustness and Invulnerability. Based on the above four parameters, connectivity robustness function and efficiency function can be defined and utilized to evaluate the network's robustness and invulnerability.
(1) Connectivity Robustness Function ( ). Connectivity robustness refers to maintaining the connection capability of the remaining nodes when some of the nodes or edges in the network were removed [26]. In the network = ( , ) with nodes, the connectivity robustness is defined as follows: Here, ( ) is the largest connected component remaining after the removal of nodes. The connectivity robustness is normalized; that is, 0 < ( ) ≤ 1. The maximum value of the connectivity robustness function ( ) = 1 is obtained in the case that is also a connected graph after nodes were removed.
(2) Efficiency Function ( ). Instead of and , the network is characterized in terms of how efficiently it propagates information on a global and on a local scale, respectively defined as the global efficiency function glob ( ) and local efficiency function loc ( ) [25,27]. We assume that the efficiency in the communication between nodes and is inversely proportional to the shortest distance: = 1/ , ∀ , . In this definition, when there is no path in the graph between and , = +∞, and consistently = 0. The global efficiency of the graph can be defined as The local efficiency function can be defined as the average efficiency of local subgraphs as follows: where is the subgraph of the neighbors of , which is made by nodes and at most ( − 1)/2 edges. It is important 4 International Journal of Distributed Sensor Networks to notice that the quantities { } are the shortest distances between nodes and calculated on the graph . Both the global and the local efficiency are already normalized; that is, 0 ≤ glob ( ) ≤ 1, and 0 ≤ loc ( ) ≤ 1.
The maximum values of the efficiency glob ( ) = 1 and loc ( ) = 1 are obtained in the ideal case of a completely connected graph, that is, in the case in which the graph has all of the ( − 1)/2 possible edges and = 1, ∀ , . In the efficiency-based formalism, a network is extremely efficient in exchanging information both on a high global and local efficiency functions value. Moreover, the description of a network in terms of its efficiency can be extended to unconnected networks and, more important, with only a few modifications, to weighted networks. A weighted network is a case in which there is a weight associated with each of the edges. Such a network needs two matrices to be described. Consider the following.
(1) The usual adjacency matrix [ ] telling about the existence or nonexistence of a link is defined as follows: (2) The second weights matrix associated with each link [ ], where can be defined as communication cost, depended on the optimal problem. Obviously, weighted network optimal problem, such as TSP (traveling salesman problem), is more complex than topology control.
In this paper, we focus instead on the simpler case of unweighted networks topology. We will use the connectivity robustness function and the global efficiency function to evaluate the network's robustness and invulnerability under the random failures and the intentional attacks.

Description of Betweenness Addition Edges Expansion Algorithm
In order to improve the network's robustness and invulnerability under the random failures and the intentional attacks, a novel betweenness addition edges expansion algorithm is proposed. Based on the traffic analysis in practical WSNs, we find that the communication connection is established usually by events driven, in which both the vertex betweenness vet and edge betweenness edg follow heavy-tailed distribution. As in the above definition, the betweenness is the number of the shortest paths through node V or edge , which shows the importance of V and in network transmission. The vertex V with high-betweenness vet bears more packets switching, which is the core vertex in the network. The edge with high-betweenness edg bears more traffic flows, which is the key edge for the network's connectivity. In order to improve the networks robustness and avoid transmitting congestion, network's core parts should be identified. BAEE carefully selects special vertex parts and connects edges with betweenness addition strategy. After the optimal operation, the network diameter can be effectively reduced, and the transmission delay will be shortened. BAEE process is given as follows.
(1) According to the vertex betweenness (V) formula and network adjacency matrix [ ] × , each vertex betweenness is calculated and saved in the column vector . Consider the following: (2) Using the vector betweenness column vector , a new betweenness plus column vector [ + ] is calculated, where = 2 , as follows: The process of the bubble method is shown as follows (a) Compare the first element, that is, 0th, with the latter element; if smaller, then switch. Sequentially compare times element and eventually change the smallest value element into th unit; the element ( ) does not move anymore.  According to the above optimization approach, the experimental simulation was taken to evaluate the algorithm's performance. The detailed analysis about results was shown in the following section.

Simulation and Performance Evaluation
The simulation scenario is that 100 sensor nodes were randomly placed in a 900 m × 900 m field. Each node's radio propagation range is 300 m. After the self-organized process, a strongly connected physical topology is established. To reduce the interference, the neighbors of each sensor node are control based on the traffic requirements, and then a logical topology is generated, which is the topology that we really need for data transmission, shown in Figure 1.
The connectivity robustness function ( ) and efficiency function ( ) for the initial network are calculated as follows: ( ) = 1, because it is a connected graph; ( ) = 0.226. In the simulation, BAEE algorithm is used to optimize the network topology, compared with Fiedler-vector-based strategy (FVBS), another well-known method for TOC presented in [28]. FVBS main idea is adding a link between a node pair with the maximal | − |, the absolute difference between the th and th elements of the Fiedler vector of . Because the Fiedler vector is related to the algebraic connectivity of , to maintain the fairness of evaluation, the simulation results are analyzed through the connectivity robustness function ( ) and efficiency function ( ), except for the algebraic connectivity.
In the simulation, firstly, using BAEE and FVBS to optimize the original topology, the new topologies BAEE and FVBS are generated. To maintain the fairness, the same number of edges is added in BAEE and FVBS , shown in Table 1.
Then, the identical random failures and the intentional attacks are applied on the two . The robustness and invulnerability are evaluated by the two performance metrics ( ) and ( ).

The Experiments under Random Failures.
Random failures mean that nodes in the network are randomly failed, and at the same time the edges connecting with the failure nodes are also failed. Because of the low reliable hardware circuits, the limited battery-power, and the harsh wilderness conditions, the case of sensor node failed often occurs in the practical application. Figure 2 shows the connectivity robustness function value for the increasing of the number of random failure nodes. From Figure 2, we can observe that the two optimized topologies have higher value than the original network confronting random failures. Moreover, BAEE is better than FVBS; the value has an average 5.23% increase, which means that the optimized network has the stronger capability of maintaining connectivity. Different from the other two curves, the ( BAEE ) curve of BAEE is stable. For example, at 11 failure nodes scenario, the ( BAEE ) curve does not shake, different from the sharp decline of ( FVBS ) and ( ) curves, which indicates that BAEE algorithm has better "resistance. " Figure 3 shows the efficiency function ( ) under random failures. As the number of the failure nodes increases, the efficiency function of the three networks decreases. The reason is that failure nodes make certain shortest paths broken. But ( BAEE ) is higher than ( FVBS ) and ( ), whose increase rates are 13.78% and 23.59%, respectively. This is extreme efficiency showed that BAEE algorithm can optimize network and reach extreme efficiency in exchanging information for ubiquitous data-centric wireless sensor networks.

The Experiments under Intentional Attacks.
Intentional attack is another kind of accident for wireless sensor networks. Based on partial information of network, enemy can accurately attack the weakest parts and break down the whole system. So a network should have more robust topology to resist intentional attacks. In the following experiments, two types of attacks are simulated: (1) make nodes with high vertex betweenness fail; (2) make links with high edge betweenness fail. The two metrics ( ) and ( ) are also used to measure the algorithms' performance. Figure 4 presents the connectivity robustness function ( ) under the intentional attacks with high-betweenness nodes failed. From the three curves, we can find that both BAEE and FVBS algorithms improve the original network's invulnerability. BAEE is also stronger than FVBS when the number of failed nodes is more than 6. The gap is 25.74% approximately. When the number of failed nodes is continually increasing and more than 13, the values of ( BAEE ) and ( FVBS ) have a sharp decline and coincide with ( ). The reason is that the original network has its inherent structure quality, and TOC algorithms can just improve the network performance limited.
The efficiency function ( ) against nodes' failure is shown in Figure 5. BAEE algorithm optimized the network and reached a high value of ( BAEE ). The average is higher than ( FVBS ) 22.98%. Moreover, we found that the same two aberration points, occurring in above experiments, also appear in this experiment: when the number of failed nodes is more than 6, ( FVBS ) and ( ) curves present a dump, but the network topology optimized by BAEE algorithm escapes this shake, showing stronger stability. When the number of failed nodes is more than 13, both ( BAEE ) and ( FVBS ) have a sharp decline and coincide with ( ), proving the network's inherent structure quality.
Another attack strategy, high-betweenness links failed, is implemented in the experiments. The top 20 highbetweenness links are sequentially broken to evaluate the network's performance; the curves of ( ) are presented in Figure 6. The results show that, under the high-betweenness edges' attack, FVBS algorithm cannot improve the network's invulnerability capability anymore, shown as the two curves ( FVBS ) and ( ) coinciding. However, BAEE algorithm is effective under this kind of attack. While the broken links are less than 12, ( BAEE ) is higher than ( FVBS ) and ( ) 9.89% averagely. Figure 7 presents the efficiency function ( ) under the edges intentional attack. BAEE also reaches a higher ( BAEE ) value than FVBS ( FVBS ) and original network ( ), in 6 International Journal of Distributed Sensor Networks

Betweenness Addition Edges Expansion algorithm
(1) Calculate each vertex's betweenness with the vertex betweenness (V) formula:   which an average increasing rate is 31.41% for ( FVBS ) and 50.88% for ( ). These results indicate that BAEE algorithm has obvious advantages against edges intentional attacks.

Conclusions
Because of the omnidirectional antenna, in WSNs, any two sensor nodes can connect if they are placed in each other's communication range. Therefore, the physical topology of WSNs is usually a strongly connected topology. Anyone should frequently receive and process data from the quantity of its neighbors, which will consume large amounts of  energy, a logical topology generating from a physical one and further dynamic optimization are necessary for the selforganized wireless sensor networks.
With topology vulnerability analysis, this paper proposes one topology optimization control algorithm-BAEE. The algorithm calculates the vertex betweenness and expanded special edges with the minimum cost. Two metrics, the connectivity robustness function ( ) and efficiency function ( ), are utilized to measure the network performance. ( ) is the metric to measure topology connectivity, and ( ) is the metric to evaluate the network exchanging information capability. Detailed definitions are presented in this paper. Using numerical experimental simulations under various random failures and intentional attack scenarios, we measured the performance of BAEE and compared it with the Fiedler-vector-based strategy in TOC. Results were very   International Journal of Distributed Sensor Networks promising and showed that our novel algorithm's performance is much better than others in reaching high connectivity robustness function value and efficiency function value, which means that the optimized network by BAEE has robust connectivity and extremely efficient exchanging information capability.