An Energy-Efﬁcient and Fault-Tolerant Convergecast Protocol in Wireless Sensor Networks

The simple graph theory is commonly employed in wireless sensor networks topology control. An inherent problem of small-granularity algorithms is the high computing complexity and large solution space when managing large-scale WSNs. Computed transmission paths are of low fault tolerance because of unattended sensor nodes and frail wireless transmitting channels. This paper uses hyper-graph theory to solve these practical problems and proposes a spanning hyper-tree algorithm (SHTa) to compute the minimum transmitting power delivery paths set for WSNs convergecast. There are three main contributions of this paper: (1) we present a novel hyper-graph model to abstract large-scale and high connectivity WSNs into a robust hyper-tree infrastructure; (2) we present a precise mathematical derivation that solves the “hyper-tree existence” problem; (3) SHTa is proposed to compute the delivery paths set, which is the minimum power transmitting convergecast hyper-tree. Variable scale hyper-edges represented as computing units limit solution space and reduce computing complexity. Mutual backup delivery paths in one hyper-edge improve the capability of fault tolerance. With experiment results, SHTa computes short latency paths with low energy consumption, compared with previous algorithms. Furthermore, in dynamic experiments scenes, SHTa retains its robust transmitting quality and presents high fault tolerance.


Introduction
Self-organized wireless sensor networks can be used to cooperatively detect and perceive real objects. Sensors can communicate and exchange information among themselves without human intervention. This is achieved by integrating technologies, including sensors, embedded calculations, distributed information processing, and wireless communication. Wireless sensor networks have huge potential in civil and military applications, such as smart grid, smart home, healthcare monitoring, and intelligent transport.
Self-organized wireless sensor networks are made up of highly distributed systems of small-size, wireless unattended sensors. Each sensor is capable of detecting devices' current operating conditions, such as temperature, noise, vibration, or output signals. This data is preprocessed, transmitted, and exchanged in a machine-to-machine (M2M) network [1,2]. There is a need for reliable, scalable, and smart protocols and algorithms for self-organized M2M networks or sensor networks.
In traditional communication networks, simple graph theory is always used [3,4]. But a large-scale self-organized wireless sensor network consists of hundreds or thousands of nodes with a complex topology. Hence, a large number of the control massages are required to establish transmission paths. On the other hand, because of the low reliability of the sensor nodes and wireless communication links, many real-time control messages have to be used to maintain an established path. These tasks use significant amount of bandwidth and consume the extra energy.
To solve this problem, in this paper, we used the hypergraph theory and proposed Spanning Hyper-Tree algorithm (SHTa) to create a concise and robust hyper-graph infrastructure for large-scale and high connectivity self-organized wireless networks.. Based on the best of our knowledge, it is the first hyper-graph model for self-organized wireless 2 International Journal of Distributed Sensor Networks networks architecture. Because a dynamic hyperedge is the minimum computing unit during routing in this type of hyper-graph architecture, fewer packets are used, which saves energy and prolongs the network's lifetime. More than one connected pairs in a hyperedge provides high bandwidth and low loss rate during transmission. This effectively improves network fault tolerance. Moreover, SHTa solves the "hypertree existence uncertainty problem," which is a new problem that differs from the simple graph model. An axiom "any graph has its spanning tree" is invalidated in a hyper-graph, that is, not each hyper-graph exists spanning hyper-tree with loop-free. SHTa presents an effective spanning hyper-tree method and we proposed the strict mathematical proof to prove the certainty theorem. The remainder of the paper is organized as follows. Section 2 introduces some background material on wireless communication network architecture and optimal routing problems. Wireless self-organized sensor networks' hypergraph model is presented in Section 3. Section 4 describes the SHTa in detail, followed by validity proof. Section 5 proposes the computer simulation and evaluation; finally, Section 6 is the conclusion and outline of future research.

Convergecast with Data Aggregation in Wireless Sensor Networks
For peer-to-peer (P2P) communication model, Dijkstra and Bellman-Ford algorithms are often employed to build a shortest path tree (SPT), such as OSPF used in IP backbone networks. Each router with OSPF stores an SPT in which the root is itself. Packets are transmitted following SPT's branches to arrive the minimum cost. Different from P2P, self-organized wireless sensor network collects data from each sensor nodes to "Sink," called convergecast. During transmission, data aggregation is used to eliminate the redundancy in collection data. Many algorithms are presented to establish data aggregation tree, such as EADAT [5], E-Span [6], and HEED [7]. These algorithms set transmitting energy consumption as link weight and build SPT as an aggregation tree. Reference [8] presented the DCTC algorithm to detect and track a mobile target. DCTC used Dijkstra to establish collection tree, which is also an SPT.
Not every data packet will be transmitted from source to destination, due to data aggregation, in the intermediate nodes. A wireless sensor network is generalized as data center, and the optimum number of transmissions required per datum in the DC (Data Centre) is equal to the number of edges in the Minimum Steiner Tree (MST). Therefore, MST, not SPT, is the truly minimized sum cost tree in convergecast protocols with data aggregation. Figure 1 shows an example to explain this optimization problem. Three nodes transmit information to Sink. SPT and MST are shown, respectively, in Figures 1(a) and 1(b). Table 1 presents the two routing configurations: the total cost of SPT method is 3.82, larger than MST. Therefore, MST is better than SPT. If the weight of edges is defined as energy consumption, MST is just the optimal energy consumption tree in the wireless sensor network.

Hypergraph Model for Wireless Self-Organized Sensor Networks
Whether for IP backbone network, cellar mobile network or Ad-hoc network, the simple graph theory is the main tool for research on architecture control [8][9][10][11] and counting routing protocols [3,4,[12][13][14]. But in large-scale wireless self-organized sensor networks, the number of sensor nodes can be hundreds or thousands of times of that of backbone network or mobile network. Each node can connect with any one neighbour by omnidirectional antenna, which creates high node connectivity and complication in topology controlling. A simple graph algorithm with tiny granularity often has high computing complexity and uses a large amount of memory. On the other hand, in wireless sensor networks, a single transmitting path has a low fault tolerance level because of the low reliability of sensor nodes and wireless links. During data transmission, lots of control messages need to be transmitted frequently to maintain the connectivity of a delivery path, which may use lots of links' bandwidth and consume significant amount of energy.
To solve this problem, Hyper-graph theory is used as a novel mathematical tool to generalize high connectivity wireless self-organized networks into concise and robust hyper-graph infrastructure. As far as we know, it is the first hyper-graph architecture model in wireless sensor network. In the model, special nodes and connected edges among them are generalized as hyper-edges. With the growth of hyper-edges, as the minimum computing unit, fewer extra packets are used and the energy consumption is effectively reduced.
. , x n } be a finite set, and let ε = {E i | i ∈ I} be a family of subsets of X. If the following two conditions are satisfied:  We describe a wireless self-organized sensor network as a hyper-graph H = (X, ε), in which X = {x 1 , x 2 , . . . , x n } is the sensor nodes set. Special characteristics of nodes are represented as a hyper-edge, that is, E i = {N 1 , N 2 , . . . , N j , e 1 , e 2 , . . . , e k }, and hyper-edge set is ε = {E 1 , E 2 , . . . , E m }. It is obvious that cluster in simple graph is a special type of hyper-edge, and hyper-graph is the extended cluster.
In the hyper-graph model, we should also establish MST for optimal convegercast. But the binary relation of hyperedge and vertices in hyper-graph is not the one-to-one mapping relation of vertices and edges as it is in a simple graph, which is more complex. Therefore, the axiom "any graph has its spanning tree" is invalid in hyper-graph, that is, not each hyper-graph exists spanning hyper-tree with loopfree. A hyper-graph example with no hyper-tree is shown in Figure 2. Two hyper-edges are split and Theorem 2 proposed hyper-tree does not exist, because of existing the loop (1-Hyperedge 1 -10-Hyperedge 2 -1) in the bipartite graph G H , shown in Figure 2(b).

Theorem 2. Hyper-graph H is a hyper-tree, if and only if the bipartite graph G H is a tree.
To ensure one hyper-graph certainly has hyper-tree, we proved two conditions must be satisfied: The precise mathematical proof is shown here. Firstly, if there is no hyper-cycle in H = (X, ε), the proposition is true. Otherwise, if there are hyper-cycles, a break-cycle method is used. In a hyper-clcye C, three connecting hyperedgs E i , E j , E k always can be found easily, then there must be a 2-degree chained hyper-edge among them, assuming and H is also connective. Repeating this process till there is no hyper-cycle, the result is a hyper-graph T = (X , ε ), X = X and ε ⊆ ε, which is the final spanning hyper-tree.
In Section 4, we presented a novel topology controlling algorithm to split hyper-edge, establish hyper-graph with satisfying the above two conditions, and span the minimum hyper-tree for minimum energy consumption convergecast.

Minimum Spanning Hyper-Tree Algorithm
In the implement of SHTa, a type of generalized synchronization mechanism with "synchronous round" was used. Firstly, we describe this synchronization mechanism.
Time synchronization is an important feature of distributed systems including wired and wireless communication systems. Many time synchronization schemes were designed including GPS [16] and Network Time Protocol 4 International Journal of Distributed Sensor Networks (NTP) [17] used in IP networks applications. In M2M and sensor networks, time synchronization is also used frequently for various purposes including sensor data fusion, coordinated actuation, and power-efficient duty cycling: for example integrating a time series of proximity detections into a velocity estimate; measuring the time of flight of sound for localizing its source; distributing a beam forming array; suppressing redundant messages by recognizing that they describe duplicate detections of the same event by different sensors; or supporting energy efficient scheduling and power management. Now, many good time synchronization algorithms, such as Reference Broadcast (RBS [18]), TINY/MINI-SYNC [19], and Level Synchronization [20], are presented to provide time accuracy in wireless self-organized sensor networks.
Compared with accurate time slots synchronization, generalized synchronization mechanisms with "synchronous round" can save a large number of timescale check packets, which ensures the accurate time synchronisation, and reduce the complexity of designing communication protocols, therefore reducing the transmission energy consumption. In a generalized synchronous mechanism, each processor unit should complete two steps during one synchronous round: in the first step, the processor transmits event driven messages to its neighbor; in the second step, processor switches its current state with a state transition function, once it has received any valid messages.
When synchronous network is a deterministic system, a state transition function with the same valid input must achieve the same output in each time. Mapping the two steps onto a sensor node processor, the following two operations would be implemented. Without loss of generality, we suppose SHTa implements at k th "synchronous round." Each main hyper-tree ε k m initiate to search for a minimal power chain hyper-edge connected with its neighbours. Nodes v i , at the edge of ε k m , broadcasts R mPCHe message. Any node receiving this message will implement SHTa as in Algorithm 2.
As soon as two main hyper-edges ε k m , ε k n confirm their conjunct minimal power chain hyper-edge, Op HC messages are broadcasted in these three units. Any one node received this message will implement the operations described in Algorithm 3.
Whenever SHTa cannot find any new chained hyperedge ε k j or implement consolidation operation, the algorithm stops and the spanning forest gathers into a hyper-tree. In the following section, we prove that this hyper-tree is just a minimum spanning hyper-tree.
We rewrote the conclusion of the spanning hyper-tree from SHTa algorithm: In hyper-graph H = (X, ε), ∪ j E j = {(X, ε) : 1 ≤ j ≤ k} is one of the hyper-graph's spanning forests. If e is the minimum weight chain hyper-edge in the set ∪ j E j , there must be a hyper-tree, which is made up of ∪ j E j and e. Moreover, this hyper-tree is the minimum hypertree in all of the spanning hyper-trees which include ∪ j E j .
Proof by Contradiction. Suppose that the conclusion is erroneous, that is, there is a hyper-tree T, which includes ∪ j E j , but does not include the e. And T is strictly less than any other hyper-tree, which includes the ∪ j E j and the e. Now put e into T, and then obtain the graph T . Obviously, there is a cycle in T , which includes another chain hyper-edge e , e / = e, and e ∈ ∪ j E j . Based on the definition, weight(e ) ≥ weight(e) is obtained. e can be safely deleted from T . And another hyper-tree T is made, including ∪ j E j and e. the power of T is not larger than the power of T. There is a contradiction for T. The supposition is in error, and the original proposition is true.

Computer Simulation
This section evaluates the performance of the novel algorithm using simulation. Firstly, seven different sensor scenes are studied, in a 200 × 200 m 2 square area, and a number of sensor nodes are uniformly dispersed, ranging from 50 to 350 nodes with increment step of 50 nodes. Each node has a radio range of 40 m. We used this environment to simulate how different network density affects the energy consumption during the processing of spanning tree or hyper-tree. Then, transmission performance metrics, average latency and loss packets ratio, are evaluated when data packets are delivered following SHTa, compared with Directed Diffusion (DD) [21] and its improved algorithm EADD [22]. We use the same parameter as [21]: (1) using the 802.11 MAC protocol to ensure the data link connected; (2) setting the idle time power dissipation about 35 mW, receiving data power dissipation 395 mW, and transmiting data power dissipation 660 mW; (3) setting events modelled as 64 bytes and information control packet 36 bytes. Finally, in the simulations, we use a fixed events generated model-after every ten-second interval, ten nodes were randomly selected as sensor sources and generated constant bit rate (CBR) data streams with packet intervals of 0.1 seconds. The duration of each data streams is 5 seconds.
We first compute the maximum and average nodes' degree and the standard deviation of nodes' degree in seven different scenes to analyze the network density, shown in Table 2. Figure 3 shows the average dissipated energy per packet as a function of networks size. DD and EADD have almost the same energy consumption and a half less than flooding. SHTa consumes less energy than DD and EADD. With the increase of the network size, SHTa can save 23.7% energy v j ∈ V receive R mPCHe message and implement the following operation in SHTa { ignore this R mPCHe packet; (4) return UNSUCCESS; // can not recover the new delivery path (5) } (6) if (MPCHe.Source E ID == v j .He ID) //v j and R mPCHe belong to the same main hyper-edge (7) { ignore this R mPCHe; (8) R mPCHe.ttl= R mPCHe.ttl-1; (9) } (10) else // v j receive this R mPCHe at first time (11) if (v j ∈ ε k n ) (12) R mPCHe.Tran Pw = Pw (v i , v j ); // sub-minimal power is transmission power in ε k    that of directed diffusion. It is because SHTa used hyper-edge as computing granularity, with consolidation operation, that hyper-edges become larger and less in the networks during the SHTa processing, which is completely different from the trivial nodes operation. Therefore it effectively reduces the number of overhead packets and reduces the size of the solution space, which results in the reduction of energy consumption and a prolonged network's lifetime. Figure 4 plots the average latency observed as a function of network size. Using the shortest path, DD and EADD algorithms have lower delays than flooding algorithm. Because more available energy can give nodes a faster response time, EADD just selects these vigorous nodes as relay stations and   achieves lower delay than DD. Differing from the shortest edge path algorithm, SHTa uses hyper-edge, in which a set of identified nodes and edges composing multiple paths transmit information at the same time; therefore the lowest delays can be reached in four algorithms. Figure 5 presents another performance metric-loss packets rate. SHTa and flooding have lower value than DD and EADD. Compared with signal path in DD or EADD, multiple delivery paths in one hyper-edge in SHTa or duplication flooding in the Flooding algorithm improves the transmission reliability.
We also study the impact of dynamics in wireless sensor networks with 10%, 20%, and 30% random failure nodes. Figure 6 presented the average dissipated energy per packet as a function of network size. By increasing the failure percentage from 10% to 30%, both of DD and SHTa algorithms significantly consume more energy in transmitting per packet, the increase rate of DD is 33.40%, more than 28.48% of SHTa. Figure 7 presents the average latency measurement. The results show that SHTa also provides the lower average  delay for various fault percentages, that is, average 0.259 s when 30% of nodes fail. This is mainly because SHTa presents mutual backup delivery paths in one hyper-edge, which improves the capability of fault tolerance. In the final experiment, we evaluated the loss packets rate when the fault percentage of faulty nodes is increased. Figure 8 clearly shows that SHTa drops lower number of data packets compared with DD protocols, that is, 8.82% when 30% of nodes fail, and DD performs slightly worse, 17.12% for the same situation. All of results fairly present that SHTa is great robustness and can offers significant performance gain in networks with high fault percentage.

Conclusion
To consistently provide reliable communication services for machine to machine applications, scalable and smart network architecture control algorithms are needed for wireless self-organized communication networks. This paper generalizes large-scale wireless self-organized sensor networks into concise and robust hyper-graph infrastructure and proposes an algorithm called SHTa to achieve minimum spanning hyper-tree. We proved algorithm's validity with mathematical deduction and computer simulation. Based on experimental results, the SHTa algorithm can save more energy and have lower latency and packets loss rates than previous algorithms, and the algorithm is more robust in the dynamic experiments. All of these results show that SHTa is an effective technique for wireless sensor networks and M2M applications.