A Cluster-Based Consensus Algorithm in a Wireless Sensor Network

In this paper, we propose an average connectivity degree cluster (ACDC) scheme gossip algorithm to improve the convergence speed and the accuracy of the consensus, when a common decision is needed for a certain phenomenon in a distributed network. We analyze the effects of the initial value, the network topology (regular and irregular), and the number of clusters on the algorithm convergence rate as well as the accuracy of the value when reaching consensus. A utility function is developed based on two parameters, iteration and relative error, to help the network designers make an optimal decision based on their requirements. An irregular sensor model which is based on the degree of irregular (DOI) radius is introduced to evaluate the robustness of the algorithm. The simulation results demonstrate that for any initial value and network topology, the proposed ACDC gossip algorithm can yield results that are 50% closer to the real average value than the referenced standard gossip and grid cluster gossip algorithms. With different DOI values, our ACDC gossip algorithm can still reach lower relative error compared with other gossip algorithms, which demonstrates that our algorithm is robust enough to be executed in the network.


Introduction
The advancement of radio equipped modules and miniaturization of electronic components motivate the development of wireless sensor network (WSN) in which numerous distributed sensor nodes are usually deployed to perform a wide variety of applications, such as monitoring, surveillance, security, health care, and load balancing [1,2]. Nodes are usually deployed randomly in an ad hoc manner, and for certain tasks, the detection values at different nodes are conditionally independent. Conventionally, tasks are executed in a centralized manner that is straightforward to implement. However, it is not scalable for an increasing number of nodes and sometimes it is expensive and impossible to deploy and maintain such a central controller [3]. Thus, the management technique and distributed decision-making algorithm that organize these multiple distributed agents to carry out a task cooperatively have been extensively studied in recent years. Individual detection by one node in the distributed dynamic WSN system is not sufficient to perform decision making without knowledge of the global network. A consistent decision must be reached among these geographically dispersed sensor nodes through some type of information exchange mechanism. This decision, based on common interests, is referred to as reaching consensus using the detection values of the sensor nodes.
Although the consensus algorithm has been thoroughly studied in the control area, it is of vital important in the distributed sensor network. It is acted as a way to achieve globally optimal decision in a totally decentralized way, without sending all the sensors data to a fusion center [4]. Recently, the most attractive consensus algorithm is the gossip algorithm [5], where pairs of nodes are selected at random to exchange and update their values. Compared with routing algorithm, it is robust and easily implemented. It is not necessary to put much effort on route discovery and route maintenance, and it is a distributed iterative information exchange scheme. However, random information exchange between neighbors also leads to overhead and increases the time to reach consensus in the network. In addition, the 2 International Journal of Distributed Sensor Networks connectivity of the network affects the accuracy of the final consensus value. Therefore, the following two issues should be taken into account when designing the gossip strategy in the network.
(i) Decrease the amount of time required to reach consensus, that is, the convergence rate of the algorithm [6].
(ii) Improve the accuracy of the node value when reaching consensus, that is, the convergence accuracy of the algorithm.
Lots of research has worked on improving the convergence rate of gossip algorithm. Geographic gossip, which combined the gossip algorithm with geographic routing, was recently proposed [7]. This algorithm increases the diversity of the pairwise gossip operation by randomly choosing pairwise gossip nodes within the entire network rather than selecting them from adjacent nodes. The improved approach of geographic gossip was path averaging [8], where the average was performed at each node along the route between the exchanging pair nodes. However, in these two mechanisms, the probability of packet loss increased when sending messages along longer routes. Additionally, as the distance between pairwise nodes that are exchanging information increases, extra energy is consumed to set up and maintain the two-way route between them. The broadcast gossip algorithm which takes advantage of the broadcast characteristic of the wireless medium was proposed [9]. This scheme enables all the neighbors of the wake-up node to listen to the data transmission and perform updates. The other approach that makes use of the broadcast characteristic of wireless medium was the eavesdropping gossip [10], where each node can overhear the data broadcasted by its neighbors and the exchange pair of one node is optimally based on all the data that it received. Subsequently, cluster-based gossip algorithm has been proposed [3,11,12]. In [3], each node had a timer which was decremented by 1 at each time and the cluster head was chosen as one node's timer expires. This cluster formation method is simple and easy to implement and to distribute. However, it is impractical in reality because some of the nodes may not have chance to join a cluster. In [11], an algorithm that combines cluster and geographic routing was proposed for a large-scale sensor network. The network is firstly divided into grid clusters, and then the standard gossip algorithm is executed to reach a local consensus in each cluster area. A representative node is subsequently chosen in each cluster, and pairwise gossip is executed among these representative nodes via multihop routing until the consensus goal is reached. However, in the cluster stage, the nodes are divided into groups according to their locations. Thus, an imbalance in node numbers in different cluster slows the convergence rate for reaching consensus. In [12], the authors analyzed the data transmission scheme in the cluster, based wireless network, but they did not mention how to form a cluster, and if the cluster head is collapses, the whole network can no longer reach consensus.
Referring to cluster mechanisms, different cluster algorithms are proposed. For example, the Lowest Identifier (LID) [13] which chooses the node with the lowest ID as a cluster head is a simple clustering method. Its cluster formation method is similar to [3] and its cluster head chosen method is similar to [11]. Highest-Connectivity Degree Algorithm (HCDA) [14] is a connectivity-based cluster formation algorithm which is based on the neighbor number of a node. In HCDA, there is no restriction on the number of nodes in a cluster. When the number of nodes in one cluster is too large, the burden of the cluster head becomes too heavy which may lead to communication bottleneck. The lifetime of the whole network is short because of the imbalance of the network load. There are also some other algorithms, such as distributed clustering algorithm [15], distributed mobility adaptive clustering [16], and weighted clustering algorithm [17] which introduce weight on the selection of a cluster head. All of these above algorithms have not considered the impacts of the number of nodes in one cluster on the network capacity and throughput. Therefore, in this paper, we introduce a throughput/capacity aware cluster mechanism for the gossip algorithm and evaluate its convergence performance.
The contributions of this paper are the following.
(i) A new Average Connecting Degree Cluster (ACDC) based gossip algorithm is proposed to improve the convergence speed and the consensus accuracy.
(ii) Most research studies regarding consensus only consider the network topology as a random geographic graph (RGG). Few studies have considered the impact of regular, random, and small-world graphs on the convergence rate of consensus [18,19]. No study has analyzed the convergence rate for an irregular network topology. Hence, in this paper, we investigate the proposed gossip algorithm in irregular network topology, such as C-shaped, I-shaped, and O-shaped topologies.
(iii) A utility function that combines the convergence rate and consensus accuracy is proposed to provide an optimal choice reference for WSN users with regard to their purposes.
(iv) An irregular sensor model is introduced to evaluate the robustness of the algorithm.
This paper is organized as follows: Section 2 presents the proposed network model and describes the problem to be solved. Section 3 provides details about the ACDCbased gossip algorithm. Section 4 summarizes the algorithm performance evaluation and analysis. Section 5 presents our conclusions.

Network Model and Problem Formulation
Assume that static sensor nodes are independently deployed in a unit square area and that the network topology is represented as = ( , , ), where = {1, 2, 3, . . . , } represents the set of nodes and is the connectivity radius. A pair of nodes ( , ) is connected and can directly communicate with each other if their Euclidean distance is smaller than . The edge set is saved in and the set of node's neighbors in International Journal of Distributed Sensor Networks 3 one hop is denoted by ( ) = { ∈ ; ( , ) ∈ }. The degree of this node, which is equal to its number of neighbors, can be defined as = | ( )| [20].
Each node in the network has an initial value (0), representing an observation of some type. The initial value vector of all these nodes can be defined as (0) = [ 1 (0), 2 (0), . . . , (0)]. In this paper, we deal with the average consensus which means that the consensus equilibrium value is equal to the average value of the initial value held by each node. It has been reported that the average consensus is reached for the case in which the communication topology is fixed and connected [21]. A connected network is one in which a path exists between every pair of nodes [20]. The average of these values is = (1/ )∑ =1 (0) [21]. At th iteration, each node maintains an estimation ( ) that is generally different from that of other nodes. A vector ( ) = [ 1 ( ), 2 ( ), . . . , ( )] is used to define the values of all the nodes. Suppose the network is connected and the communication relationship is symmetric; that is, node and node can receive the information from each other correctly based on the wireless link between them for a given time slot. The ultimate goal of consensus is to drive the estimated vector value ( ) infinitely close to the average vector = [ , , . . . , ] with a minimal amount of information exchange. To match the distributed nature of WSN, an asynchronous time model is adopted by the gossip algorithm to trigger the node wake up and execute the gossip algorithm. The clock in each node is assumed to have a tick rate based on the Poisson process.
During the gossip algorithm process, at the ( −1)th iteration, node randomly chooses a neighbor node to exchange information, and their values are updated according to the following equation: The metric proposed in [20] is used to evaluate the convergence rate of reaching consensus. This metric defines the normalization of difference between consensus value and the real average value. Definition 1. For the randomized algorithm using as the restriction on the calculation, for any 0 < < 1, the averaging time ave ( ) is the earliest time at which the nodes' value ( ) is close to the average vector with a probability greater than 1 − [20]: With (2), we define the purpose of the algorithm as satisfying the probability in (2) with at least iteration number . In Section 3, we introduce a new algorithm to meet this goal.

Average-Connectivity-Degree-Cluster-(ACDC-) Based Gossip Algorithm
In this section, we first introduce the average connectivity degree cluster (ACDC) scheme. Then, based on this cluster  mechanism, we execute the gossip algorithm in the network to reach a consensus.
Assume that in an -node wireless network, each node has the capability of transmitting at bits per second. It has been shown [22] that the throughput obtained by a node in the network for a randomly chosen destination is only Θ( /√ ) bits per second with using a noninterference protocol, even if the nodes are optimally located, the traffic pattern is optimally assigned, and the transmission range is optimally chosen. Note that the expression ( ) = Θ( ( )) is used as an asymptotic notation which means when → ∞, the function ( ) is equal to ( ) within a constant factor.
Suppose that the network is divided into clusters. The network system model is shown in Figure 1. Here, we define the cluster level as the first tier. The cluster head level forms the second tier. Assume that different clusters are connected by a backbone network. Assume the throughput of each cluster is ( ∈ ). Then, the throughput of the network, , can be obtained by calculating the throughput of all the clusters in the network during a certain time interval. Usually, the throughput of each cluster is different because the number of nodes in each cluster varies. Normally, the affordability of backbone network is constant. Consequently, when the throughputs of some clusters are much higher than others, congestion arises in the backbone network. The whole network may be even paralyzed in the worst circumstances.
Then, we define From the network perspective, the throughput of each cluster should be [18] = Θ ( √ ) (bits/sec) .
From the cluster perspective, the throughput of each node is  The throughput of a cluster is composed of the throughput of each node in the cluster. Thus, according to (3), (4), and (5), we obtain Consequently, if the relationship between the number of nodes and the number of cluster can be maintained as described in (6), the network can be stabilized. The number of nodes in the cluster can also be calculated as This cluster scheme is used to establish a stable clusterbased network. Based on this cluster scheme, the load in each cluster is the same. Compared with grid cluster based gossip, this scheme alleviates the data transmission burden in certain clusters. The flow chart describing this algorithm is given in Figure 2.
In this algorithm, each node belongs to only one cluster. Although we obtain the optimal number of nodes in each the cluster head is that it is more convenient and faster to broadcast the final decision to other nodes via the wireless link connected between them. If there are less than (√ − 1) neighbors around the chosen cluster head, we will first form the cluster and then choose the second highest degree node in the network. Then, we can add some of its neighbors to the cluster until the number of nodes in the cluster equals to (√ − 1). During the calculation, if we cannot determine an integer according to the root square of (7), the number that is closest to the root square can be selected as the defined number of nodes in each cluster. After the cluster formation, the remaining nodes choose the nearest cluster to join. Consequently, variation in the number of nodes in each cluster is small, and a balanced throughput can be obtained. Note that the cluster is virtually formed in the proposed method. The original connections between different nodes in the network are not changed. When consensus is reached in different clusters, the gossip algorithm is executed at the cluster head level. Because the transmission radius of each node is constant, information exchange between cluster heads can only be accomplished by multihop transmission. When consensus is reached at the cluster head level, the cluster head broadcasts its value to the nodes that are located in its cluster and directly connected to it. Nodes update their values when they receive data from the cluster head. However, if the node is not connected directly with the cluster head, it must obtain this new value through its neighbors. In Figure 2, the part circled in red dash is ACDC cluster formation process.

Numerical Results
The performance of the gossip algorithm is evaluated based on the system model developed in the previous section. Suppose that 100 independent nodes are randomly deployed in a 100 × 100 m 2 area and that each node has a transmission radius of 30 m. The convergence rate and convergence accuracy of our proposal are compared with those of the standard gossip algorithm [20] and the grid cluster gossip algorithm [10]. In the standard gossip, a node randomly chooses one of its neighbors with which to perform information exchange.
In the grid gossip algorithm, the whole network is first split into grids with equal areas. Because the network area is square, the number of grids into which the network area can be divided is 2 ( is a positive integer, ∈ + ). Here, if = 2, 3, and 4, the square network area is divided into 4, 9, and 16 grids accordingly. In the ACDC algorithm, a cluster is divided according to the number of nodes in the network. Hence, according to (7) tiers. During the simulation, we set the iteration number = 2000 and the relative error constraint = 0.0001.

Convergence Performance Evaluation.
The relative error is defined in (8). This expression is used to represent the closeness of the consensus value to after determined amount of iterations: Based on different initial values, Gaussian distribution, independent identical distribution, and linear variation, the performance is evaluated using the Monte Carlo method.
The same boundary is set for the above three types of initial value distributions. The convergence rates based on regular network topology (e.g., random geometric graph (Figure 3)) and irregular network topology (e.g., C-shaped, I-shaped, and O-shaped ( Figures 5-7)) are also evaluated.
Based on the network topology in Figures 3 and 4 shows the convergence rates of standard gossip, grid cluster gossip, and the proposed ACDC gossip algorithm. The following six conclusions can be derived from Figure 4.
(1) In Figure 4(a), the convergence rate comparison initialized with linear variation is presented. The convergence rate of the cluster-based gossip algorithm is observed to be faster than that of the standard gossip algorithm. Based on 2,000 iterations, the ACDC gossip algorithm is nearly 50% closer to the real average value than the grid cluster-based gossip algorithm with 4 grids. However, it takes a longer time for the ACDC algorithm to reach consensus compared with other grid gossip algorithms. For example, 410, 620, and 780 iterations are needed for the 9-grid, 4-grid, and 16-grid cluster-based gossip algorithms, respectively, to reach consensus. However, nearly 1,010 iterations are needed for the ACDC gossip algorithm to reach consensus.
(2) In Figure 4(b), the first 100 iterations of Figure 4(a), representing intracluster gossip are shown. In the first 30 iterations, the 16-grid gossip algorithm is observed to be faster than the other algorithms. This result is due to the presence of 16-pair nodes exchanging information at the same time in the former algorithm, whereas there are 4, 9, and 10 pairs of nodes in 4-grid, 9-grid, and ACDC algorithms, respectively. Based on this figure, 35, 35, 90, and more than 100 iterations are required for the ACDC gossip, 16-grid gossip, 9grid gossip, and 4-grid gossip algorithms, respectively, to reach consensus in each cluster. Therefore, during the intracluster gossip, the relative error of 16-grid gossip algorithm is less than that of the ACDC gossip algorithm in the same iteration, which means that the International Journal of Distributed Sensor Networks convergence rate of the 16-grid cluster based gossip algorithm is faster than ACDC gossip algorithm. Similar results can be obtained for the comparison between the 16-grid and 4-grid gossip algorithms or 16-grid and 9-grid gossip algorithms. Please refer to the appendix for the proof.
(3) In Figure 4(a), the area circled in blue represents intercluster gossip. During this period, there are 16,9,4, and 10 cluster heads for the 16-grid, 9-grid, 4grid, and ACDC gossip algorithms, respectively, in the second tier. At each iteration, only two cluster heads are chosen to exchange their information. The values of all other nodes in the network are kept constant. Therefore, the variation in the relative error is notably small. (4) In Figure 4(a), the area circled in purple indicates the broadcast period. In this period, the cluster head broadcasts consensus value to nodes in its cluster area.
All of the nodes in the network update their value at the same time. Hence, there is a rapid decrease in the relative error. (5) In Figure 4(a), the area circled in yellow illustrates the stabilized period, which means that consensus has been reached. As shown in the figure, the relative error of the ACDC gossip algorithm is 10 −5 after 2,000 iterations, which is much lower than those of the grid gossip (approximately 10 −2 ) and standard gossip (10 −1.7 ) algorithms.
(6) In Figure 4(c), the results for an initial values with Gaussian distribution and a comparison of the convergence rates for different gossip algorithms are shown, and Figure 4(d) compares the convergence based on initialization with independent identical distribution values. The results shown in both of these two figures confirm that the proposed algorithm can attain higher convergence accuracy than the conventional algorithms, which is independent on the initial values. The consensus value and the relative error vary with different initialization values.
In Figure 5, the convergence rates based on a C-shaped network topology are illustrated. Based on this figure, the ACDC-based gossip algorithm is able to achieve a much lower relative error compared with other gossip algorithms, indicating that the value held by each node is much closer to the desired value.
In Figure 6, the convergence rates based on an I-shaped network topology are shown. For an I-shaped topology, the relative error for the cluster-based gossip algorithm with 9 grids is the most inefficient. The second worst relative error is observed for the 16-grid cluster-based gossip algorithm. Furthermore, the 4-grid cluster-based gossip algorithm reaches consensus more quickly compared with the ACDC gossip algorithm, but the relative error of the latter algorithm is 50% lower because the number of nodes at the cluster head level in the 4-grid gossip algorithm is less than that in the ACDC algorithm. In Figure 7, the convergence rates based on an O-shaped network topology are presented. The results for an Oshaped topology also demonstrates that the proposed ACDC algorithm is superior to the referenced algorithm with regard to accuracy. The 16-grid cluster gossip algorithm is the slowest scheme to reach consensus for this topology.
Based on the results presented in Figures 4-7, it can be concluded that our ACDC gossip algorithm is superior to the referenced algorithms in terms of convergence accuracy. This incense in accuracy is because the algorithm's accuracy is primarily based on the cluster scheme. In particular, in irregular network topologies, the grid cluster is divided according to node locations. There are clusters with no nodes or small numbers of nodes in an area. The imbalanced number of nodes in different clusters results in a lower accuracy in the final consensus value. For example, in the C-shaped topology, there are steps observed in ACDC gossip which is shown in the green line because several loops are executed to reach consensus in each of the first 2,000 iterations. We can also observe that the increase in the number of clusters in the gridcluster gossip algorithm would not improve the convergence consensus accuracy. Therefore, we omit the simulations when the network is divided into 25 or even more grids.

Utility Comparison.
A utility function is developed to evaluate the efficiency of these algorithms. We define the ability with the mathematical formula in (A.7). This function consists of two parameters, iteration and relative error. Here, we define the iteration as , while the relative error as : where is a weight parameter that represents the importance placed on iteration or relative error by users. In this equation, when the value is smaller, the value of will be larger. A larger value of can also be obtained with a smaller value. According to this equation, the network designers can find an optimal pair of values from which they can obtain the highest utility. The consensus iteration number and relative error of Gaussian distribution for different topologies are given in Table 1.
Using the simulation results in Table 1 as parameters for our utility function in (A.7), we obtain the graphs in Figure 8.
Based on Figure 8, it can be concluded that, when is greater than 0.6, that is, the network designers pay much more attention on the convergence rate, the 4-grid gossip algorithm is a better choice in the C-shaped network topology, while the 9-grid gossip algorithm is a better choice for attaining higher utility in other types of topologies. When the accuracy requirement is not very strict and minimizing the amount of the time spent on data transmission is desired, a grid cluster-based gossip algorithm is the superior choice. If is less than 0.6, the network designers place greater importance on the accuracy of the executed algorithm. The utility of our proposed algorithm is much greater than that of the other algorithms making it a viable choice.

Robustness Analysis.
Robustness in WSN means how the algorithm is capable to perform in some situations, such as packet loss, node death, or node mobility. For node death and node mobility, the gossip algorithm executes data change with random pair of nodes to combat the topology change. It is not necessary to consider the routing protocol. If some of the neighbors around one node die or move out of the communication range of the node, it can pick up the other neighbor to exchange information. Here, we do not consider the isolated node case. Therefore, all the algorithms mentioned in this paper are robust to the network topology change.
Packet loss is usually caused by variation in wireless links. The radio irregularity of each node, which is caused by the propagation medium and hardware devices, is the main reason for the asymmetric links [23]. When a signal propagates within a wireless medium, it may be reflected, diffracted, and scattered [24]. Reflection occurs when a signal wave impacts on an object which has larger dimensions than the wavelength of the signal. It usually occurs with impacts with the surface of the earth, buildings, and walls. Diffraction occurs when the signal is obstructed by a sharp irregular surface and scattering occurs when the medium through which the signal wave travels contain a large numbers of objects with dimensions that are smaller than the signal wavelength. Consequently, the transmission radios in different directions are various. The other reason of irregular radius is the hardware difference of each node, especially the antenna gain, even if the manufacture is the same.
To evaluate whether our algorithm works well in an irregularity radio scenario, we introduce the irregular sensor model which is proposed in [25]. For each sensor node, the radio propagation range is predefined as R, and the effective radio range eff is defined by the Gaussian distribution with a mean of R and a standard derivation of DOI, where DOI represents the degree of irregularity of eff . Here, we define the range of eff is from min = − 3 * DOI to max = + 3 * DOI.
The DOI model defines an upper and lower bound on signal propagation. If the distance between two nodes is no larger than min , we assume that the wireless link between these two nodes is symmetric which means they can receive each other's data correctly. If the distance between two nodes is beyond max , there is no communication between them. If the distance between two nodes is between min and max , we assume there are three kinds of possible scenarios, symmetric communication and asymmetric communication, which means one node can receive the other node's data correctly while opposite link is disrupted, and no communication.
We defined DOI = [0, 0.5, 1, 2, 3], where DOI = 0 means the transmission radius is constant. When DOI = [0.5, 1, 2, 3], the relationship between min , , and max is shown in Figure 9. Table 2 and Figure 10 show the relative error value when the network reaches consensus along with the change of DOI. Compared with other gossip algorithms, we find that no matter what kind of network topology, our algorithm performs with the least relative error value, nearly 10 −3 times smaller than the other algorithms.

Conclusions
In this paper, we investigated the convergence rates of gossip algorithms in a dense, randomly deployed WSN with three types of sensor observation attributes. The use of an averageconnectivity-degree-cluster (ACDC-) based gossip algorithm was proposed to improve the convergence rate of consensus decisions. We analyzed the effects of sensor observation attributes, network topology, and the number of clusters on the convergence rate that reached consensus. We also developed a utility function that considers the number of iterations and relative error. An irregular sensor model is introduced to evaluate the robustness of the algorithm. The simulation results show that the proposed ACDC gossip algorithm is much more accurate than the standard gossip and grid cluster-based gossip algorithms for any type of topology. When users place more importance on the algorithm accuracy, the proposed ACDC algorithm should be selected. However, if minimizing the amount of time spent on information transmission is desired, the 4-grid or 9-grid cluster-based gossip algorithms are better choices. We also analyzed and proved that our ACDC algorithm is robust enough in all the network topologies used in this paper. In the future, our objective will be to improve the convergence rate of the proposed algorithm.

Appendix
This appendix is used to prove that the convergence rate of 16-grid gossip is faster than the other gossip algorithms during the intracluster communication period. Take the 16grid gossip and ACDC algorithms as examples.
We then obtain Therefore, we can conclude that the in the first 100 iterations, 16-grid gossip performs faster than the other algorithm. Similar results can be obtained compared to the 16-grid gossip and other grid gossip algorithms.