A Dynamic Virtual Force-Based Data Aggregation Algorithm for Wireless Sensor Networks

In wireless sensor networks (WSNs), energy efficiency is critical for increasing network lifetime. WSNs consist of low-cost nodes with constrained energy. Present study demonstrates that data aggregation is an effective approach to reduce energy consumption in WSNs. Current data aggregation algorithms, such as cluster-based, tree-based, and chain-based data aggregation algorithm, incur high overhead to maintain structures and in dynamic scenarios incur much more cost to continuously reconstruct aggregate route. In this paper, we propose a dynamic virtual force-based algorithm (VFE) for data aggregation which can adapt to different scenario changes. Inspired by the concept of cost field and virtual force, VFE constructs dynamic routing without structure overhead, which makes data aggregation more efficient. The simulations confirm that VFE achieves significant energy saving and prolongs network lifetime.


Introduction
Wireless sensor networks are applied to numerous applications, for example, wild habitat monitoring, forest fire detection, building safety monitoring, military surveillance, and so on [1]. With the development of technology, the size of sensor node is getting smaller, so the battery capacity is constrained and the computing performance is getting stronger so the energy consumption is increased. The deployment of sensor nodes is always without careful preplanning and performed in an ad hoc manner [2]. Therefore, it is critical to design WSNs data aggregation algorithm that saves energy and consequently prolongs the network lifetime [3]. If a data aggregation algorithm maximizes the function of sensor network it is energy efficient [4]. In order to use the constrained energy more efficiently, most existing data aggregation algorithms attempt to find the optimal aggregation at nodes.
However, it is insufficient to only focus on the problem of energy efficiency when designing data aggregation algorithm. Other factors such as network lifetime, data accuracy, and data latency should also be taken into account. Previous research shows that nodes drain their energy much more faster if they are closer to the sink [5]. This energy consumption difference could make some nodes fail while others working and in this case, the network lifetime could be dramatically reduced. Therefore, it should be rational to keep suitable balance between energy efficiency and energy consumption difference [6]. With the data aggregation mechanism, a node could hold data received or self-generated data for a while and aggregate those data then send out aggregated data [7]. Data aggregation attempts to collect the most critical data from other sensors and make it available to the sink in an energy efficient manner with minimum data latency [4]. Data packets spatially and temporally convergent are two main methods for efficient data aggregation.
VFE maintains a cost field based on gradient and it provides sensor nodes with the direction to the sink. The concept of "gradient" usually means to set the neighboring nodes at a global direction towards destined sink. Thus, each sensor node obtains the direction to send sensing data to the sink. The cost field based design is inspired by a common, natural phenomenon: mountain water flows down into a valley, and the altitude at each point serves as the guide to direct the water from a high post to a low post and eventually to the valleys bottom, which has the lowest altitude [8]. It can be set up according to any common form, such as hop count, 2 International Journal of Distributed Sensor Networks energy consumption, or physical distance [9]. In VFE the gradient is established based on the minimum energy consumption from node to sink.
Khatib [10] first proposed the virtual force theory and it was applied in the research of the mobile robotics route plan and obstacle avoidance. Howard et al. [11] applied it to the coverage problem of WSNs. Assuming that each sensor node attracts the other nodes with virtual force and attract them to transmit data to it, we consider the sensor node as a mass and then calculate its virtual force.
The structure of this paper is as follows. Section 2 introduces some related works. Section 3 presents the design of this algorithm. Section 4 is the evaluation of three different algorithms. Section 5 is the conclusion.

Related Works
However, current data aggregation algorithms, such as cluster-based, tree-based, and chain-based data aggregation algorithms, maintain a relatively fixed structures and cannot adapt to scenario change; we define those data aggregation algorithms as static algorithms. Heinzelman et al. [12] proposed low-energy adaptive clustering hierarchy (LEACH) algorithm and LEACH was the first cluster formation algorithm. LEACH divides the network into different clusters randomly; in each cluster it assigns a node as cluster head and cluster head transport and aggregates data from its cluster to the sink. Younis and Fahmy [13] proposed hybrid energy efficient distributed (HEED) algorithm; the ideal is to form clusters efficiently for extending network lifetime and select cluster head according to node residual energy and a secondary parameter. Lindsey et al. [14] proposed a typical chain-based data aggregation protocol, power efficient data gathering protocol for sensor information systems (PEGA-SIS); it employs greedy algorithm to construct the chain. Greedy chain formation assumes global knowledge of the network and it will incur high overhead. Ding et al. [15] proposed a method which generates minimum spanning tree (MST) by implementing Prim algorithm. These data aggregation routing algorithms may perform well in static scenario where nodes function properly.
Nevertheless in dynamic scenario, situations changing rapidly and nodes may move or fail unexpectedly, for example, monitoring of active volcano, difficult terrain border lands, battlefields, and so forth. In those scenarios, situations change rapidly and those routing algorithms incur high route constructing and maintenance overhead. Yue et al. [16] proposed a novel cluster-based data aggregation algorithm. The algorithm implemented cluster head rotation technique in each cluster and it helps in balancing the energy dissipation. Zechinelli-Martini et al. [17] proposed an energy aware data aggregation model, in which temporal database is implemented in the sensors and the intention of this aggregation model is to achieve WSNs energy efficiency by minimizing the packet transmissions. Esnaashari and Meybodi [18] proposed a learning automata protocol which relies on gathering similar sensors' data only, and it consists of two phases: route discovery and route selection, and these available routes consider the shortest discovered paths to the sink and have the same length. Li et al. [7] proposed a lifetime balanced data aggregation algorithm; that is, instead of focusing on energy consumption of individual node, design a unique algorithm to balance the nodal lifetime. Said et al. [19] proposed a new effective data aggregation protocol to reduce the energy consumption in WSNs; it uses in-network aggregation to distribute the processing all over the aggregate route to avoid unbalanced energy consumption on specific nodes until they run out. There still are plenty of novel algorithms such as those described in [20,21]. These data aggregation routing schemes introduced some good solutions for dynamic scenario.
In this paper, we propose a dynamic data aggregation routing algorithm. We achieve this algorithm by building up the cost field based on the minimum energy consumption from node to sink, and then each node uses the virtual force to select the optimal node to be the next hop. Hence, we can use dynamic routes for data aggregation.

Design of Virtual Force-Based Energy Efficient Algorithm (VFE)
In the following analysis, the nodes are considered to be uniformly and randomly placed in a two-dimensional space. All nodes have equal, omnidirectional transmission patterns of range and the sink energy supply is adequate.

Cost Field.
At the beginning of VFE, the sensor network establishes a cost field and the cost field provides the direction from node to the sink. Since a node decides whether it should forward a packet, it does not have to keep the information that the data should be forwarded to which neighbor [22]. When the transmission power of a node is adjustable, the total amount of energy consumed in transmitting a packet along a path is a better routing metric than hop count [9]. The probability that a neighbor node selects its next hop depends on this node's gradient. Ye et al. [8] use the overhead of one packet per node to construct the optimal cost field and devised an efficient backoff-based cost field scheme. Define the cost as the minimum energy consumed by transmitting and recieving a packet from node to the sink. VFE adopts the backoff-based cost field. In a general way to set up a cost field is networkwide flooding. The flooding scheme rebroadcasts node's cost whenever it receives a lower cost. However, the backoff-based cost field scheme allows a node to broadcast it's cost after a certain delay and results in incuring much less overhead than the flooding scheme. In [9], the authors proposed an improved cost field establishment algorithm. However, cost field is not the main issue in this paper, and the backoffbased cost field setup algorithm is sufficient for both energy efficiency and computing complexity.
Initially sink broadcasts an ADV (advertisement) message and the initial cost is 0; each node sets its initial cost of ∞. If a node hearing an ADV message from other nodes while ADV message propagates in the whole network, then the node adds the heard ADV cost as its own cost, which is the cost between this node and the sender. It compares this new cost to the heard message (if there exist other cost), and if the new cost is smaller, it sets itself as the cost of this node. A node may hear many ADV messages before a node settles its minimum cost, to avoid significant redundancy with too many duplicated messages in a node. Instead of broadcasting immediately after obtaining a lower cost, we defer the node to broadcast its cost until it heard the message with minimum cost. Thus, the node may broadcast with its optimal cost and only broadcast once. Eventually, each node can implicitly offer the direction to send sensing data to the sink.
Once the cost field is established, data packets then can flow along minimum energy consumption paths to the sink. Note that these periodic broadcasts ADV messages throughout the whole network may cause excessive overhead in WSNs [23]. Furthermore, network topology may change, for instance, wireless link breakage, sensor node failure, or mobility; some gradient value will be changed and then require more frequent broadcast. Thus it will drain out nodes' energy on that path quickly and cause packets congestion. However, to reduce energy consumption in those nodes and extend network lifetime, VFE uses the cost field to restrict the direction of packets transmission, and nodes should choose the optimal path with the virtual force.
The dynamic routing scheme employed by VFE is based on a virtual force model. It is assumed that each node initially has a single sensed information to send to the sink in the network and each node attracts its neighbors with virtual force (in this paper, we define neighbors of nodes' as nodes within one hop away from node , as shown in Figure 1, and all the grey dots are neighbors of node ), and the virtual force is a force which goes along a straight line between two sensor nodes; those blue arrows are virtual force between node and its neighbors. The virtual force between node and node V, V , is directly proportional to the mass of the node itself and the mass of neighboring sensor nodes and inversely proportional to the square of the distance between them. We define node's energy consumption ratio and data amount as "mass of sensor" and the equation of virtual force can be shown as follows: where is a constant coefficient and and V are mass of node and node V, while V is the distance between and V. The virtual force of node is defined in terms of the mass of neighboring nodes to it and it determines which node should be the next hop. The mass of node describes the energy consumption ratio and the data amount of the node, so the mass of sensor is as follows: where residual is the residual energy of node and initial is the initial energy of node and is responsible for aggregation node V's data with its own. We have where and denote the data amount of node before and after aggregation and is the packet node sent to node and 1 + 2 + ⋅ ⋅ ⋅ + are aggregated data from neighbors and is data aggregation ratio. In (2), ∈ {0, 1}, which means that when value of increases, the proportion of energy consumption will increase and the proportion of data amount will decrease; as a result, mass of node will correspondingly change, and the virtual force between different nodes will change too. The parameter is changeable, while node's energy falling too quick, and reduces the value of and the corresponding mass of this node will decrease. The parameter is of great importance to extend the lifetime of network. In (2), residual / initial reflects the ratio of energy consumption; hence node is much more likely to choose the node with less energy consumption to be the next hop. 1/ is inversely proportional to the data amount of node , and node is more likely to choose the node with less data amount to be the next hop. Figure 1 illustrates our basic idea. We suppose that the dotted line in Figure 1 is the current cost field which is established through the broadcast of ADV messages at initial stage. Node is attracted by all its neighbors in different directions and the power of those virtual forces is different. In this theory node needs to determine which neighbor should be the next hop in the aggregate route.

Selection of Neighbor Nodes.
Our goal is to send sensor nodes' data packets to sink in dynamic aggregate routes and to consider the energy balance, so we will select the maximum virtual force along the cost field. Hence the neighbor with the maximum virtual force will be the next hop in the aggregate route and it is a dynamic process. While the data aggregation process works, the residual energy and obtained data of nodes are diverse from each other and the virtual forces of neighbor nodes are changing; in this way, we can achieve dynamic routing and extend network lifetime and effective data aggregation. The process of VFE is summarized as Algorithm 1. Figure 2 is an example of neighbor nodes selection. As shown in Figure 2, all the grey dots are neighbors of node 4 International Journal of Distributed Sensor Networks (1) Initialization: nodes, sink (2) sink broadcast ADV messages (3) if nodes heard the message with minimum cost then (4) build cost field (5) end if (6) node (source node) detect event (7) for all node 's neighbors do (8) neighbors calculate virtual force (9) end for (10) node choose the maximum virtual force node along the cost field as next hop (11)   ; blue arrows are virtual force between node and its neighbors; the black arrow is the direction of cost field. The virtual force is proportional to the length of arrows; then > > . In a normal way, node should choose node as the next hop. However in VFE, the direction of cost field presents the direction to the sink, so the node with maximum virtual force along the cost filed should be the next hop; in Figure 2, node should choose node as the next hop.

Evaluation
This section shows the effect of the proposed algorithm on energy consumption. We use OMNET++ 4.3 simulator to evaluate the performance of our algorithm and the parameters of our simulations are shown in Table 1. Sensor nodes are uniformly distributed in a 100 × 100 m 2 field. There are 2 sources in the corner of the field, an event report every round, with a packet size of 32 bytes. After the cost field is established, to avoid frequent change of aggregate routes and reduce data calculation which is caused by variety of virtual force, we specify that virtual force will be recalculated after 200 rounds.
We compare VFE with two existing data aggregation algorithms, power efficient data gathering and aggregation protocol-power aware (PEDAP-PA) [24] and energy efficient data aggregation ant colony algorithms (DAACA) [25]. In PEDAP-PA, it constructs minimum spanning tree (MST) by using Prim algorithm, where data packets transmit and aggregate along tree routes, and it provides near optimal lifetime for the first node although slightly decreasing the lifetime of the last node. In DAACA, authors consider each node as an artificial ant which is raised in ant colony optimization. Each node estimates the amount of pheromones and the remaining energy to compute the probabilities for selecting the next hop. After certain rounds of transmissions, the pheromones adjustments are performed; then it can build the optimal route. Figure 3 shows the time of all nodes dead of network (we define round as a packet deliver from source node to sink). In PEDAP-PA, the Prim algorithm is used to construct the tree routes which can save energy with neighbor nodes; however, this is a centralized method; the sink will consume much more energy than other nodes. The energy cost in maintaining the network is high; hence nodes' energy consumption is quicker than the other two algorithms. In DAACA, there is no overhead for topology or MST reconstruction; therefore, the average energy cost is lower than PEDAP-PA. But the aggregate route of DAACA is static, so data packets will be sent along maximum pheromones routes and the energy of the nodes on these routes will drain quickly. In VFE, there is no overhead for structure maintenance and aggregate route is dynamic, so nodes' energy consumption is smaller than those other two algorithms. Assuming that a sensor network is properly functioning if 70% sensor nodes are alive and as shown in Figure 3, the network lifetime of VFE is longer than those two algorithms. Figure 4 shows the average remaining energy of three data aggregation algorithms. All sensor nodes are uniform, but different algorithms will cause huge difference in energy consumption. As expected, the PEDAP-PA consumes much more energy than DAACA and VFE. In PEDAP-PA, the sink needs to broadcast the topology of the network to all nodes in the network periodically; hence it consumes most energy. Although in DAACA, nodes need no topology of the network, when events source of the network changes frequently, it needs depositing global pheromones and because the established global pheromones are useless, the energy consumption is substantial. Whereas in VFE, it maintains a cost field based on the minimum energy consumption, and each node will compare the remaining energy and the amount of data packets with its neighbors to achieve balanced usage of energy. Figure 5 shows the total transmitted data amount of three data aggregation algorithms.  much less data than DAACA and VFE. In PEDAP-PA, as the energy consumption is faster than those two algorithms, the whole network's lifetime is less; thus transmitted data amount is less. In DAACA, because of the frequent reconstruction of aggregate routes which cannot adapt to dynamic scenario, the performance of this algorithm is worse. In VFE, while sensor nodes choose the next hop, they will calculate virtual force of their neighbors, and nodes will choose the maximum virtual force node as the next hop. In other words, node will choose neighbor nodes with more remaining energy and less data amount as next hop, and there are different routes existing at the same time. If some routes are congested or there are too many data packets in some nodes, new data packets can be delivered to the sink through different routes. Hence, our algorithm can avoid data congestion effectively and balance energy consumption of the whole network. Figure 6 shows that different cause different network lifetimes in VFE. As shown in Figure 6, while = 0.3, VFE can achieve the longest network lifetime, = 0.7; the network lifetime of VFE is the shortest. The parameter is of great importance in our mechanism. Though = 0.3 the network lifetime is longer than the other two situations but it will reduce node's mass. Node's mass is reduced and the possibility to be choosen as the next hop by other nodes will decrease too, and it will be sharply reduced in data transmission. In different applications we can choose the more suitable and achieve the best effect.

Conclusion
In this paper, we propose an energy efficient data aggregation algorithm adapting to dynamic routing to collect data in WSNs. Existing data aggregation algorithms cannot adapt to dynamically changing scenario and cannot make data aggregation effective. To achieve efficient data aggregation, we want packets to concentrate in space. But nodes on this path would drain their energy quickly. Motivated by the cost field and virtual force theories, we propose a virtual forcebased dynamic routing algorithm which makes data aggregation more efficient.