An Energy-Efficient and Scalable Secure Data Aggregation for Wireless Sensor Networks

Due to the characteristics of resource-constrained and battery-powered sensors in wireless sensor networks (WSNs), energy consumption is always a major concern. Data aggregation is an essential technique to reduce the communication overhead and energy consumption. Since many applications require data privacy, we need to take security into consideration. In this paper, we propose an energy-efficient, secure, highly accurate, and scalable scheme for data aggregation (EESSDA). The main idea of EESSDA is that secure data aggregation is achieved by establishing secure channel and slicing technology. The EESSDA scheme does not need encryption and decryption operations during the data aggregation, which saves energy and obtain high accuracy of aggregation results. Meanwhile, in EESSDA scheme, the advanced deployment of shared information between nodes is not required, making the networks with good scalability. Our analysis and simulations show that EESSDA is of lower communication overhead, more efficiency and accuracy, and better privacy preservation and scalability than existing schemes.


Introduction
Wireless sensor networks (WSNs) are composed of a large number of sensor nodes to cooperatively monitor physical or environmental conditions, such as temperature, humidity, or noise, at different locations.WSNs have become increasingly popular in many military and civilian applications [1][2][3], for example, in the military field, identifying and locating targets for potential attacks through WSNs and in civilian field, tracking a patient's blood pressure, blood sugar, heart rate, and so forth.via wearable sensors to monitor the patient's health.
Sensor nodes are usually constrained in energy, communication, storage, and computation capability, especially the ones powered by batteries which cannot be replaced optionally.Therefore, it is requisite for WSNs to save energy and increase network lifetime.In [4], a node consumes approximately the same amount of energy to compute 800 instructions as it does in sending a single bit of data.Hence, reducing the amount of traffic is a crucial way to save energy.WSNs usually generate large amounts of raw data in which there exists high redundancy.So, it is important to develop efficient data processing technique to reduce redundant data and the amount of transmission.Data aggregation [5][6][7][8][9][10] is an efficient method to eliminate data redundancy and save energy.However, data are transmitted by multihop and wireless in WSNs, which makes the transmission of data be captured and eavesdropped easily by a malicious attacker.In many applications, WSNs encounter some serious security problems, so the scheme of data aggregation not only optimizes raw data and reduces the amount of transmission for network, but also keeps the network at a high level of security.
Generally, security requirements of the privacy-preserving data aggregation scheme can be satisfied using encryption technology.Privacy-preserving data aggregation scheme is classified into two types: hop-by-hop and end-to-end.In hopby-hop fashion, aggregator nodes must decrypt all sensor data they receive, aggregate the sensor data according to the corresponding aggregation function, and encrypt the aggregation result before sending it to next hop node [11][12][13][14].End-to-end privacy-preserving data aggregation scheme performs data aggregation through homomorphic encryption International Journal of Distributed Sensor Networks technology.One intermediate (cluster head) node receives the ciphertexts from leaf (cluster) nodes and then aggregates them with its own encrypted sensor data; the result will finally be sent to a next node [15,16].Obviously, the above privacy-preserving data aggregation scheme will cause great latency and energy consumption because of the decryption/encryption process.In [17], a new privacy-preserving data aggregation protocol was proposed.Sink shares a random number (key) with each sensor node.Then each sensor node simply adds its data up with the random number and gets a pseudodata which will be aggregated along the aggregation tree to the Sink.Knowing all the shared numbers, Sink can get the real aggregation results with subtraction.However, a portion of the sensor nodes may not participate in the data aggregation due to collisions, which is hard to be tracked by Sink.In that case, Sink still subtracts all the shared numbers from a pseudoaggregation result, which might yield the aggregation results with large deviations.
In this paper, we propose a secure, energy-efficient, scalable, and highly accurate scheme for data aggregation (EESSDA).In EESSDA, a secure channel is established between each sensor and its neighbor (i.e., the two sensors share a common random number) for transmitting message without encrypting private data.Considering that the leaf nodes' data will be disclosed to intermediate nodes, in EESSDA, a technology similar to SMART in [11] (each node slices its private data randomly into  pieces, one kept for itself and the remaining encrypted and sent to its neighboring nodes) is adopted.Different from SMART, our method can overcome the above mentioned disclosure just using the leaf nodes' data and will consume much less amount of traffic because only the leaf nodes need to decompose their data into slices, and intermediate nodes only need to send one message for data aggregation.In conclusion, EESSDA requires no encryption/decryption operations and reduces the amount of traffic.Therefore, EESSDA has high accuracy of the aggregation results, since it involves less collision and no case like Sink subtracting the random number of the failed node appeared in [17].In addition, because EESSDA does not require deploying shared common information between nodes in advance, the network has good scalability, which is essential for the cheap sensor with easy loss.Theoretical analysis and simulation results demonstrate that EESSDA exhibits an excellent performance in terms of security, energy efficiency, accuracy, and scalability.
Our contributions in this paper are as follows.
(1) Privacy: EESSDA provides end-to-end data confidentiality by the use of secure channel and "slicing and assembling" technology on leaf nodes.(2) Energy efficiency: EESSDA does not require the encryption/decryption in processing of data aggregation, which economizes on energy consumption and latency.On the other hand, only the leaf nodes need to process "slicing and assembling, " so EESSDA greatly reduces the amount of traffic and consumption of energy.(3) Accuracy: EESSDA reduces the amount of traffic and the latency of time and does not need encryption/decryption, which can improve the accuracy of the aggregation because data packets have less chance to collide.(4) Scalability: in EESSDA scheme, each node only needs to predistribute  keys randomly drawn from the key-pool, making the network have good scalability and more suitable for the dynamic network.
The rest of the paper is organized as follows.In Section 2, we overview some related works on secure data aggregation.Section 3 introduces the network model and design goals.In Section 4, we give the detailed descriptions of our scheme EESSDA and analysis of its scalability.Section 5 evaluates and simulates the proposed schemes of EESSDA.Finally, we summarize our conclusions in Section 6.

Related Work
In typical WSNs, sensor nodes are usually resource-constrained and battery-limited.There has been extensive work on data aggregation schemes to increase the lifetime of WSNs by reducing the amount of traffic and resource consumption.However, these aggregation schemes have been designed without security in mind.In practice, WSNs may be deployed in an untrusted environment in many applications, such as battlefield, where an adversary may compromise nodes and reveal sensitive information.Hence, privacy-preserving is a key technology to extend the application of WSNs.The secure data aggregation is becoming a new hot research topic in WSNs [11][12][13][14][15][16]18].
Several secure data aggregation schemes have been proposed based on hop-by-hop encryption mechanism.In [12], SDAP was proposed based on the principle of divide-andconquer and commitment-and-attest.First, SDAP dynamically partitions the topology tree into multiple logical groups (subtrees) of similar sizes using a probabilistic approach.A commitment-based hop-by-hop aggregation is performed in each group to generate a group of aggregation results, which is the criteria for the base station to determine whether the group is suspicious.In [11], the authors proposed two privacypreserving data aggregation schemes, CPDA and SMART, for additive aggregation.The CPDA scheme leverages algebraic properties of polynomials to calculate the desired aggregate value.The SMART scheme builds on slicing techniques and the associative property of addition.In [13], the iPDA scheme was proposed to improve the integrity of the data based on the SMART scheme.In iPDA, data privacy is achieved through data slicing and assembling technique; and data integrity is achieved through redundancy by constructing two disjoint aggregation trees to collect data of interests.However, the iPDA has high communication overhead and low aggregation accuracy due to the slicing technology and each sensor node has to send its data to both aggregation trees.EEHA [14] preserves data privacy like SMART scheme, in which the nodes are divided into leaf nodes and intermediate nodes.In EEHA, only leaf nodes utilize slicing and assembling technology to preserve data privacy, and intermediate nodes only aggregate their private data, data pieces received from leaf nodes and data from child nodes into a new aggregated data to protect the privacy of its private data.Hence, compared with SMART scheme, EEHA scheme has less communication and higher data accuracy.
The schemes in [15,16] utilize privacy homomorphic encryption to allow aggregation of encrypted data.In CDA [15], each sensor node splits its data into  parts ( ≥ 2), encrypts them by using a public key, and transmits them to the aggregator node.The aggregator node operates on the ciphertext, computes an aggregated value from the ciphertext, and sends it to the sink.IPHCDA scheme [16] employs an elliptic curve cryptography-based homomorphic encryption algorithm to offer data confidentiality along with hierarchical data aggregation.IPHCDA scheme partitions the network into several regions and employs a different public key in each region.Data aggregators perform aggregation over the encrypted data and transmit the aggregated data to the base station.The base station not only classifies the aggregated data based on the encryption keys, but also achieves data integrity through verifying the MAC of the aggregated data.
Besides, in KIPDA [18] scheme, the authors proposed a noncryptographic method which obfuscates data by hiding them among a set of camouflage values, enabling indistinguishability for data aggregation.KIPDA defines a message set consisting of the actual data and camouflage values for MIN/MAX aggregation.The message set is an array of values, where the actual data and camouflage values are assigned cleverly to specific positions in the array according to predefined policies that guarantee 100% accuracy of the aggregation, while the attacker cannot distinguish between the actual data and camouflage values.Because the data are not encrypted, it is easily and efficiently aggregated with minimal in-network processing delay, but the level of privacy is relatively low.

System Model
3.1.Network Model.WSNs are composed of a large number of resource-constrained sensors, equipped with nonrechargeable batteries.We use the tree structure to organize sensors to perform the task of data aggregation, as shown in Figure 1.There are three types of nodes in the sensor network: the Sink, intermediate nodes, and leaf nodes.The Sink is the node where aggregation result is destined.The intermediate nodes serve as aggregator nodes, which are responsible for forwarding queries, aggregating the received data and its own data, and sending the aggregation to their parent nodes.The leaf nodes utilize the "slicing and assembling" technique to protect data privacy by decomposing their private data into pieces, sending the pieces to neighboring nodes, and assembling their piece and the pieces they received to get new results which will be sent to their parent.Typical aggregation functions include SUM, AVERAGE, COUNT, MAX, and MIN.We focus on additive aggregation functions because all the typical aggregation functions can be reduced to the additive aggregation function SUM [17].

Attack Model.
A malicious attacker can launch a variety of attacks to undermine the data security.We consider the following two cases.information.Eavesdropping attack is the most common and easiest form of attack on data privacy, which is the focus of this paper.We assume the attacker can eavesdrop on the entire network.(2) Compromising sensor nodes: After compromising one or more sensor nodes, an adversary can obtain its data and keys and perform the following attacks.Firstly, an adversary use the keys obtained from compromised nodes to decrypt the ciphertext of private data sent by other node(s).Secondly, an adversary utilizes several colluding compromised nodes to collect and infer private data of other node(s).

Design Goals.
The main goal of secure data aggregation scheme is to maintain data privacy for each node in the WSNs.Meanwhile, the scheme must consider the performance of efficiency, accuracy, and scalability.Therefore, a desired secure data aggregation should meet the following criteria.
(1) Privacy: to broaden the area of WSNs' applications, data aggregation must guarantee the privacy of data.Each node should only know its own data.However, the wireless link is vulnerable to eavesdropping by attackers to reveal private data.Furthermore, some compromised nodes may collude to uncover the private data of other nodes.A good secure data aggregation scheme should be robust to such attacks.
(2) Efficiency: the goal of data aggregation is to reduce the amount of messages transmitted within the WSNs by using in-network processing to eliminate redundant data and thus reducing resource and energy usage.
To protect the privacy of data, additional overhead is unavoidable in secure data aggregation.However, a good private data aggregation scheme should keep the overhead as little as possible.
(3) Accuracy: data aggregation results may be used to make critical decisions in the WSNs, so the accuracy of the aggregation results must be guaranteed during the process of data aggregation.Therefore, accuracy should be a crucial criterion to estimate the performance of secure data aggregation schemes.
(4) Scalability: the cheap sensor nodes are prone to fail, which makes WSNs dynamic in network topology.When some nodes fail or new nodes are deployed, it is very necessary for the secure data aggregation scheme to continue to be implemented correctly.A good secure data aggregation scheme needs to have easy scalability.

Key Setup for Security Channel.
Neighboring nodes establish a secure channel with encryption technology.In this paper, key management adopts a random key distribution mechanism proposed in [19].The key distribution consists of three phases.(1) Key predistribution: a large key-pool of  keys and their corresponding identities are generated.Each node within the WSNs randomly selected  keys from the key-pool.These  keys form a key ring for a sensor node.
(2) Shared-key discovery: each sensor node finds out which neighbors share a common key with itself by exchanging discovery messages.If two neighboring nodes share a common key then there is a secure link between them.
(3) Pathkey establishment: if two neighboring nodes do not share a common key, their secure link is established by two or more multihop.
In the random key distribution mechanism, the probability that any pair of neighboring nodes possess at least one common key is  connect .When two neighboring nodes transmit the encrypted message by their common key, the probability that a third node possesses the same key is  overhear .Details can be seen in the following formula:

Energy-Efficient and Scalable Secure Data Aggregation (EESSDA) Scheme
In this section, we present the detail of our proposed secure data aggregation scheme which is energy-efficient, scalable, and highly accurate.The EESSDA scheme consists of five steps: (1) aggregation tree construction; (2) secure channel establishment; (3) slicing; (4) assembling and mixing; and (5) aggregation.Because of the dynamic nature of WSNs, this section also describes how to deploy new nodes or handle failed nodes.
4.1.Secure Data Aggregation.The scheme consists of five steps, whose detailed procedures are listed as follows.
Step 1 (aggregation tree construction).A common technique for data aggregation is constructing an aggregation tree.
There are various methods for building an aggregation tree.We construct the aggregation tree using the method described in TAG [10].The network is organized as a tree rooted at the Sink node, and each sensor node has a shortest routing path to the Sink.Meanwhile, all parent-child nodes at least share a common key by setting conditions of path selection during constructing aggregation tree, as shown in Figure 2. Step 2 (secure channel establishment).Aggregation tree is composed of intermediate nodes and leaf nodes.
( (2) Leaf nodes: each leaf node establishes a secure channel with its parent node.In addition, each leaf node establishes secure channel with its neighbors or nodes within h-hop which at least share a common key with it.
After the establishment of secure channel, sensor node transmits data (including sensing data, aggregate results, slice) through secure channel.The node adds data up with the random number (secure channel) and then send the pseudodata to the destination node.The destination node gets the real data after subtracting the random number.For example,  5 sends the slice V 51 to  1 , the specific process:  5 →  1 :  51 = V 51 +  51 MOD , and  1 gets V 51 by V 51 =  51 −  51 MOD .
Step 3 (slicing).Because the leaf nodes contain only its own data, each leaf node ensures the confidentiality of its data by slicing data into pieces before sending data to its parent node.We adopt the slicing technique similar to that of the SMART [11].Each leaf node   slices its primitive data V  randomly into  pieces based on the number of its secure channels; that is, V  = ∑  =1 V  , where V  is denoted as a piece of data sent from node   to node   .If there is no data from node   to node   , V  = 0.
Figure 3 describes the slicing step, where one of the  pieces is kept at node   itself, the remaining  − 1 pieces are sent to the neighbors of   (we take ℎ = 1 here) through secure channel.For example, leaf  8 slices its data V 8 randomly into 3 pieces, and then  8 keeps V 88 and uses secure channel to send the remaining 2 pieces to neighbor nodes  7 and  9 , respectively, in its neighbor nodes set { 2 ,  Step 4 (assembling and mixing).First, all nodes wait for certain time Δ, which guarantees that all slices are received.Then, each leaf   aggregates up all the received slices and the slice left by itself to get a new result V  .In Figure 3, we obtain all leaf nodes mixing result as follows: ( Step 5 (aggregation).After a leaf node mixes up the received slices to get a new result, it sends the new result to its parent through secure channel.The intermediate nodes receive new results V  sent by their children nodes and may also receive slices V  sent by leaves.Once an intermediate node has got all data from its child nodes or leaf nodes, it performs an aggregation operation to get a new result and forwards the new result to its parent by secure channel, which in turn forwards the aggregation result along the tree.Eventually the aggregation result reaches the Sink.For example,  2 receives mixing results V 6 ( 4 ) and V 7 ( 5 ) and gets V 4 and V 5 by subtracting random numbers  24 ,  25 (secure channel), respectively.Then,  2 aggregates all the data (including its own private data) and gets a new result.(V 2 = V 4 + V 5 + V 2 MOD ).Finally,  2 sends the result to Sink by secure channel ( 25 ); that is,  2 → Sink : V 2 = V 2 +  2 MOD , as shown in Figure 3.
Algorithm 1 illustrates the 3-step process of  1 in Figure 3.

Aggregation Algorithm.
The pseudocode of EESSDA for every node is described in Algorithm 2.
We propose an energy-efficient and scalable secure data aggregation algorithm, described in Algorithm 2. It basically is composed of three phases.The first phase (lines 1-4) is the predeployment stage, including construction of aggregation tree (line 1) and establishment of secure channel (lines 2-4).The second phase (lines 5-9) is slicing-mixing operation, in which we enumerate all leaf nodes by one loop.Each leaf node slices its primitive data (line 6), mixes (line 7) all the received slices (include itself slice), and sends mixing result V to its parent node (line 8).The third phase (lines 10-17) is data aggregation operation.Each intermediate node performs mixing operation and aggregation operation by one loop.Firstly, intermediate node mixes all receive slices from leaf nodes (lines 11-13); secondly, intermediate node aggregates all data from its child nodes; finally, intermediate node sends aggregation result to its parent node (line 15).Intermediate nodes in turn forward the aggregation result along the tree.Eventually the aggregation result reaches the Sink, and EESSDA algorithm is completed.

Scalability.
Because of the dynamic nature of WSNs, the network may need to deploy new nodes.In existing aggregation or query schemes, when a new node is deployed, the network needs to distribute some shared information between the new node and Sink/parent node/root node of subtree in advance.The network expansion is very difficult.On the other hand, when there are some failed nodes in WSNs, aggregation scheme needs to ensure that the network still performs aggregation correctly.So our proposed scheme EESSDA has good scalability, which can be described in detail as follows.
(1) Deploying new nodes.When it is deployed into the network, a new node   establishes secure channels with its neighbors which at least share one common key with it.And then,   selects the node   with the smallest number of hops from neighbor nodes set as its parent node.  is successfully deployed, and it becomes a leaf node in the network.WSNs reconstruct the aggregation tree based on the number of the deployed new nodes or time interval.(2) Failed nodes.When the parent of node   fails after a certain time,   will mark its parent as a failed node and select a new parent node from its neighbors.The same as above,   selects a neighbor node as its parent node which has the secure channel with   and the smallest number of hops.Meanwhile,   updates its own number of hops, and broadcasts a request to its child nodes to modify hop number.When the total number of failed nodes is small, the network still can work properly.

Simulation and Performance Analysis
In this section, we evaluate the performance of EESSDA through theoretical analysis and simulation study, including communication overhead, computation overhead, energy efficiency, accuracy, and privacy-preservation. Based on the simulator of WSNs in [20], we use C# and MATLAB to implement a simulator in order to simulate executing EESSDA and SMART schemes.We implemented these two schemes using Algorithm 1: Illustration of the three steps in EESSDA.
(1) Construct an aggregation tree on top of TAG; (2) Ensure that all parent-child nodes share a common key; (3) Set waiting time Δ; (4) Establish secure channel; (5) foreach leaf node   do (6) perform slicing operation and wait; (7) perform mixing operation and get new result V  ; (8) send V  to its parent node; (9) end for (10) foreach intermediate node   do (11) if receives slice from leaf node then (12) perform mixing operation (13) end if (14) if receives all child nodes data or time elapsed then (15) perform aggregate operation and send aggregation result V  to its parent node; (16) end if (17) end for Algorithm 2: EESSDA algorithm.a real world data set from Intel Lab Data [21] to compare their performance of communication overhead, computation overhead, energy efficiency, accuracy, privacy-preservation and so on.

Simulation Setting.
The simulation runs on a PC with Core i3-3220CPU, 4G memory, and Win 7 OS.We assume networks with 400 sensor nodes.These nodes are randomly deployed over a 400 × 400 m 2 area.The transmission range of a sensor node is 50 m and data rate is 1 Mbps.According to [18], as far as TelosB Mote is concerned, the energy used to transmit and receive 1 bit of data are   = 0.72 J and   = 0.81 J, respectively, and encrypt/decrypt 10 bit of data is 8.92 J use RC4.Each point in the figure is the average result of 20 runs of the simulation.In each run, one randomly generated WSNs topology is used.

Communication Overhead.
The communication overhead of EESSDA consists of two parts:  sc , the establishment of secure channels, and  data , data transmission.When node   with node   establishes a secure channel,   needs to send an encrypted random number and receive an ACK form node   .The encrypted data and ACK are of  bits and 1 bit, respectively.On the average, each node builds secure channels with   neighbor nodes.Because the secure channel is bidirectional, we thus have In EESSDA, the behavior of leaf nodes is not similar with that of the intermediate nodes.(1) Intermediate node: it receives all data form its child nodes or leaf nodes and then performs an aggregation operation with itself data to get a new aggregation result and sends the result to its parent.We suppose the reading at node is in range [0,   ], so the data transferred through secure channel is of ⌈log(  * )⌉ bits.(2) Leaf node: it slices its data into  pieces and send −1 piece to neighboring nodes.We consider the network with  sensor nodes and the percentage of leaf nodes of the aggregation tree is .Then we have To improve level of security, each period, EESSDA reestablishes secure channels.We assume performing  runs data aggregation during each period.Therefore, the communication overhead with each run is listed as follows: In our experiments, we implemented EESSDA and SMART on the same already constructed aggregation tree.In SMART, each node needs to send  messages for secure data aggregation ( − 1 messages during the slicing step and then one message for data aggregation).Hence, the communication overhead of SMART is  *  * . Figure 4 shows the communication overhead of EESSDA and SMART ( = 3,  = 3) under different epoch durations.From Figure 4, we can see that EESSDA decreases 20% communication overhead compared with SMART; moreover, if  and  are constant, the larger  is, the larger ratio of decreasing is.When  = 3, with the increase of , communication overhead of EESSDA has more decrease than SMART, up to 27% as shown in Figure 5. Figure 6 illustrates the communication overhead of EESSDA and SMART with respect to ( = 3), we can conclude that EESSDA is more efficient than the SMART (except as  = 1, namely, each run data aggregation before the scheme establishes secure channels), and as  increases, the communication overhead of scheme is reduced on each run.

Energy Consumption.
Energy consumption involves two aspects: communication and computation.Computation involves encryption/decryption operations and modular arithmetic operations.Encryption/decryption is much more energy consuming than modular arithmetic.Therefore, we only consider the cost of encryption/decryption computation.Figure 7 shows that EESSDA saves 45% energy compared with SMART.In EESSDA, encryption/decryption computations only occur during the secure channel establishment step.Each secure channel needs to perform encryption and decryption computation once; that is, Num dec = Num enc = 1/2 ∑  =1   .For the SMART scheme, each transmission of data needs to compute encryption and decryption computation each once, so the number of encryption and decryption is Num dec = Num enc =  * .Therefore, SMART is much more energyconsuming than EESSDA.With the increase of , SMART would perform more encryption and decryption and consume more energy, as shown in Figure 8. From Figure 9, we can see that EESSDA consumes much less energy compared with SMART, especially when the value of  is large.So EESSDA can greatly increase network lifetime than SMART.

Privacy.
To preserve the privacy of data during data aggregation, the primitive data produced by the sensor nodes must not be disclosed to the neighbor nodes or attackers.To address privacy, SMART adopts the "slicing and assembling" and encryption technique, in which nodes divide their primitive data into several pieces, send encrypted data pieces to neighbor nodes, and aggregate the received data from its child nodes or neighbor nodes, and then routes the aggregated result to the Sink.Our EESSDA scheme utilizes "secure    channel" to ensure that the data will not be disclosed to other nodes or attackers.In EESSDA, the schemes used to ensure data privacy are different for leaf nodes and intermediate nodes.For leaf nodes, we utilize the "slicing and assembling" technique mentioned above.For intermediate nodes, it aggregates the received data and its own data to conceal the primitive data.So the data produced by the intermediate nodes are not disclosed to their parent node.First, we evaluate the privacy of secure channel.And then we analyze the privacy of leaf nodes and intermediate nodes, respectively.
(1) Secure channel: a secure channel is a common random number which is shared by two nodes.There are two situations under which random number is revealed.(1) A compromised neighbor node holds a communication key and is able to decrypt the random number.From [11], we can see that the probability that the node has the communication key by its key ring is   = /.Meanwhile, after the end of each period, EESSDA reestablishes secure channels to improve level of security.(2) Guessing the random number: because the random number of   is uniformly distributed in the range [0, ], the probability of correctly guessing the random number is 1/, where  =   * .Because   is uniformly distributed, V  +   MOD  is also uniformly distributed in the range [0, ].Therefore, when node   sends data to node  through secure channel, the attacker cannot infer V  by eavesdropping.So the probability that the random number is leaked is   = min( *  *   , ), where   is the average number of compromised neighbor nodes and  is a security coefficient for different runs data aggregation during each period.(2) Leaf node: leaf node   slices the primitive data V  into  pieces and sends  − 1 piece to its neighbors through secure channel.The probability of the leak of each piece V  is   .Only if an attacker breaks  − 1 outgoing data and V  of node   , it will be able to crack the primitive data held by   .The probability that an attacker breaks  − 1 pieces data is  −1  based on the above safety analysis of secure channel.The value of V  is aggregated by the th piece data and all pieces data from other leaf nodes, so the probability of the leak of V  is  +1  , where  is the number of pieces sent to node   .  measures the performance of the privacy preservation of a leaf node, so   can be approximated by where  child is the number of child nodes.
Figure 10 compares the privacy preservation performance for EESSDA and SMART ( = 3,  = 3), we can see that the two schemes have good privacy.Figure 11 illustrates the privacy preservation with respect to ( = 3), we can conclude that the larger value of  (the number of slices each leaf node chooses to decompose its primitive data), the better privacy can be achieved.But a larger  will also yield a larger communication overhead.Therefore, although both EESSDA and SMART can achieve considerable privacy, the former has less communication overhead and energy consumption.We can achieve a better balance than SMART between privacy preservation and communication overhead/energy consumption by setting different value of .

Aggregation Accuracy.
In ideal situations when there is no data loss in the network, the scheme should get 100% accurate aggregation results.However, due to collisions over wireless channels, data processing delays and then data may get lost or delayed.So the aggregation accuracy may be lower than it is in the ideal situation.We define the accuracy metric as the ratio between the aggregation result and real result of all individual sensor nodes.This paper focuses on additive aggregation function, we thus have Figure 12 shows the accuracy of EESSDA and SMART ( = 3) with respect to epoch duration.From the figure we can see that the accuracy increases as the epoch duration increases.There are two reasons contributing to this [11]: (1) with longer epoch duration, the data packets to be sent within this duration will have less chance to collide; (2) with longer epoch duration, the data packets will have a better chance of being delivered within the deadline.Meanwhile, the EESSDA has better accuracy than SMART, especially when the epoch duration is small.That is because SMART scheme spends a lot of time for encryption/decryption operations, while EESSDA does not need encryption/decryption operations during the data aggregation.Therefore, the chance of occurring collisions is decreased, and the probability of being delivered within the deadline is increased, which causes an improvement of aggregation accuracy.Figure 13 illustrates the aggregation accuracy of EESSDA with respect to the selection of .We can conclude that the accuracy of EESSDA is not sensitive to .However, with larger the value of , there is a slightly decrease in the aggregation accuracy.This is mainly because when a data is sliced into more pieces, more data packets are needed to be sent to other neighboring nodes.Hence, more collisions occur, which causes a reduction of aggregation accuracy.

Conclusions
Providing efficient and privacy-preserving, data aggregation is a challenging problem in WSNs.We propose EESSDA scheme for secure data aggregation in WSNs.Different from general data aggregation that preserves data privacy by encryption technology, EESSDA achieves data privacy through secure channel.Because EESSDA does not need International Journal of Distributed Sensor Networks encryption/decryption operations during the data aggregation, it saves much energy for encryption/decryption operations, reduces the time of data processing, and consequently leads to improvement of aggregation accuracy.In addition, EESSDA can ensure that the network aggregates correctly when deploying new nodes or having few failed nodes.So EESSDA scheme has good scalability.We compare the performance of EESSDA and SMART, simulation results show that EESSDA scheme decreases 20% or more communication overhead and 40% energy consumption compared with SMART.And our scheme provides higher aggregation accuracy and scalability than SMART scheme.
Intermediate nodes: each intermediate node establishes a secure channel with its parent or child node; that is, every pair of parent-child nodes shares a common secret random number.For example, node   establishes a secure channel with its parent node   .  selects a random number   , encrypts   using its shared key   with   , and then sends the result to   .  receives the encrypted data and gets the random number   by decrypting the data using its shared key   .Thus   is the secure channel between   and   , where   =   .
3 ,  6 ,  7 ,  9 }; that  8 →  7 : V 87 +  87 MOD ,  8 →  9 : V 89 +  89 MOD , where  is the range of possible aggregation values. is, V 4 = V 4 +  41 MOD  V 5 = V 5 +  51 MOD  Leaf nodes slices:  51 = V 51 +  51 MOD  (2) Aggregation V 1 = (V 4 −  41 ) + (V 5 −  51 ) + ( 51 −  51 ) (3) Intermediate node: the intermediate node   receives data sent by its children nodes or leaf nodes, aggregates these data and its sensor data, and then sends the aggregation result to parent.Only if attacker breaks all incoming data of a node   and aggregation result V  of   will it be able to crack the sensor data of   .  measures the performance of the privacy preservation of an intermediate node.Consequently,   is estimated as