Spatial-Temporal Correlative Fault Detection in Wireless Sensor Networks

Wireless sensor networks (WSNs) have been used extensively in a range of applications to facilitate real-time critical decision-making and situation monitoring. Accurate data analysis and decision-making rely on the quality of the WSN data that have been gathered. However, sensor nodes are prone to faults and are often unreliable because of their intrinsic natures or the harsh environments in which they are used. Using dust data from faulty sensors not only has negative effects on the analysis results and the decisions made but also shortens the network lifetime and can waste huge amounts of limited valuable resources. In this paper, the quality of a WSN service is assessed, focusing on abnormal data derived from faulty sensors. The aim was to develop an effective strategy for locating faulty sensor nodes in WSNs. The proposed fault detection strategy is decentralized, coordinate-free, and node-based, and it uses time series analysis and spatial correlations in the collected data. Experiments using a real dataset from the Intel Berkeley Research Laboratory showed that the algorithm can give a high level of accuracy and a low false alarm rate when detecting faults even when there are many faulty sensors.


Introduction
A wireless sensor network (WSN) consists of spatially distributed autonomous sensors.A WSN operated in selforganization and multihop mode can be used to perceive physical or environmental conditions (such as the temperature, sound level, or pressure) in a target region and to acquire, process, and transmit data describing these conditions.Wireless sensor networks were developed for military applications, but they have been used to monitor and control industrial processes, to monitor machine health, and in other applications [1,2].
Low-cost and large-scale sensor nodes are often deployed in uncontrolled or even harsh environments, so they are prone to developing faults and becoming unreliable.Faults may occur at different levels of the sensor network for different reasons, such as the depletion of batteries or the failure of a physical component, and faults can lead to incorrect readings being included in a dataset, packet losses during communication, and errors occurring in the middleware and software [3,4].From a "data-centric" point of view, faulty sensors may cause dust data to be generated, and this may cause limited resources to be wasted, shorten the network lifetime, affect the analysis of the data and the decisions that are made, and even lead to the failure of the entire network.It is therefore desirable to exclude data from faulty sensors to ensure that the quality of the service is maintained.Faults in WSNs are defined as observations that are not consistent with expected behaviour.Fault tolerance and detection in systems controlling machines and distributed systems have previously been studied intensively [5][6][7][8].Traditional detection methods cannot be directly applied to WSNs because of the limited resources available and the need for the large scale deployment of sensors.It is challenging to develop an accurate detection method that has the characteristics required for use in WSNs.
The aim of the work presented here was to develop a system for locating faulty sensors in a WSN.We propose 2 International Journal of Distributed Sensor Networks a localized fault detection algorithm for identifying faulty sensors.The algorithm uses spatial and temporal correlations in WSN data to define normal behaviour and then identifies faults as they occur.The main benefits of our proposed method are outlined below.
(1) The system identifies temporal outliers by performing a time series analysis of each node and performing spatial correlations using neighbouring sensor data and local majority voting.Four new simple and effective spatial strategies are proposed for revising the anomalies.Performance evaluations showed that different strategies and thresholds may be applicable in different situations.
(2) The localized decision-making strategies in the proposed methods are aimed at decreasing communication times and avoiding detection status diffusion, thereby saving energy consumption.
(3) Simulations showed that our detection algorithm does not need to be given the physical positions of the sensors and that it works well even when there are many faulty sensors.
The algorithm was found to be able to successfully identify faulty sensors even when half of the neighbouring sensors failed or when there were few neighbouring sensors.Even if a node had no neighbouring sensors the algorithm degenerated into a time series analysis method but still worked well.The remainder of this paper is organized as follows.The literature on fault detection in WSNs is reviewed in Section 2. The network model and data model are defined in Section 3. The proposed spatial-temporal fault detection scheme is described in detail in Section 4. Performance evaluations are presented in Section 5 and our conclusions are drawn in Section 6.

Related Work
Fault detection in WSNs is an attractive field of research.Krishnamachari and Iyengar [9] developed a Bayesian fault recognition algorithm for disambiguating binary fault features.The algorithm they developed was simple and consumed little energy, but the authors assumed that measurement faults were equally likely to occur at every sensor node and used the binary mode to represent the measurements made by the sensors, and these aspects of the algorithm may limit the scope of its applications.Ni and Pottie [10] used hierarchical Bayesian space-time modelling to detect faults.This method was more complex than the first order linear AR modelling method, but the spatial and temporal correlations were not explicitly calculated in either method, and the existence of such correlations was simply assumed.
A distributed fault detection scheme in which the collected data were spatially correlated was proposed by Chen et al. [11].In this method, each sensor uses its neighbours to identify its initial status, determines whether its status is either "good" or "faulty" from its neighbours' statuses, and then sends its final state to the adjacent sensor nodes.The false alarm rate using this method was low when the probability of sensor faults occurring was low.Jiang [12] improved the scheme described above by defining new detection criteria and increased the fault detection accuracy for different average numbers of neighbouring nodes and node failure ratios.However, in both of the methods described [11,12] each sensor needs to communicate with neighbouring nodes at least three times, and such frequent communication between adjacent nodes consumes a large amount of energy.Lee and Choi [13] proposed a fault detection algorithm in which comparisons between neighbouring nodes were made, and the decision made at each node was disseminated.The use of the dissemination process meant that the network would have a relatively high level of energy consumption.
Some researchers have used artificial intelligence approaches to detect faults.Rassam et al. [14] assessed PCAbased anomaly detection models.Siripanadorn et al. [15] integrated SOM and DWT data to detect anomalies.Nandi et al. [16] introduced two different detection probabilities and used a model selection approach, a multiple model selection approach, and Bayesian model averaging methods to solve the detection problem.Such algorithms consume relatively large amounts of resources.
Sharma et al. [17] described and analysed four qualitatively different classes of fault detection method (rule-based, LLSE, time series forecasting, and HMMs) using real-world datasets.They found that each of the four method types sat at a different point on the accuracy/robustness spectrum.Their evaluation showed that no single method perfectly detected the different types of faults and that using hybrid methods could help eliminate false positives or false negatives.They also found that using two methods in sequence could allow more low-intensity faults to be detected, at the expense of a slightly higher number of false positives, than could be detected using one method alone.
Zhang et al. [18] introduced five different detection methods that were based on either spatial or temporal correlations or on both.Out of these algorithms, the temporal and spatial outlier detection method, the spatial predicted outlier detection method, and the spatial and temporal integrated outlier detection method used both spatial and temporal correlations, but they were complex methods, and each node required rather large amounts of computing resources so that it could receive a number of parameters to allow it to predict its neighbours' observations from its previous observations.In this paper we will attempt to identify simple and efficient strategies using spatial correlations of time series analyses that have previously been used [17,18] for the distributed and online detection of faults in a WSN.

Network Model and Data Model
We assume that sensors are randomly deployed or placed in predetermined locations and that every sensor has the same transmission range.Each sensor node is able to locate its neighbours within its transmission range using a broadcast/acknowledge protocol.The communication graph for a WSN can be represented as digraph (, ), in which  is the set of sensor nodes in the network and  is the set of edges connecting the sensor nodes.Two nodes V  and V  are said to have an edge   = (V  , V  ) in the graph if the distance between them is less than the transmission range   .In this paper, faulty sensors are defined as those for which the observations do not match the expected behaviour.Nodes with faulty sensors remain capable of receiving, sending, and processing data.Only sensor nodes with permanent communication faults (including a lack of power) are removed from the network.
We define the data model for a sensor network using spatial and temporal correlations.Individual faulty sensor nodes are not relevant to this model.A spatial-temporal correlation will imply that adjacent sensor nodes have made similar observations and that each node has similar values to its previous values in the time series.We let V  and V  be neighbouring nodes and let

Algorithm.
The proposed fault detection algorithm is based on spatial-temporal correlations and uses the definitions given above.The algorithm can be broken down into four steps.
Step 1.The value for each node V  is predicted from an analysis of its time series.
A node will have similar values throughout its time series, so temporal correlations can be used to construct an efficient time series model to allow changes in the values to be modelled and forecast.The multiplicative seasonal autoregressive integrated moving average (SARIMA) time series model is a general model for analysing time series.The SARIMA can be written explicitly as [19] where   is a time series,   is a residual sequence,  is a lag operator, ∇ = 1 −  is a "no seasonal difference" operation, ∇  = 1 −   is a "seasonal difference" operation, Φ  () and   (  ) are "no seasonal difference" and "seasonal difference" autoregressive polynomials, respectively, Θ  () and   (  ) are "no seasonal difference" and "seasonal difference" moving average polynomials, respectively, parameters , , , and  are polynomial coefficients for the four polynomials just described, respectively, and  and  are nonseasonal difference and seasonal difference frequencies, respectively.The time analysis involves four substeps.(1) Model identification: the choice of a time series model is focused on the selection of parameters , , , , , , and  by observing the sample autocorrelations and sample partial autocorrelations.For example, if  =  =  =  = 0, the model degenerates into the autoregressive integrated moving average, but if  =  =  =  =  = 0, the model is called an autoregressive moving average.(2) Parameter estimation: the historical data are used to estimate the parameters for the tentatively selected model.(3) Diagnostic checking: various diagnostic tests are used to check the suitability of the tentatively selected model.(4) Model forecasting: the selected model is used, with the previous observations, to predict the next observation.
The choice of a time series model for the sensor measurements is determined by the nature of the phenomenon being evaluated.It is possible that a complex seasonal model could be the most appropriate for making the predictions, but using a complex model means that more parameters and more computationally intensive training will be required.Determining the best-fitting time series model for modelling the phenomena of interest is an important task for a resourceconstrained WSN, but it is not the focus of this work.Our results obtained using real-world datasets showed that the simple AR() model used here allows faults in a time series of temperature measurements to be detected effectively.
Step 2. Run a threshold test to determine the preliminary state of each node V  .
Once each node V  knows its measured value    and predicted value x  at time , we can obtain the likelihood state LS  of node V  using a threshold test.
LG.The state LS  , based on temporal correlations, International Journal of Distributed Sensor Networks is a preliminary test result.Sensors can be determined to be either good or faulty in this phase.
Step 3. Compute a comparison value between each node V  and each member of Neighbour(V  ).
A comparison test result   is generated by sensor V  based on the measurements for its neighbour V  using a predefined threshold value Obviously, a faulty sensor can generate arbitrary measurements, and the comparison test result   will then exceed  2 .
Step 4. Analyse the results using different judging rules to determine the final state of node V  .
Each sensor sends its preliminary state to all of its neighbours.There are four possible relationships between node V  and its neighbours, and they are defined in (2).Consider Obviously,    and ℎ   imply that node V  may be good, and    and    imply that node V  may be faulty.The relationship between node V  and its neighbours can follow the pattern shown in Figure 1.The node is marked with its preliminary state LG or LF, and the edge is marked with the comparison test result   .
The final state of node V  depends on its neighbours' states and their comparison test results.According to the voting strategy, if the majority of the neighbours conclude that V  is good, it is considered to be fault-free.We use the rules shown in ( 3)-( 6) to determine the final state of node V  to illustrate different detection effects.

Rule 2a. Consider
Function ToFS(LS  ) will convert LG (or LF) into FG (or FF).The criteria shown above mean that there is a common implied condition: if    = ℎ   =    =    = 0 then FS  = ToFS(LS  ).This means that the final state of node V  is based on its own preliminary state LS  when node V  does not have any neighbouring nodes.This condition may decrease the likelihood of false alarms occurring if the majority of the network nodes are faulty.
The threshold values  1 ,  2 and the time series model parameters are preset in the node process unit at the time of deployment.The consistency of the diagnosis is not checked by propagating states because such a validation process will consume a large amount of communication or computing resources and even cause detection errors.The fault detection algorithm described above is summarized below.

Spatial-Temporal Correlative Fault Detection Algorithm (STCFD).
We have the following.
Step 1.Consider the following.
(1) Each sensor node V  establishes its own NeigTab(V  ) and sets predefined thresholds  1 and  2 .
(2) Parameters , , , , , , and  are set for the time series model for the sensor nodes.
Step 2. Consider the following.
(1) The difference between the value for each node V  and its predicted value at time , | x The values    and LS  are sent to Neighbour(V  ) and NeigTab(V  ) is updated.
Step 3. Consider the following.
(1) The differences between the value for node V  and the values for its neighbours at time , |   − Step 4. Consider the following.

Example.
In this section we will present an example to illustrate our algorithm.A partial set of sensor nodes in a wireless sensor network with some faulty nodes is shown in Figure 2. We examined nodes V 1 -V 12 .If two nodes are neighbours they are connected by a line.For convenience, each edge is marked with the value   and each node V  is marked with its preliminary state LS  .
The results of performing the detection algorithm are shown in Table 1.The final detected states and the preliminary states of the nodes were consistent except for nodes V 4 and and V 12 were found to be good, and node V 7 was faulty.Node V 4 was found to be faulty in the time series analysis but it was found to be good using all Table 1: Detection process and results for the algorithm used on the wireless sensor network shown in Figure 2. four of our algorithm rules because most of its neighbouring nodes believed that it was good.Node V 8 was first considered to be good but was then found to be faulty under some of the rules or good under other rules.LS 8 = LG and (  8 + ℎ  8 ) − (  8 +   8 ) = 2 − 3 = −1 < 0 is considered to be good using Rule 1a but faulty using Rule 1b.Similarly, it is considered to be good using Rule 2a but faulty using Rule 2b.The differences between the results found using different rules may affect the detection accuracy of the algorithm, and this will be discussed in detail in Section 5.

Performance Evaluation
The proposed STCFD depends on a number of parameters (the threshold values  1 and  2 , the average number of neighbours, and the number of faulty nodes in a target area).The performance of the STCFD was evaluated using a real-world dataset with different values for the parameters.We assumed that faults were independent of each other in the experiments.We used the detection rate (DR) and the false positive rate (FPR) to evaluate the detection performance.The DR is the ratio of the number of detected faulty sensor nodes to the total number of faulty nodes.The FPR is the ratio of the number of fault-free sensor nodes that are diagnosed as faulty to the total number of fault-free nodes.An effective detection technique should achieve a high DR and a low FPR.Here, nodes with some transient faults were treated as fault-free nodes.In the time series analysis we forecast the sensor measurements for time  + 1 from the measurements up to time .We used EViews 6.0 for the time series analysis and Matlab 7.0 as a computing tool.

Real-World Dataset from the Intel Berkeley Research
Laboratory.The Intel Berkeley Research Laboratory (IBRL) dataset was used as a real world dataset for testing the proposed model.The data were collected from 54 Mica2Dot sensors that were deployed in the IBRL between 28 February and 5 April 2004.The position of each node in the deployment area is shown in Figure 3.The Mica2Dot sensors had weather boards and collected time-stamped topology information and humidity, temperature, light, and voltage values once every 31 s.The data were collected using a TinyDB in-network query processing system, which was built in the TinyOS platform.The IBRL dataset included a log of about 2.3 million readings from these sensors [20].
The temperature dataset for two consecutive days (from 00:00 to 24:00), 28 February and 29 February, was selected from the experimental data.The data from 28 February were used to estimate the SARIMA model parameters.The fault detection methods were evaluated using the data from 29 February.The transmission range of the nodes was chosen to ensure that the sensors had different average numbers of neighbours in the simulation runs.An example of the network topology is shown in Figure 4.The relationships between the transmission range   , the average number of neighbours, and the maximum difference between neighbours are shown in Table 2.We used the transmission range   (rather than the average number of neighbours) for the mapping relationships in the following discussion.

Data Preprocessing.
The proposed STCFD is a distributed parallel algorithm, and it requires time synchronized data to be input.However, different nodes cannot collect data at the same time because the Mica2Dot sensor would miss some of the data packages.The IBRL dataset could not therefore be directly used in the STCFD, and we had to preprocess the data before it was input to the algorithm.We used a smoothing window to modify the original data to keep the gathered data synchronized.The smoothing window outputs the average value, giving samples at specified time intervals.The smoothing window size could be set at 10, 20, 30, or 60 min, as required.
The raw IBRL data are shown in Figure 5(a), and the data processed using smoothing window of 30 min is shown in Figure 5(b).The smoothing window was 30 min in the analysis described below.Each sensor node could acquire 48 time series samples between 00:00 and 24:00 on 29 February.Analysing the 48 samples from each sensor node on 28 February easily showed that the samples satisfied AR(2), and the desired prediction data were easily obtained.For example, the sampling data and their forecasts for node 1 on 29 February are shown in Figure 6.
Some artificial anomalies with slightly deviating statistical characteristics were inserted to allow the anomaly detection models to be evaluated using these samples.The maximum temperature in the 2592 samples from 54 nodes on 29 February was 23.0 ∘ C and the minimum was 14.9 ∘ C. Taking the maximum difference between neighbouring sensors into account, random measured values were generated using the temperature ranges (0, 12) and (26, 40) as faults, and they were inserted into the normal dataset.

Experimental Results.
The DR and FPR values produced in simulations using different parameters and rules were compared.We assumed that the network was available when the number of faulty nodes was less than half the total number of nodes in the WSN.Sensors were randomly chosen to be faulty, and the number of faulty sensors was set in the range 1-27.We will now discuss the effects on the algorithm of using different parameters in the experiments described below.Each experimental result was the average of 30 independent runs.

Experiment I:
Variable  1 ,  2 = 2.5, and   = 10 m.In experiment I we determined the faulty node detection rate and false alarm rate using different  1 values and algorithm Rule 1a, Rule 1b, Rule 2a, and Rule 2b.The  1 parameter was set at 0.5, 1.5, 3, 5, 7, or 9, and  2 and   were set at 2.5 and 10 m, respectively.
As can be seen in Figures 7 and 8, increasing the number of faulty nodes caused the DR using Rule 1a and Rule 1b to decrease gradually.The DR using Rule 1a was higher when  1 was 1.5 or 3 than when it was 0.5, 5, or 7 (Figure 7).Increasing  1 caused the FPR using Rule 1a to decrease.The FPR decreased from 6.26% to 2.39% when  1 was 0.5, but the FPR using Rule 1a was only 0% when  1 was 7.
In Figure 8, larger  1 values gave better DR values but not larger FPR values using Rule 1b.Increasing  1 from 1.5 to 3 caused the FPR to fall sharply.Increasing  1 from 3 to 5 and then to 7 caused the FPR to change little.Increasing  1 to 9 caused the FPR using Rule 1b to increase rather than decrease.
International Journal of Distributed Sensor Networks It can be seen from Figures 9 and 10 that the DR using Rule 2a and Rule 2b almost stayed the same as the number of faulty nodes was changed.However, the DR using Rule 2a was affected by the  1 value and increasing  1 decreased the detection rate.The DR using Rule 2a was about 99.96% when  1 was 0.5, but the DR was only about 92% when  1 was 7. Fortunately, increasing  1 decreased the FPR.The FPR was approximately 0% when  1 was 7, and when  1 was 0.5 the FPR using Rule 2a continued to increase (from 2.2% to 3.6%).
The  1 value hardly affected the DR using Rule 2b (Figure 10), and the DR was around 99.95% when  1 was 0.5, 1.5, 3, 5, or even 7.However, the FPR using Rule 2b was influenced by the  1 value.The FPR using Rule 2b was lower when  1 was 1.5 or 3 than when it was 0.5, 5, or 7.
Two main conclusions were drawn from Experiment I. (1)  1 can affect the DR and FPR using all four rules, but it is impossible to find a  1 value that gives both a high DR and a low FPR.The  1 value that gives an optimal effect will depend on the application.(2) The DRs found using algorithm Rule 1a and Rule 1b are related to the numbers of faulty nodes.The DRs found using algorithm Rule 2a and Rule 2b are almost independent of the numbers of faulty nodes.However, the FPRs using Rule 2a and Rule 2b increase as the number of faulty nodes increases.It can be seen from Figures 11 and 12 that the DRs obtained using Rule 1a and Rule 1b were similar.The  2 value was relatively small and the DR was relatively high when the number of faulty nodes was relatively low.The  2 value was relatively small and its DR was relatively low when the number of faulty nodes was relatively high.Increasing the number of faulty nodes increased the  2 value and the corresponding DR decreased more slowly.

Experiment II: Variable
The DR using Rule 1a was higher when  2 was 1.5 than when  2 was 3.5 when the number of faulty nodes was ≤11.The DR when  2 was 1.5 decreased from 99.638% to 98.808%, but the DR when  2 was 3.5 decreased from 99.384% to 98.762%.When the number of faulty nodes was >11, the DR was lower when  2 was 1.5 than when  2 was 3.5, and the rate decreased more quickly when  2 was 1.5 than when  2 was 3.5.The DR when  2 was 1.5 decreased from 98.43% to 70.078%, but the DR when  2 was 3.5 only decreased from 98.436% to 80.725%.It was difficult to find a  2 value that gave a better DR than other values gave for all of the different faulty node patterns.
The best  2 value was approximately the same for the FPRs using Rule 1a or Rule 1b.The FPR was lower when  2 was 2.5, 3.5, 4.5, or 5.5 than when  2 was 1.5 or 9 (Figure 11(b)).The FPR was lower when  2 was 3.5, 4.5, or 5.5 than when  2 was 1.5, 2.5, or 9 (Figure 12(b)).Figure 5: (a) Raw data for node 1 from 00:00 to 24:00 on 29 February and (b) time series samples for node 1 from 00:00 to 24:00 on 29 February produced using a smoothing window of 30 min.The effects on the DR and FPR using Rule 2a and Rule 2b are clearly shown in Figures 13 and 14.Smaller  2 values gave higher DRs (up to 100%).Larger  2 values gave lower FRPs.The DR was about 100% when  2 was 1.5 but only about 81.5% when  2 was 9, and the FPR when  2 was 1.5 increased from 3.01% to 6.20% but when  2 was 9 the FPR increased from 0% to 0.88% (Figure 13).

International Journal of Distributed Sensor Networks
Two main conclusions were drawn from Experiment II.(1) The best  2 value gave optimal effects, but the optimal value depends on the application.(2) The DRs achieved using algorithm Rule 1a and Rule 1b are related to the number of faulty nodes.The DRs achieved using algorithm Rule 2a and Rule 2b are almost independent of the number of faulty nodes but the FPRs increase as the number of faulty nodes increases.

Experiment III:
Variable   ,  1 = 1.5, and  2 = 2.5.The faulty node detection accuracies achieved using Rule 1a, Rule 1b, Rule 2a, and Rule 2b and the numbers of faulty nodes for different neighbours are shown in Figures 15 and 16.   was set to 8 m, 10 m, and 12 m (from Table 2), implying that the average number of neighbours was 5.67, 8.19, and 10.56, respectively.
Increasing   from 8 m to 12 m caused the DRs using both Rule 1a and Rule 1b to increase but the FPR to decrease (Figure 15).Increasing the number of faulty nodes caused both the DR and the FPR to decrease, but if the   was increased the corresponding FPR decreased more slowly.
The DRs using Rule 2a and Rule 2b were almost independent of the average number of neighbours (Figure 16).Increasing the average number of neighbours caused the FPRs using Rule 2a and Rule 2b to decrease, but the FPRs increased as the number of faulty nodes increased.Even when half of the total number of sensors were faulty the DRs using Rule 2a and Rule 2b were still about 99.95%.The FPR using Rule 2a was 0.13% when   was 8 m, 0.04% when   was 10 m, and only 0.03% when   was 12 m.The FPR using Rule 2b was 4.47% when   was 8 m, 1.22% when   was 10 m, and only 0.58% when   was 12 m.
In brief, Experiment III showed that increasing the number of neighbours may improve the detection performances of algorithm Rule 1a, Rule 1b, Rule 2a, and Rule 2b.

Experiment IV: Comparing the Accuracies of Detection Achieved Using Different Algorithms.
In the following experiments we compared the detection performances of our STCFD, a single time series model (STM) [18], and a distributed fault detection (DFD) algorithm [11].
(1) STCFD and STM.We used two scenarios to compare the performances of the STCFD and STM.positive rate.When  1 = 5, the STM gave a low detection rate and a low false positive rate.
The STM DR was about 100% and the FPR remained at about 8%, and neither was affected by the number of faulty nodes (Figure 17).Increasing the number of failed nodes caused the DRs and FPRs using Rule 1a and Rule 1b to decrease nonlinearly (Figure 17(a)).The DRs decreased from about 93% to about 70%, the FPR using Rule 1a decreased from about 6.27% to 2.39%, and the FPR using Rule 1b decreased from about 7.68% to 2.56%.The DRs using Rule 2a and Rule 2b were about 99.95% and were not related to the number of faulty nodes (Figure 17(b)), the FPR using Rule 2a increased from about 2.19% to 3.63%, and the FPR using Rule 2b increased from about 2.47% to 4.96%, and both of the FPRs were less than the STM FPR (8%).Rule 1a and Rule 1b gave lower false alarm rates than the STM, but at the expense of a lower detection rate.Rule 2a and Rule 2b obviously gave lower false alarm rates than the other methods and maintained a high detection rate that was equivalent to that achieved using the STM.
International Journal of Distributed Sensor Networks It can be seen from Figure 18 that the FPR using the STM was only 0% but the DR using the STM was about 98%.The DR using Rule 1a decreased from 98.11% to 75.76%, and the DR using Rule 2a was about 98%.Neither Rule 1a nor Rule 2a improved the detection accuracy because their DRs were not higher than the DR achieved using the STM.
The DR using Rule 2b was about 99.95%, which was higher than was achieved using the STM (2%).The FPR using Rule 2b only increased from 0.18% to 1.62%.When the number of faults was <16, the DR using Rule 1b was higher than was achieved using the STM (1.90%-0.03%),and the corresponding FPR was also higher than was achieved using the STM (0.18%-0.01%).These results imply that the FPRs were increased less than the DRs using algorithm Rule 2b and Rule 1b relative to using the STM, so Rule 2b and Rule 1b gave better detection performances than the STM.Experiment IV showed that we can produce an STCFD rule that will increase the detection rate or decrease the false positive rate under different conditions, to improve the detection accuracy over that achieved using the STM.
(2) STCFD and DFD.Different types of parameters are available, so it is difficult to compare the different algorithms exactly.We chose relatively moderate values for appropriate parameters so that we could compare the STCFD and DFD algorithms in general terms.We used  1 = 1.5,  2 = 2.5, The DRs and FPRs achieved using the STCFD and DFD methods plotted against the numbers of faulty sensors are shown in Figure 19.The DR using DFD, Rule 1a, and Rule 1b decreased as the number of faulty nodes increased.When the number of faulty nodes was <12, the DR using the DFD was >99.0% and was slightly higher than or at least equal to the DRs achieved using Rule 1a and Rule 1b.When the number of faulty nodes increased from 12 to 24, the DR using the DFD decreased sharply until it fell below 70%, but the DRs using Rule 1a and Rule 1b only decreased to about 84%.Some sensors may not have enough neighbours for a correct and complete analysis to be achieved, so faulty sensors were not diagnosed as faulty using the DFD.Increasing the number of faulty sensors caused the FPR using the DFD to increase from 0.65% to 2.4% and then to decrease to 1.62%.
The DRs and FPRs using Rule 2a and Rule 2b were higher than the DR using the DFD.The DRs using Rule 2a and Rule 2b were still about 99.95% when 27 sensors were faulty.Furthermore, the FPR using Rule 2a increased from 0% to 0.06% and the FPR using Rule 2b increased from 0.22% to 1.20%.It is clear that our STCFD gives a better fault detection accuracy than does the DFD, especially when the number of faulty nodes is higher than in the target network.

Complexity Analysis.
There are usually strict constraints on resources when WSN algorithms are performed, and this is a significant difference from other scenarios in which such algorithms are used.A very accurate but resource-hungry method is hardly applicable to WSNs.We analysed the complexity of the STCFD method, including the communication, computation, and memory demands.
The computational complexity was found to depend mainly on the time series model used to calculate   ,    , ℎ   ,    , and    for the spatial neighbours.The SARIMA model gave a different level of complexity.Considering AR() in our experiment, the computational complexity was (), where  is the regression coefficient.The maximum computational complexity at each node was, therefore, ( +  ⋅ ), where  is a constant and  is the number of neighbours.The communication complexity of the STCFD was low, each node only sending one of its own observations to its neighbours to allow a spatial comparison to be performed in each run period, which is less than the communication times required in previous algorithms [11][12][13].The memory complexity came mainly from keeping previous observations and algorithm parameters (selected SARIMA parameters, x  , DS  ,  1 ,  2 ,   , . ..) in memory.This may be represented as ( + ), where  is the number of variables required.The overheads involved in storing the temporal and spatial correlation parameters were found to be negligible.

Conclusions
In this paper we present a method for detecting distributed faults in coordinate-free WSNs.The method is based on spatial and temporal correlations in WSN data.Firstly, we used a time series analysis method to determine the preliminary detection state for each node.Secondly, we used spatial correlations of adjacent nodes to obtain a comparison test result   for the neighbours of each node.Finally, we used four rules (Rule 1a, Rule 1b, Rule 2a, and Rule 2b) to determine the final detection results.Our algorithm involves three parameters, a temporal threshold value  1 , a spatial threshold value  2 , and the average number of neighbours.
Experimental results obtained using a real database gave the following results.number of faulty nodes increases.(2) The values of  1 and  2 depend on the specific application, and the larger the average number of neighbours each node has the better.(3) Overall, our localized fault detection algorithm gives a high detection accuracy and a low false alarm rate even when there is a large set of faulty sensors, and our algorithm gives better detection accuracies than the STM and DFD algorithms.
Our algorithm is promising, and we intend to extend it to see how it behaves in extremely large deployments.In further research and simulations, we will use our proposed algorithms in real situations.We will focus on the following four aspects: (1) using more complex data structures, with seasonal, cyclical, and other trends, (2) inserting different fault types (short faults, noise faults, and constant faults) or fault intervals into real data for detection tests, (3) using different SARIMA values for the forecast for each node, rather than uniform values as used in the experiments described above, and (4) optimizing the analysis process to decrease the complexity of the algorithm.

Figure 1 :
Figure 1: Relationships between node V  and its neighbours.

Figure 2 :
Figure 2: Illustration of a fault detection algorithm for a wireless sensor network.

Figure 3 :Figure 4 :
Figure 3: Sensor nodes deployed to collect the Intel Berkeley Research Laboratory dataset.

Figure 6 :
Figure 6: Sampling data and forecasts for node 1 on 29 February (using a smoothing window of 30 min).

Figure 18 :
Figure 18: (a) Detection rate (DR) and false positive rate (FPR) for the single time series model (STM) and the algorithm Rule 1a and Rule 1b when  1 = 5,  2 = 2.5, and   = 10 m and (b) DR and FPR for the STM and the algorithm Rule 2a and Rule 2b when  1 = 5,  2 = 2.5, and   = 10 m.

Figure 19 :
Figure 19: (a) Detection rate (DR) and false positive rate (FPR) for the distributed fault detection method (DFD) and the algorithm Rule 1a and Rule 1b and (b) DR and FPR for the DFD and the algorithm Rule 2a and Rule 2b.
at time   and  +1 , respectively.The conditions that need to be satisfied by V  and V  are then |   −    | ≤  1 and |   −  +1  | ≤  2 , in which  1 and  2 may vary depending on the application.

Table 2 :
Relationships between the transmission range   and the average number of neighbours and the maximum difference between neighbours.