Distributed networked localization using neighboring distances only through a computational topology control approach

For large-scale wireless sensor networks, the nonlinear localization problem where only neighboring distances are available to each individual sensor nodes have been attracting great research attention. In general, distributed algorithms for this problem are likely to suffer from the failures that localizations are trapped in local minima. Focusing on this issue, this article considers a fully distributed algorithm by introducing a novel mechanism, where each individual node is allowed to computationally interact with a random subset of its neighbors, for helping localizations escape from local minima. Theoretical analyses reveal that with the proposed algorithm, any local minimum of the localization will be unstable, and the global optimum would finally be achieved with probability 1 after enough time of iterations. Numerical simulations are given as well to demonstrate the effectiveness of the algorithm.


Introduction
Wireless sensor networks have been attracting great attention since a large number of simple sensor nodes working cooperatively can bring plenty of advantages. With the increasing requirement of cooperations among sensors, the problem of node localization arises since position information plays a crucial role in complex collaboration. However, as the network scale grows, manually configuring positions for each node would become a disaster, and equipping all nodes with high-precision Global Navigation Satellite System (GNSS) receivers not only costs much but also is strongly environment dependent. Nowadays, it is widely believed that localizing nodes using relative measurements is of great benefits to large-scale sensor networks. Among varieties of inter-node information, the relative distance is favorite due to its low-cost economy, unlimited field of view, and high precision. Therefore, plenty of research interest has been focused on the localization using relative distances. However, the main issue troubling the research is that even in the absence of measurement noises, the distance-based localization is a nonconvex optimization problem with many local minima.
In general, node localization algorithms can be classified into two categories: centralized ones and distributed ones. [1][2][3][4][5][6] The centralized ones mainly include elaborately designed heuristic optimization algorithms, such as the genetic algorithm, simulated annealing algorithm, ant colony optimization, and particle swarm optimization for localization. Besides, kinds of modelbased nonlinear fitting algorithms are also included, 7 where a central model of the relative structure with respect to node positions is used for collecting and fitting all the distance measurements among nodes. However, centralized algorithms completely rely on global information and centralized computations, put forward heavy requirements on network conditions, and lack of the feasibility and flexibility for network scales, 8 despite the fact that they can obtain an excellent solution after running for enough time of iterations.
On the contrary, distributed algorithms behave in a decentralized manner without using any global information. Further divisions of distributed algorithms can be made, according to whether the localization process is simultaneous for all the nodes or not, into incremental algorithms and concurrent algorithms, respectively. The former algorithms locate nodes sequentially in an incremental way, where positions are solved by starting with the group of anchor nodes, and from the close-by nodes to those far away. In each iteration, any node whose position has been solved is designated as an anchor node. [2][3][4][5] These algorithms are simple to understand and easy to implement. However, they are errorprone because errors would be accumulated along the solving paths. This drawback can cause a fatal localization crash for large-scale networks. In contrast, the latter algorithms solve positions for all the nodes simultaneously, where each individual node in the network updates its position estimate synchronously in each iteration following the same protocol, [9][10][11][12][13][14][15][16][17][18][19][20][21][22][23] In general, concurrent algorithms are more competent for largescale networks and become the research emphasis in distributed localization problems.
In the studies of concurrent algorithms, many works focus on the problem of localization refinement, where rough position estimates are desired to be refined to more accurate ones. [12][13][14][15][16] Toward this goal, the position estimate of each node is often considered as a distribution rather than a single point, and nodes then refine their position estimates by modeling the distribution characteristics, by intersecting the neighboring distributions, and by fusing data within the intersections. If the models of both the localization process and the noises are available, then estimates can be refined by making the noisy measurements best fit the model from the perspective of probability distribution, and thus, a refined localization result with extremely high precision would be obtained. However, the refinement is deeply restricted by two major assumptions, without which the process would fail. First, the models are assumed to be exact enough to describe the actual signals. Second, rough position estimates are assumed to be known prior to nodes. For getting a proper starting point for the refinement localization, research turns to the study of the lost-in-space localization, where nodes have no prior knowledge on positions and need to produce position estimates from a large solution space. Toward this goal, position estimates can never be considered as distributions, since the initial error of starting points would probably go beyond the assumed bound of probability distribution models. Generally, most research treats the position estimate as a single point instead. 9,11,[17][18][19][20]22,23 In the studies by Priyantha et al., 9 Howard et al., 18 and Gotsman and Koren, 19 a distributed localization algorithm, which can be referred to as the mass-spring relaxation (MSR) algorithm, is developed using the gradient descent method for guaranteeing the convergence to local optima. MSR regards the estimates on node positions as distinct virtual masses, and the resulting distances among masses as springs. Depending on the error between the resulting distance and the measured one, every spring is recognized to be either ''compressed'' or ''stretched.'' Along the gradient direction, a compressed spring provides a repulsive force that pushes two neighboring nodes away, and a stretched spring provides an attractive force that pulls the nodes near. Virtual masses will thus cooperatively keep moving under spring forces until they get a balanced assignment, which is regarded as the solution of localization. However, the balanced assignment using MSR only corresponds to local minima, because the distance-based localization optimization is of high nonconvexity even in the absence of measurement noises. 9,17,22 To solve the local-minima issue, varieties of methods have been considered. For example, the information of hop-counts among nodes can be used in advance to generate an excellent starting point that is close to the global minimum, for the followed localization process using MSR algorithm. 9 But the starting point cannot be always good enough, especially when the network connectivity is large. The information of additional mobile anchor nodes is also helpful, because varying distances can be introduced to save the localization from local minima. [24][25][26] But this method brings difficulties in engineering implementation as well. Without requiring any additional information, Zhu and Ding 11 proposed an MSR-based ''pulled-only'' algorithm, where only the stretched springs are retained while the compressed ones are ignored. But this method requires that the problem be convex, that is, all ordinary nodes need to lie inside the convex hull of the anchor nodes. Naraghi-Pour and Rojas 22 proposed an observer for monitoring localization errors and used an error-triggered strategy. Under the strategy, active disturbance noises are additionally put on position estimates whenever the observed localization errors are large. This method is suitable for dealing with nonconvex problems, but it brings additional issues to designing toggle conditions and generation noises. Besides, the global optimum would be unstable under this method, since toggle conditions cannot be exact in the absence of uncertain measurement noises. We should point out that for the sake of brevity, our survey of previous work here is short of complete and only contains the previous work related to our work. For a more complete survey, we refer to the studies by Yassin et al., 2 Paul and Sato, 6 and Naraghi-Pour and Rojas. 22 Focusing on the local-minima issue in distance-based localization problems, this article considers a concise topology-controlled mass-spring relaxation (TCMSR) algorithm that allows node to computationally interact with only a random subset of its neighbors, for help localizations escape from local minima. The significant contributions of the algorithm are fourfold: (1) the global minimum would still be stable under the algorithm, while any local minimum would not; (2) the algorithm would finally achieve the global optimum with probability 1 after enough iterations; (3) no additional information is needed for assisting the localization. Actually, on the contrary, less information is used in the algorithm; (4) the algorithm addresses the issue in the topology control level only and does not conflict with other existing algorithms, meaning that TCMSR can be used for improving these existing algorithms.
The rest of this article is organized as follows: Section ''Preliminaries'' reviews the preliminaries of the graph and the traditional mass-spring algorithm. Section ''TCMSR algorithm'' formally proposes the TCMSR algorithm along with theoretical analyses and discussions. In section ''Simulations,'' numerical simulations are performed and show the effectiveness of the proposed algorithm. Finally, remarking conclusions are given in section ''Conclusion.''

Notation and definition
Consider a network consisted of N ordinary nodes and M anchor nodes at distinct positions in some physical region, where the anchor node can be aware of its own position alone while the ordinary node cannot. By labeling the ordinary nodes by 1, 2, . . . , N and the anchor nodes by N + 1, N + 2, . . . , N + M, we denote the ordinary node set by N = f1, 2, . . . , N g and the anchor node set by M = fN + 1, N + 2, . . . , N + Mg. The interaction relationship among the nodes can be modeled by a directed graph G. A directed edge e ij in G means that node i can receive information from node j, but not necessarily vice versa. To indicate the existences of the edges, the adjacency matrix A = ½a ij associated with G is defined such that a ij = 1 if the edge e ij exists and a ij = 0 otherwise. The Laplacian matrix L=½l ij associated with G is also defined such that l ii = P j2N S M, j6 ¼i a ij while l ij =À a ij , 8i 6 ¼ j. We say node j is a neighbor of node i if a ij = 1 and denote the neighbor set of node i by N i & N S M such that a ij = 1, 8j 2 N i . Therefore, jN i j denotes the number of neighbors of node i. The neighbor relationship j 2 N i does not imply i 2 N j in a directed graph, but note that A is symmetric in undirected graphs. A graph is said to contain a directed spanning tree if there is a directed path that starts from an anchor node in the graph and connects all the other ordinary nodes. As an example, Figure 1 illustrates the sketch of a directed sensor network that contains directed spanning trees.
Denote by r Ã i the actual position of each node i 2 N S M with respect to a certain coordinate system, and suppose some mechanism exists through which ordinary nodes can measure the distances to its neighbors. Denote the directed distance measurements by d ij for each ordinary node i 2 N and its neighbors j 2 N i , such that d ij = jr Ã i À r Ã j j + e ij , with e ij being a small bounded measuring noise.
The localization problem can be then defined in the following.
Localization problem. Given a sensor network with an undirected graph G containing a collection of N ordinary nodes and M anchor nodes, and the distance measurements of each node to its neighbors, the goal is to produce a set of position estimates r i that are consistent with the actual positions of all ordinary nodes, that is, an assignment of points r i such that r i = r Ã i for i 2 N . It is worthwhile to mention that although the assignment might not be unique in some cases where specific mirror symmetry occurs or the network is incompletely observable, 27 the localization problem in this article only targets at the case with sufficient observability and a unique embedding assignment.

MSR algorithm
The MSR algorithm is recalled here in detail in order to help make its main principle clear. Although the MSR algorithm has different aliases (e.g. anchor-free localization algorithm) in the research literature, 9,[17][18][19][20] its principle remains the same. In the MSR algorithm, the nodes are treated as distinct virtual masses, and the resulting distances among the masses are treated as a group of springs that connect neighbor nodes. Besides, the node position estimates are considered as the realtime positions of the masses, and the distance measurements are considered as the unstressed lengths of the springs. For every mass i in such a virtual first-order mass-spring system, non-zero repulsive or attractive velocities that contributed by its neighbors will act on it immediately as long as the springs are deformed, that is, when their resulting distances are not equal to the unstressed lengths. Therefore, the resultant velocity of each mass will be obtained by combining all the component velocities that are contributed by its neighbors and drives each mass to move. The motion process will continue until nodes get a balance, which is regarded as the solution to the localization problem.
In a mathematical expression, the MSR algorithm 9,17-20 updates the position estimate for each node following where the superscript k ½ indicates the moment k; jj is the vector norm operator; g is a proper positive scalar gain; r i , a ij , and d ij are defined the same as before; and n ij is the unit direction from r i to r j , given by n ij =(r j Àr i )=jr j Àr i j.

Performance metric for localization errors
Global energy ratio. The global energy ratio (GER) is used to indicate the structural error of the localization, given by 9 The smaller the GER is, the better performance the localization would achieve. In the absence of measurement noises, the global minimum of the localization always corresponds to a GER value near 0.

TCMSR algorithm
In this part, we develop a useful interacting mechanism for each node at first, give a brief explanation on the role of the mechanism after that, and present the TCMSR algorithm at last.

Interacting mechanism
The mechanism is such established that for all nodes in a sensor network, interaction no longer follows the physical topology itself, but the computational topologies that are produced upon it. Figure 2 illustrates the way of producing computational topologies upon the physical topology, where a series of controlled switches are imaginarily mounted on all the edges of the physical topology. Note that each undirected edge will be considered as two independent directed edges and be mounted by two independent switches. For example, suppose that nodes i and j have an undirected edge in the physical topology, then we regard the edge as two reversely directed edges e ij and e ji that start from one node and point to one another, and two independent switches termed as s ij and s ji will be respectively mounted on the directed edges. The connectivity of each directed edge e ij relies on the switch value s ij . With s ij taking different values (0 and 1) for any i and j, corresponding computational topologies can be produced.
Denote by a series ofG the graphs of the computational topologies, byÃ = ½ã ij the adjacency matrix In this article, s ij is a Boolean variable that is dynamically determined by here, rand() is a random function that generates random float numbers within (0, 1). 0\t i \1 is a proper parameter. It can be noted that the greater t i is, the higher probability for s ij to take the value 0. The interacting mechanism developed in this article can be then known as the rule that nodes interact fol-lowingG instead of G.

Brief explanation
Although the introduction of the mechanism seems trivial, its nontrivial contribution lies exactly in that the mechanism can effectively save the localization from being trapped in local minima. To explain this, we look into equation (1) and observe what has happened in local minima. For brevity, we simply refer to the term n ij Á (jr i À r j j À d ij ) in equation (1) as v ij , which implies the component velocity acted on node i that is contributed by node j.
First of all, we would like to explain the exact reason why traditional MSR may be trapped in local minima. Equation (1) expects that for any node i, the estimate should be driven by the component velocities until it arrives at the correct position, where v ij = 0 holds for all j. However, equation (1) reveals that the equilibrium is not unique when all v ij = 0. This situation, to the best of the authors' knowledge, makes the occurrence of local minima. Actually, it is clear that local minima only when P j2N i v ij = 0 while P j2N i jv ijj 6 ¼ 0. For helping better understand the reason explained above, we take Figure 6 as an instantiation of localminima-trapped localizations. Without loss of generality, focus on a specific node i = 25, and list all its component velocities in Table 1. It is clear that not all the component velocities equal to 0 while the summation of the velocities equals to 0 approximately. As a result, the estimate r i cannot be further refined using equation (1), implying that the estimation has fallen into the local minimum.
Next, we explain why our mechanism can easily break up the trap. By selecting a proper t i \1, there must be a positive probability for some s ij to be zero. As a result, the event when O i 6 ¼ N i must happen with a positive probability. Since O i may be any subset of N i , there always exists a positive probability for some O i to make the resultant velocity of node i be unequal to zero, that is, . Therefore, the local-minima equilibrium turns to be unstable immediately, and hence, the localization gets a chance to jump out of the trap. This explanation shows the validness of our mechanism briefly, and a more detailed proof will be provided in the following section.

Formal proposal of the algorithm
The TCMSR algorithm can be straightly proposed as where k ½ , j j, r i , a ij , d ij , and n ij are defined previously in equation (1),ã ij = s ij Á a ij , s ij is determined by equation (3), and time varying g i satisfies 0\g i \(1=jO i j), with jO i j = P j2N iã ij being the number of computational neighbors of node i at each time slice.
Before going on, some lemmas are given as follows.
Lemma 1. Let S 1 , S 2 , . . . , S k be a finite set of stochastic, indecomposable, and aperiodic 28,29 (SIA) matrices. Then, there exists a column vector y such that lim Lemma 2. If the union of the set of simple graphs fG 1 , G 2 , . . . , G m g & G has a spanning tree, then the matrix product D m Á Á Á D 2 D 1 is SIA, where D i is a stochastic matrix corresponding to each simple graph G i . 28 Lemma 3. Any local minimum will be unstable under algorithm (equation (4)). Proof. The proof can be simply performed by reduction to the absurd. Suppose that there exists a stable local minimum at time slice k. This implies that P j2O ½k i v ij = 0 holds for all i and j, while P j2N i v ij 6 ¼ 0. However, there must be such a coming slice k 0 that P j2O ½k 0 i v ij = 0 no longer holds for all i and j, with This conflicts with the assumption that the local minimum is stable. Therefore, the proof can be completed by concluding from the absurd.  (4)).
Proof. For the global minimum, v ij = 0 holds for all i and j. Therefore, it is clear that for any time slice and O i , it always holds that P j2O i v ij = 0, which implies that the global minimum stays stable. This completes the proof.
With these lemmas, the following theorem can be studied.
Theorem 1. For a localization problem with an initial guess as the starting point, the global minimum of the localization can be finally achieved with probability 1 using the TCMSR algorithm (equation (4)) for enough iterations.
Proof. First of all, we show that the algorithm can converge at least to some minima. At any time, each node i has a current estimate r i on its position. Each node i also periodically sends this position estimate to all its neighbors. Therefore, each node knows its own estimated position and the estimated positions of all its neighbors. Using these position estimates, each node i calculates the estimated distance jr i À r j j to each neighbor j. It also knows the measured distance d ij to each neighbor j.
The difference between the estimated distance and the measured distance immediately generates either an attractive or a repulsive velocity v ij = n ij (jr i À r j j À d ij ) that acts on node i in the direction n ij , with n ij = (r j À r i )=jr j À r i j being the unit vector in the direction from r i to r j . Note that only when s ij . 0 does each neighbor j contribute a component velocity v ij to node i, and the resultant velocity v i acting on each node i is the composition of all the contributory component velocities.
The structural energy E ij of nodes i and j due to the difference in the measured and estimated distances is the square of the difference, that is, E ij = (jr i À r j j À d ij ) 2 , and the total energy of node i equals E i = P j E ij . The energy E i of each node i reduces when it moves by a small amount Dr ij =ã ij g i in the direction of the velocity v ij . Likewise, the total energy of the system E = P i E i reduces as each node moves. This implies that the algorithm can converge with a small g i . Note that the small parameter g i scales the exact amount Dr ij by which each node i moves. To determine the value range of g i , consider the local convergence near some converged minimum.
Denoting h ij = À n ij Á d ij (regardless of the zerodenominator singularity) and noting that r i À r j = À n ij jr j À r i j, one can rewrite equation (4) as At any time, one can always treat h ij as h ij = h i À h j , with h i being a proper vector for any i 2 N . In particular, let h i = p i for i 2 M since anchor nodes are aware of their actual positions. By introducing the error statẽ r i = r i À h i for any i, one can further rewrite equation (5) in the error-form equation as With the observation that h i changes much more slowly than r i as the algorithm converges (by observing h ij = À n ij Á d ij ), one have that Dh i = o(r i À r j ) as k increases. Therefore, around the optima equation (6) can be simplified with a good approximation as Since equation (7) is component-wise in form, one can just consider it in a one-dimensional case, and the consideration of other dimensions is the same. Therefore, equation (7) can be rewritten in the compact matrix form, yieldingr .L is the Laplacian matrix associated with the graphG, withG being the graph of the computational topology.
Due to the fact that the physical graph G contains a spanning tree (since the given localization problem is sufficiently observable), the probability for a finite union fG ½k 1 ,G ½k 2 , . . . ,G ½k m g across enough time slices k 1 , k 2 , . . . , k m to have a directed spanning tree equals to 1. Therefore, a period of time slices ½0, m l can be divided into a sequence of segments ½0, m 1 , ½m 1 + 1, m 2 ,..., ½m lÀ1 + 1, m l , so that the union of the subgraphs in each segment has a spanning tree. Define the state transition matrix for the lth segment by S ½l = D ½m l Á Á Á D ½m lÀ1 + 2 D ½m lÀ1 + 1 . Sincel ii = jO i j and 0\g i \1=jO i j, the disk theorem 30 indicates that D ½k is stochastic. As a result, one obtains the fact from Lemma 2 that S ½l is SIA for any segment l. Following Lemma 1, the error states converge, that is with e being a constant. It means that the error states converge to achieve a consensus, that is,r i !r j for any i and j. By noting thatr i = r i À h i = 0 holds for i 2 M, one finally obtainsr i ! 0 and equivalently r i ! h i for any i. As above all, it is shown that the algorithm can converge at least to some local minima. For ease of reference, this single localization process, that starts from a given starting point and converges to a minimum, is called an epoch. Second, Lemma 3 points out the fact that any local minimum will be unstable under the proposed algorithm. Since the computational topology is randomly produced upon the physical topology, the motion trajectory escaping from the current minimum would vary with different iterations and thus will reset varying starting points for the next localization epoch.
Third, since the starting points of the localization vary after local minima are escaped from, the probability for a single localization epoch to converge to the global minimum, termed as P, is larger than zero, that is, P.0. Therefore, after enough time of epochs, the final probability for the localization to converge to the global minimum, termed as P global , can be obtained as This indicates that the global minimum can be eventually achieved using the algorithm.
Finally, Lemma 4 shows that the global minimum will be stable whenever it is achieved. Above all, the proof has been completed.

Additional discussion
First, we would like to discuss the parameter t i in equation (3). Denote by O i the neighbor set of node i in a computational topology for all i and by jj the card of the set. Therefore, the number of neighbors of node i in a specific computational topologyG can be termed by jO i j. Denote by jO i j the mathematical expectation of jO i j across allG, then equation (3) indicates that jO i j = (1 À t i ) Á jN i j. It is clear that the problem of choosing an optimal t i can be equivalent to the dual problem of choosing an optimal jO i j.
To the best knowledge of the authors, there is hardly no mathematical method to exactly solve both the primal and the dual problems. In other words, the problem of choosing an optimal parameter remains unsolved and needs further research.
In this article, we temporarily provide an empirical way to help determine a workable value for the parameter. On one hand, by recalling the reason of the trap of local minima, we know that the trap takes place because the neighboring components counteract each other. It is then clear that for each node, the degree of counteraction will become heavier when the number of neighbors gets large (in analogy with molecular thermal movement), and hence, a smaller jO i j is expected to help localization escape from local minima. While on the other hand, a too-small jO i j will reduce the connectivity of computational networks and is harmful to convergence. Concluding from both hands, one can obtain a guideline as: a relatively larger jO i j is beneficial to convergence but is harmful to searching for global optimization (e.g. when jO i j = jN i j, TCMSR will completely degrade into MSR), while a relatively smaller jO i j is beneficial to searching for global optimization but is harmful to convergence. As a tradeoff, we empirically consider 1 ł jO i j ł 2 to be of benefit. For the simulations of this research, in particular, we capriciously let jO i j = 2 for every node i, and now that jO i j = (1 À t i ) Á jN i j, t i is determined as follows Second, it is worthy of mentioning that the time complexity of TCMSR is not higher than that of MSR and has no relation with the size of sensor networks. Compared to MSR, TCMSR introduces no extra action in the communication process and uses less neighboring information to take part in the updating process. The only introduced action is the lightweight generation of s ij for each computational neighbor. It means that TCMSR can be applicable in all scenarios where MSR can be used. However, it should also be mentioned that TCMSR guarantees global optimization only under the assumption of enough iterations, so that in some real scenarios where only finite iterations are allowed, TCMSR may not always achieve global optimization.

Simulations
In this section, we verify the proposed TCMSR algorithm by numerical simulations and compare it with the traditional MSR algorithm. The only parameter g in the MSR is set to be 1=2jN i j for each node i as reported, 9 and the main parameters in the proposed TCMSR algorithm are set as described in Theorem 1, is self aware by each node and updates with iterations, and u = 0:01 is a small positive tolerance to guarantee g i \1=jO i j. In total, three experiments are performed.
Specific case: with a fold-free (good) starting point In the first experiment, we compare the MSR and the TCMSR with a fold-free starting point in an instantiation of sensor networks. The sensor network is deployed in a 400 units by 200 units (unit of length, for example, centimeter, meter, and kilometer) twodimensional square space, where M = 4 anchor nodes are placed at the four corners and N = 150 ordinary nodes are placed at distinct random locations. The physical topology is generated with a mean connectivity J mean = 16:22 by letting the maximal communication radius be 60 units. In fact, measuring noises would be related with specific measuring principles and many factors. However, considering that noises are not a major issue for the particular problem discussed in this article, we take no account of any physical models to describe the measuring noises and treat the noise as a Gaussian distribution simply. In particular, distance measurements among neighboring nodes are generated with Gaussian noises added, whose 3s boundaries are all 0.03 unit. Figure 3 shows the actual position assignment and the interaction for the instantiation.
Both MSR and TCMSR are given a same fold-free starting point, which is actually in a close neighborhood of the correct assignment. The localizations are then performed using the two algorithms, respectively, and their results are plotted simultaneously in Figure 4. It can be seen that both the two algorithms can successfully make the localization converge to the global minimum (i.e. the correct assignment). Despite that the performances of the two algorithms seem similar, the slight performance difference between the two algorithms lies exactly in the feature that TCMSR would jitter stronger than MSR does when they are converging. This is because TCMSR uses computational topologies to interact among nodes instead of the fixed physical topology, and the switching of computational topologies cause the jitter. However, the jitter reflects that TCMSR behaves in a more aggressive way to converge. It can be also seen that the jitter of TCMSR weakens and disappears after the algorithm has converged. This shows the fact that the global minimum would not suffer from the proposed TCMSR algorithm.

Specific case: with a random (bad) starting point
In the second experiment, we compare MSR and TCMSR with a purely random starting point. The same instantiation of sensor networks is used, as described in the first experiment. Different from the first experiment, a random guess on position assignment, which has a significant estimation error, is set as the starting points of both the two algorithms. Localization results are shown in Figure 5. It can be seen that MSR finally makes the localization converge to and gets trapped in a local minimum with a significant bias, implying that the traditional MSR fails in the localization. For a more visualized viewing, Figure 6 illustrates the localization errors in a two-dimensional view. The dot dash lines clearly indicate that MSR converged to a local minimum. However, the localization estimation cannot be refined anymore. This is because the resultant velocity acting on each node has become extremely small,  although not all the component velocities are small. Table 1 shows a more detail of component velocities of an arbitrary node i = 25.
On the contrary, the proposed TCMSR can help the localization converge to the global optimum, since the localization error converges to zero successfully. This success is guaranteed by the effective mechanism, under which local minima can never stay stable. After enough time of iterations, the global minimum can be eventually achieved by using TCMSR.

Monte Carlo simulation
In the third experiment, we perform the Monte Carlo simulation for MSR and TCMSR, respectively. In total, 300 different sensor networks are generated by varying the number of ordinary nodes (50 ł N ł 300) and the mean connectivity degree (8 ł J mean ł 40). For each generated sensor network, three different starting points are randomly guessed for both the algorithms. Therefore, for both the algorithms, 900 times of localizations are performed.
Results of the simulation present that the traditional MSR algorithm fails in about 60% of the localizations, showing a high sensitivity to network conditions and starting points. On the contrary, the proposed TCMSR algorithm can always achieve the global minimum for the localization in all the simulations, even starting from a purely random starting point (on the same scale of the correct assignment). The success rate using TCMSR is an exact 100%.
As a summary, the simulation shows that the proposed TCMSR algorithm can work not only with the fold-free starting point but also with the purely random starting point. Compared to the traditional MSR algorithm, TCMSR behaves better in achieving the global minimum, with a more effective and robust performance. Considering the fact that there is no simple way to ensure a good enough starting point, the proposed TCMSR algorithm shows a great advantage for practical localization problems.

Conclusion
This article deals with the distributed localization problem in wireless sensor networks. Traditional distributed algorithms for this problem are likely to suffer from the local-minima failures. Focusing on this issue, a fully distributed algorithm is proposed under a novel mechanism, where each individual node is allowed to computationally interact with a random subset of its neighbors for helping localizations escape from local minima. Theoretical analyses reveal that any local minimum of the localization will be unstable under the proposed algorithm, and that the global optimum would finally be achieved with probability 1 after enough time of iterations. Numerical simulations given at last have demonstrated the effectiveness of the algorithm.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the National Natural Science Foundation of China under Grants 61701498, 61605099, and 61703403, the Young Elite Scientists Sponsorship Program by CAST (2016QNRC001), the Youth Talent Program of Beijing High-level Innovation and