The impacts of weak links on topology discovery process in large-scale wireless multi-hop networks

In wireless multi-hop networks, especially large-scale wireless multi-hop networks, obtaining the network topology is of vital significance. In fact, in both proactive and reactive routing protocols, before establishing an appropriate end-to-end route, the source node needs to obtain the global or local topology. Our previous research has studied the impacts of weak links on reactive routing protocols, which can also be considered as local topology discovery process. In this article, in order to get insight of the impacts of weak links on topology discovery process, especially the global topology discovery on which the proactive routing protocols rely, we apply a Markov chain to model the most common used topology discovery process in large-scale wireless multi-hop networks. Considering the fading characteristics of wireless channel, we analyze the impacts of weak links on topology discovery algorithms. Simulation and theoretical results show that, with the increase in the network scale, the weak links have great impacts on the stability and even on the feasibility of wireless multi-hop networks.


Introduction
There are two important kinds of routing protocols, proactive and reactive routing protocol, 1,2 in wireless multi-hop networks. Generally, in proactive routing protocol, nodes collect the global topology by periodic broadcast and make the routing decision based on the global topology. While in the reactive routing protocol, before making routing decision, only the local topology between the source and the destination nodes is collected and kept in short-term. Therefore, a topology discovery process is necessary in establishing an appropriate end-to-end route in wireless multi-hop networks. In other words, the source node must get the global or local network topology before calculating the end-toend routes.
Owing to the fading characteristics of wireless propagation, weak links exist and have great impacts on the topology discovery of wireless multi-hop networks. 3 Even in a fully connected network, because of the weak links, the source nodes may fail to get the correct topology, which makes the routing decision unavailable. The impacts of weak links are reflected in the following two aspects of topology discovery:

Topology collection
Weak links result in weak connections. First, the connection relationship of these two nodes, especially the nodes on the coverage edge, may change from time to time owing to the weak links. For proactive and reactive routing protocols, the dynamic change of network topology makes it hard to collect the real-time and correct topology, which greatly affects the performance of routing protocols. However, because of the cumulative effect of packet loss rate in wireless multihop network, the topology information may lose during multiple transmission. The failure of topology collection leads to the failure of route establishment.

Topology update
When the topology changes are detected, the topology update information should be spread globally or locally. The reactive routing protocols, for example, once a node in the multi-hop route has detected the change of connection, it started a route repair process to inform the source node. Similarly, on one hand, during the spread process of topology update information, the global or local topology may change again because of the weak connections. Therefore, before the last topology update information reaches its destination, another topology update information is sent. In other words, the topology update information is useless for the nodes' multiple hops away. On the other hand, the topology update information may also lose during multiple transmission.
In conclusion, the unavoidable weak links in wireless multi-hop networks have great impacts on the topology discovery process and make it difficult to obtain the correct topology in time. Topology discovery is the precondition of routing discovery in wireless multi-hop networks. However, the impacts of weak links on topology discovery or routing discovery are always ignored in recent studies. [4][5][6][7] Plenty of topology discovery algorithms are proposed without considering weak links. Although with the notice of such impacts, researchers mainly focus on designing mechanisms to deal with weak links [8][9][10][11] and propose evaluation model for wireless multi-hop network. 12 The power control method, 13 artificial intelligence (AI)-based link quality prediction, 14,15 softwaredefined networking (SDN) 16 architecture, and multichannel mechanism 17 are applied to deal with the weak link problems and good results have been achieved. However, the innate characters and how the weak links affect the topology discovery are seldom researched.
Our previous studies 18,19 have focused on the impacts of weak links on reactive routing protocols, which can also be considered as local topology discovery. We mainly analyze how the weak links affect the discovery of local topology from the source to the destination nodes. We apply the Markov chain to model the local topology discovery process and finally get the probability of routing success and the distribution of hop count, respectively. 18 The theoretical and simulation results show that, it is difficult to obtain the local topology correctly and quickly owing to the impacts of weak links in the large-scale wireless multi-hop networks.
In this article, we further analyze the impacts of weak links on topology discovery and focus on the global topology discovery which is the precondition of proactive routing protocols. Considering a commonly accepted topology discovery algorithm, two important parameters, the topology stable duration between a given node and its neighbor nodes and the spread time of topology change, are studied. By comparing these two parameters, we discuss the feasibility of topology discovery algorithms in multi-hop wireless networks.
The main contributions of this article are as follows: 1. We apply a Markov chain model to analyze the commonly accepted topology discovery algorithms. 2. We theoretically point out the limitation of large-scale wireless multi-hop networks with proactive routing protocols. 3. We offer an accurate research model to evaluate the feasibility of wireless multi-hop networks, and our results have vital importance on network planning in large-scale wireless multi-hop networks.
The rest of this article is organized as follows. In section ''System model,'' the topology discovery model and the channel model are presented. The Markov chain model is proposed to analyze the topology discovery process and the details of such model are discussed. The Markov chain model is solved in section ''Model solving.'' Two important parameters, the topology stable duration and the spread time of topology change, are obtained. The simulation and theoretical results are compared in section ''Simulation and theoretical results.'' In section ''Conclusion,'' we conclude this article.

System model
In this section, we apply a Markov chain model to analyze the commonly accepted topology discovery process in wireless multi-hop networks. We consider a network with low node mobility which means that the position of nodes remains stationary during topology discovery process. The channel model, topology discovery process, and the Markov chain model will be discussed, respectively, in the following sections.

Channel model
Owing to the fading characteristic of wireless propagation, packets may lose when transmitting between two nodes. In this article, we apply the log-normal shadow fading channel to model the wireless channel between neighbor nodes. The deployment of nodes in wireless multi-hop network is as shown in Figure 1. Let d ij denotes the distance between nodes i and j, and the packet delivery rate between these two nodes is given by P(d ij ). The effective communication radius is r 0 . We also assume that the MAC (media access control) layer protocol is able to avoid collisions and interferences.
According to the log-normal shadow fading channel model, 20 the packet delivery rate P(d ij ) is given by where a is the path loss exponent and s denotes the standard deviation, r 0 represents the effective communication radius, which is the distance when the pointto-point packet delivery rate drops to 50%. Then, we have where b th is the signal attenuation threshold and p r, th is the threshold of received signal strength, p t is the sending signal strength. The curve of p(x) is shown in Figure  2. In this article, we set a = 3, s = 4, and b th = 69 dB.

Topology discovery process
As discussed above, there are two kinds of topology discovery, global and local topology discovery. For reactive routing protocols, the local topology between source node and destination node should be discovered before establishing an end-to-end route. While for the proactive routing protocols, generally, each node has to get the global topology before calculating routes to other nodes. In this article, considering the universality, we focus on commonly accepted global topology discovery algorithms. The topology discovery process is as shown in Figure 3. All nodes periodically broadcast HELLO packets to their neighbors with the same broadcast period T. The broadcast time of each node is randomly distributed within T . The HELLO packet contains the source node ID and its neighbors' ID. Once a node has received a HELLO packet, it has all the topology information of the HELLO source. For example, node A broadcasts a HELLO packet and node B receives this   packet. Then, node B adds node A into its one-hop neighbor list and adds node A's neighbors into its twohop neighbor list.
In wired networks, if two nodes can receive each other's HELLO packets, they are definitely connected. However, in wireless networks, the HELLO packets may lose even the distance between two nodes are very close. Therefore, taking the weak links into consideration, we study the most common used topology discovery algorithms. All nodes periodically broadcast HELLO packets at the beginning. For a given node, if it continuously receives M HELLO packets from the same source node, it adds the HELLO source into its neighbor list. Influenced by the weak links, if a node fails to receive K HELLO packets during K continuous broadcast periods, it deletes the HELLO source from its neighbor list. The process of HELLO packets in a given node is as shown in Figure 4.

Markov chain model
A node collects the HELLO packets from its neighbor nodes to obtain the topology information. Because all nodes have the same broadcast period and periodically send HELLO packets, without considering the case of packet loss, one node can receive and only receive one HELLO packet from the same neighbor node during one broadcast period T .
For example, to guarantee the accuracy of topology mentioned above, node A adds node B into its neighbor list after continuously receiving M HELLO packets from node B. Similarly, node A deletes node B from its neighbor list after continuously losing K HELLO packets from node B. To get insight of the impacts of weak links on the topology discovery process, we apply a Markov chain to model the topology discovery process between two nodes. Some important assumptions are as follows: 1. The MAC layer protocol is considered perfect and the size of HELLO packet is small enough to ignore collisions. 2. During a broadcast period T, because of the separation of broadcast time, the receptions of HELLO packets for a given node are considered independent. 3. Because of the space separation of neighbor nodes, the receptions of a given HELLO packet are independent. 4. For a given node, the transmissions of HELLO packets at different broadcast periods are independent.
Based on the assumptions and the topology discovery process above, we have the facts as follows: 1. Whether node A can receive the HELLO packet from node B at period T 2 depends on the delivery rate. In other words, the process of HELLO packet at period T 2 has nothing to do with the process at period T 1 .

The connection state change of node A and node
B only depends on their last connection state.
We define a two-dimensional random variable fh n , j n g, where h n denotes the connection state of nodes A and B and j n represents the reception of the HELLO packet from node B at the end of T n . The state space is defined as denotes that nodes A and B are disconnected and node A has continuously received i HELLO packets from node B at the end of T n . State (h n , j n ) = (1, À j) denotes that nodes A and B are connected and node A has continuously lost j HELLO packets from node B at the end of T n .
Specifically, state (h n , j n ) = (0, M) and state (h n , j n ) = (1, À K) are instantaneous states. For state (h n , j n ) = (0, M), at the end of T n , if node A has continuously received M HELLO packets, the state should be (h n , j n ) = (1, M). It is the same for state (h n , j n ) = (1, À K).
Considering the above two facts, we have Thus, the topology discovery process can be modeled as a Markov chain fh n , j n g. The state-transition diagram is as shown in Figure 5. At the beginning of the topology discovery process, nodes A and B are disconnected and no HELLO packet has been received yet. Therefore, the initial state is (h 0 , j 0 ) = (0, À K).
From Figure  5, for example, state (h 0 , j 0 ) = (0, M À 1) denotes that node A has continuously received M À 1 HELLO packets from node B and connection relation has not yet established. At the end of T n + 1 , if node A does not receive HELLO packet from node B, the state will change to (h n + 1 , j n + 1 ) = (0, À K), which means that node A has to recount the number of HELLO packets from node B.
Otherwise, the state will change to (h n + 1 , j n + 1 ) = (1, M), which denotes that node A has continuously received M HELLO packets from node B and is going to add node B into its neighbor list.

Topology update process
The topology update process is shown in Figure 6. If a node detects the change of its neighbor list, it broadcasts the new neighbor list to its neighbors the next broadcast period. The new neighbor list is regarded as the topology update information and will be spread to the edge of the network according to the topology discovery process. Therefore, this article mainly focuses on analyzing the details of the spread of new neighbor list.
As shown in Figure 6, at T 1 , node A detects the change of connection between nodes A and B. Then, it broadcasts the new neighbor list to its neighbors at T 2 . Without considering the packet loss along the path from node A to the edge of the network, node 1 detects the change of node A's neighbors at node A's broadcast time, and node 2 detects the same change at node 1's broadcast time and so on.
Let T sA and T B sA denote the topology stable duration of node A and the topology stable duration between nodes, A and B, respectively. The spread time of topology update information is denoted as T u . The details of these three parameters will be analyzed in the following section.

Model solving
In this section, we solve the Markov chain model to get the matrix of transition probability and steady-state  distribution, respectively. Then, the probability of state change for a given broadcast period and the distribution of T sA are obtained. Finally, we deduced the close form of T u .

Markov chain model solving
The one-step transition probability of the twodimension is defined as Pf(h n , j n )j(h n + 1 , j n + 1 )g. According to the state-transition diagram in Figure 4 and the topology discovery process, we have where p = p(d AB ) denotes the packet delivery rate between nodes A and B. The first five equations in (3) mean that nodes A and B are disconnected at broadcast period T nÀ1 and the state changes according to the reception of the next HELLO packet and the topology discovery algorithm. Equation 1, for example, node A has continuously received i HELLO packets before broadcast period T n . If node A successfully receives the HELLO packets from node B at broadcast period T n , the state will change to (0, i + 1). Similarly, the last five equations in (3) mean that nodes A and B are connected at broadcast period T nÀ1 and the state changes according to the reception of the next HELLO packet and the topology discovery algorithm. From the state-transition diagram, it is easy to know that there are (K + M) states during the topology discovery process. For convenience, we map the twodimensional states fh n , j n g into one-dimensional states h n , j n ð Þ!g n : g n = Mh n + j n j j, j n 6 ¼ M, À K M, j n = M M + K, j n = À K Then, we have a new Markov chain fg n g whose states are one-to-one corresponded with the twodimensional Markov chain fh n , j n g. The one-step transition probability is then given by We defined a (K + M) 3 (K + M) transition probability matrix Q for the topology discovery process.
Let p = p 1 , p 2 , Á Á Á , p M + K ½ denotes the steady-state distribution of g n f g, we have Solving equation (9), we get the steady-state distribution of one-dimensional Markov chain fg n g

Distribution of topology stable duration
From the description of topology discovery process and the state-transition diagram mentioned above, it is easy to divide the states into two categories: connected states S c and disconnected states S d Two nodes are considered stable in both cases: these two nodes remain connected or disconnected. Therefore, T B sA is composed of two parts: the time from a connected state to the first disconnected state T cd and the time from a disconnected state to the first connected state T dc In order to get T B sA , we have the following definitions: Definition 1. For given states i and j, the number of broadcast periods from state i to state j is defined as Definition 2. For given states i and j, the probability of first arrival is defined as which denotes the probability of that the state transfers from i to j after n broadcast periods. Such probability can also be written as f n ð Þ ij = D P g n = j, g s 6 ¼ j, s = 1, 2, . . . , n À 1jg 0 = i f g From the state-transition diagram, we know that for state i 2 S c , the first disconnected state it transfers to is state j = M + K. Similarly, for state i 2 S d , the first connected state it transfers to is state j = M. Therefore, the number of broadcast periods it takes from a connected state to the first disconnected state is given by Considering the steady-state distribution, we obtain T cd and T dc , respectively Then, we have where E(T ij ) is the average broadcast periods from state i to state j and it is given by Let Nb A denotes the neighbor set of node A. Considering the assumptions above, we know that packet transmissions between two nodes, node pairs (A, B) and (A, 1), for example, are independent. Once the change of the connections between node A and its neighbors has been detected, the local topology change information will then be spread to the edge of the network. As shown in Figure 4, if the connection between nodes A and B changes, the local topology of node A changes. Therefore, T sA is also known as the duration between two consecutive topology changes.
We define the probability of topology change during a given broadcast period of node A as P cA and the probability of connection change between node A and its neighbor i during a given broadcast period as P i cA . Then, we have According to the state-transition diagram, we have Then, the average of T sA is given by Spread time of topology update As shown in Figure 6, node A detects the change of local topology and modifies its neighbor list. This topology update information will be spread to the edge of network. During the transport of this topology update information, the local topology of node A may change again. As a result, the nodes at the edge of the network may never get the correct topology of node A. Therefore, it is necessary to figure out the relationship between T sA and T u , which significantly indicates the degree of impacts of weak links on topology discovery process.
We assume that all nodes apply the same topology discovery algorithm and have the same broadcast period T. As shown in Figure 6, each node selects its startup time evenly within (0, T ). For the sake of simplicity, we also assume that there exists an appropriate multi-hop path from node A to the edge of the network. Then, we analyze the spread time of topology update information along this h-hop path T (h) u . Node A detects the change of its connection with node B at node B's broadcast time T 1 . Then, node A modifies its neighbor list and broadcasts a new HELLO packet without node B's information at its broadcast time T 2 . Node 1 immediately detects the same change according to the new HELLO packet and modifies its own neighbor list at T 2 . After that, node 1 broadcast another new HELLO packet at its next broadcast time. Till now, the topology update information has reached node 2. The spread time of topology update information from node A to node 2 is given by where T ij denotes the duration from the time when node i detects the change of topology to the time when node j detects the same change. Therefore, T (h) u is generally written as From Figure 6, it is easy to know that T ij is a random variable which depends on the broadcast time of node i and node i À 1. According to the assumptions above, the probability density function of T ij is given by Then, we have the average spread time of topology update is also known as the time for the h-hop-away nodes to detect the change of node A's topology. We define the topology effective time e as which denotes the topology stable duration of node A in the h-hop-away nodes. If the topology effective time e.0, the h-hop-away nodes get the correct local topology of node A before it changes. Otherwise, the h-hop-away nodes will never get the correct local topology of node A. Therefore, e is an important parameter that indicates the feasibility of topology discovery algorithms in large-scale wireless multi-hop networks.

Simulation and theoretical results
In this section, we use MATLAB simulation platform to build a liner multi-hop network with randomly distributed relay nodes. A common topology discovery algorithm and fading channel model are applied in the network. Some important parameters are shown in Table 1. In order to get insight into the impacts of weak links on topology discovery process, we mainly analyze two important parameters, the topology stable duration T sA and the spread time of topology change T (h) u , respectively. The simulation and theoretical results are compared in the following sections.

Impacts on local topology stable duration
We analyze local topology stable duration on two aspects. First, for two give nodes A and B, we analyze the point-to-point topology stable duration T AB with different node distances. Second, we consider the neighbor discovery process of topology discovery algorithms and discuss the local topology stable duration T sA .
Point-to-point topology stable duration. The point-to-point topology stable durations T AB with different topology discovery parameters are shown in Figures 7-9. It can be seen that the theoretical results are consistent with the simulation results. With the increase in d AB , the stable duration decreases to a lowest point and then increases sharply. When two nodes are close or faraway   enough, they remain connected or disconnected, respectively. In other words, their connection relationship is stable. While d AB is around r 0 , which means that the packet delivery rate is around 50%, the point-to-point topology stable duration drops to a lowest point. The connection relationship changes quickly. In Figure 7, we set M = 3 and K = 2, 3, 4, respectively. It can be seen from the figure that when the distance d AB is smaller than r 0 , with the increase in K, the stable duration increases. However, if the distance d AB is greater than r 0 , the increase of K no longer affects the stable duration.
In Figure 8, we set K = 3 and M = 2, 3, 4, respectively. When the distance d AB is greater than r 0 , with the increase in M, the stable duration increases. Similarly, when the distance d AB is smaller than r 0 , the change of M does not affect the topology stable duration.
Compare the simulation and theoretical results of Figures 7 and 8, we find that if the distance d AB \r 0 , the stable duration is mainly decided by K. With the increase in distance d AB , the stable duration is then mainly decided by M. The reason is that when the distance is small, the packet delivery rate is large. As a result, it is easy to continuously receive M HELLO packets but hard to continuously loss K HELLO packets. The wireless link is strong and these two nodes should be regarded as connected. A large K could prevent the occasional packet loss and keep these two nodes connected for a longer duration. Contrarily, if the distance d AB .r 0 , the link becomes weak and the packet delivery rate is large. The two nodes should be regarded as disconnected. In this case, a large M makes it stricter to establish connected relation between these two nodes. As a result, unsymmetrical parameters could be applied in the design of topology discovery algorithms in consideration of the distance or the packet delivery rate. Figure 9 shows the theoretical result of the stable duration when M = K = 1, 2, . . . , 5. It is easy to know that with the increase in M and K, the stable duration increases. The values of M and K have great effect on the topology stable duration. The large values of M and K make the network stable. However, if we set M = K = 5, a node detects the topology change five broadcast periods later and the slow reaction is unacceptable in some application cases. If we set M = K = 1, the topology stable duration becomes quite short, and the connection relationship changes too frequently. Therefore, appropriate values of M and K are important in designing topology discovery algorithms.
Local topology stable duration. The local topology stable duration is also known as neighbor discovery process which is important in the topology discovery algorithms and even in the proactive routing discovery algorithms. In this part, we analyze the performance of the common topology discovery algorithms in neighbor discovering under the impacts of weak links. In wireless multi-hop networks, the global topology changes too frequently and the global topology stable duration is not that important. Thus, we focus on the local topology stable duration and analyze the network stability.
The local topology stable durations of node A T sA are shown in Figures 10-12. With the increase in node density, the stable duration drops quickly. When the node density is high enough, which means that the number of node A's neighbors as well as the number of weak links is large, the local topology changes almost every broadcast period. In this case, node A almost  broadcasts different HELLO packets every period and the remote nodes will never get the correct or stable local topology of node A. In Figures 10-12, we have the similar conclusion as the point-to-point topology stable duration: with the increase in M and K, T sA increases. If the node density is high enough, the increase of M and K hardly works in increasing T sA . However, when the node density is low, the large values of M and K make the network stable. In Figures 10 and 11, we set M = 3, K = 2, 3, 4 and K = 3, M = 2, 3, 4, respectively.

Impacts on feasibility of topology discovery algorithms
Considering the propagation of topology update information, we compare T sA with T (h) u . The topology effective time e is applied to measure the feasibility of topology discovery algorithms. As shown in Figures  13-15, with the increase in network scale (the node density and hop count), e decreases. We set M = K = 3, 4, 5 in Figures 13-15, respectively. It is easy to find that when the curved surface of e is above the flat surface of e = 0, the topology discovery algorithm works well. Otherwise, when e drops below zero, T sA is less than T (h) u . In this case, both network radius and node density are limited. If the network scale is beyond the limited scale, the topology discovery algorithm is not feasible.
No matter what kind of topology discovery or routing protocol is applied, proactive or reactive, the main idea of finding an appropriate end-to-end path is to discover the topology between source and destination nodes. The source node starts a topology discovery process and finally obtains the connection relationship between itself and its destination node(s). The connection relationship in wireless networks, especially in    large-scale wireless multi-hop networks, is unstable and changes from time to time. Due to the weak links, the unstable connected nodes may be considered disconnected by the topology discovery algorithms. As a result, when the change is detected, the topology update process is necessary. Both processes take time and as mentioned above, topology effective time e is an appropriate parameter to measure the feasibility of topology discovery algorithms in large-scale wireless multi-hop networks.

Conclusion
In this article, we apply a Markov chain model to analyze the topology discovery process and get the impact factors of feasibility of large-scale wireless multi-hop networks. Our early study has pointed out the impacts of weak links on the process of reactive routing protocols. This time, we consider the most common used topology discovery process in proactive routing protocols and give an in-depth analyze of the impacts of weak links. Theoretical and simulation results have demonstrated that the topology discovery algorithms cannot maintain a stable topology. With the increase in network scale, the condition gets even worse. Actually, no matter what kind of routing protocol (or topology discovery algorithm) is applied, the impacts of weak links exist and have non-negligible influence on the feasibility of wireless multi-hop networks.
In order to avoid the impact of weak links, the future research may focus on the following two aspects. First, the artificial intelligence technology can be applied in wireless link prediction or topology prediction. If a node can predict the existence of weak links in advance, the impact of weak links is weakened. 14,15 Second, as we discussed above, the existence of weak links seriously affects the routing or topology discovery process. It is one of the better solutions to minimize packet interaction in the process of routing or topology discovery. Therefore, the SDN-based networking strategies 16 can reduce the distributed packet interaction, which helps to improve the feasibility and stability of the wireless multi-hop networks to a certain extent.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the School of Information Engineering, Shaoguan University.