How Much Delay Has to Be Tolerated in a Mobile Social Network?

Message delivery in a mobile social network (MSN) is difficult due to the fact that the topology of such network is sparse and unstable. Various routing schemes for MSNs were proposed to make the message delivery robust and efficient. However, little research has been conducted to explore how much delay has to be tolerated for the message delivery from the source to the destination. Since the social relationships among nodes are stable during a certain period of time, it is expected that the delay of message delivery in MSNs could be modeled with a probability model. In this paper, we take the first step to address this issue. We firstly extract three routing models from the existing routing schemes for MSNs and then develop the probability models of the message transmission delay for each abstract routing model. The simulation results show that the theoretical models match very well the simulation trace statistics.


Introduction
Mobile social network is a type of delay-tolerant networks (DTN [1]) which take into consideration the sociality of the participating nodes of the networks [2].If the terminals of MSN are smartphones facilitated with rich sensing capability, it can function as a typical distributed sensor network [3].Similar to the traditional mobile ad hoc networks (MANETs), no fixed infrastructures are built in such networks, and communications between the nodes depend on short-distance wireless links, like Wi-Fi or even Bluetooth.Since equipment in an MSN is usually carried by people moving freely, those nodes may organize themselves into highly dynamic topology, and the links between the nodes are usually intermittent and connected by opportunity due to the mobility of the nodes.As a result, large transmission delay is allowed in MSN.So in some extent, MSN is considered as a type of delaytolerant networks.Although routing in an MSN might adopt the similar store-and-forward strategies [4] that dominate DTNs, it is essentially different from the DTN routing because the sociality of the user behavior can be cooperated explicitly to improve the efficiency of networking and communication in MSNs.Previous works [5][6][7] have shown that the performance of routing in MSN may heavily depend on the users' social behavior like that in a human social network, which is also the key characteristic for the design and analysis of the other issues in MSNs; see, for example [8,9].
Traditional MANET routing protocols such as AODV [10], DSR [11], DSDV [12], and LAR [13] make assumption that the topologies of the networks are fully connected.These protocols will fail to route any message if there is no precomputed route from source to destination at the time of message being sent.In an opportunistic MSN, nodes carry the data to be forwarded and also are ready to forward data for other nodes.What is more, the mobility of nodes can be exploited to forward data opportunistically upon the encounter with each other.The key problem here is hence how to design appropriate relay selection strategies to improve the opportunity or probability of data forwarding.This improvement will make the data forwarding more efficient with less delay.Relay selection strategies depending on the history of contacts among nodes were proposed to find the right forwarding relays [14,15].
While the end-to-end connections cannot be preset when the message is ready for transmission, social networks often demonstrate unique social characteristic like the so-called small-world phenomenon shown in the Milgram's 1967 mail transfer experiment [16].In a social network, two-people contacts frequently usually have social ties.Recently, several International Journal of Distributed Sensor Networks sociality-aware routing strategies have been proposed to improve the routing efficiency for the MSNs [17][18][19].
Both contact history-based and social network metricsbased routing schemes designed for MSNs are best-effort, and there is no delay guarantee for the message delivery from the source to the destination.While MSNs are transmission delay-tolerant, it is still curious that how much delay must be tolerated for the message to be delivered from source to the destination or the delay bounds for message delivery in certain scenarios.These bounds are particularly interesting for the future multimedia application over the MSNs.While most of the work for data delivery in MSNs focuses on the design of new routing schemes, researches about how to evaluate the message delay remain rare and we try to make a first step to fill this gap.
In our opinion, the difficulty of the delay estimation for MSNs comes from two aspects: the dynamics of network topology and the uncertainty of the routing path.Though the mobility of nodes makes the networks' physical topology change dynamically, the nodes' social behavior tries to maintain regular contacts among nodes with social ties.We believe, therefore, it is possible to predict the routing behavior, at least the bounds of end-to-end delay of message delivery in an MSN.
In this paper, we illuminate a unified framework for evaluating the delay of message delivery in MSNs.Firstly, we propose general routing models extracted from the existing routing schemes.Then, we evaluate the delay with stochastic process theory based on the extracted routing model.The contributions of our paper are in threefold.First, we extract characteristics from the existing MSN routing strategies and classify them into three general models.Second, we propose a general method to evaluate the end-to-end delay in an MSN.And last, we design simulation experiments to validate the theoretical model.To our knowledge, this is the first effort to estimate the end-to-end delay of message delivery in an MSN via a probability framework.
The rest of this paper is organized as follows.Section 2 reviews the existing work.Section 3 presents the abstract routing model of the MSNs.Section 4 analyzes the delay bounds of the three types of general routing models with stochastic process theory and gives out the estimation results.Section 5 validates the performance of the proposed probability delay model by comparing the theoretical result with the simulation, and Section 6 concludes the paper.

Related Work
There is much work on the routing schemes for MANETs [20], from which the DTNs were evolved.The researches about routing schemes in a DTN might originate from the work of Jain et al. [4], in which messages are to be moved end-to-end across a time-varying connected graph but the topology dynamics may be known in advance.After that, there is much progress on this topic [21,22].
Epidemic routing [23] which is originally designed for MANETs could deliver data fast and robustly by forwarding data to any encounters.To satisfy the limitation of energy and memory of nodes, a variety of relay selection strategies specially for DTNs are proposed [14,15,[17][18][19].Lindgren et al. proposed the ProPhet [14], making use of contact history to predict the probability of the meeting of the two nodes.Only nodes that can reach the destination with a higher probability will get the data from the relay.
Data dissemination in the delay tolerant MSNs is a critical component of many applications, for example, content update in an online social networks [24].Sociality-based routing schemes for the DTNs have also been studied in recent years.SimBet routing [17] uses two social metrics (centrality and similarity) to estimate or predict the probability that potential relay nodes may reach the destination.Unlike the ProPhet and Epidemic, only one copy of data exists in the networks by SimBet routing.After that, Daly and Haahr tried to improve SimBet by using multiple copy of data forwarding [18].Recently, Hui et al. [19] proposed a novel sociality-based forwarding algorithm, BUBBLE, which employed two social and structural metrics, namely, centrality and community.
Though there have been many research works on the design of routing schemes for the delay tolerant MSNs, model-based performance evaluation of these routing schemes is relatively scarce.Boldrini et al. [25] considered a utility-based cooperative data dissemination system in which the utility of data was defined based on the social relationships between users.Specifically, they designed a Markov model to characterize the data dissemination process in both its stationary and transient regimes.The main result of their analysis is that the data distribution process always converges to one of two possible stationary regimes.
Our study is obviously different from above work.In this paper, we concentrate on how to evaluate the delay from the source to the destination in a delay tolerant MSN by a probabilistic way.Based on the abstracted routing models, the data disseminating process is modeled as a stochastic process to estimate the delay to be experienced in such routing process.

Network Model.
In an MSN, nodes may be in a moving status and links between some nodes do not always exist.These characteristics make the topology of an MSN dynamic.However, during a certain time, the social relationships among the nodes are fixed.Being similar with [26,27], we assume the pairwise node intercontact time is exponentially distributed.We consider the links among nodes as the social relationships of their holders, and the network model can be described as follows.
Consider an MSN with  mobile nodes, which can be denoted as a graph  = (, ), where  is the set of nodes and  is the set of links between nodes.If there is social relationship between node  and node , let   ∈  denote the edge   between them.Letting   denote the weight of the link   , then the intercontact time between nodes  and  obeys exponential distribution with parameter   .And further, the contacts between nodes  and  form a homogeneous Poisson process with the contact rate   .

The Model of MSN
Routing.An MSN may experience frequent, persistent link partitioning and may never have a stable end-to-end path.Like in DTNs, routing in MSN employs the store-and-forward strategies over the opportunistic links.From various routing schemes existing currently, we extracted the following essential attributes which play the key roles for message delivery in an MSN.
(a) The relay selection strategy: this is without doubt the most important issue in the design of MSN routing, which directly affects the routing performance.Generally, we divided the strategies into two categories: nonstrategic that forwards data to any nodes it meets, and strategic that only forwarding data to those nodes which have better forwards quality.
(b) The data copying strategy: to improve the successful ratio of message delivery, the relays usually forward multiple copies of the data to their neighbors concurrently.However, this strategy significantly increases the consumption of the network resource.The data copying strategy here can be classified as single-copy and multicopy in general.The multicopy strategies can be further divided into finite-copy and infinitecopy strategies.While the routing scheme with finitecopy strategy usually forward fixed number of copies to the neighbors, the routing schemes with infinitecopy strategy simply forward the data to all the nodes it encounters.
(c) Clustering: while a good relay selection strategy can restrain the number of data copy to be forwarded, clustering is an effective approach to limit the overconsumption of the resource in a large scale MSN.According to whether the nodes will be clustered, the routing schemes for MSNs can be classified into layered routing and plain routing.
In this paper, only plain routing is considered.We extract three types of routing models from existing instances of routing schemes designed for MSNs based on the first two attributes described above.

Single-Copy Strategic Routing (SCSR).
The routing strategy refers that node carrying data will not indiscriminately forward data to whichever it encounters but only chooses those nodes that can forward the data to destination with one or more hops.The usual routing strategies include what we have mentioned before, for example, the contacts historybased and social metric-based relay selection.A typical routing scheme belonging to this type is SimBet.

Multicopy Routing without Strategies (MCR-WS)
. This is the simplest routing schemes.Mobile node will forward its carried data to any nodes it encounters and keep the data at the same time.An infinite-copy strategy is more popular than the finite multiple copy strategy.Epidemic routing belongs to this type of routing schemes.

Multicopy Strategic Routing (MCSR).
In this type of multicopy routing schemes, strategies are considered which makes routing decisions more complex.Since it combines the simplicity of multicopy routing and efficiency of strategic routing, it attracts more attention and most of the routing schemes belong to this type.Compared with MCR-WS routing, if the relay selection strategy is designed to be effective enough, MCSR could achieve similar performance while consuming less system resources.ProPhet is a typical routing scheme of this type.
3.3.Some Assumptions of the Routing Model.We make the following assumptions listed from weak to strong.
(1) There is enough time for the nodes to exchange their data when they contact.
(2) For the intercontact time is much longer than the data transmission time over the link, the latter would be ignored when evaluating the transmission delay.
(3) In the case of multicopy routing strategy, nodes' buffer capacity is large enough thus no packet will be discarded due to the lack of memory.( 4) TTL (time-to-live) field is set as time limitation and the data will be discarded actively if it does not arrive at the destination after TTL time's forwarding.
(5) All nodes will not receive the same data for two times by numbering the message with a global ID created by a Hash function.

Analysis
We will analyze the delay of message delivery in MSNs based on the three routing models in Section 3. Before the detailed analysis, we firstly present an important result in Lemma which is hypoexponentially distributed [28] according to the following lemma.
Lemma 1 (section 5.2.4 of [28]).For an opportunistic path with  hops, the corresponding edges weight as  1 ,  2 , . . .,  +1 , then the probability density function of  as in (1) is where By formula (2) and assumption 5, the expected delay of the path in Figure 1 is (3)

Delay of Multicopy Routing without Strategies.
We firstly analyze the case of MCR-WS and the typical routing of this type is Epidemic [23].Messages generated at source node will be forwarded to encountering nodes that have not received the data.Data will disseminate over the whole network quickly by multicopying in the network.As we know, MCR-WS usually enjoys the minimum delay but consumes much more system resources.
According to assumptions 1 and 2 in Section 3, the time needed to send data from the source to the destination is the sum of the intercontact time between intermediate nodes pairs.There might be several possible routing paths connecting the source to the destination.But due to the multicopy strategy, the transmission delay is always the minimum of delays along all the possible paths as the following lemma.
Proof.The proof is obvious and omitted.
Because the weight of every edge constructing a path can be got according to formula (2), the cumulative distribution function    () of each path's delay can be derived by Lemma 1.By Lemma 2, the cumulative distribution function of (, ) is given out in Proposition 3. Proposition 3. In the MCR-WS scheme, if { 1 ,  2 , . ..,  } are independent with each other, then the transmission delay from  to (, ) = min{ 1 ,  2 , . . .,   } follows the distribution and . By the position of the broken node and the next backing node, two subpaths can be found which have the same source   (broken node) and the same destination   (backing node).Because the two subpaths share no common intermediate nodes, they are independent.As a result, through Lemma 1 and Proposition 3, the expectation of transmission delay from   to   can be calculated as   , building a dummy connection between   and   instead of the paths which connecting them.The distribution of intercontact time   between   and   is approximated as exponential distribution with parameter      = 1/  .Through the above approximate treatment, two related paths could be merged into one path.What is more, if the same merging process is done on all the paths { 1 ,  2 , . . .,   }, a group of independent paths will be constructed.Finally, transmission delay evaluation of MCR-WS will be done using Proposition 3 directly.

Delay of Single-Copy Strategic
Routing.Now we consider the second case, SCSR.The strategy here refers to that relaying nodes will not forward data to any encounters but select those that could forward the data to the destination with better quality.There are several different relay selection strategies.Some maintain a local encountering probability by contact history and others calculate the utility value using social metrics.Despite of the distinct of the strategies, no strategy can guarantee that the relay selection will always reach the destination.It is assumed that the probability of selecting a right relay is a constant  Eval- .We will show how to estimate the value of  Eval- later.
Let  be the node forwarding data and () its neighbor set.Because not all neighbors can lead to the destination, let () ⊆ () denote nodes that can forward data to the destination and tend to be selected as the next relay by the strategy.Due to the single-copy strategy, only one of the nodes in () will receive data from .To make sure the data could be forwarded to the destination with highest probability, the intercontact history between node  and the candidates of relays is used to decide the probability of relay selection.Let   (  ) denote the probability that node   ∈ () is selected by  as the next relay.For node   ∈ (), it is obvious that the more frequently contacting with , the higher probability of   (  ).So we have (  ) :   (  ) =  (  , ) :  (  : ) ∀,  ∈  () . ( In most cases, there are many possible paths connecting the source  and the destination .All the possible path set can be found and it is denoted as PathSet (, ) = {( 1 , Prob 1 ), ( 2 , Prob 2 ), . . ., (  , Prob  )}, in which Prob  denotes the probability of selecting path   for message delivery.Assuming the path   includes the following nodes  1 (source),  2 , . . .,  −1 ,   (destination), then Because there is only one data copy in the network, with the assumption of independent probability of possible path selection, we can calculate the probability of the successful message delivery: What is more, it is easy to get the probability of successful message delivery of the real scenario and we denote it as  Real .Then letting  Eval =  Real , with ( 5), (6), and ( 7), the probability of selecting the proper relay  Eval- can be calculated.
Through above discussion, the preparation for the evaluation of SCSR has been done and then Proposition 4 is given as follows.
Proposition 4. In the type of SCSR, assume the possible routing path set is Path(, ) = {( 1 , Prob 1 ), ( 2 , Prob 2 ), . .., (  , Prob  )}, in which Prob  denotes the probability of selecting path   .And let   denote the transmission delay over path   .Then the delay  needed to send data from source to destination successfully has the following probability density function: where  Real = ∑  =1 Prob  and    () is the probability distribution function of the variable   .
Proof.Firstly, it is proved that   () satisfies the property of probability density function.Obviously, for all  ∈ ,   () ≥ 0 and it has So   () can be probability density function of variable .Then we prove that the relationship in (8) makes sense.
Because of the single-copy strategy, there is only one path that is selected to transmit data.The probability Prob  of each path being chosen can be calculated according to (5), (6), and (7), and the distribution functions of transmission delay   on each path are known; we have By the influence of policy, the data may not be sent to the destination for selecting the wrong path, which makes the transmission delay infinite.If the case of choosing the broken path is not taken into consideration, the sum of chosen probability of all the possible left paths ∑  =1 Prob  is less than 1, which makes the component of  uncompleted.But if we normalize the probability of selecting the proper path, that is to say, letting the probability Prob  be normalized as Prob  / Real , which would keep the component of  completed, then we can get the probability density function   () = ∑  =1    () × Prob  / Real under the condition of successful message delivery.
By Proposition 4, the expected time delay from the source node  to the destination  in this type of routing schemes can be easily got as

Delay of Multicopy Strategic
Routing.The multicopy strategy improves the probability of successful message delivery comparing with the SCSR.And the relay selection strategy reduces the overhead of MCR-WS.Spyropoulos et al. [15] proved that the performance of MCSR could achieve the performance of Epidemic routing if the strategy was designed properly.
The method of dealing with relay selection strategy is similar to that in SCSR.We still assume the probability of selecting the proper relay is a constant  Eval- .However, due to the multicopy strategy, every node in () is selected with the same probability  Eval- .All paths connecting the source  and destination  will be denoted as Path(, ) = {( 1 , Prob 1 ), ( 2 , Prob 2 ), . .., (  , Prob  )}.Then the probability of transferring data on path   is where |  | − 1 is the number of relays selected.Unless all the possible paths fail, the data will fail to be transmitted to the destination.So the probability of successfully sending data is then  Eval- and each probability Prob  can be calculated.To evaluate the transmission delay in the MCSR, Proposition 5 is given as follows.
Proposition 5.In the type of MCSR, node  sends message to node  and the all possible routing path set is Path(, ) = {( 1 , Prob 1 ), ( 2 , Prob 2 ), . . ., (  , Prob  )}.If the routing paths are independent with each other, the transmission delay  follows: Proof.We consider all the cases of  > .Firstly, if the data is sent successfully to the destination, it requires all transmission delay on every possible routing path that has   > .Because all   are independent with each other, through the distribution function of   , the probability of   >  for  = 1, 2, . . .,  is ∏  =1 (1 −    ()).For every possible routing path is selected at the probability of Prob  , the probability of   >  should be multiplied by Prob  .Then we have   () = 1 − ∏  =1 (1 −    ()) × Prob  .Secondly, we verify whether   () fulfills the definition of distribution function.Obviously, it is true that lim  → ∞   () = 1, for lim  → ∞    () = 1, for all  = 1, 2, . . ., .Proposition 5 requires that all possible correlated routing paths are independent with each other.Some fixing similar to MCR-WS should be done.Two related routing paths that share intermediate nodes will be merged.The merging process is as same as the process in MCR-WS, but the probability of the new merged routing path should be re-calculated.Assume, in the merging process, two subpaths which have  1 and  2 nodes, respectively, include their common head node   and tail node   .After they have been merged, the probability of having data flow from   to   is 1 − (1 − ( Eval- )  1 )×(1−( Eval- )  2 ).Through the path merging and probability modifying process, a new independent routing path will be got.Then using Proposition 5, the transmission delay in MCSR can be calculated.

Simulation
In this section, we validate our delay model by comparing the theoretical delay model with the simulation results.All of the three routing schemes corresponding to each routing model are included: Epidemic [23] for MCR-WS, SimBet [17] for SCSR, and ProPhet [14] for MCSR.

Simulation Setup.
The simulation experiment is developed based on the general discrete event simulation platform OMNET++ [29].Though the scale of network in our simulation could be set as any size only if it is within the resource constraints of the simulator, the theoretical delay model has no relevance to the scale.In the simulation, 30 nodes are randomly placed inside an area of 500 × 500 m 2 .IEEE 802.11bDCF is used as the MAC layer protocol.The radio propagation range of all nodes is set as 100 meters.The nodes move according to the Gauss-Markov mobility models [30], with a minimum speed of 0 m/s and a maximum speed of 5 m/s.When two nodes are in the radio propagation range, it is considered that they meet and can communicate with each other.As models omit the transmission delay of the channel, the transmission delay of all channels is set as 0. During the simulation, the source and destination nodes are selected randomly from the node set.Data packets are generated by the source node every five seconds and 3000 packets will be sent during this interval.The time-to-live fields of these packets are set as 15 s, which means the message will be discarded if it has not arrived at the destination after 15 seconds.The intercontact time of nodes meeting each other is recorded to provide the social ties for the modelbased numerical experiment.We approximate the parameter of the exponential distribution as the reciprocal of the average intercontact time.The parameter setting of ProPhet is as same as in [14], that is,  init = 0.75,  = 0.25, and  = 0.98.And the parameters of SimBet are just like in [17], that is,  =  = 0.5.To simulate the packet routing process, every data packet is forwarded on the meeting of the expected nodes according to the routing schemes and the transmission delay is traced.

Model Evaluation.
We try to evaluate the accuracy of the theoretical model by comparing the estimated delay with the corresponding statistics of the simulation result.We have already derived the probability density function or cumulative distribution function of the transmission delay of each routing model, respectively, as in (4), ( 8) and (14).Denote all the delays traced in the simulation as { 1 ,  2 , . . .,   } and divide the time interval [0, ] into constant  parts equally.The constant  should be carefully set.Then counting the number   of   belongs to each subinterval and let   / be the probability of the subinterval.In one simulation, the comparison results are shown in Figures 2, 3, and  4. It is obvious that the delay in simulation matches very well the delay calculated by the proposed theory model.
Then we run the simulation for 20 times of each routing model, each of which includes sending 3000 data packets.Each time the network scale is also  = 30 but with different network topology.Through the simulation, the average transmission delay is obtained and the expected delay by the model could also be calculated.Pearson Correlation Coefficient [31] is a measure of the correlation between two variables  and , giving a value between +1 and −1 inclusive.The more close to +1 or −1, the more related they are.It is commonly denoted with  and can be obtained from the following formula: We use Pearson Correlation Coefficient to show further how much the average transmission delay in simulation matches the theoretical result.The result presented in Table 1    shows that the simulation delays and theory-calculated delays are highly related.

Conclusions
In this paper, we proposed a probability model to estimate the delay of message delivery in the delay tolerant mobile social networks.We firstly extracted three general routing models from the existing various routing schemes for MSNs.According to an elegant result in probability theory, we constructed probability delay models for each of the three routing models.
Then the simulation experiments were designed to validate the accuracy of the theoretical delay model.It was found that the delay statistics from the simulation trace matched very well the theoretical results, which means that proposed model is quite accurate for the prediction of delay of message delivery in an MSN.
During the model construction, we omitted the storage limitation of the nodes in an MSN.In the future work we will enhance the proposed delay model considering the message loss from the storage limit.For only the plain routing schemes for MSNs being considered in the current work, delay estimation for the layered routing schemes is also a valuable extension to this work.