The Role of Heterogeneity in Autonomous Perimeter Defense Problems

When is heterogeneity in the composition of an autonomous robotic team beneficial and when is it detrimental? We investigate and answer this question in the context of a minimally viable model that examines the role of heterogeneous speeds in perimeter defense problems, where defenders share a total allocated speed budget. We consider two distinct problem settings and develop strategies based on dynamic programming and on local interaction rules. We present a theoretical analysis of both approaches and our results are extensively validated using simulations. Interestingly, our results demonstrate that the viability of heterogeneous teams depends on the amount of information available to the defenders. Moreover, our results suggest a universality property: across a wide range of problem parameters the optimal ratio of the speeds of the defenders remains nearly constant.


Introduction
An increasingly important task, where a robotic system can be employed, is in defending an area against external agents, which pose varying levels of threat.Examples include defending airports against intruding and flight-grounding drones Supported by the Army Research Laboratory as part of the Distributed and Collaborative Intelligent Systems and Technology (DCIST) Collaborative Research Alliance (CRA).[6], defending wildlife habitats against trespassing poachers [1], extinguishing and preventing the spread of devastating wildfires caused by human or natural activity [8], as well as military applications [13].
In general, solutions to perimeter defense problems allude to finding strategies for a set of agents restricted to the perimeter of an area, entrusted with defending the area from intruders which are trying to breach the perimeter of the area [16].
Compared to a homogeneous team of robots, a team of robots with varying capabilities (heterogeneous team) comes with its unique set of advantages and challenges.Equipping different agents with different capabilities can lead to synergy effects where the heterogeneous system outperforms the alternative homogeneous system composed of identical agents.As a result, in the last decade, there has been significant interest in the robotics community to define, explore, and quantify heterogeneity in different robot applications [19,14,11,7,12,10].
This paper investigates the impact of heterogeneity in multi-robot teams for the perimeter defense problem.We propose two optimal strategies, valid under different assumptions.The first strategy is based on dynamic programming (DP) [2].It is optimal when the defenders are able to predict the location of the incoming attacks, but suffers from the curse of dimensionality and therefore relatively high associated computational costs.The second strategy is based on local interaction rules, and is optimal when the defenders have no information about the incoming attacks.This strategy can be efficiently computed in an online fashion, but does not implement any prior knowledge of the attack locations.
We prove the optimality of both strategies and analyze their time complexities.The algorithms are extensively validated on simulations.Our numerical experiments are two-dimensional, but the majority of the theoretical results remain valid for any dimension.This includes three-dimensional perimeters in applications involving drones, and higher-dimensional perimeters arising as constraint sets in a state space of arbitrary dimension.
Our results show that heterogeneity is beneficial in the case where the defenders have access to information about the incoming attacks, and is detrimental when the defenders have no information about the attacks.Moreover, we show the universality property that the optimal ratio of the speeds of the defenders remains nearly constant for a two defender case setting.
Related work: Perimeter defense problems are a variant of pursuit-evasion problems which have been studied extensively in literature.The seminal work of Issacs delineates differential-game approaches to arrive at equilibrium strategies for one pursuer one evader games [4].There has been considerable effort by researchers from various communities for solving various variants of pursuitevasion games involving multiple pursuers and evaders [20,21,3].These papers contain works that view pursuit-evasion games either from the pursuers' side, from the evaders' side, or both.The curse of dimensionality poses a considerable challenge in solving problems involving multiple pursuers and evaders.The perimeter defense problem presented in this paper is a variant of the target guarding problem first introduced by Isaacs [4].In the target guarding problem setting an agent is tasked with guarding an region of interest against an adver- sarial agent.Investigations on perimeter defense problems are in their nascent stage.The review paper by Shishika and Kumar [16] delineates the recent works done on multi-robot perimeter defense problems [15,5,18,17].Unlike the problems considered in these works, we consider a class of perimeter defense problems in which the number of attackers is much larger than the number of defenders.The remainder of the paper is organized as follows.Section 2 contains our notation together with the problem statement.Sections 3 and 4 detail our theoretical results in the infinite and unit-time horizon cases respectively.Section 5 concludes with simulation results.

Problem statement
In this paper, bold letters are used to represent vectors and non-bold letters to represent scalars.Calligraphic letters are used to represent sets, and |S| denotes the cardinality of a set S.
For any positive integer n ∈ Z + , [n] denotes the set {1, 2, • • • , n}.For a domain X with x 1 , x 2 ∈ X , dist(x 1 , x 2 ) denotes the length of the shortest path between x 1 and x 2 contained inside X .As an example, in the case when X denotes a circle of radius where θ 1 , θ 2 are the polar angles of x 1 and x 2 , respectively.

Perimeter defense against point attacks
For ease of reference, the notation of this section is summarized in Table 1.
Our problem is perimeter defense against point attacks with mobile defenders of varying speeds.Specifically, we have a perimeter X in d-dimensional space, with a distance metric dist, defended by m mobile defenders with speeds v 1 , . . ., v m , so that defender i at x ∈ X at time t can make it to x at time Without loss of generality we order the defenders from fastest to slowest, i.e. v 1 ≥ • • • ≥ v m , and we denote the speed vector as v = (v 1 , . . ., v m ).Then n attacks (z j , t j ) ∈ X × R ≥0 , where z j is the location on X at which it happens, and t j is the time; WLOG we order these by time, i.e.
Because attacks happen at fixed locations and times, they cannot react to the positions of the defenders.
Fig. 1: Three defenders facing three attacks, with the unit-time reachable sets for each defender shown.Note that the third dimension is time; if the attack represents a physical object it is approaching from somewhere outside the circle, but we are only concerned with where and when it will hit the perimeter.In this example the defenders are not allowed to leave the perimeter, so the size of the reachable set scales linearly with speed (until it covers the whole perimeter).
An attack (z j , t j ) is thwarted if and only if a defender is present, i.e. there is some defender i at z j at time t j ; otherwise, we say that the attack breaches the perimeter.The goal is to design a policy for the defenders that minimizes the number of attacks that breach the defenses, and to study the effectiveness of different defender speed combinations against attacks.
Additionally, the team of defenders has a horizon h under which they can see attacks: specifically, at time t, any attack (z j , t j ) is known to the defenders if and only if t j ≤ t + h.We will study in particular the unit horizon h = 1 and the infinite horizon h = ∞ (all attacks are visible from the start).
Finally, the defenders are allowed to start at t = 0 at any combination of locations in X ; they are even allowed to choose their starting locations based on the attack sequence (up to horizon h).
Given a speed vector v and sequence of attacks {(z j , t j )} n j=1 , we define Opt(v, {(z j , t j )} n j=1 ) as the minimum number of attacks from {(z j , t j )} n j=1 that defenders of speed v can let through (with all attacks known).In some cases we will be dealing with Opt(v, {(z j , t j )} n j=1 ) for one sequence of attacks {(z j , t j )} n j=1 over many defender speed vectors v; in that case we write Opt(v) for convenience.

Different settings
Within the above problem description, there are several different variations, mostly to do with how the attacks are generated and the length of the horizon h.We roughly divide attack sequences into two settings: 1. Any sequence of attacks (z j , t j ) is legitimate.2. Attacks must come at unit time intervals, i.e.
Note that in setting 2 we do not lose any generality by having the attacks happen at unit time intervals, since we can rescale the time units (and adjust the speeds of the defenders accordingly).Since the index j is superfluous in setting 2 we refer to the sequence of attacks as z 1 , z 2 , . . ., z n , indexed by t.
In setting 1, we study the case where all attacks are known to the defenders at the start; our primary problems are (i) find an algorithm for the defenders' movements that minimizes the number of breaches, and (ii) study the behavior of optimal defense against uniformly-random attacks (in both location and time) for different combinations of defenders.Since setting 1 is more general, the algorithms will also apply to setting 2.
In setting 2, we study the case where the attacks are (i) generated uniformly at random in location (time is fixed) and (ii) generated by an adversary which wants to guarantee a breach with as few attacks as possible.We also consider both the case where all the attacks are known to the defenders at the start (h = ∞) and the case where attack t only becomes known at time t − 1 (h = 1).
Remark 1.Here we deal with the case where the number of defenders is fixed, and the question is how fast to make each defender (and in particular whether to make them all the same speed or not).The alternative case of varying the number of defenders is investigated in the Appendix, especially in regards to the tradeoff between fewer and faster defenders versus more and slower ones.

Dynamic programming with infinite horizon
We now give an algorithm which, given defender speeds v = (v 1 , . . ., v m ) and attacks {(z j , t j )} n j=1 returns two things: (i) Opt(v, {(z j , t j )} n j=1 ) (the minimum number of attacks that can be let through); and (ii) the list (of lists) = ( 1 , . . ., m ), where i is the (sub)sequence of attacks which defender i should thwart.We refer to as a defense plan.
Recall that by default the attacks are sorted in order of arrival time (or the user should sort them before applying the algorithm).
Fig. 2: Computing f (6, 2, 4) (defender 1 has to thwart attack 6, etc.) recursively; each defender is allowed to thwart attacks prior to these, but not afterwards.Since 6 is the maximum value, we consider the last attack that defender 1 can handle before 6: based on its speed, it can be 0 (defend nothing before 6), 1, or 3. Thus f (6, 2, 4) = min(f (0, 2, 4), The pseudocode is given in Alg. 1, in which we use the following notation: j = (j 1 , . . ., j m ) ∈ {0, 1, . . ., n} m denotes a vector of attacks assigned to each defender (with j i = 0 indicating no attack assigned to defender i, and we allow the j i 's to be non-distinct even though it is redundant); i.e. j with the ith entry replaced by j ; is the indicator that defender j is capable of thwarting attack j after thwarting j (and 1 i (0, j ) = 1 since defenders can start anywhere); [•] + [•] denotes concatenation (of lists); and arg min (arg max) denote the sets of values minimizing (maximizing) the arguments.The for-loop in Alg. 1 iterates in lexicographic order, skipping f (0, . . ., 0) (whose value is already known) so that the recursion can work.The proof of the following result is contained in the Appendix: Theorem 1. Alg. 1 outputs the correct value of Opt(v, {(z j , t j )} n j=1 ) and .
Remark 2. Alg. 1 relies on the subtle point that i * ∈ arg max i j i because if not, then we do not know whether to subtract 1 when we do the update; by setting i * ∈ arg max i j i , we remove the question of whether a defender i assigned to a later j i can also thwart attack j i * .
Remark 3. Alg. 1 assumes that the defenders can start at whatever locations they want, but can be modified for fixed defender starting locations (or a set of possible starting locations) by redefining 1 i (0, j) to indicate whether they can reach attack j from their starting locations.It can also be modified for the important case where attacks cause varying amounts of damage, with attack j doing w j damage (should it not be intercepted); see for instance the Iron Dome missile defense system, which prioritizes attacks based on potential damage estimates [13].To make this modification, replace −1 with −w j i * (j) in line 7 and f (0, . . ., 0) = c = n with f (0, . . ., 0) = c = j w j .
Algorithm 1: Dynamic programming for infinite horizon defenders.
Choose j * (j) ∈ arg min j f (j −i * (j) (j )) : j < j i * (j) and 1i * (j , j i * (j) ) Given m defenders and n attackers, the number of computations needed to run Alg. 1 is on the order of (n + 1) m+1 (we need to run through (n + 1) m values of j, and each update takes up to n time for the comparisons).

Monotoniticy-based computational acceleration
In order to investigate team heterogeneity, we compute Opt(v, {(z j , t j )} n j=1 ) for all v whose elements v i are at g evenly-spaced locations in a range (v min , v max ]. 6We refer to g as the number of grains.If we were to run Alg. 1 for all combinations v of speeds, the complexity becomes O((n + 1) m+1 g m ), which gets extremely large very quickly.
However, as each attack sequence is evaluated on all v, we can take advantage of the monotonicity of Opt over v to reduce the amount of computation needed.
In particular, for any sequence since faster defenders can always emulate slower ones and thus achieve (at least) as good a result on any attack sequence.This means that Thus we know Opt(v ) = k for a range of v , without having to run Alg. 1.
Taking the set of values v ∈ (v min , v max ] m (of given grains), for any {(z j , t j )} n j=1 we can evaluate Opt(v, {(z j , t j )} n j=1 ) in a strategic order to minimize the number of times we need to run Alg. 1.This is discussed in greater detail in the Appendix.

Unit Horizon Theoretical Results
This section considers defenders with a unit horizon of incoming attacks.The general setup is -We consider a perimeter X homeomorphic to S 1 (a circle 7 ), with distances determined by arc length and total length normalized to 1; we represent X = [−1/2, 1/2] (but −1/2 and 1/2 are the same point).To denote this situation, we define the distance function (a rescaled version of (1)).The maximum possible value of dist(y 1 , y 2 ) is 1/2, and we assume they start at maximum distance from each other, i.e., at antipodal points.-The n attackers are generated according to Setting 2 from Section 2.2: attacker t appears at time t, uniformly (and independently) over X .-The defenders have a unit horizon in time: at any given time they only see the next incoming attack, though they also know n and the current time t.
Therefore the defenders' policy can be thought of as a sequence of decisions taken at unit time intervals (i.e. when the next attack is revealed), which is naturally formulated as a Markov Decision Process (MDP) [9] with n steps, with the reward being the number of thwarted attacks.
To simplify the MDP we can remove one state variable since, by symmetry, we can rotate X (or relabel it) so that at the beginning of any time step, defender 1 is at location 0. We can also reflect it so that defender 2 is on the positive half.Thus the state at time t (just before the location of the next attack is revealed) can be denoted by a single parameter a(t), indicating the distance between the two defenders.Then the next attack's location x(t + 1) is revealed, in the coordinate system relative to the defenders' positions.

Policy and Reward
A unit-horizon policy is a function f The inputs are a(t), x(t) and the number of remaining attacks, and the output is f (a(t), x(t)) = a(t + 1).As described above, a(t + 1) is the distance between the two defenders at time t + 1. f must satisfy the condition The policy then produces a reward the reward, based on whether the given movement makes it possible for the attack to be thwarted (r(t) = 1 if so, = 0 if not).r(t) is given as follows: The reason for this is that by symmetry (of the perimeter and of the attacks), given the distance a(t + 1) = f (a(t), x(t)) between the defenders at the start of the next step, the ability of the defenders to stop future attacks does not depend on their locations.Thus, if the defenders can stop the current attack and end at distance a(t + 1) = f (a(t), x(t)) for the next step, this is always preferable to ending at the same distance without making the capture.Hence r(t) = 1 under policy f if and only if this is possible, which can be split into two cases: (i) defender 1 makes the capture; (ii) defender 2 makes the capture.If either of these are feasible, r(t) = 1; if neither are, r(t) = 0. Remark 4. If dist(a(t), x(t)) > v 2 and dist(0, x(t)) > v 1 , this means that neither defender can reach the next attack and hence r(t) = 0 no matter what.

Optimal defender policy
Fix a defender policy f .For a given total number N of incoming attacks and an initial distance a between the two defenders, we define the expected reward J(a; N ) of the defenders as the expected total number of thwarted attacks, i.e., where the expectation is over the attack locations x(t).With this definition, we are interested in determining the policy f that leads to the highest expected reward.We show in the Appendix that for a wide range of values for v 1 , v 2 and N , the optimal strategy should (i) always thwart the currently-known if possible.
We next prove that the optimal policy subsequently should (ii) always maximize a(t) subject to the first constraint.That is: t+1) is maximized for all inputs, over all policies that satisfy (i) (i.e.capture when possible).
We next show necessary and sufficient conditions for perfect defense, i.e. when no (fixed-time) attack sequence can force a breach.
Theorem 2 (The perfect defense theorem).For any pair of defenders with speeds v 1 , v 2 where v 2 ≤ v 1 , there exists a sequence of attacks that breaches if and only if the defenders can defend indefinitely even with a one-step horizon.Furthermore, if any sequence of attacks guarantees a breach, there is a sequence of at most 6 attacks that does so.
Both proofs are given in the Appendix.

Simulation Results
We conduct simulations for each of the settings from Section 2.2.Our experiments are run as follows: 1. Generate attacks {(z j , t j )} n j=1 randomly, either with fixed attack times t j = j or uniformly-random attack times in [0, t max ]. 2. Compute Opt(v, {(z j , t j )} n j=1 ) for v ∈ (v min , v max ] m , at g intervals.3. Repeat the above for T trials and average the resulting values for each v. We conduct all of our experiments on a circular perimeter of circumference 1, where the defenders are not permitted to leave the perimeter (so maximally distant points are at opposite ends and have distance 1/2).Comparison of the results sheds light on the conditions which favor heterogeneous defender teams and those which favor homogeneous teams and/or single super-defenders.
The structure of the simulations means each combination of defender speeds is evaluated on the same set of attack sequences, which makes the comparison fairer, and allows us to significantly speed up the computation when evaluating Opt(v, {(z j , t j )} n j=1 ) for many values of v on a single attack sequence {(z j , t j )} n j=1 , by exploiting the fact that Opt is a monotonically-decreasing step function in v (as described in Section 3.2).
The full list of parameters is given in Table 3.

Simulation Results
In Figure 3, we simulate sequences of n = 25 attacks of both settings, where the perimeter X is a unit circle of circumference 1 and m = 2 defenders; for uniformly random attack times we set t max = 25 to get the same density of attacks in both cases.This is analyzed over the speed range (v min , v max ] = (0, 0.6] with g = 256 grains.The left column shows results for uniformly-random attack times; the right column shows results for fixed attack times. The results are given as surface plots, taking defender speeds v 1 , v 2 and returning Opt(v 1 , v 2 ) (ignoring the assumption in the analysis that v 1 ≥ v 2 , so the plots are symmetric about the line v 1 = v 2 ).We give: -Top row: Opt(v 1 , v 2 ) for a single sequence of attacks.This can be viewed as T = 1, and is meant to give a visualization of how adjusting the speeds of the defenders changes the ability to defend against a particular sequence.Since Opt(v 1 , v 2 ) takes integer values, we have a monotonically-decreasing step function.-Middle and bottom rows: Opt(v 1 , v 2 ) when averaged over T = 200 randomly-generated attack sequences.Middle row gives the front view to show overall shape; bottom row gives the back view to show the ridge at v 1 = v 2 .This ridge, which appears for both uniformly-random attack times and fixed attack times, shows that on average homogeneous defenders are less efficient (per combined speed) than heterogeneous defenders.
From this we can make a number of interesting observations: -Opt(v 1 , v 2 ) is generally larger for the uniformly random attack times, as attacks which are close together in time are much harder to defend.In particular, with fixed attack times Opt(v 1 , v 2 ) = 0 for sufficiently large defender speeds (one defender of speed 1/2 is -As mentioned, there is a ridge on v 1 = v 2 (the back view makes it clearly visible).This shows that on average, homogeneous defenders are less effective than well-designed heterogeneous ones.-Under uniformly-random attack times, each 'half' (cutting at the v 1 = v 2 line) is empirically convex, while under fixed attack times, each 'half' is convex near the v 1 = v 2 ridge but becomes concave again near the edge of the plot (as seen in the back view) and as the defender speeds increase (as can be seen on the edge in both views).
We also consider the question: what is the optimal mix of defender speeds?To answer this, we need to consider what we want to hold constant, since obviously faster defenders are always better; an obvious starting point is to look at defenders of a fixed total speed, and consider what ratio of speeds performs the best.This also means that we are comparing defender teams whose reachable sets are of equal total size (ignoring overlaps), and (because we evaluate over a grid of possible values of v) means we compare the values of Opt(v) on a diagonal line.
In Figure 4, we show the best (empirical) mixture: for each value of That is, given a total speed of v tot , what is the optimal fraction of the speed 'budget' to assign to the slower defender?A value of 0.5 signifies homogeneous defenders are best; a value of 0.0 signifies that a single super-defender is best; and a value in between signify some heterogeneous mix of defenders is best.These are based on the same experiments as shown in Figure 3.Note that the fixed attack times graph ends at v tot = 0.5; past that, both one single super defender and homogeneous defenders will defend perfectly, so measuring the minimum no longer makes sense.However, it is striking that the benefits of a heterogeneous team persist so close to that threshold, and the optimal ratio remains relatively stable over a wide range of speed 'budgets' in both settings.

Computational complexity of simulations:
The results of the monotonicity-based computational acceleration discussed in Section 3.2 can be seen in Figure 5, corresponding to the simulations shown in Figure 3.As before, the left-hand column is the results for uniformly-random time attacks, and the right-hand column is the results for fixed time attacks, while the top row represents a single trial (corresponding to the top row of Figure 5) and the bottom row correspond to the average of T = 200 trials.
Each square is a 256 × 256 grid, representing the 256 2 combinations of speeds v for which we want to compute Opt(v); the shade of a given point represents the fraction of times Alg. 1 had to be run on for that specific v (as opposed to the value being known already due to monotonicity), running from yellow (Alg. 1 never had to be run) to purple (Alg. 1 always had to be run).Note that because they represent a single trial (each), every point in the top two graphs takes a value of either 0 or 1.
We note a few things: (i) the savings increase strongly where E[Opt(v)] is flatter (this is expected since ∇ v E[Opt(v)] corresponds to the probability that there is a step at v, and having a step nearby means the condition is less likely to be satisfied); (ii) there are darker points at regular intervals (such as in the center), which correspond to the combinations which are evaluated earlier.
Even with m = 2 and the strategic use of monotonicity, which can save up to about 95% of the running time, this can get big fairly quickly.Fig. 5: Monotonicity savings for the trials depicted in Figure 3. Uniformlyrandom attack times on the left, and fixed attack times on the right.Axes labeled by position in the vector of possible speeds (0 to g − 1).Top row is for one trial (corresponding to the single trials shown in Figure 3) and bottom is average over 200 trials.

Simulations for unit horizon
Simulation results for the case of two defenders on a circular perimeter with unit horizon are shown in Figure 6.Note that in this case, heterogeneity is not beneficial, it is even detrimental.The optimal speed allocation is to assign the entire speed budget to one defender or split it equally.Fig. 6: Unit-horizon case, 2 defenders evaluated at g = 128 grains for speeds (v min , v max ] = (0, 0.5] for 10000 trials and n = 25 attacks.Left: back view, note the lack of the 'ridge' seen in Figure 3. Right: front view.

Conclusion
We introduced and studied a minimal model to map out how and why heterogeneity in robotic teams affects performance in perimeter defense applications.
On the one hand, we showed that a heterogeneous team achieves better performance when full information of the oncoming attacks is available to the defenders.Moreover, we uncovered a seemingly universal behavior, where the ratio of optimal defender speeds is nearly constant for a range of problem parameters.
On the other hand, we proved that heterogeneity is detrimental to the system's performance in the converse case where minimal attack information is available.These results suggest that heterogeneity is potentially a non-robust property, since less system information dramatically decreases its usefulness.
Future directions involve quantifying and studying the use of heterogeneity when intermediate levels of information are available to the defenders.This would explore the existence of a phase transition where heterogeneity changes from decreasing to improving system performance.Possible scenarios include varying the horizon length of incoming attacks between the cases of 1 and ∞ considered in the paper.Another scenario augments the unit time horizon with the knowledge of the number of remaining attacks.In particular, we conjecture that even in this case defenders should always capture attacks if possible and that heterogeneity remains detrimental.Lastly, we wish to perform numerical simulations for a larger number of defenders.
Alg. 1 depends on the function f (j) : {0, 1, . . ., n} m → N, which denotes the following: suppose that defender i (with speed v i ) is required to thwart attack j i and then no others after that (but defender i can thwart attacks arriving before t ji , and if j i = 0, then defender i is not allowed to thwart any attack); f (j) is the minimum number of defenders that can be let through under these constraints.Then the following hold: f (0, . . ., 0) = n (the base case from which we recursively compute f ); -Opt(v, {(z j , t j )} n j=1 ) = min j f (j) (this allows us to extract the correct value by keeping track of this minimum).
We then want to recursively compute f (j) for all j ∈ {0, 1, . . ., n} m .This can be done by considering i.e. the set of defenders that thwart the latest attack in j.We then (arbitrarily) select a defender i * ∈ S to consider.We can then ask: suppose j is the last attack that defender i * thwarts before thwarting j i * .Then j < j i * and 1 i * (j , j i * ) = 1, since otherwise i * cannot defend both attacks.The best the defenders can do in this case is to thwart f (j −i * (j )) attacks, then have defender i * thwart j i * : if |S| = 1 (i * is the unique defender required to thwart j i * ) then this is 1 more attack thwarted in total; otherwise it's redundant.Thus, minimizing over all possible j , we have (where 1{|S| = 1} is the indicator function for |S| = 1).We also let j * be (any) j which minimizes this, which will be important for reconstructing the optimal defense plan .
By iterating over all j in lexicographic order, we can compute f (j) using the above recursion (and starting from f (0, . . ., 0) = n).We keep track of the minimum value (and the j min which minimizes f (j)) and output this as Opt(v, {(z j , t j )} n j=1 ).To reconstruct the defense plan, we start with j min and read backward: we know that the optimal defense plan (which is the optimal defense plan for j min ) starts by defending according to j min i * (j * ) and then having i * defend j min i * at the end; we then recurse to j min i * (j * ) until we arrive at j = (0, . . ., 0).

Proof of Proposition 1
To begin, we want to give a rough justification for assuming condition (i) in Proposition 1. First, we conjecture that it is always optimal to thwart a reachable attack; furthermore, we prove that under certain conditions (specifically, if the number of future attacks is sufficiently small relative to a parameter dependent on the defender speeds) then thwarting it is always optimal.Our setup is: the defenders are currently at a distance a and an attack comes in at x which can be reached by at least one of the defenders, after which N further (uniformly and randomly distributed) attacks will follow.We have two policies: f , which declines to capture in this case, and f * , which follows the conjectured optimal strategy of always capturing, and always maximizing the distance between the defenders conditional on whether or not a capture is made.
We let J f (a; x; N ) be the expected reward of following policy f from the given conditions (current separation a, attack at x, N attacks after that), and J f * (a; x; N ) be the policy of following f * (including the initial capture which f does not do).Then if we can show that J f (a; x; N ) ≤ J f * (a; x; N ) it proves that f * outperforms all policies which do not make the capture at the current time.Since our result is such that if it works for this value of N , it also works for smaller values of N , it means that the optimal policy must always capture if possible and thus, by Proposition 1 in the main work, the policy f * is therefore optimal under the given conditions.
We also assume that the current separation between the defenders (before deciding whether to thwart the current reachable attack) is a ≥ 2v 2 ; this assumption is justified because by Proposition 1 in the main work, we know that all things equal we want to maximize separation (J is monotonic in a), and 2v 2 separation can always be maintained even when capturing since the noncapturing defender can always maximize its distance to the current attack; even if the initial position of the defenders does not satisfy this, after 2 attacks it can always be achieved (even with captures).
, and let the current separation between the defenders be a ≥ 2v 2 ; then if there are N ≤ 1 w attacks left after the current attack, thwarting the current attack (if possible) is always optimal.
Proof.We assume that 2v 1 ≤ 1 and v 1 + 3v 2 ≤ 1 (otherwise by Theorem 2 from the main work it is possible to thwart all attacks no matter what, which by definition means the optimal policy thwarts the current attack).
Let s f (i) and s f * (i) be the expected sizes of the union of the reachable set at step i after the current step, under policies f and f * respectively, where f does not thwart the attack and f * is the conjectured optimal policy.Then . The first inequality is a generic upper bound on the size of the reachable set union for defenders of speeds v 1 , v 2 ; the second holds because if no capture is made, the defenders can achieve at least v 1 + v 2 separation, and if a capture is made, they can achieve at least 2v 2 separation (the non-capturing defender maximizes separation), in which case the union of the reachable sets satisfies the given bound.
Thus, the probability that the attack at step i is reachable (with the current attack considered step 0) is bounded by the above, and hence and Subtracting using these bounds, we get that The extra "+1" comes from the fact that f * makes a capture at the current step, while f does not.Thus, we are done.
The value w is maximized when v 1 = 3/8 and v 2 = 1/8, yielding w = 1/4.We also show w as a function of v 1 and v 2 in Figure 7, with the regions defined by Theorem 2 in the main text removed, since these guarantee perfect defense and hence that all attackers can and should be captured.Note also that the above holds as well if the number of future attacks is random (unknown to the defenders) with E[N ] replacing N .
We note that the bound above is not the best achievable by this method; we can sharpen it by noting that under f * , when a capture is not made, the union of the reachable sets on the next step is min(1, 2(v 1 + v 2 )) (maximizing separation without needing to capture allows the defenders to have non-intersecting reachable sets).
We next proceed with the proof of Proposition 1 in the main text.
Proof (of Proposition 1).The proof is by induction on the number of attacks N .We show (ii) together with the following additional properties: slower defender to take the attack at 1/4 (which corresponds to −1/4 in the original sequence).Now we just need to find x, y (there's no reason a priori to expect that 2 additional attacks is the right number to use, it just happens to be what works) such that the sequence (−1/4, 1/4, x, y) forces the slower defender to defend the second attack.
Finding x, y: We assume that v 1 < 1/2 (otherwise it is trivial to defend against all attacks).Suppose we have −1/4, 1/4 as the first two attacks (in that order) and the faster defender takes −1/4.Since −1/4, 1/4 are 1/2 apart, it means the slower defender must defend 1/4.We put the next attack at x = 1/4 − v 2 − ε; it is therefore out of reach of the slower defender and must be defended by the faster defender.We then put the last attack at y = 1/4 − v 2 − v 1 − 2ε, out of reach of the faster defender.It is out of reach of the slower defender too if 1/4+ 2v 2 is not sufficient to reach it (since the slower defender was at 1/4 two time steps ago).Considering the negative-direction arc from 1/4 to 1/4 − v 2 − v 1 − 2ε, which has length v 1 + v 2 + 2ε, and the positive-direction arc from 1/4 to 1/4 + 2v 2 , which has length 2v 2 , they fail to meet if their sum is less than 1, i.e. if Since we can choose ε > 0 as small as we like, this sequence works if v 1 + 3v 2 < 1 (and v 1 < 1/2).Thus for any pair of defenders satisfying this condition, for sufficiently small ε > 0 (in particular ε < (1−(v 1 +3v 2 ))/2 and ε < (1/2−v 1 )/2), the attack pattern cannot be defended.
Perfect defense: We conclude by showing the reverse direction, namely that any team of defenders such that v 1 + 3v 2 ≥ 1 can always defend against any sequence of attacks.To show this, it is clearly sufficient (and necessary) to show it for the case of v 1 + 3v 2 = 1.To do this, we show the stronger result that this team can defend all attacks even with a 1-step visibility horizon (i.e. the defenders only need to know the location of the next attack).We say that the defenders have full coverage if the union of their reachable sets is the entire perimeter.This means that, for the next time step at least, the defense cannot be breached.If they can guarantee full coverage at every step then they can defend against any sequence of attacks.Note that since v 2 ≤ v 1 , we have (with equality iff v 2 = v 1 = 1/4), and hence the two defenders have reachable sets large enough to get full coverage.Let s(t) be the distance between the defenders at time t.Then they have full coverage at step t if and only if Our goal is now to show that if s(t) ≥ 2v 2 then no matter where x(t + 1) (the next attack) happens, it can be defended in such a way that s(t + 1) ≥ 2v 2 ; then by induction, as long as s(0) ≥ 2v 2 (which we guarantee by starting them off at opposite poles, i.e. s(0) = 1/2).If the attack is reachable the slower defender (v 2 to either side) it is easy to see that the slower defender can defend against it and the faster defender can imitate the movement (same distance and direction) to make s(t + 1) = s(t) ≥ 2v 2 .If the attack is outside of the reachable set of the slower defender, then the faster defender must take it; but since it is outside the 2v 2 -wide reachable set of the slower defender, one of the two endpoints of that reachable set must be at least 2v 2 away from attack location.The slower defender then goes to that endpoint, thus preserving s(t + 1) ≥ 2v 2 .
Thus, we know that defenders satisfying v 1 + 3v 2 ≥ 1 can defend against any sequence of attacks, even with a one-step horizon.

Homogeneous defenders
For a team of homogeneous defenders with speed v against a sequence of attacks {(z j , t j )} n j=1 , we show the following matching-based algorithm for determining the minimum number m of such defenders are needed to thwart all the attacks.The homogeneity of the defenders is key to this algorithm, which runs in time O( √ n|E|) where |E| is the number of edges in the DAG G; this is in turn proportional to n 2 in the worst case and hence the overall run time is O(n 5/2 ).
First, we build a directed acyclic graph (DAG) G on n nodes u 1 , . . ., u n , each node representing an attack; we assume that they are sorted by time t 1 ≤ t 2 ≤ . . .t j (if not, we sort them in O(n log n) time).We put a directed edge u j → u j (where j > j) if and only if dist(z j , z j ) ≤ v(t j − t j ), i.e. if attack j can be reached by a defender which thwarts attack j.
Note that a (directed) path in G corresponds to a sequence of attacks that can be defended by a single defender.Thus, the goal is to decompose G into m directed paths which cover all vertices.
This path decomposition is done via a (well known) reduction to maximum bipartite matching: we split each node u j into u in j and u out j , where a directed edge u in j → u out j exists, and all edges u j → u j are replaced with edges u out j → u in j .This does not change the minimum m.
Taking {u out j } and {u in j } as the two parts of a bipartite graph (and ignoring the u in j → u out j edges), any matching on this graph can be used to reconstruct a set of nonoverlapping paths that cover all the vertices of G by putting back the u in j → u out j edges; since each u in j has exactly one edge coming out of it (to u out j ) and each u out j has at most one edge coming out of it (if it is matched it has one, otherwise not) one can start at any unmatched u in j and uniquely walk forward until reaching some unmatched u out j , producing one directed path.Since there is one such path starting at each unmatched u in j (and ending at each unmatched u out j ), a matching of size k produces a set of m = n − k directed paths covering the G (and conversely any set of m directed paths can be used to find a size-(n − m) sized matching).Thus, finding the maximum k gives the minimum numbers of defenders m.
Thus, we can use any well-known maximum bipartite matching algorithm (e.g.Hopcroft-Karp) to compute the number of defenders needed.
Remark 5. Note that a complete matching (which would imply that 0 defenders could thwart all the attacks!) is by definition impossible because G is acyclic.
The algorithm is summarized in Alg. 2 for ease of reference.1 Form DAG G with nodes u1, . . ., un and edge uj → u j iff dist(xi 1 , xi 2 ) ≤ s(ti 2 − ti 1 ); 2 Split each node ui into u in i and u out i .Add edges u in i → u out i , and replace all edges ui 1 → ui 2 with edges u out i 1 → u in i 2 ; 3 Use the Hopcroft-Karp algorithm to find a maximum-bipartite matching on G of size k; 4 m ← n − k Complexity There are two parts to this problem, each of which could probably be sped up with a little thought: 1. Building DAG G and/or its bipartite counterpart.2. Computing the maximum matching.
Ignoring d, which we assume to be small, (1) takes O(n 2 ) time, and (2) with Hopcroft-Karp takes O( √ n|E|) where E is the number of edges, which in the worst case means O(n 5/2 ).

Simulation Results
The goal of the simulations is to explore the relationship between speed and number of defenders (if we halve the speed of the defenders, how many more do we need to keep the same level of effectiveness?) Our experiments follow the pattern: We compute M (v, {(z j , t j )} n j=1 ) for all v whose v i are at g evenly-spaced locations in the range (v min , v max ] of speeds.We refer to g as the number of grains.
The full list of parameters is given in Table 3.
For ease of notation, we denote The expectation is over the attacks, i.e. random (z j , t j ).All experiments are for uniformly-random attack times on a circular (1-dimensional) perimeter of circumference 1.
Our simulations suggest a strong relationship between the speed v of the homogeneous defenders and the expected number E[M (v)] of defenders required to thwart all attacks: plotting these in a log-log plot reveals an almost linear relationship between log v and log E[M (v) − 1], as shown in Figure 8. 8This suggests that given a particular distribution of attacks, there is some value α > 0 such that The slope of the line9 in the log-log plots in Figure 8 is −α.What α measures, roughly, is the balance of the speed-number tradeoff for the defenders: if we double the speed of the defenders, how many fewer defenders (on average) will we need to stop all the attacks?The nearly linear nature of Figure 8 indicate that, for a given (uniform in time and location) distribution of attacks, this tradeoff is roughly constant over a wide range of speeds.When α is small, it indicates that increasing the number of defenders is comparatively more important than increasing the speed (by the same proportion); in particular, if α = 1 then doubling the speed and doubling the number are equivalent.
We then examine what this α is given different distributions of attacks: for a given value of t max , we vary the number n of attacks and see how α changes.The results are shown in Figure 9, with the y-axis being the slope of the log-log plot of the given parameters (−α, as mentioned above).Note that α < 1 in all measured cases, meaning that increasing the number of defenders by some proportion c is always more effective than increasing the speed of the defenders by a factor of c.When there are few attacks (n = 5), α ≈ 0.8, but as n increases it drops to about 0.4 before leveling out, indicating a substantial decrease in the efficacy of increasing speed as compared to increasing the number of defenders.This general pattern holds true for both plotted cases (t max = 15 and t max = 25) but the drop-off is sharper when t max = 15.

The Monotonicity Acceleration
We now describe in greater detail how we evaluate Opt(v) in an unusual order in order to take advantage of the monotonicity of the function.While we have not made any rigorous attempt to optimize the ordering, the order we use was effective enough to reduce the computation of the simulations described in the main work (see Figures 3 and 5, left-hand column for uniformly-random attack times and right-hand for fixed attack times) by an average of between 93% and 99% depending on the parameters; the savings are much more pronounced for the fixed attack times case, as the Opt function is generally flatter (and hence produces more cases where the upper bound matches the lower bound and the DP computation can be skipped).First, it is noted that permuting v does not affect Opt(v); hence whenever Opt(v 1 , v 2 ) is computed, we get the same value for Opt(v 2 , v 1 ) (recall that in these simulations we drop the WLOG assumption that v 1 ≥ v 2 ).For simplicity we will talk about the indices of the speed values we compute for: the goal is to compute Opt(v) for all v with entries from a range of g values, which we will denote as a vector w = (w 1 , . . ., w g ) where w 1 < w 2 < • • • < w g without loss of generality.Then we define a function f : [g] m → R where f (j 1 , . . ., j m ) = Opt(w j1 , . . ., w jm ) ( This just means that the ith defender has speed v i = w ji .Since w is monotonically increasing (in its entries) and Opt is monotonically decreasing, f is monotonically decreasing as well.For this section, as in the simulations, m = 2 so we can visualize [g] m as a grid (as in Figure 5 in the main paper).The basic idea is to compute in order of decreasing powers of 2, which we refer to as levels: for instance, when g = 256 we compute first for multiples of 256, then multiples of 128, then 64 and so forth.In this way, at each level we can use the results from the level above to check the monotonicity condition and potentially save significant computation.
The algorithm then does the following: We also for simplicity only considered the upper and lower bounds generated by previous entries on the same row or column of the grid (two upper bounds, one for the row and one for the column, to two lower bounds, also one for the row and one for the column).

Fig. 4 :
Fig. 4: Empirical optimal ratio v 2 /v tot , for various values of v tot .Left: Uniform attack times.Right: Fixed attack times.

Fig. 7 :
Fig. 7: The quantity w as a function of v 1 and v 2 , with the regions defined by Theorem 2 in the main text removed.

Algorithm 2 :
Homogeneous case.Data: Sequence of attacks {(zj, tj)} n j=1 Result: Minimum number of defenders m required to defend all attacks.

Fig. 9 :
Fig. 9: Relationship of −α (y-axis) to number of attacks n (x-axis) from n = 5 to n = 40, uniform attack times, for two different values of t max .For each value of n and each of T = 1000 random attack sequences, the value M (v) was computed for 50 linearly-spaced values of v from 0 to 4 (not inclusive of 0), i.e. v = 0.08, 0.16, . . ., 4.00, and the best-fit slope of the log-log plot extracted.

Table 2 :
Parameters of the experiments Range of defender speeds (inclusive of vmax but not vmin) g Number of speed values measured (grains) within (vmin, vmax]

Table 3 :
Parameters of the experiments Range of defender speeds (inclusive of vmax but not vmin) g Number of speed values measured (grains) within (vmin, vmax]1.Generate attacks {(z j , t j )} n j=1 randomly, either with fixed attack times t j = j or uniformly random attack times in [0,t max ]. 2. Compute M (v, {(z j , t j )} n j=1 ) for v ∈ (v min , v max ] or v ∈ (v min , v max ] m , at g intervals.3.Repeat the above for T trials and average the resulting values for each value of v or v.