Trajectory planning for unmanned surface vehicles operating under wave-induced motion uncertainty in dynamic environments

We present a deliberative trajectory planning method to avoid collisions with traffic vessels. It also plans traversal across wavefields generated by these vessels and minimizes the risk of failure. Our method searches over a state-space consisting of pose and time. And, it produces collision-free and minimum-risk trajectory. It uses a lookup table to account for motion uncertainty and failure risk. We also present speed-up techniques to increase performance. Our wave-aware planner produces plans that (1) have shorter execution times and safer when compared to previously developed reactive planning schemes and (2) comply with user-defined wave-traversal constraints and Collision Regulations (COLREGs)


Introduction
Automated trajectory planning is a prerequisite for realizing autonomous unmanned vehicles. Collision avoidance with static and dynamic obstacles under perception uncertainty are basic requirements for generating useful trajectories. The focus of this article is a framework for deliberative trajectory planning in the presence of large semipermeable dynamic obstacles and motion uncertainty. Typically, trajectory planning for unmanned vehicles is decoupled into two parts. First, a path planning problem is solved to obtain a tentative path, ignoring dynamic obstacles. Then, a motion planning problem is solved, this time considering dynamic obstacles, using the tentative path as a reference to yield a final trajectory. The motion planning problem is conventionally solved using reactive planning approaches like generalized velocity obstacles 1 and velocity tuning. [2][3][4] These methods work well when the dynamic obstacles are small, and the introduction of these dynamic obstacles does not introduce drastic differences between the true optimal path and the tentative path. Thus, in the case of small dynamic obstacles, a reactively planned trajectory is likely to be close to the true optimal trajectory while following a tentative path computed while ignoring small dynamic obstacles. However, in the case of large dynamic obstacles, it is generally likely for reactively planned trajectories to be highly suboptimal. For instance, on the one hand, reactively avoiding (i.e. going around) large dynamic obstacles may introduce unnecessary slowdowns and increase execution time. On the other hand, punching through semipermeable dynamic obstacles may decrease execution time but increase the risk of failure and collisions. Therefore, handling large semipermeable dynamic obstacles requires deliberative planning that can carefully balance the competing priorities (i.e. reduce execution time, reduce failure/collision rate). For example, reducing failure/collision rate should be the first priority. When it is possible to have zero failures/collisions, reducing execution time is the next priority. In some situations such as those that arise in the marine robotics domain, 5 navigation around or through large permeable dynamic obstacles is required (Figure 1).
The wavefield generated by a marine vessel is a good example of a large semipermeable dynamic obstacle which can either be traversed or be avoided altogether. In a majority of practical scenarios, unmanned ground vehicles exhibit significantly less motion uncertainty compared to unmanned surface vehicles. Trajectory planning for unmanned ground vehicles is a well-studied problem and in the recent years, significant progress has been made in solving it using a wide variety of methods such as graph search, 6 stochastic tree search, 7,8 Markov decision processes (MDPs), and optimal control. 9,10 Graph search methods like state lattice search are popular global planning methods due to optimality guarantees and efficiency in large environments. Unmanned surface vehicles exhibit more motion uncertainty due to significant external disturbances like waves and winds. 11 Collision avoidance is more challenging under these circumstances and it involves not only staying a safe distance from static obstacles and dynamic obstacles but also following COLREGs. COL-REGs is a set of maritime navigation rules set by the International Maritime Organization (IMO) to minimize collisions and confusion among marine vessels. The effects of motion uncertainty are pronounced in smaller Unmanned Surface Vehicles (USVs) as they are more susceptible to wave disturbances. An MDP framework can rigorously handle motion uncertainty for environments with static obstacles. It requires a state transition model of the USV. A state transition model to account for wave disturbances is developed in Thakur et al. 12 for use in path planning amid static obstacles. In addition to ambient sea waves, wavefields generated by the motion of dynamic obstacles over the sea create yet another source of motion uncertainty and adds to the number of failure modes. The wavefield induces external disturbances that vary both spatially and temporally. Trajectory planning for USVs in the presence of dynamic obstacles needs to consider a state vector that includes position, orientation, velocity, and time. For time-varying environments, the regular MDP can be converted to a time-extended MDP. 13 Time-extended MDP over high-dimensional state spaces are complex and computationally prohibitive. 14,15 On the other hand, performing a search over a lattice of motion primitives requires careful collision risk assessment and heuristics tuned for dealing with dynamic obstacles in the presence of significant motion uncertainty. Large-scale partially observable Markov decision processes (POMDPs) can be approximately solved only in a reactive online fashion using methods developed in the literature. 16,17 This article builds on the work in Rajendran et al. 18 As in Rajendran et al., 18 we only focus on the environmental disturbance caused by wavefields, and hence, the effects of wind and water currents are not considered. In Rajendran et al., 18 dynamic obstacle heuristics (DOHs) was used requiring extensive precomputation for traffic vessels of different sizes and velocities using off-line simulations. This needs to be done for a planning scenario before the search can be performed. Computing lookup tables takes considerable amount of time. Therefore, this method cannot be used in practice in new scenarios. The method described in this article overcomes that limitation and uses online heuristics instead of precomputed lookup tables. Both methods lead to almost identical paths.
The new contributions of this article are the following. (1) We adapt the idea of space-time heuristics 19 to overcome the limitations of DOH used in our exploratory work. We also incorporate COLREGs rules into it to solve more difficult problems involving many dynamic obstacles. (2) We investigate the behavior of our method in four new maps that capture the types of environments the USV may be exposed to. (3) We characterize the behavior of our method under different operating conditions involving varying perception uncertainty, varying number of dynamic obstacles, and varying wavefield parameters.

Trajectory planning
Precomputed state-transition probabilities over differing sea states are used in Svec et al. 11 to handle motion uncertainty. Incorporating perception uncertainty turns MDPs into POMDPs. In Agha-mohammadi et al., 20 the probabilistic road map idea is extended for belief spaces using MDPs and feedback controllers. Feedback controllers are designed to drive the agent into a sampled state in belief space called a Feedback Information Road Map (FIRM) node. Then, through the use of a precomputed policy for the MDP over FIRM nodes, a sequence of controllers that drive the agent from the initial state to the final state is obtained. This way POMDPs become computationally tractable for problems of larger scale involving only static obstacles. However, it is hard to augment these methods to work in domains with spatiotemporal disturbances. In Blackmore et al., 21 chance constraints are used in the context of static obstacle avoidance, guaranteeing prespecified failure probability bounds. Belief-space (pose Â covariance space) search is used in static obstacle environments to incorporate motion uncertainty and generate plans that expedite arrival time and minimize final pose error covariance. 22,23 These works do not include dynamic obstacles. Terrain variation causes trajectory tracking error. The error characteristics of the controller are considered in the planning method proposed in Mellinger and Kumar 24 where tracking error is modeled as a function of terrain. The probability of successful navigation through the terrain is incorporated into the cost function. We take inspiration from Mellinger and Kumar 24 and introduce the notion of timevarying terrain. We also account for risk of failure through a robot-reliability model. 25 We also use methods in Greytak and Hover 26 to model trajectory tracking uncertainty and utilize low-level trajectory tracking error characteristics.

Trajectory and wave prediction
Accurate prediction of trajectories of moving obstacles is a fundamental requirement for deliberative trajectory planning. Several threads of research work are in progress to provide predictions. In the marine domain, automatic identification system (AIS) data 27 can be used for coarse grained trajectory prediction. AIS is susceptible to communication interruptions. 28 However, when it is used in conjunction with Gaussian Process Regression (GPR) based trajectory prediction methods, 8,[29][30][31] it can produce good estimates for long-term planning purposes. 32 Waves generated by moving vessels can be forecast up to 180 s into the future using radar systems. 33,34 We assume such a system is available in the sensor suite to aid deliberative planning.

Vehicle model
For trajectory planning purposes, we use a simplified kinematic vehicle model to describe the motion of the USV The USV pose is x ¼ ðx; y; Þ T where x and y denote the position in North-East-Down coordinates, denotes heading with respect to north. The USV control variables are u ¼ ðv; !Þ T and w is a zero-mean random process noise capturing the effects of environmental disturbances. A state s is defined by the tuple ðx; y; ; t; p s Þ encoding the USV pose, the time stamp, and probability of being operational up to a time duration t from the start time. A motion goal is a tuple MG ¼ ðx f ; y f ; f ; t f Þ encoding a pose ðx f ; y f ; f Þ to be reached at time instant t f . Given an initial state x i and a final state x f ¼ ðx f ; y f ; f Þ T specified by the motion goal MG, and letting w ¼ 0, a nominal trajectory xðtÞ satisfying xðt i Þ ¼ x i ; xðt f Þ ¼ x f can be computed using trajectory generation methods. 9,35 We assume that the underlying trajectory tracking controller of the USV will achieve the target pose x f by the time deadline t f . In our work, we use Dubins curves 36 as the optimal trajectory between motion goals. We also use only 70% of the maximum allowable USV speed as the controller may need to saturate the throttle occasionally to make up for lost time while rejecting disturbances. A motion goal set MGS r;n dirs ðsÞ is a set of motion goals that are a distance of r units away relative to a particular state s and directed at n dirs different angles. Each of the motion goals is given a label l corresponding to its direction from s. Thus, MGS r;n dirs ðsÞ ¼ fMG l g [ MG O , where l ¼ k Â 360=n dirs ; k 2 ½0; n dirs À 1. The label O indicates a wait action of duration Dt wait at the same pose. We assume a station-keeping controller 37 is active for the duration of the wait action. Figure 2 shows an example of a labeled motion goal set MGS r¼30m;n dirs ¼8 ðsÞ where s ¼ ðx ¼ 0; y ¼ 0; ¼ 0; t ¼ 0Þ. r is adapted to have multiple resolutions depending on how congested the local region is. This allows for high-resolution motion goals in areas which are difficult.

Traffic vessel profiles
A traffic vessel profile (TVP) is a collection describing the geometry and state of a traffic vessel underway. These data are generated by a trajectory prediction system. 31 A typical traffic vessel profile TVP i made up of the following quantities: (i) predicted trajectory p i ðtÞ, (ii) instantaneous position covariance matrix S i ðtÞ, (iii) vessel zones fZ k g 3 k¼1 where each Z k is an ellipse attached to and defined in the body frame of the vessel (see Figure 3). The vessel zones move in unison with the traffic vessels. Vessel zone 1 is a region with 100% probability of failure. Vessel zone 2 and 3 denote regions where significant wave disturbances are expected. These zones also specify the local wave front direction in that region. We assume that waves outside the zones 1, 2, and 3 do not affect the USV. The sizes of these vessel zones are application-dependent. In this work, the length of zone 1 is fixed to three boat-lengths long, and it can be made longer for vessels traveling at a higher relative speed compared the USV. The length of zones 2 and 3 have been chosen based on the attenuation of the wavefield and where the wavefield poses no danger to the USV. A set of TVPs is denoted by TVPS ¼ fTVP k g n tv k¼1 .

Environmental disturbances
We assume environmental disturbances are specified parametrically through a parameter vector defined for each state in the state space as in Figure 4. For example, at point D,   states are defined as in the Douglas sea scale. 38 Point C lies in a static obstacle region and hence, g C ¼ finvalid ¼ trueg is left undefined and handled as a special case. Whenever a traffic vessel passes through a point, for example, point A at a certain time t, the parameter vector is updated as In both cases A and B, the local wave amplitude and wave direction can be determined given the wave zone and the pose of the traffic vessel at t. The field near vessel in g indicates that the point is in some zone of some traffic vessel.
In this work, TVPs and environmental disturbances are modeled with the help of basic wave models. In practice, the wave interaction is more complex and estimation of wave parameters is required. With recent advances of computer vision and deep learning, it is possible to construct a method of wave parameter estimation from vision. This method does not have to be very accurate for use in a failure assessment module.

Failure model
We seek to model failure probabilities using a robot reliability model. 25 The probability that a robot remains functional up to time t is given by p s ðtÞ ¼ e Àt=M where M is the mean-time-between-failures (MTBF), which is estimated as the ratio of number of failures to the total number of attempts within an observation time window. Developing a precise model of the failure (i.e. capsize) of the USV in varying sea-states and waves is difficult. It involves allowing the robot to fail in the process of counting the number of failures within an observation time window. This is not practical. A safer alternative is to measure how close to failure the robot got to when exposed to a certain local disturbance. In this work, a local disturbance is described parametrically as the tuple consisting of sea-state and proximity to a vessel. Two sea-states are considered calm (undisturbed waters) and rough (choppy waters due to wind conditions/multiple traffic vessels). And, proximity to vessel is a Boolean flag that is determined by whether the USV happens to be in any of the wavezones around the traffic vessel (see Figure 5). If the USV is in proximity to a vessel, then the wavezone is also part of the local disturbance parameter. USVs typically have certain design limits such as roll limits or pitch limits. We define a failure event as an event when any of these design limits is exceeded. Thus, given a set of trajectory samples fs k ðtÞg n k k¼1 as in Appendix A of duration t 1 . For each trajectory sample, we count the number of failure events during t 1 . With failure events labeled and counted, compute M as in equation (3) for each trajectory sample. We aggregate the MTBF over trajectory samples for each combination of local disturbance parameter g. In this way, a lookup table that is indexed by g can be generated (see Table 2). Once MTBF estimates are computed for an USV operating in different local disturbance parameters, it can later be used in the robot reliability model to compute failure probabilities.

Problem formulation
, sensed by the perception system, (iv) a set of static obstacles defined by a set of polygons SO.
Compute: A sequence of states fs k g p k¼0 such that (i) s 0 ¼ s i , (ii) s p 2 G, (iii) the trajectory is COLREGs compliant, (iv) probability of success p s at s p is maximized, (v) expected time of execution is minimized.

Approach
The problem is an implicit graph search problem. We use the Theta* graph search algorithm 39 to solve it. We describe the essential aspects of the Theta* graph search implementation.

Node expansion and cost functions
A node n is a tuple defined as n ¼ fs; g; h; f g where g is the cost-to-come, h is the cost-to-go, f is the total cost (f ¼ g þ h), s is the state. An initial node n i is constructed as n i ¼ fs i ; 0; hðs i ; s f Þ; 0 þ hðs i ; s f Þg. When a parent node n p is visited, a set of child nodes fn c;k g N c k¼1 that contain with states corresponding to the nearby motion goals are assigned. Here N c is the total number of child nodes expanded for n p . Each of the child nodes also contains the reference trajectory xðs p ; s c;k Þ composed of the Dubins curve linking the parent state to the child state. During execution, there is motion uncertainty as the controller tracks the reference trajectory. This uncertainty is handled while evaluating the cost function. To describe the cost function, we focus on the expansion of an arbitrary parent node n p resulting in a child node n c ending in the state s c as in Figure 6. Let us assume the reference trajectory xðs p ; s c Þ starts at t ¼ t p and ends at t ¼ t c . To evaluate the cost function, the reference trajectory must be discretized into temporal segments. Performing regular discretization, we obtain a sequence of time samples TS ¼ ft q g n ts q¼1 where t q 2 ½t p ; t c , t qþ1 À t q ¼ Dt. Suppose that for a t 2 TS, the reference trajectory puts the USV state at s with the 2D position being p and time being t. And, similarly, all of traffic vessels are moved to t on their mean predicted trajectory. The uncertainty in the predicted trajectory will be considered in Cost assessment. Having moved the traffic vessels to their positions in the environment map, the local disturbance vector g is obtained by looking up p on the environment map and identifying which of the characteristic scenarios apply (see Figure 4). We will now see how g is used in the trajectory evaluation process.
Obtaining tracking error and failure model parameters. Local environmental conditions (such as local sea state) and being in the vicinity of a moving vessel affect the trajectory tracking performance of the USV as it moves along a reference trajectory. These conditions are encoded into g (Figure 4) and can be used to obtain tracking error and failure model parameters. If the USV were to be controlled in an open-loop configuration, the trajectory tracking error would build up over time. However, in the presence of a low-level stabilizing feedback controller, the tracking error can be bounded within an envelope around the reference trajectory. 40 Environmental disturbances and controller can be extensively studied off-line by either using Monte Carlo simulations. This way the various realizations of the transition trajectories can be obtained. 12 These trajectories can also be obtained by empirically evaluating them in real scenarios. With either approach, the goal is to characterize the USV motion as it is commanded to move in a straight line from an initial state s 0 toward a motion goal under parameterized disturbances in the environment. Appendix A outlines a simple procedure to identify tracking error parameters. We seek to model the tracking error and the probability of successful navigation along the reference trajectory as a function of the local disturbance parameter vector g. The tracking error can be characterized by observing cross-and along-track errors (see Figure 7). The cross-track error at any point in time is the deviation in a direction normal to the reference trajectory (i.e. lateral deviation). The along-track error at any point in time is the difference in the track-projected current position and the target point on the trajectory (i.e. lag or overshoot). The probability of successful navigation can be characterized by measuring MTBF under different conditions. To this end, we conducted preliminary experiments with a 16-foot Wave Adaptive Modular Vehicle (WAM-V) USV 37 in the Stranahan River, near Dania Beach, Florida. We chose the WAM-V platform due to its robustness under wave action. We could safely perform the physical experiments while not risking the total loss of the robot. However, this work is applicable to platforms that are not specifically designed to withstand waves that may cause hull damage. We collected data using wave gauges and the navigation sensors (Inertial Measurement Unit (IMU), GPS, Compass) of the USV ( Figure 8) and studied the effect of the waves generated by a traffic vessel on the  WAM-V. Due to the difficulties of controlling sea-state, only two states calm and rough were considered. Calm condition refers to a situation when water level across the test area is roughly constant, and no wind effects are seen on the surface of the water. Rough condition refers to a situation when water levels are changing due to interacting waves generated by passing traffic vessels and wind. In both conditions, the USV was made to traverse a 20 m linear path 10 times and the cross-track s ct and alongtrack s ct error standard deviation in the tracked trajectory was used to populate Table 1. Similarly, in both sea-state conditions, the pitch and roll angles recorded and the procedure in Determining MTBF is used to compute M r and M c in Table 2. M t is not generated using physical experiments instead we use a surrogate model defined by equation (4). In a very rough sea-state, the data from IMU are likely to be noisy and other sensing modalities might be needed for producing better estimates. Since we performed experiments in a water channel, our experiments were shielded from disturbances and did not experience noisy IMU measurements and we were able to observe roughly sinusoidal variations as the waves passed by. Tables 1 and 2 are indexed by the local disturbance parameters in g and they are briefly explained below.
Determining tracking error parameters. As for the tracking error model, when the query state lies over calm waters and is not in the vicinity of any traffic vessel, we use nominal values for tracking error parameters. Otherwise, the tracking error parameters are doubled.
Determining MTBF. Since traffic-free calm waters are very safe, M c ¼ 1. Since rough waters are not safe regardless of traffic, M r ¼ 600. Determining MTBF over calm waters near traffic vessels involves determining the angles between the waves and the heading of the USV as shown in Figure 5. If the query state is located within any zone of the traffic vessels, we determine the relative position p d from the center of the traffic vessel and the incidence angle q i between the USV heading and the wave direction of zone Z 0 (see Figure 5). p d is calculated in the body-fixed coordinates of the traffic vessel and d ¼ 1 0 ½ p d . Sailors typically prefer the incidence angle to be around 45, reducing the risk of broadside rolling as well as pitchpoling. 41 Adhering to this notion, we use the following assignment for MTBF a ¼ otherwise: Here M Z 0 is the MTBF value associated with the zone Z 0 and is defined as And, L z is the maximum trailing length of the traffic vessel zones.
In this work, the MTBF model was generated using a combination of simulation and physical experiments. These were done in an off-line manner. In future, these can be performed in an online manner as well. We envision that it will work in the following manner. Initially, we assume a conservative MTBF model. Using this conservative MTBF model, the USV can navigate safely. And, based on an explore-exploit tradeoff, we can explore unknown wave conditions and update MTBF model appropriately using a Bayesian inferencing scheme. This will enable exploring new wave conditions while safely updating the MTBF model parameters.
Cost assessment. Having determined the tracking error and the MTBF parameters using g, we proceed to evaluate the cost function along the reference trajectory xðs p ; s c Þ. Tracking error parameters s at and s ct encode the deviation along the path and the normal deviation from the position p,  respectively (see Figure 7). Thus, the actual position p of the USV at t is given by where c*N ð0; s ct Þ and a*N ð0; s at Þ. N ½ p and T ½ p are the normal and tangent unit vectors at p. An ellipse E t centered at p and aligned to the tangent direction T ½ p and normal direction N ½ p with axis lengths 6 s at and 6s ct describes a 99:7% confidence region within which the USV can be expected to be at t ¼ t (99:7% comes from the 68-95-99.7 empirical rule associated with the normal distribution).
For each ellipse E t q , the probability of failure p f ;q is computed as where M q is computed by averaging the MTBF values over a uniform set of points in E t q . More specifically, these MTBF values are computed by looking up each point in G and then using Table 2 to find the MTBF. Assuming independence of the failure events, the final probability of success for executing xðs p ; s c Þ is given by p s ¼ P n ts q¼1 ð1 À p f ;q Þ. Thus, the probability of success imparted to the child state is p s ðn c Þ :¼ p s ðn p Þ Â p s where p s ðn p Þ and p s ðn c Þ are the probabilities of success of the parent and child states, respectively. The child node n c is then assigned a cost-tocome gðn c Þ where the time cost is c t ¼ t c À t p , F f is a large constant with units of time discouraging risky motion causing failure, and F c is a large constant with units of time discouraging moves that violate COLREGs. Parameter c c 2 f0; 1g encodes the violation of COLREGs computed using. 42,43 Parameter ! f 2 ½0; 1 is a user defined weight parameter.
Accounting for perception uncertainty. To account for the perception uncertainty in the measurement of the TVPs, a Monte Carlo sampling scheme 44 is utilized. The Monte Carlo sampling approach is used as it is a general method and amenable to parallelization. In the previous section, g was computed using only the mean position of traffic vessels at t. However, to account for uncertainty, covariance information needs to be used. We pick one TVP, TV P i 2 TV PS, to illustrate the process. A set of possible instantaneous positions PS i ¼ fp k g n ps k¼1 is sampled from the distribution N ð p i ð tÞ; S i ð tÞÞ. We move the traffic vessel to each possible position p k 2 PS i and then look up g k as in Figure 4. We compute the corresponding M k using Table 2 and Figure 5. Finally, we compute the average M over all the computed M k . This average M is used in place of the original MTBF in equation (7).

Speed-up techniques
We describe some techniques to improve the search performance.
Static obstacle heuristics. The cost function includes the time duration of moving from one position to another (i.e. c t ). And, static obstacles do not move between planning cycles. In such a scenario, a time-to-goal map (also known as arrival time map) is beneficial. This map is termed Static Obstacle Heuristic (SOH) and it stores the lower bound on the time cost of going from any position in the workspace to the goal position g. This map is computed using the fast marching method (FMM) to approximately solve an Eikonal equation. 45,46 The level sets of the solution of the Eikonal equation correspond to the frontiers of a wave emanating from the goal position. The speed of the wave is set to the maximum speed of the USV. As the frontier (i.e. the open set in Dijkstra's algorithm) bends around corners, the time-to-goal estimate is more informative than the naive Euclidean distance-based estimate. The raw time-to-goal values given by the FMM are naturally not admissible unless they are scaled down by a factor that depends on the discretization used in computing the map. 6 After the appropriate scaling is applied, each cell in this map contains an underestimate of the time to reach the goal from that cell location. This value is used to compute an estimate of c t to be accumulated until the goal is reached. For a given query position s, a heuristic value h s ðs; gÞ is looked up from the SOH.
Space-time exploration heuristics. Developing admissible heuristics to handle dynamic and static obstacles is challenging and can often be just as hard as the original problem. However, by relaxing the original problem, effective heuristics can be generated. We follow Chen et al. 19 and drop to 3D configuration space by excluding (i.e. we use only x À y À t space) to generate a heuristic trajectory using a point robot kinematic model (relaxed version of the USV kinematic model) from initial state s i to the goal region G. Maximum speed v max of the point robot is the same as that of the USV. We also ignore zones 2 and 3 of traffic vessels such that only zone 1 is considered as a dynamic obstacle. In other words, for the purpose of heuristics generation, we allow the point robot to move freely across zone 2 and 3 without any penalty. Suppose the point robot is at the initial state p i ðx i ; y i Þ at the query time t i (see Figure 9(a)). We observe that set of all points this point robot can reach in Dt is given by a cone C i with apex at ðx i ; y i ; t i Þ, perpendicular height Dt, and radius v max Dt. We choose Dt such that it is as large as possible without the cone touching any configuration space obstacle. Suppose d c ðtÞ is the clearance distance between p i and any workspace obstacle at time t. Then This cone represents a collision-free volume in the 3D configuration space. To explore the space, the idea is to grow new cones at the top face of cone C i and continue this process until the goal region is reached by a certain cone. In this way, we generate a sequence of overlapping cones forming a collision-free volume in configuration space, which connects the initial state and the goal region.
Nodes are described using tuples of the form n k ¼ ðx k ; y k ; t k ; r k ; C k ; g k ; h k ; f k Þ. C k is the cone situated at ðx k ; y k ; t k Þ with radius r k ¼ v max =Dt k . The cost-tocome g k ¼ t k þ c c F c is the sum of time elapsed and COL-REGs violation cost. And, the cost-to-go is the value from the SOH lookup table, h k ¼ h s ðp k ; gÞ. The total cost is  (Figure 9(a)). For each candidate location ðx c ; y c ; t c Þ, a new cone C c is constructed with apex at the candidate location such that it has the largest height and does not collide with configuration space obstacles. If such a cone can be constructed, a new node n c ¼ ðx c ; y c ; t c ; r c ; C c ; g c ; h c ; f c Þ is generated where g c ¼ t c þ c c F c , h c ¼ h s ð ðx c ; y c Þ; gÞ, f c ¼ g c þ h c . Incorporation of COLREGs constraints through the term c c F c improves the effectiveness of the heuristics. Once Algorithm 1 ends, a sequence of cones CS ¼ fC q g starting at the initial state and ending in the goal region is obtained. Let t be the sequence of apexes of the cones in CS and it represents a coarse-grained trajectory in the ðx; y; tÞ space that avoids static and dynamic obstacles (see Figure 9(b)). We describe how the heuristic value is computed. We first project t to ðx; yÞ space and denote it as t p . For a given a query state s, we use the position component p of s to find the closest point p c on t p . The heuristic is computed as in Chen et al. 19 It is the sum of time difference between the end point of t and the closest point on t and time required to reach p c (i.e. jpÀp c j v max þ t end À t c where tðt c Þ ¼ p c and t end is the time stamp of the end point of t).
Adaptive motion goal set. To accelerate the Theta* graph search, the resolution of the motion goal set r and n dirs is modified depending on the distance of the parent node position from moving vessels and static obstacles. We use In open waters, longer strides are made and the progress toward the goal is faster. Table 3 summarizes the list of parameters used in the planner. These parameters are chosen by the user.

Selection of parameters
User preference parameters. User preference parameters balance safety and performance. The failure penalty F f can be chosen as the duration of time required to dispatch a functional USV in lieu of the failed USV. This choice roughly captures the real time impact a failed USV will cause. The COLREGs violation penalty F c should be much smaller than F f to favor COLREGs noncompliant trajectories over trajectories that lead to a collision or failure. The extent to which large time delays are accepted to minimize failure is controlled by ! f . The closer to ! f is to 1, the more conservative the trajectories are in terms of avoiding failure.
Planning speed parameters. r; n dirs : Decreasing r and increasing n dirs creates a denser motion goal lattice and thus increases planning times. Dt wait : Using longer wait state durations, Dt wait may introduce undesirable looping behavior as the cost of gyrating about a point may be cheaper than waiting at the same point for a duration of Dt wait . Introducing an extra cost term in equation (8) capturing energy cost will remove this behavior (e.g. c p ¼ PathLengthðn p ; n c Þ).

Results
We implemented the wave-aware trajectory planner (WA-TP) in MATLAB on a PC with Intel Xeon 3.5 GHz CPU and 32 GB RAM. A set of simulation experiments were conducted. We discuss the experiments and their results below.

Simulation experiments
We generated four different maps to test the trajectory planner. Traffic vessels at t ¼ 0 are shown in Figure 10. The USV and traffic vessels were simulated using three degrees-of-freedom dynamic model as in Klinger et al. 47 In the USV pose observations used for planning, noise was added following the distribution in equation (6). The response of traffic vessels to the motion of the USV is not simulated. Only the USV is allowed to react to the traffic vessels. Some of the important parameters used are map size ¼ 600 m Â 600 m, USV length ¼ L u ¼ 5 m, traffic vessel length ¼ L v ¼ 10 m, traffic vessel speed ¼ 1 À 2 m =s. The turning radius and maximum speed of the USV are 5 m and 2 m/s, respectively. The other parameters are presented in Table 3. The traffic vessels are positioned to result in interesting dynamic obstacle avoidance, as well as wavefield crossing behavior. We compare our approach (WA-TP) with three different configurations of the planning stack. These configurations are summarized in Table 4. Global path planner (G-PP) does not consider the dynamic obstacles and generates only kinematically feasible paths to the goal. It is realized by removing dynamic obstacles and removing the time dimension from the implementation of WA-TP. Conservative trajectory planner (C-TP) takes into consideration the traffic vessels along with their wavefields as a nonpermeable dynamic obstacles. This yields conservative trajectories. C-TP is realized by setting M ¼ 0 within the wavefield in the implementation of WA-TP. The configuration G-PP þ C-VO follows the generated path and avoids traffic vessels using conservative velocity obstacles (C-VOs). C-VO considers traffic vessels along with their wavefields as large obstacles. Velocity obstacle (VO) was not designed to handle waves. Therefore, combining G-PP with VO will lead to the vehicle going over the waves and a high rate of failures. Therefore, we only evaluated G-PP þ C-VO combination instead of G-PP þ VO.
The configuration C-TP þ VO follows the conservative trajectory generated by C-TP and avoids traffic vessels using regular VO. We tested random start and goal poses sampled from the sampling regions, as shown in Figure 10. For each sampled pair, we collected aggregated performance metrics for 20 runs of different configurations of the planning stack. Table 5 presents the execution time, collision count, and success probability. For each run, normalized execution time is computed relative to the execution time of WA-TP. Success probability is computed after each run by using the executed trajectory as in Cost assessment. The results show that WA-TPþVO generally reduces execution time with the exception of M1. However, G-PP þ C-VO incurs collisions in M1 as planning without the time dimension results in situations where collision is inevitable. This is also the case in M3 as well. Going into the busy channel just because it is a shorter route to the goal is dangerous and time-consuming. Furthermore, the success probabilities are greatly reduced in G-PP þ C-VO as the USV traverses wavefields without proper alignment. On the other hand, C-TP þ VO plans longer routes to prevent collisions and ensure higher success probability. This results in increased execution time compared to WA-TP þ VO. This is especially evident in M3 where the USV has to cross a busy channel to reach the goal point. C-TP þ VO waits for a completely clear channel, whereas WA-TP þ VO waits for an opportune moment where the wavefields are traversable.     Sensitivity to speed-up techniques. The planner was invoked in the test maps (M1, M2, M3) enabling and disabling subsets of these techniques to study the effects of speed-up techniques. Twenty start and goal points were randomly sampled in the sampling regions corresponding to each map. The baseline configuration of the planner uses Euclidean distance-based time heuristic (i.e. hðs; gÞ ¼ time taken using straight-line path from s to g using maximum speed). Also, the finest motion goal set and a sample size of n ps ¼ 30 is used. The results are presented in Table 6. SOH greatly reduces expansion count and planning time. Among the two heuristics (SOH and space-time exploration heuristics (STEH)), the effect of SOH and STEH is roughly equivalent in maps that do not have heavy dynamic obstacle traffic (M1, M2). In fact, there is an overhead of computing STEH, and it may increase planning time. However, in M3, where the dynamic obstacle traffic is heavy, STEH provides better heuristics compared to SOH due to the added time dimension. In M3, SOH guides the search through the busy channel, whereas STEH guides the search to go around the congestion and head south toward the central channel. In environments with heavy dynamic obstacle traffic, the overhead of computing STEH is justified. Using adaptive motion goals reduces the expansion count further and increases the solution cost at worst by 9%.
Sensitivity to MTBF parameter. We illustrate the effects of changing the MTBF parameter of the failure model summarized in Table 2. Figure 11 shows the initial configuration of USV and traffic vessels in the islands map. Traffic vessels were initialized to speeds between 1 m/s and 2 m/s. The USV was set up to use WA-TP þ VO and commanded to cross the busy channel ( Figure 11). In this experiment, the parameter controlling the danger level of the wave region (i.e. MTBF value M t ) is changed. Two different values of M t are used to simulate two different levels of danger. In Figure 12(a), the wavefields are made to be dangerous by assigning a lower MTBF parameter (M t ¼ 1000). This results in the USV take a longer trajectory to reach the goal point while altogether avoiding any contact with the wavefield. In Figure 12(b), the wavefields are made to be permeable by assigning a high MTBF parameter (M t ¼ 3000), making the traversal of the wavefield less dangerous. Thus, in this case, the USV attempts crossing the wavefield at an appropriate angle as it makes it way to the goal point.
Sensitivity to number of dynamic obstacles and perception uncertainty. We also studied how the algorithm scaled when the number of dynamic obstacles was varied. We used the map M4 and varied the number of dynamic obstacles. Random start and goal poses were sampled from the sampling  regions. We used this random sampling scheme to ensure the USV interacts with the dynamic obstacles. Figure 13 shows the mean planning time increasing roughly linearly in the number of dynamic obstacles. We used the scenario in Figure 14 to study how our method performs under different levels of perception uncertainty. Figure 15 illustrates the different behaviors exhibited by our method. In Figure 15(a), the USV stops at the mouth of the channel waiting for the traffic vessel to clear the channel. This behavior is due to the high perception uncertainty associated with the sensing of the traffic vessel. Entering the channel in this situation is a risky move. If the channel is entered without an accurate state estimate of the traffic vessel, a collision might be inevitable when the USV is unable to maneuver around the traffic vessel. This behavior is avoided. In Figure 15(b), the USV enters the channel and keeps to the right lane while the traffic vessel is underway. This behavior is due to the low perception uncertainty associated with the sensing of the traffic vessel. The USV avoids the traffic vessel and also complies with COLREGs.

Physical experiments
Experimental setup. A set of physical experiments using a 16 feet WAM-V USV at North Lake, Hollywood, Florida (see Figure 16), was performed to verify the effectiveness of the proposed approach. In place of real static and dynamic obstacles, we introduced virtual obstacles. They were given to the navigation system in lieu of the perception subsystem. This was done to perform the experiments without risking the loss of the WAM-V platform. Figures 17 and 18 show the scenario and the trajectory obtained by WA-TP and C-TP. In these scenarios, dynamic obstacles were moving at constant speeds between 1.0 m/s and 2.0 m/s. During experiments, there was gentle southeasterly winds ranging between 10 and 12 KTS. From Table 7, the execution times are observed to have reduced by as much as 25:1% under the proposed approach. A video clip showing some simulation and physical experiments is located at https://youtu.be/_VGxb4nRUwU.

Conclusion
Our approach plans trajectories over long distances while avoiding static obstacles and traffic vessels. It also avoids waves generated by them. Our approach executes these trajectories faster and safer. The execution times are reduced by at least 25% when compared to other methods. The proposed approach is able to react safely to varying levels of perception uncertainty. The incorporation of wavefields into the planning process improves the success probability of the planned trajectory. By using heuristics and adaptive motion goal sets, the planning effort and time has been reduced. However, further reduction is required for practical deployment in congested harbors. Harbors have speed limits ranging between 1.8 m/s and 5 m/s and reacting to traffic vessels potentially moving at those speeds requires faster planning times than possible by the current MATLAB implementation. In this work, simplifying assumptions such as trajectory of traffic vessels, the shape and size of the wave regions were made. However, these aspects of the planner can be changed and tuned for specific applications that are not necessarily in the marine domain. The heuristics and evaluation of transition cost under motion/perception uncertainty remain applicable in other domains where there are large semipermeable dynamic obstacles. There are a few potential directions of future work. The reactive behavior of traffic vessels to the actions of the USV was not captured in the planner. Furthermore, wind and current effects were not included in the planning process. These effects can be included when good estimates of the currents and wind direction are available. This will require development of new heuristics. Based on the assumption that multiple traffic vessels may not come very close to the USV, the interaction of wavefields by multiple traffic vessels was ignored. Including this interaction in the planning problem is essential in very congested environments. An optimized Cþþ implementation can be done for a faster planning time by utilizing GPU and multicore frameworks to perform the computations in the Monte Carlo scheme used for failure probability computations.

Authors' note
Opinions expressed are those of the authors and do not necessarily reflect the opinions of the sponsors.    low-level controller is commanded a motion goal to s 1 , it tries to track the reference in the presence of disturbances. Thus, the target position to be achieved at t ¼ t 1 is p 1 ¼ ðx 1 ; 0; 0Þ. We assume the motion goal is conservative enough to be feasible, even under heavy disturbances (i.e. s 1 can be reached by t 1 ). The controller is able to achieve the target position while absorbing the effect of disturbances (by temporarily maneuvering at faster or slower speeds than the reference speed). Many trials are conducted where the USV is repeatedly made to move from s 0 to s 1 under different conditions g (i.e. sea states and wave directions) and USV trajectory sðtÞ is recorded. The USV is subjected to a superposed waves of varying frequency and amplitude, as expected in a particular sea state as well as varying local waves generated by traffic vessels. The kth trial captures the evolution of the USV state under a particular realization of g. Given a set of trajectory samples fs k ðtÞg n k k¼1 , the deviation from the optimal reference trajectory can be characterized using two quantities: the maximal cross-track error standard deviation (s ct ) and the maximal along-track (s at ) error standard deviation.
s ct ¼ max q2½0;t 1 s½fN q ½t k ðqÞ À tðqÞg n k k¼1 s at ¼ max q2½0;t 1 s½fT q ½t k ðqÞ À tðqÞg n k k¼1 Here t k and t denote the x À y path associated with s k and x, respectively. s½Á denotes the standard deviation operator. N q ½Á and T q ½Á denote the normal projection and tangential projection operator on the reference path point tðqÞ. The maximal cross-track error standard deviation is the maximum standard deviation of the normal distance between the USV and the path over the entire time interval t 2 ½0; t 1 . The maximal along-track error standard deviation is the maximum standard deviation of the tangential distance between the USV and the reference position over the entire time interval t 2 ½0; t 1 . s ct and s at collectively capture the lateral position and speed uncertainty experienced by the USV as it travels toward a motion goal under influence of disturbance parameter g.