Online control synthesis for uncertain systems under signal temporal logic specifications

Signal temporal logic (STL) formulas have been widely used as a formal language to express complex robotic specifications, thanks to their rich expressiveness and explicit time semantics. Existing approaches for STL control synthesis suffer from limited scalability with respect to the task complexity and lack of robustness against the uncertainty, for example, external disturbances. In this paper, we study the online control synthesis problem for uncertain discrete-time systems subject to STL specifications. Different from existing techniques, we propose an approach based on STL, reachability analysis, and temporal logic trees. First, based on a real-time version of STL semantics, we develop the notion of tube-based temporal logic tree (tTLT) and its recursive (offline) construction algorithm. We show that the tTLT is an under-approximation of the STL formula, in the sense that a trajectory satisfying a tTLT also satisfies the corresponding STL formula. Then, an online control synthesis algorithm is designed using the constructed tTLT. It is shown that when the STL formula is robustly satisfiable and the initial state of the system belongs to the initial root node of the tTLT, it is guaranteed that the trajectory generated by the control synthesis algorithm satisfies the STL formula. We validate the effectiveness of the proposed approach by several simulation examples and further demonstrate its practical usability on a hardware experiment. These results show that our approach is able to handle complex STL formulas with long horizons and ensure the robustness against the disturbances, which is beyond the scope of the state-of-the-art STL control synthesis approaches.


Motivation and Related Work
Rapid growth of robotic applications, such as autonomous vehicles and service robots, has stimulated the need of new control synthesis approaches to safely accomplish more complex objectives such as nondeterministic, periodic, or sequential tasks.Temporal logics, such as linear temporal logic (LTL) (Baier and Katoen, 2008), metric interval temporal logic (MITL) (Koymans, 1990), and signal temporal logic (STL) (Maler and Nickovic, 2004), have shown capability in expressing such objectives for dynamical systems in the last decade.Various control approaches have been developed accordingly.
LTL focuses on the Boolean satisfaction of properties by given signals while MITL is a continuous-time extension that allows to express temporal constraints.Existing control approaches that use LTL or MITL mainly rely on a finite abstraction of the system dynamics and a language equivalent automata (Gastin and Oddoux, 2001) or timedautomata (Alur et al., 1996) representation of the LTL or MITL specification.The controller is synthesized by solving a game over the product automata (Belta et al., 2007(Belta et al., , 2017;;Zhou et al., 2016).Other control approaches include optimization-based (Wolff and Murray, 2016;Fu and Topcu, 2015) and sampling-based methods (Vasile and Belta, 2013;Kantaros and Zavlanos, 2018).STL is a more recently developed temporal logic, which allows the specification of properties over dense-time.Due to a number of advantages, such as explicitly treating realvalued signals (Maler and Nickovic, 2004), and admitting qualitative semantics (Fainekos and Pappas, 2009), control synthesis under STL specifications has gained popularity in the last few years.
Different from LTL or MITL, automata-based methods have not been developed for STL specifications to the same extent due to their complexity.Existing approaches that deal with control synthesis under STL specifications include optimization (Raman et al., 2015(Raman et al., , 2014;;Sadraddini and Belta, 2015) and barrier function methods (Lindemann andDimarogonas, 2018, 2019a;Yang et al., 2020).Optimization methods are mainly used for discrete-time systems.The idea is to encode STL formulas as mixed-integer constraints, and then the satisfying controller can be obtained by solving a series of optimization problems (Raman et al., 2015(Raman et al., , 2014).An extension of the mixed-integer formulation is investigated for linear systems with additive bounded disturbances in Sadraddini and Belta (2015), where the controller is obtained by solving the optimization problem at each time step in a receding horizon fashion.One drawback of this approach is the exponential computational complexity which makes it difficult to be applied to STL formulas with long time horizons.Barrier function methods are mainly used for continuous-time systems.The idea is to transfer the STL formula into one or several (timevarying) control barrier functions, and then obtain feedback control laws by solving quadratic programs (Lindemann andDimarogonas, 2018, 2019a).This method is computationally efficient.However, as the existence and design of barrier functions are still open problems, it currently mainly applies to deterministic affine systems.In Yang et al. (2020), the authors consider linear cyber-physical systems with continuous-time dynamics and discrete-time controllers.The proposed offline trajectory planner is based on a mixed integer quadratic programming that utilizes control barrier functions to generate satisfying trajectories in continuoustime.Other control synthesis approaches include samplingbased (Vasile et al., 2017;Karlsson et al., 2020) and learningbased methods (Venkataraman et al., 2020;Kapoor et al., 2020).In addition, control synthesis for multi-agent systems and STL specifications is recently considered in Lindemann and Dimarogonas (2019b); Buyukkocak et al. (2021); Sun et al. (2022).
We note that although various methods exist for the control synthesis under STL specifications, guaranteeing robustness against uncertainties is still a challenging problem.The core contribution of this paper is on robust control synthesis for uncertain systems under STL specifications.

Main Contributions and Organization
Motivated by the above considerations, this work considers the online control synthesis problem for uncertain discretetime systems under STL specifications.The paper is inspired by Chen et al. (2018b), where relationships between primitive STL formulas and reachable sets are developed, and Gao et al. (2022), where the notion of temporal logic tree is proposed for LTL.However, we note that it is far from straightforward to extend these results to general STL formulas.The contributions of our paper are summarized as follows: (i) A real-time version of satisfaction relation and a tubebased temporal logic tree (tTLT) are proposed for STL formulas.A correspondence between STL formulas and tTLT is established via reachability analysis on the underlying systems.An algorithm is proposed for the automated construction of tTLT.Note that the tTLTs in this paper are different from the TLTs defined for LTL formulas in Gao et al. (2022), due to the time constraints encoded in the STL formulas.(ii) We show that the tTLT is an underapproximation for a broad fragment of STL formulas, i.e., all the trajectories that satisfy the tTLT also satisfy the corresponding STL formula.(iii) We propose an online control synthesis algorithm based on the constructed tTLT from the STL formula.
When the STL formula is robustly satisfiable and the initial state of the system belongs to the initial root node of the tTLT, it is proven that the trajectory generated by the proposed online control synthesis algorithm satisfies the STL formula.
The remainder of the paper is organized as follows.In Section 2, preliminaries and the problem under consideration are formulated.In Section 3, definitions of real-time STL semantics and tTLT are introduced.Section 4 establishes a semantic connection between STL and tTLT.Section 5 deals with the online control synthesis problem.The results are validated by simulations and experiments in Section 6. Conclusions are given in Section 7.
Notation.Let R := (−∞, ∞), R ≥0 := [0, ∞).Let N be the set of natural numbers.Denote R n as the n dimensional real vector space, R n×m as the n × m real matrix space.Given a vector x ∈ R n , define x and x T as the Euclidean norm and the transpose of vector x, respectively.Given a set Ω, Ω denotes its complement, 2 Ω denotes its powerset, and |Ω| denotes its cardinality.The operators ∪ and ∩ represent set union and set intersection, respectively.In addition, we use ∧ to denote the logical operator AND and ∨ to denote the logical operator OR.The set difference A \ B is defined by

Systems dynamics
Consider an uncertain discrete-time control system of the form where In the following, let us define the control policy.
Denote by U ≥k the set of all control policies that start from time t k .
One can see from Definition 2.1 that a control policy is a sequence of time-dependent functions, each of which maps from the state space to the input space.
Definition 2.2.A disturbance signal w = w 0 w 1 . . .w k . . . is called admissible if w k ∈ W, ∀k ∈ N. Denote by W ≥k the set of all admissible disturbance signals that start from time t k .
The solution of (1) is defined as a discrete-time signal x := x 0 x 1 . ... We call x a trajectory of (1) if there exists a control policy ν ∈ U ≥0 and a disturbance signal w ∈ W ≥0 satisfying (1), i.e., We use x ν,w x0 (t k ) to denote the trajectory point reached at time t k under the control policy ν and the disturbance w from state x 0 at time t 0 .
The deterministic system is defined by and x ν x0 (t k ) denotes the solution at time t k of the deterministic system when the control policy is ν and the initial state is x 0 at time t 0 .

Signal temporal logic
We use STL to concisely specify the desired system behavior.STL (Maler and Nickovic, 2004) is a predicate logic consisting of predicates µ, which are defined through a Prepared using sagej.clspredicate function g µ : R n → R as The syntax of STL is given by where ϕ, ϕ 1 , ϕ 2 are STL formulas and The validity of an STL formula ϕ with respect to a discrete-time signal x at time t k , is defined inductively as follows (Raman et al., 2015): The signal x = x 0 x 1 . . .satisfies ϕ, denoted by x ϕ if (x, t 0 ) ϕ.By using the "negation" operator ¬ and the "conjunction" operator ∧, we can define "disjunction" ϕ 1 ∨ ϕ 2 = ¬(¬ϕ 1 ∧ ¬ϕ 2 ).And by employing the until operator U I , we can define "eventually" F I ϕ = U I ϕ and "always" Definition 2.3.(Dokhanchi et al., 2014) The time horizon ϕ of an STL formula ϕ is inductively defined as Definition 2.4.(Satisfiability) Consider the deterministic system (2) and the STL formula ϕ.We say ϕ is satisfiable from the initial state x 0 if there exists a control policy ν such that x ν x0 ϕ.
Definition 2.5.(Robust satisfiability) Consider the uncertain system (1) and the STL formula ϕ.We say ϕ is robustly satisfiable from the initial state x 0 if there exists a control policy ν such that Given an STL formula ϕ, let denote the set of initial states from which ϕ is (robustly) satisfiable.

Reachability operators
In this section, we define two reachability operators.The natural connection between reachability and temporal operators plays an important role to the approach proposed in this paper.The definitions of maximal and minimal reachable tube are given as follows.
Definition 2.6.Consider the system (1), three sets Ω 1 , Ω 2 , C ⊆ R n , and a time interval [a, b].The maximal reachable tube from Ω 1 to Ω 2 is defined as ) collects all states in Ω 1 at time t k from which there exists a control policy ν ∈ U ≥k that, despite the worst disturbance signals, drives the system to the target set Ω 2 at some time instant t k ∈ [max{a, t k }, b] while satisfying constraints defined by C prior to reaching the target.
Definition 2.7.Consider the system (1), two sets Ω 1 , Ω 2 ⊆ R n , and a time interval [a, b].The minimal reachable tube from Ω 1 to Ω 2 is defined as ) collects all states in Ω 1 at time t k from which no matter what control policy ν is applied, there exists a disturbance signal that drives the system to the target set Ω 2 at some time instant t k ∈ [max{a, t k }, b].In this definition, the constraint set C is redundant.

Problem formulation
Consider the following fragment of STL formulas, which is inductively defined as  G [a1,b1] φ 1 within the encoded time intervals of the temporal operators G [a1,b1] and U [a2,b2] , respectively.Nevertheless, we note that the fragment (5) is more general than most of the fragments considered in the literature studying online control synthesis, e.g., Lindemann and Dimarogonas (2018); Buyukkocak et al. (2022).Such fragment (5) is expressive enough to specify a large number of robotic tasks, e.g., time-constrained reachability, supply-delivery, and safety.
The problem under consideration is formulated as follows.
Remark 2.2.Note that the objective of Problem 2.1 is not to synthesize a closed-form control policy ν, which is in general computationally intractable for systems with continuous spaces.Instead, we aim at finding online a sequence of feedback control inputs in a way that is similar to receding horizon control.
The key idea to solve Problem 2.1 is as follows.We first transform the STL formula to an alternative tree-based representation, which we call tube-based temporal logic tree (tTLT), by leveraging reachability analysis, as detailed in Section 3.There exists a semantic connection between the STL formula and the corresponding tTLT, thanks to the reachability analysis, which is explained in Section 4. Based on this fact, we can perform control synthesis over the tTLT, instead of the STL formula.An online control synthesis algorithm is provided in Section 5.

Real-time STL semantics and tube-based temporal logic tree
In this section, a real-time version of STL semantics and a notion of tTLT are proposed.The real-time STL semantics establishes the satisfaction relation between a real-time signal and the STL formula.Based on this realtime semantics, we then propose the tTLT using the close connection between STL and reachability analysis.

Real-time STL semantics
Before proceeding, the following definition is required.
Definition 3.1.Suffix and Completions.Given a discretetime signal x = x 0 x 1 . .., we say that a partial signal s = s l s l+1 . . ., l ∈ N, is a suffix of the signal x if ∀k ≥ l, s k = x k .The set of completions of a partial signal s, denoted by C(s), is given by Given a time instant t k and a time interval The real-time STL semantics is defined as follows.
Definition 3.2.Let t k be the starting time of any STL formula ϕ to be evaluated.Given a partial signal s = s l s l+1 . . .starting from time instant t l ≥ t k , the real-time satisfaction of ϕ with respect to the partial signal s, denoted by (s, t k , t l ) | ϕ, is recursively defined by Eq. ( 6).
The real-time satisfaction relation (s, t k , t l ) | ϕ suggests that the partial signal s is the suffix of a satisfying trajectory that starts from t k , i.e., Using the induction rule, one can define the real-time STL semantics for "disjunction" ϕ 1 ∨ ϕ 2 , "eventually" F [a,b] ϕ, and "always" G [a,b] ϕ.
In parallel with Definitions 2.4 and 2.5, we define the STL satisfibility given a partial signal as follows.
Definition 3.3.Consider the deterministic system (2) and the STL formula ϕ.We say ϕ is satisfiable from the state x k at time t k if there exists a control policy ν ∈ U ≥k such that Definition 3.4.Consider the uncertain system (1) and the STL formula ϕ.We say ϕ is robustly satisfiable from the state x k at time t k if there exists a control policy ν ∈ U ≥k such that (x ν,w x k , t 0 , t k ) | ϕ, ∀w ∈ W ≥k .Note that when t k = t 0 , Definitions 3.3 and 3.4 degenerate to Definitions 2.4 and 2.5, respectively.Given an STL formula ϕ, let (7) denote the set of states from which ϕ is robustly satisfiable at t k .Then, we have the following results.
Proof.Assume that x k ∈ S ϕ1∧ϕ2 (t k ).According to Definition 3.2 and ( 7), one has that there exists a control policy ν ∈ U ≥k such that The other direction may not hold because it could happen that for a state x k , there exist two control policies . However, there is no control policy which ensures the robust satisfaction of ϕ 1 ∧ ϕ 2 at t k .
Assume now that x k ∈ S ϕ1 (t k ), then one has that there exists a control policy ν ∈ U ≥k such that (x ν,w x k , t 0 , t k ) | ϕ 1 , ∀w ∈ W ≥k .Moreover, according to STL syntax, one further has The other direction may not hold because it could happen that there exists no state such that either ϕ 1 or ϕ 2 is robustly satisfiable from at t k , i.e., S ϕ1 However, there exists a state x * k from which there exists a control policy ν ∈ U ≥k such that It is implied from Propositions 3.1 and 3.2 that the realtime satisfiable set of the STL formula can be inferred by set operations and reachability analysis, which makes it reasonable to develop the tTLT, a tree structure consisting of reachable tubes and operators.In the following section, we will detail the definition of tTLT and how to construct an tTLT from a given STL formula using reachability analysis.

Tube-based temporal logic tree and its construction
An tTLT is a variant of the TLT proposed in the recent work (Gao et al., 2022) for LTL formulas.Due to the timedependent essence of STL formulas, the reachable sets in the TLT are replaced with the reachable tubes in the tTLT, which can explicitly incorporate the time constraints in the STL formulas.The intuition of the tTLT is that it indicates how a state trajectory should evolve in order to satisfy the time constrains embedded in an STL formula.In the following, a formal definition of the tTLT is introduced.
Definition 3.5.An tTLT is a tree for which the next holds: • each node is either a tube node that maps from the nonnegative time axis, i.e., R ≥0 , to the subset of R n , or an operator node that belongs to {∧, ∨, U I , F I , G I }; • the root node and the leaf nodes are tube nodes; • if a tube node is not a leaf node, its unique child is an operator node; • the children of any operator node are tube nodes.
The following result shows how to construct an tTLT for any given STL formula using reachability analysis.
Theorem 3.1.For the system (1) and every STL formula ϕ in (3), an tTLT, denoted by T ϕ , can be constructed from ϕ through the reachability operators R M and R m .
Proof.We follow three steps to construct an tTLT.
Step 1: Rewrite the STL formula ϕ into the equivalent positive normal form (PNF).It has been proven in Sadraddini and Belta (2015) that each STL formula has an equivalent STL formula in PNF (i.e., negations only occur adjacent to predicates), which can be inductively defined as Step 2: For each predicate µ or its negation ¬µ, construct the tTLT with only one tube node X µ = {x : g µ (x) ≥ 0} or S µ .The tTLT of or ⊥ has only one tube node, which is R n or ∅.
Step 3: Following the induction rule to construct the tTLT T ϕ .More specifically, we will show that given STL formulas ϕ 1 and ϕ 2 , if the tTLTs can be constructed from ϕ 1 and ϕ 2 , then the tTLTs can be constructed from Case 1: Boolean operators ∧ and ∨.Consider two STL formulas ϕ 1 , ϕ 2 and their corresponding tTLTs T ϕ1 , T ϕ2 .The root nodes of T ϕ1 and T ϕ2 are denoted by X ϕ1 (t k ) and X ϕ2 (t k ), respectively.The tTLT T ϕ1∧ϕ2 (T ϕ1∨ϕ2 ) can be constructed by connecting X ϕ1 (t k ) and X ϕ2 (t k ) through the operator node ∧ (∨) and taking the intersection (or union) of the two root nodes, i.e., X ϕ1 ( Case 2: Until operator U [a,b] .Consider two STL formulas ϕ 1 , ϕ 2 and their corresponding tTLTs T ϕ1 , T ϕ2 .The root nodes of T ϕ1 and T ϕ2 are denoted by X ϕ1 (t k ) and X ϕ2 (t k ), respectively.In addition, the leaf nodes of T ϕ1 are denoted by where N is the total number of leaf nodes of T ϕ1 .The tTLT T ϕ1U [a,b] ϕ2 can be constructed by the following steps: 1) replace each leaf node 2) update T ϕ1 from the leaf nodes to the root node with the new leaf nodes; and 3) connect each leaf node of the updated T ϕ1 and the root node of T ϕ2 , i.e., X ϕ2 (t k ), with the operator node Case 3: Eventually and always operators F [a,b] and G [a,b] .Consider an STL formula ϕ 1 and its corresponding tTLT T ϕ1 .The root node of T ϕ1 is given by X ϕ1 (t k ).The tTLT Based on Theorem 3.1, Algorithm 1 is designed for the construction of tTLT T ϕ .It takes the syntax tree of the STL formula ϕ as input.For an STL formula, the nodes of its syntax tree are either predicate or operator nodes.More specifically, all the leaf nodes are predicates and all other nodes are operators.

Algorithm 1 tTLTConstruction
Input: the syntax tree of STL formula ϕ.Return: the tTLT T ϕ .
1: for each leaf node µ (or ¬µ) of the syntax tree do, 2: Replace µ (or ¬µ) by S µ (or S ¬µ ), 3: end for 4: for each operator node of the syntax tree through a bottom-up traversal, do 5: Construct T ϕ according to Theorem 3.1, 6: end for Let us use the following example to show how to construct the tTLT.
are predicates.The syntax tree of ϕ is shown on the left-hand side of Figure 4.The corresponding tTLT for ϕ (constructed using Algorithm 1) is shown on the right-hand side of Figure 4, where Remark 3.1.Given an STL formula ϕ in positive normal form, let N denote the number of Boolean operators and M the number of temporal operators contained in ϕ.Let T ϕ be the tTLT corresponds to ϕ.Then, T ϕ has at most 2N number of complete paths.In addition, each complete path has at most 2(N + M ) + 1 number of nodes, out of which at most N + M are non-root tube nodes.Thus, one can conclude that T ϕ contains at most 4N (N + M ) + 1 number of nodes, out of which at most 2N (N + M ) + 1 number of tube nodes.

Semantic Connection between STL and tTLT
In this section, the semantic connection between an STL formula and its corresponding tTLT is derived.Before that, we first define the complete path and its segment.
Definition 4.1.A complete path p of an tTLT is a path that starts from the root node and ends at a leaf node.It can be encoded in the form of p where N f is the number of operator nodes contained in the complete path, X i : R ≥0 → 2 R n , ∀i ∈ {0, 1, . . ., N f } represent tube nodes, and Θ j ∈ {∧, ∨, U I , F I , G I }, ∀j ∈ {1, . . ., N f } represent operator nodes.Any subsequence of a complete path is called a segment of the complete path.
Now, we define the maximal temporal segment for an tTLT, which plays an important role when simplifying the tTLT.
Definition 4.2.A maximal temporal segment (MTS) of a complete path of the tTLT is one of the following types of segment: 1) a segment from the root node to the parent of the first Boolean operator node (∧ or ∨); 2) a segment from one child of one Boolean operator node to the parent of the next Boolean operator node; 3) a segment from one child of the last Boolean operator node to the leaf node.
One can conclude from Definition 4.2 that any MTS starts and ends with a tube node and contains no Boolean operator nodes.
Definition 4.3.A time coding of (a complete path of) the tTLT is an assignment of each tube node X i of (the complete path of) the tTLT an activation time instant t κi , κ i ∈ N. Now, we further define the satisfaction relation between a trajectory x and a complete path of the tTLT.Definition 4.4.Consider a trajectory x := x 0 x 1 . . .and a complete path p = X 0 Θ 1 X 1 Θ 2 . . .Θ N f X N f .We say x satisfies p, denoted by x p, if there exists a time coding for p such that . This means that if a trajectory x p, it must visit each tube node X i of the complete path p sequentially.In addition, we can further conclude from items iv)-v) that the trajectory x has to stay in each tube node X i for sufficiently long time steps.
With Definition 4.4, the satisfaction relation between a trajectory x and an tTLT T ϕ can be defined as follows.
Definition 4.5.Consider a trajectory x and an tTLT T ϕ .We say x satisfies T ϕ , denoted by x T ϕ , if there exists a time coding {t κi } for T ϕ such that the output of Algorithm 2 is true.
The central idea of Algorithm 2 is to check the Boolean relation among sub-formulas of a given STL formula ϕ.For instance, assume ϕ = ∧ n i=1 ϕ i , where each ϕ i , ∀i = 1, • • • , n contains no Boolean operators.Then one can get from Algorithm 1 that T ϕ has n complete paths p i , i = 1, • • • , n, and each p i corresponds to a sub-formula ϕ i .Then Algorithm 2 dictates that x T ϕ if and only if x satisfies every complete path of T ϕ .Assume now that ϕ = ∨ n i=1 ϕ i , then Algorithm 2 dictates that x T ϕ if and only if x satisfies at least one complete path of T ϕ .

Algorithm 2 tTLTSatisfaction
Input: a trajectory x, an tTLT T ϕ , and a time coding {t κi }.Return: true or false.
1: T c ϕ ← Compression(T ϕ ), 2: for each complete path p of T ϕ , do end if 8: end for 9: set all the non-leaf tube nodes in T c ϕ with false, 10: Backtracking(T c ϕ ), 11: return the root node of T c ϕ .
Algorithm 2 takes as inputs a trajectory x, an tTLT T ϕ , and a time coding {t κi }, and outputs true or false.It works as follows.Given an tTLT T ϕ , we first compress it via Algorithm 3 (line 1), in this way the resulting compressed tree T c ϕ contains only Boolean operator nodes and tube nodes.Then for each complete path p of T ϕ , if x |= p, one sets the corresponding leaf node of p in T c ϕ (note that T c ϕ and T ϕ have the same set of leaf nodes) with true.Otherwise, one sets the corresponding leaf node of p in T c ϕ with false (lines 2-8).After that, we set all the non-leaf tube nodes of T c ϕ with false (line 9) and the resulting tree becomes a Boolean tree (a tree with Boolean operator and Boolean variable nodes).Finally, we backtrack the Boolean tree T c ϕ using Algorithm 4, and return the root node (lines 10-11).
We further detail the Compression algorithm (Algorithm 3) and the Backtracking algorithm (Algorithm 4) in the following.Algorithm 3 aims at obtaining a simplified tree with Boolean operator nodes and tube nodes only.
To do so, we first encode each MTS in the form of X 1 Θ 1 . . .Θ N f −1 X N f (line 3), and then replace it with one tube node (line 4).Algorithm 4 takes the compressed tree T c ϕ as an input, and then update the parent of each Boolean operator node through a bottom-up traversal.In Algorithm 4, PA(Θ) and CH 1 (Θ), CH 2 (Θ) represent the parent node and the two children of the Boolean operator node Θ ∈ {∧, ∨}, respectively.

Algorithm 3 Compression
Input: an tTLT T ϕ .Return: the compressed tree T c ϕ .1: for each complete path of T ϕ , do 2: for each MTS, do replace the MTS with one tube node ∪ Let {t κ1 , t κ2 , t κ4 , t κ5 } be the time coding of the complete path p 1 , where t κ1 , t κ2 , t κ4 , and t κ5 are the activation time instants of the tube nodes X 1 , X 2 , X 4 , and X 5 := S µ1 , respectively.Then, we have according to Definition 4.4 that a trajectory In addition, the tTLT T ϕ contains 3 MTSs, i.e., X 1 , ϕ is shown in Figure 5.If a trajectory x satisfies both of the complete paths p 1 and p 2 , the output of Algorithm 2 is true, otherwise, the output is false.Definition 4.6.(Robust satisfiable tTLT) The tTLT T ϕ is called robust satisfiable for the system (1) with initial state x 0 if there exists a control policy ν ∈ U ≥0 such that x ν,w x0 T ϕ , ∀w ∈ W ≥0 .
The following theorem provides a formally semantic relation between the STL formula fragment in (5) and the corresponding tTLTs.
Theorem 4.1.Consider the uncertain system (1) with initial state x 0 and an STL formula ϕ in (5).Let T ϕ be the tTLT corresponding to ϕ.Then, one has that ϕ is robustly satisfiable for (1) if T ϕ is robustly satisfiable for (1).
Proof.From Definitions 2.5 and 4.6, one has that to prove Theorem 4.1, it is equivalent to prove In the following, we will first prove x ν,w x0 where ϕ 1 and ϕ 2 in item iv) are STL formulas belong to items ii) or iii).
Case i): For , predicates µ, ¬µ, and Case ii): We note that the proofs of the three are similar, therefore, in the following, we only consider the case ϕ = µ 1 U [a,b] µ 2 .The tTLT T ϕ can be constructed via Algorithm 1, which is shown in Figure 6.
x k ∈ S ϕ1 .Moreover, from Definition 2.6, one has that i) and ii) together implies Case iii): We note that the proofs of the two are similar.In the following, we consider the case ϕ = F [a1,b1] G [a2,b2] µ 1 .The tTLT T ϕ can be constructed via Algorithm 1, which is shown in Figure 7.
Assume that x ν,w x0 T ϕ , then one has from Definition 4.4 that T ϕ , then one has from Definition 4.4 that x ν,w x0 T ϕ1 and x ν,w x0 T ϕ2 .Moreover, since ϕ 1 and ϕ 2 belong to items ii) or iii), then one can conclude from Case ii) and Case iii) that x ν,w x0 The proof of the other direction is similar and hence omitted.
Then, we prove x ν,w x0 , where ϕ 1 and ϕ 2 are STL formulas belong to items ii) or iii).

Case v): ϕ
ϕ is similar to Case iv).The other direction does not hold because for an uncertain system, it is possible that there exists a trajectory x ν,w x0 such that x ν,w x0 ϕ, however, the initial state x 0 / ∈ X ϕ root (t 0 ) (due to Proposition 3.2), where X ϕ root denotes the root node of T ϕ .In this case, x ν,w x0 does not satisfy T ϕ .
The proof of x ν,w x0 T ϕ ⇒ x ν,w x0 ϕ for other STL formulas ϕ in (5) can be completed inductively by combining Cases i)-v).Therefore, the conclusion follows.
Thanks to the semantic relation between the STL formulas in (5) and the corresponding tTLT, we are able to perform control synthesis over the tTLT, instead of the STL formulas, while preserving the correct-by-construction guarantee.The details of this control synthesis are provided in the next section.

Online Control Synthesis
This section concerns online control synthesis as defined by Problem 2.1.From Theorems 4.1 , one can see that to guarantee the satisfaction of the STL formula ϕ in (5), it is sufficient to find a control policy ν that guarantees the (robust) satisfaction of the corresponding tTLT T ϕ .In the following, control synthesis algorithms are designed such that the tTLT T ϕ is satisfied based on Definitions 4.4 and 4.5.

Definitions and notations
Before proceeding, the following definitions and notations are needed.
Definition 5.1.The time horizon |Θ| of an STL operator Definition 5.2.A segment of the complete path of an tTLT is called a Boolean segment if it starts and ends with a tube node and contains only Boolean operator nodes.We say a tube node X j is reachable from X i by a Boolean segment if there exists a Boolean segment that starts with X i and ends with X j .
Definition 5.3.If each node of a tree is either a set node that is a subset of U or an operator node that belongs to {∧, ∨, U I , F I , G I }, then the tree is called a control tree.
Each tube node X i of the tTLT T ϕ is characterized by the following two parameters: • t a (X i ): the activation time of X i , • t h (X i ): the time horizon of X i , i.e., the time that X i is deactivated.
Denote by T ϕ (t k ) the resulting tree of T ϕ at time instant t k .It is obtained by fixing the value of each tube node X i according to the activation time t a (X i ) (i.e., T ϕ (t k ) contains either set nodes or operator nodes).Let S i (t k ) be the i-th set node of T ϕ (t k ), where S i (t k ) corresponds to the tube node X i .The relationship between S i (t k ) and X i can be described as follows: Moreover, one has that At each time instant t k , T ϕ (t k ) is characterized by • P (t k ): the set which collects all the set nodes of T ϕ (t k ), i.e., P (t k ) = ∪ i S i (t k ), • Θ: the set which collects all the operator nodes of Given a state-time pair (x k , t k ), define L : R n × R ≥0 → 2 P (t k ) as the labelling function, given by (9) which maps (x k , t k ) to a subset of P (t k ).Moreover, define the function B : R n × R ≥0 → 2 P (t k ) , which maps (x k , t k ) Prepared using sagej.cls to a set of valid set nodes in P (t k ).The function L(x k , t k ) computes the subset of set nodes of P (t k ) that contains x k at time t k (without the consideration of history trajectory) while the function B(x k , t k ) is further introduced to capture the fact that given the history trajectory, not all set nodes in L(x k , t k ) are valid at time t k .A rule for determining B(x k , t k ) given L(x k , t k ) is detailed in Algorithm 7 in the next subsection.

Online control synthesis
In the following, we will first present the online control synthesis algorithm (and its sub-algorithms), and then an example is given to further explain how each sub-algorithm works.
for each non-root and non-leaf tube node X i through a top-down traversal, do 3: root (t 0 ), 9: for each X j that is reachable from X ϕ root by a Boolean segment (see Definition 5.2), do 10: t a (X j ) ← t 0 , 12: end for Algorithm 7 trackingSetNode The online control synthesis algorithm is outlined in Algorithm 5. Before implementation, an initialization process (line 1) is required, which is outlined in Algorithm 6.Here, t a and t h are two functions that map each tube node X i to its activation time and time horizon, respectively.If t a (X i ) or t h (X i ) is unknown for X i , its value will be set as .Then, at each time instant t k , a feasible control set U(x k , t k ) is synthesized (lines 2-11).This process contains the following steps: 1) find the subset of set nodes in P (t k ) that are valid at time t k , i.e., B(x k , t k ), via Algorithm 7 (line 2); 2) determine the activation time of X i , whose corresponding set node , being visited for the first time, it is set as t k ; otherwise, i.e., being visited before, it is unchanged) (lines 3-7); 3) calculate T ϕ (t k+1 ) via Algorithm 8 (line 8); 4) build a control tree T u (t k ) (Definition 5.3) via Algorithm 9 (line 9), compress it via Algorithm 3 (line 10), and then the feasible control set U(x k , t k ) is given by backtracking the compressed control tree T c u (t k ) via Algorithm 10 (line 11).If the obtained feasible control set U(x k , t k ) = ∅, the control synthesis process stops and returns NExis (lines 12-13); otherwise, the control input ν k can be chosen as any element of U(x k , t k ) (one example is to choose ν k as min ν k ∈U(x k ,t k ) { ν k }) (line 15).Then, we implement the chosen ν k , measure x k+1 (line 16), and finally compute the subset of set nodes that are possibly available at the next time instant t k+1 , i.e., Post(B(x k , t k )), via Algorithm 11 (line 17).

Algorithm 8 updatetTLT
We further detail the Algorithms 6-11 in the following.
).The rest of Algorithm 7 (lines 3-7) is to guarantee that B(x k , t k ) contains at most one set node for each complete path of T ϕ (t k ).• Algorithm 8 outlines the procedure of calculating T ϕ (t k+1 ), given T ϕ (t k ), t a and B(x k , t k ).It is designed based on (8).• Algorithm 9 outlines the procedure of building a control tree T u (t k ), which is then used for control set synthesis.It is initialized as T ϕ (t k ) (line 1).Then, for those set nodes S i (t k ) that belongs to B(x k , t k ), it is replaced with the feasible control set (lines 2-8), otherwise, it is replaced with ∅ (lines 9-11).• Algorithm 10 is similar to Algorithm 4, which outlines the procedure of backtracking a compressed tree.• Algorithm 11 outlines the procedure of finding the subset of set nodes that are possibly available at the next time instant t k+1 given B(x k , t k ), t a and T ϕ (t k+1 ).It is designed based on Definition 4.4, where the three cases (lines 4-8, 9-12, 13-16) correspond to items i)-iii) of Definition 4.4, respectively.It guarantees that the resulting trajectory visits each tube node of T ϕ sequentially and stays in each tube node for sufficiently long time steps (as we discussed in Algorithm 7).

Algorithm 9 buildControlTree
Input: T ϕ (t k ), B(x k , t k ), and T ϕ (t k+1 ). Return: Next, an example is given to illustrate one iteration of the control synthesis algorithm (Algorithm 5).
Example 5.1.Consider the single-integrator control system ẋ = u + w with a sampling period of one second.The Algorithm 10 Backtracking* Input: a compressed tree T c u (t k ).Return: the root node of T c u (t k ).1: for each Boolean operator node Θ of T c u (t k ) through a bottom-up traversal, do for each S j (t k ) that is reachable from S i (t k ) by a Boolean segment, do 7: end if 17: end for corresponding discrete-time system is given by The tTLT that corresponds to ϕ is plotted in Figure 4. Using Definitions 2.6 and 2.7, one can calculate that The initial state x 0 = [0.5, 0.8] T , for which x 0 ∈ X ϕ root (t 0 ).Firstly, an initialization process is required, and one can get from Algorithm 6 that Now, let us see how the feasible control set U(x 0 , t 0 ) is synthesized at time instant t 0 .
3) Update the TLT (thus obtain T ϕ (t 1 )) via Algorithm 8.The output T ϕ (t 1 ) is given by and the leaf nodes S µ1 and S µ3 are unchanged.
The following theorem and corollary show the applicability and correctness of Algorithm 5.
Theorem 5.1.Consider the uncertain system (1) with initial state x 0 and an STL formula ϕ in (5).Assume that ϕ is robustly satisfiable for (1) and x 0 ∈ T ϕ root (t 0 ).Then, by implementing the online control synthesis algorithm (Algorithm 5), one can guarantee that (i) the control set U(x k , t k ) is nonempty for all k ∈ N; (ii) the resulting trajectory x ϕ.
Proof.The proof follows from the construction of tTLT and Algorithms 5-11.The existence of a controller ν k at each time step t k , is guaranteed by the definition of maximal and minimal reachable sets (Definitions 2.6 and 2.7), and the construction of tTLT (Propoition 3.1, Theorem 3.1 and Algorithm 1).Moreover, the design of Algorithms 5-11 guarantees that the resulting trajectory x satisfies the tTLT T ϕ , i.e., x T ϕ , which implies x ϕ as proven in Theorem 4.1.
Remark 5.1.The tTLT construction relies on the computation of backward reachable tubes.Over the past decade, new approaches (e.g., decomposition-based approach (Chen et al., 2018a) and learning-based approaches (Allen et al., 2014;Bansal and Tomlin, 2021)) and software tools (e.g., Hamilton-Jacobi Toolbox (Mitchell and Templeton, 2005) and CORA Toolbox (Althoff, 2015)), have been developed for improving the efficiency of computing backward reachable tubes.Moreover, we remark that the computation of reachable tubes in our work for constructing of the tTLT can be performed offine, which may mitigate the online computational burden.On the other hand, although the exact computation of backward reachable sets/tubes is in general nontrivial for high-dimensional nonlinear systems, efficient algorithms exist for linear systems with polygonal input and disturbance sets (Kurzhanski and Pravin, 2014).
Remark 5.2.The online control synthesis algorithm (Algorithm 5) contains 7 sub-algorithms, i.e., Algorithm 3 and Algorithms 6-11.The computational complexity is determined by Algorithm 9, in which one-step feasible control sets need to be computed.The computational complexity of Algorithms 3,6,7,8,10,11 is O(1).Note that in Algorithm 8, the computation of reachable sets, which is required for set node update, is done offline when constructing the tTLT.
Remark 5.3.Different from the mixed-integer programming formulation for STL control synthesis (Raman et al., 2015(Raman et al., , 2014)), where an entire control policy has to be synthesized at each time step, the control synthesis in our work is reactive in the sense that only the control input at the current time step is generated at each time step.

Case Studies
In this section, two examples illustrating the theoretical results are provided.We first perform a numerical simulation for car overtaking and then apply our algorithms to a car parking scenario.

Car overtaking example
We first consider a car overtaking example.This example will specify an overtaking task as an STL formula and then show how to synthesize overtaking controller with safety guarantee.
As shown in Figure 9, we consider a scenario where an automated vehicle Veh 1 plans to move to a target set S µ1 within 80 seconds.Since there is a broken vehicle Veh 2 in front of Veh 1 and there is another vehicle Veh 3 that moves in an opposite direction in the other lane, Veh 1 must overtake Veh 2 for reaching S µ1 and avoid Veh 3 for safety.
We describe the dynamics of the vehicle Veh 1 as in Murgovski and Sjöberg (2015): and δ is the sampling period.The working space is    The initial state of V 3 is x0 = [p x ini , 2.5] T .Then, we have that its position of x-axis is px k = px ini + δ × (k − 1) × vx .To formulate the overtaking task, we define the following three sets as shown in Figure 9: and Let us choose the sampling period as δ = 0.2s(seconds).To respect the time constraint and the input constraint for Veh 1 , we consider two possible solutions to the previous reachability problem: (1) fast overtaking: overtake Veh 2 before Veh 3 passes Veh 2 ; (2) slow overtaking: wait until Veh 3 passes Veh 2 and then overtake Veh 2 .The fast overtaking can be encoded into an STL formula: Note that S µ6 denotes the reachable set for the vehicle Veh 3 within the time interval [0, 16] seconds and 16 (that corresponds to the sampling index k = 80) is the maximal time instant that the vehicle Veh 1  can reach the set S µ5 in the sprit of ϕ 1 .Using Algorithm 1, one can construct the tTLT T ϕ fast overtake (see Figure 10), where The slow overtaking can be encoded into an STL formula Note that S µ7 denotes the reachable set for the vehicle Veh 3 within the time interval [16, +∞) and 16 (that corresponds to the sampling index k = 80) is the minimal time instant that the vehicle Veh 1 can reach the set S µ4 in the sprit of ϕ 2 .The tTLT T ϕ slow overtake can be constructed similar to T ϕ fast overtake .
In the following, two simulation cases are considered and the online control synthesis algorithm is implemented.In the fast overtaking, we choose the initial position px ini = 95 and the moving velocity vx = −2 for the vehicle Veh 3 and the initial position x 0 = [0.5, −2.5, 2] T for Veh 1 .One can verify that the specification ϕ slow overtake is infeasible in this case.Figure 11 (a) shows the position trajectories, from which we can see that the whole specification is fulfilled.The blue region denotes the set S µ6 .Figure 11   vehicle Veh 3 and the same initial position x 0 = [0.5, −2.5] T for Veh 1 .In this case one can verify that ϕ fast overtake is infeasible.Figure 12 (a) shows the position trajectories, from which we can see that the whole specification is fulfilled.The blue region denotes the intersection between the set X and the set S µ7 .Figure 12  Although the position trajectories in the two cases are similar as shown in Figures 11(a)-12(a), we highlight their difference through the evolution of the position of x-axis along the time in Figure 14.We use k 1 , k 2 , and k 3 (or k 1 , k 2 , and k 3 ) to denote the minimal time instants that Veh 1 reaches the sets S µ4 , S µ5 , and S µ1 in the fast overtaking (or the slow overtaking), respectively.We can see that these two position trajectories satisfy the time intervals encoded in the ϕ 1 and ϕ 2 , respectively.Furthermore, in order to show the robustness, we run 100 realizations of the disturbance trajectories in the fast overtaking.The position trajectories for such 100 realizations are shown in Figure 13.
Finally, we report the computation time of this example, which was run in Matlab R2016a with MPT toolbox (Herceg et al., 2013) on a Dell laptop with Windows 7, Intel i7-6600U CPU 2.80 GHz and 16.0 GB RAM.We perform reachability analysis for constructing the tTLT offline, which takes 59.10 seconds.For online control synthesis, the minimal computation time at a single time step over 100 realizations is 0.23 seconds, while the maximal computation time is 1.07 seconds.The average time of each time step is 0.31 seconds.We remark that the mixed-integer formulation is difficult to implement in this example.This is because the computational complexity of mixed-integer programming grows exponentially with the horizon of the STL formula, which in this example reaches up to 400 sampling instants, much longer than the horizons considered in the simulation examples of Raman et al. (2015Raman et al. ( , 2014)); Sadraddini and Belta (2015).

Car parking example
Next, we consider a car parking example.This example will specify a parking task as an STL formula and then show how our algorithms perform on real hardware.We will first perform reachability analysis for constructing the tTLT offline and then we use the tTLT to synthesize a parking controller for the Small-Vehicles-for-Autonomoy (SVEA) platform (Jiang et al., 2022).
As shown in Figure 15, we consider a scenario where an automated vehicle must enter the parking lot S µ1 , park in the designated parking spot S µ2 , and leave the parking lot through the exit S µ4 , where each step of the scenario has a specific deadline.Additionally, throughout the scenario, the vehicle must stay safe and avoid collisions with the parking lot walls and parked vehicles S µ3 .
We describe the underlying continuous dynamics of the automated vehicle as:  For the parking task, we set δ = 0.05s.We define the state sets in Figure 15  We let the full scenario be 60 seconds long and specify that the vehicle needs to enter the parking lot, park into the designated spot, and leave the parking lot within 10 seconds, 40 seconds, and 60 seconds, respectively.Then, this parking task can be encoded into the following STL formula: First, we use Algorithm 1 to construct the corresponding tTLT T ϕ parking (see Figure 16), where the tube nodes X i , i = 1, • • • , 8 are computed in a bottom-up manner as in the previous example.Then, we implement the online control synthesis algorithm (Algorithm 5) on a SVEA vehicle using T ϕ parking .For choosing a control policy within the constraints of the synthesized control sets, we apply the same approach as described in Section IV.C of Jiang et al. (2020).
For our evaluation, we initialize the SVEA vehicle with the initial state of x 0 = [1, 1.75, −π, 0].At this initial state, ϕ 3 is robustly satisfiable.Figure 17 shows the position trajectory, where one can see that the specification is fulfilled.In Figure 18, we show the control input trajectories for acceleration and steering.We use k 1 , k 2 , k 3 to denote the minimal time instants that the automated vehicle reaches sets S µ1 , S µ2 , and S µ4 .Using the synthesized controller, the SVEA vehicle realized k 1 = 8.0, k 2 = 18.7, and k 3 = 48.7,as illustrated in both Figures 17 and 18, confirming the satisfaction of ϕ parking .For our evaluation, we initialize the SVEA vehicle with the initial state of x 0 = [1, 1.75, −π, 0].At this initial state, ϕ 3 is robustly satisfiable.Figure 17 shows the position trajectory, where one can see that the specification is fulfilled.In Figure 18, we show the control input trajectories for acceleration and steering.We use k 1 , k 2 , k 3 to denote the minimal time instants that the automated vehicle reaches sets S µ1 , S µ2 , and S µ4 .Using the synthesized controller, the SVEA vehicle realized k 1 = 8.0, k 2 = 18.7, and k = 48.7,as illustrated in both Figures 17 and 18, confirming the satisfaction of ϕ parking .
Finally, we report the computation time of this example, which was run in Matlab R2022b with the Level Set Method Toolbox (Mitchell and Templeton, 2005).We perform reachability analysis for constructing the tTLT offline on a Dell laptop with Ubuntu 20.04,Intel i7-4600U CPU 2.10GHz and 8.0 GB RAM, which takes 2371.81seconds.We note that the offline computation time for constructing the tTLT can be significantly reduced by using the python implementation (Bui et al., 2022).Throughout the parking task, we perform the online control synthesis on an NVIDIA Jetson TX2 embedded computer onboard the SVEA vehicle.The average time step of the online control synthesis is 0.001 seconds.A video demonstration of this experiment can be found at https://bit.ly/STLtTLT.

Conclusion
A novel approach for the online control synthesis of uncertain discrete-time systems under STL specifications was proposed in this paper.Firstly, a real-time version of STL semantics and a notion of tTLT were introduced.Then the formal semantic connection between an STL formula and its corresponding tTLT was derived, i.e., a trajectory satisfying an tTLT also satisfies the corresponding STL formula.Finally, an online control synthesis algorithm was designed for the uncertain systems based on the connection between STL and tTLT.For the fragment of STL formulas under consideration, the soundness of the algorithm was proven.In the future, the control synthesis for multi-agent systems under local and/or global STL specifications is of interest.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Figure 4 .
Figure 4. Example 3.1: syntax tree (left) and tTLT (right) for BacktrackingInput: a compressed tree T c ϕ .Return: the root node of T c ϕ .1: for each Boolean operator node Θ of T c ϕ through a bottom-up traversal, do 2: Let us continue with Example 3.1.The tTLT T ϕ (right of Figure4) contains 2 complete paths, i.e.,

Figure 5 .
Figure 5. Example 4.1: compressed tree T c ϕ , where Tϕ is plotted in

Figure 11 .
Figure 11.Trajectories for one realization of disturbance signal in the fast overtaking: (a) position trajectory; (b) velocity trajectory of x-axis; (c) control trajectory of x-axis; (d) control trajectory of y-axis.

Figure 12 .
Figure 12.Trajectories for one realization of disturbance signal in the slow overtaking: (a) position trajectory; (b) velocity trajectory of x-axis; (c) control trajectory of x-axis; (d) control trajectory of y-axis.
(b) shows the velocity trajectory of v x and Figures 11 (c)-(d) show the corresponding control inputs, where the dashed lines denote the control bounds.The cyan regions represent the synthesized control sets and the lines are the control trajectories.In the slow overtaking, we choose the initial position px ini = 80 and the moving velocity vx = −3 for the Prepared using sagej.cls

Figure 13 .Figure 14 .
Figure 13.State trajectories for 100 realizations of disturbance signals in the fast overtaking.
(b) shows the velocity trajectory of v x and Figures 12 (c)-(d) show the corresponding control input trajectories of a x and v y .

Figure 15 .
Figure 15.Scenario illustration: an automated vehicle needs to enter into the parking lot, park in the designated parking spot (blue), and leave the parking lot, while avoiding any collisions.

Figure 17 .
Figure 17.The position trajectory of a SVEA vehicle performing the parking task ϕ parking .

Figure 18 .
Figure 18.The velocity and heading trajectories in response to the acceleration and steering inputs throughout the parking task ϕ parking .

Here, φ 1 , φ 2 are formulas of class φ and ϕ 1 , ϕ 2 are formulas of class ϕ given in (5).
then Prepared using sagej.cls of the control synthesis, and it relates to Algorithm 11 postSet.Firstly, one needs to compute the subset of set nodes of P (t k ) that contains x k at time t k , i.e., L(x k , t k ) (line 1).Then, one has from Definition 4.4 that if a trajectory x satisfies one complete path of the tTLT, it must i) visit each tube node of the complete path sequentially and ii) stay in each tube node for sufficiently long time steps (Remark 4.1).Based on these two requirements, Algorithm 11 is designed to predict the subset of set nodes that are possibly available at the next time instant, i.e., Post(B • Algorithm 6 calculates the functions t a and t h (lines 1-7) and Post(B(x −1 , t −1 )) (lines 8-12).• Algorithm 7 outlines the procedure of finding the subset of set nodes in P (t k ) that are valid at time t k , i.e., B(x k , t k ).This is the most important step