查看更多>>摘要:In this article, we study a pursuit–evasion game between two players with heterogeneous kinematics, where the pursuer is with damped double-integrator dynamics and the evader is with single-integrator dynamics. The pursuer aims at capturing the evader as soon as possible, while the evader wants to avoid or delay the capture. Traditional methods to solve pursuit–evasion games rely on the Hamilton–Jacobi–Isaacs (HJI) equations and retrogressive path equations, which are very complicated and nonintuitive, thus failing to obtain a complete solution. To overcome these challenges, we develop an intuitive isochron-based method to thoroughly analyze all possible situations of the game and a concise geometric approach to calculate the optimal strategies, providing a complete solution to this game. Specifically, the isochron-based method effectively leverages three main factors: the players' motion capability, the pursuer's capture capability, and the players' states. Based on these, we analyze the players' superiority and the geometrical features of their isochrones and the intersections, thus acquiring concise conditions that determine the game's outcome. For the success-capture cases, we propose a new geometric approach to calculate the target points of the players and then obtain the closed-loop state feedback optimal pursuit and evasion strategies. We then get the corresponding value function and provide a validation using the HJI equation. For the success-evasion cases, we exploit the intersection of the players' isochrones to design some effective evasion strategies, which ensure that the evader can always avoid or delay the capture. Finally, some numerical simulations are carried out to validate the effectiveness and applicability of our results.
查看更多>>摘要:In the domain of multiplayer pursuit–evasion games, it is crucial to address the practical aspects of the players' heterogeneity, the distributed control manner, and the pursuers' goal of minimum makespan. However, the three topics have received limited attention in existing literature, both separately and in combination. In this article, we address the multiplayer pursuit–evasion game integrating these key topics, where the pursuers with simple motions strive to capture as many evaders, characterized by damped double integrators, as possible, meanwhile minimizing the task makespan. To this end, we establish an overall framework and sequentially tackle four key issues, leading to an effective solution to the entire problem. We first propose a novel isochron-based method to derive the capture condition and acquire the optimal pursuit and evasion strategies. Then, we enhance the usability of the capture condition by geometrically deriving the analytical form of the pursuer's winning region with respect to a given evader. Third, leveraging the concept of winning regions, we determine the lower boundary of pursuers' sensing range, ensuring sufficient information acquisition for evader allocation while avoiding selection conflicts. Finally, we propose a fully distributed allocation algorithm for each pursuer allowing the pursuer team to converge to the optimal evader allocation. By combining these contributions, we successfully provide an effective solution to the entire problem. Various simulations are conducted to show the effectiveness of our proposed methods.
查看更多>>摘要:This article studies the data-driven reconstruction of firing rate dynamics of brain activity described by linear-threshold network models. Identifying the system parameters directly leads to a large number of variables and a highly nonconvex objective function. Instead, our approach introduces a novel reformulation that incorporates biological organizational features and turns the identification problem into a scalar variable optimization of a discontinuous, nonconvex objective function. We prove that the minimizer of the objective function is unique and establish that the solution of the optimization problem leads to the identification of all the desired system parameters. These results are the basis to introduce an algorithm to find the optimizer by searching the different regions corresponding to the domain of definition of the objective function. To deal with measurement noise in sampled data, we propose a modification of the original algorithm whose identification error is linearly bounded by the magnitude of the measurement noise. We demonstrate the effectiveness of the proposed algorithms through simulations on synthetic and experimental data.
查看更多>>摘要:State estimation of robotic systems is essential to implementing feedback controllers, which usually provide better robustness to modeling uncertainties than open-loop controllers. However, state estimation of soft robots is very challenging because soft robots have theoretically infinite degrees of freedom while existing sensors only provide a limited number of discrete measurements. This work focuses on soft robotic manipulators, also known as continuum robots. We design an observer algorithm based on the well-known Cosserat rod theory, which models continuum robots by nonlinear partial differential equations (PDEs) evolving in geometric Lie groups. The observer can estimate all infinite-dimensional continuum robot states, including poses, strains, and velocities, by only sensing the tip velocity of the continuum robot, and hence it is called a “boundary” observer. More importantly, the estimation error dynamics is formally proven to be locally input-to-state stable. The key idea is to inject sequential tip velocity measurements into the observer in a way that dissipates the energy of the estimation errors through the boundary. The distinct advantage of this PDE-based design is that it can be implemented using any existing numerical implementation for Cosserat rod models. All theoretical convergence guarantees will be preserved, regardless of the discretization method. We call this property “one design for any discretization.” Extensive numerical studies are included and suggest that the domain of attraction is large and the observer is robust to uncertainties of tip velocity measurements and model parameters.
查看更多>>摘要:In this article, we address the problem of learning optimal control policies for systems with uncertain dynamics and high-level control objectives specified as linear temporal logic (LTL) formulas. Uncertainty is considered in the workspace structure and the outcomes of control decisions giving rise to an unknown Markov decision process (MDP). Existing reinforcement learning (RL) algorithms for LTL tasks typically rely on exploring a product MDP state-space uniformly (using e.g., an $\epsilon$-greedy policy) compromising sample-efficiency. This issue becomes more pronounced as the rewards get sparser and the MDP size or the task complexity increase. In this article, we propose an accelerated RL algorithm that can learn control policies significantly faster than competitive approaches. Its sample-efficiency relies on a novel task-driven exploration strategy that biases exploration toward directions that may contribute to task satisfaction. We provide theoretical analysis and extensive comparative experiments demonstrating the sample-efficiency of the proposed method. The benefit of our method becomes more evident as the task complexity or the MDP size increases.
查看更多>>摘要:In the conventional framework for distributed fault prognosis of discrete-event systems (DESs), it is assumed that observable events are always observed [such case is called static event observations (SEOs)]. However, the assumption may not hold in many DESs such as sensor networks. This article introduces the concept of distributed fault prognosis with dynamic event observations (DEOs), in which observable events are not always observed. Communication models and extended models are constructed, based on which, for each local prognoser, an extended dynamic observation mask with two forms is constructed to capture its aggregate information. In order to verify prognosability subject to DEOs, one algorithm whose complexity is polynomial in the number of states but exponential in the number of local prognosers is presented. Furthermore, one significant condition for prognosability subject to DEOs is derived. Finally, the obtained results are applied to an Alipay online trading system and an Industry 4.0 manufacturing system.
查看更多>>摘要:Security of system behavior is a kind of information flow security, which is achieved by confusing the intruders via the indistinguishability of system behaviors. Noninterference is a typical notion to describe information flow security, for which multilevel intransitive noninterference (MINI) is an advanced variant. Since there is a lack of rigorous approach to assessing MINI, this article achieves so via observability theory. For systems modeled by labeled Petri nets (LPNs), two MINI properties, i.e., positive MINI (PMINI) and bipolar MINI (BMINI), are considered. First, a necessary and sufficient condition for their assessment is established via language equivalence. Language equivalence analyses for PMINI and BMINI are based on the existing trace equivalence and the proposed INI bisimulation, respectively. INI bisimulation is more comprehensive to describe negative noninterference than bisimulation. Second, another necessary and sufficient condition is established after the transformation of MINI assessment problem to nonblocking analysis problem. The core of such a problem transformation is the stepwise construction of nonblocking analyzer. This stepwise construction allows MINI assessment to proceed online before terminating at an appropriate time. In addition, this stepwise construction fully employs the concurrency of LPNs so that MINI can be assessed in a multithreaded way. Both online and multithreaded MINI assessments can improve assessment efficiency.