Multiautonomous underwater vehicle consistent collaborative hunting method based on generative adversarial network

The time-varying ocean currents and the delay of underwater acoustic communication have caused the uncertainty of single autonomous underwater vehicle (AUV) tracking target and the inconsistency of multi-AUV coordination, which make it difficult for multiple AUVs to form a hunting alliance. To solve the above problems, this article proposes the multi-AUV consistent collaborative hunting method based on generative adversarial network (GAN). Firstly, the three-dimensional (3D) kinematic model of AUV is established for the underwater 3D environment. Secondly, combined with the Laplacian matrix, the topology of the hunting alliance in the ideal environment is established, and the control rate of AUV is calculated. Finally, using the GAN network model, the control relationship after environmental interference is used as the input of the generative model. The control rate in the ideal environment is used as the comparison object of the discriminative model. Using the iterative training of GAN to generate a control rate that adapts to the current interference environment and combining multi-AUV topological hunting model to achieve successful hunting of noncooperative target, the experimental results show that the algorithm reduces the average hunting time to 62.53 s and the success rate of hunting is increased to 84.69%, which is 1.17% higher than the particle swarm optimization-constant modulus algorithm (PSO-CMA) algorithm.


Introduction
Autonomous underwater vehicle (AUV) is an intelligent robot that can complete the underwater work without the operator. 1 At present, AUV has been studied by many scientists and applied to a variety of tasks. However, as the complexity of current work is getting higher and higher, a single AUV can no longer complete the work independently and multiple AUVs need to work together to complete the task.
In the field of multi-AUV hunting, scholars have proposed many new collaborative algorithms and hunting algorithms. Some of these algorithms use different optimization methods to reduce the amount of calculations in the hunting process, thereby reducing the hunting time. Some scholars have improved the hunting success rate of the algorithm by improving the control precision of hunting AUV. But in the actual ocean underwater environment, time-varying ocean currents and communication delays can have a great impact on the hunting system, how to achieve successful hunting of noncooperative targets in a complex ocean underwater environment is a major test for various algorithms.
At present, the generative adversarial network (GAN) has become one of the research hotspots, mainly used in image fields, such as target recognition, image generation, and data enhancement. The combination of the generative model and the discriminative model can generate ideal target information from noise. In this article, GAN is applied to the multi-AUV hunting field. The influences of time-varying ocean currents and communication delays on the hunting system are used as random noise. Using the combination of the generated model and the discriminative model, the algorithm can generate the control rate of the hunting AUV adapted to the complex interference environment. It can reduce the effects of time-varying ocean currents and communication delays on multi-AUV hunting and improve the success rate of the algorithm.
In summary, scholars have made effective improvements in multi-AUV hunting. However, in real-world underwater environments, time-varying ocean currents and hydroacoustic communication delays cause significant interference with the multi-AUV hunting system. In this article, the GAN model is introduced into the multi-AUV hunting field and used to train the control rate that can adapt to the underwater interference environment and improve the hunting success rate of the multi-AUV hunting system, as shown in Figure 1

Related work
In terms of multi-AUV control, many scholars have conducted a lot of research in two-dimensional 2,3 and three-dimensional (3D) [4][5][6] environments. Aiming at the path tracking control problem of AUV, a new Lyapunovbased model predictive control method is proposed by Shen et al. 7 to improve the performance of noncooperative target tracking control and the robustness of tracking control. Some researchers 8,9 proposed a self-optimizing control method based on tracking differentiator and active disturbance rejection control (ADRC) theory to achieve target location and tracking. In terms of depth control of AUV, the literature 10 uses the internal shift mechanism to change the center of gravity of AUV and proposed an adaptive learning control method based on distribution and deterministic learning to accurately and effectively control the depth of AUV.
In the aspect of multi-AUV hunting, Cai et al. 11,12 used multi-AUV collaborative methods to identify and hunt noncooperative targets, which improved the robustness of the hunting process. Ni et al. 13 proposed a new method based on the spinal nerve system for the unknown 3D underwater environment, which can keep multi-AUV stable in formation without obstacle collision. For cooperative hunting of multi-AUV system, not only basic problems, such as path planning and collision avoidance, should be considered but also task assignments in a dynamic way. Cao et al. 14 proposed an integrated algorithm combining self-organizing map neural network and Glash's biological heuristic neural network method to improve the efficiency of multi-AUV collaborative hunting.
Multi-AUV systems are subject to communication delays in underwater environments. Some researchers [15][16][17] proposed a time compensation method or a method of enhancing the time delay controller for the communication delay problem, which reduces the influence of underwater acoustic communication delay on the hunting alliance. The literature 18,19 use the optimized communication topology to coordinate the multi-AUV hunting alliance, which can carry out hunting tasks in the communication delay environment, and improve the reliability of the algorithm. However, during the hunting process, there is also the influence of time-varying ocean currents, which brings great difficulties to the control of multi-AUV and needs to be considered during the hunting process. Some researchers 20,21 used the topology information of sensor networks to reduce the impact of errors and make the system more stable.
At present, the GAN algorithm is mainly applied to image recognition, such as video recognition 22,23 and image translation. 24,25 Some scholars have applied the GAN model to other fields. Ren and Xu 26 proposed a fully data-driven approach for phasor measurement unit (PMU)-based prefault dynamic security assessment with incomplete data measurements, and it can reduce the impact of data loss on fault assessment. For the production planning and control problems of aircraft remanufacturing systems, Zheng et al. 27 proposed an adaptive replanning strategy with triggers and replanning procedures to solve the problem of different types of data imbalance in the reengineering system. Tang et al. 28 proposed a feature combination method based on prior knowledge, which is used to maximize a series of risk returns in the securities market. Zhang et al. 29 used the GAN method in the field of image recognition transportation, in which the proposed method has better versatility and flexibility in image recognition transportation.
The rest of this article is organized as follows. The second section provides an overview of the relevant literature. The third section describes the 3D kinematics equation of AUV, the multi-AUV topology hunting model based on noncooperative target escape point, and the multi-AUV consistency cooperative hunting method based on GAN. The fourth section analyzes the simulation experiment. Finally, the fifth section summarizes the article.

Autonomous underwater vehicle kinematics equation in three-dimensional space
To effectively control the hunting alliance to hunt noncooperative targets faster and more accurately, the AUV kinematics equation in 3D space is established. A fixed point on the sea level is the origin O of the inertial coordinate system. The OX axis and the OY axis are perpendicular to each other in the horizontal plane, and the OZ axis is perpendicular to the X OY plane to the center of the Earth. Take the center of gravity E of the AUV as the origin of the carrier coordinate system, the EX 0 axis is the AUV forward direction, EY 0 is the traverse direction, and EZ 0 is the latency direction, as shown in Figure 2.
In Figure 2, , q, and correspond to the heeling angle, the trim angle, and the slant angle, respectively, in the inertial coordinate system of the AUV (the counterclockwise direction is positive). u, v, and w are the three coordinate components of AUV in the carrier coordinate system. p, q, and r are the three components of the velocity of the AUV in the carrier coordinates, respectively. In the inertial coordinate system, the state of the AUV uses six degrees of freedom to represent the vector h ¼ ½ x y z q T . x; y; z are the positions in the inertial coordinate system and the motion state V ¼ ½ u v w p r q T . AUV cannot perform side shifting and roll under normal conditions. Let ¼ 0, p ¼ v ¼ 0, the attitude in the inertial coordinate system becomes h ¼ ½ x y z q T , and the motion state Topology construction of the hunting alliance Multi-AUV hunts noncooperative targets in a 3D environment, as shown in Figure 3. Among them, AUV 1 ; AUV 2 ; Á Á Á ; AUV n are hunting points of the AUV hunting alliance, T is a noncooperative target, and the hunting points are evenly distributed around the target. It is assumed that the noncooperative target has the same dynamic model as the hunting alliance, and the communication topology of the alliance is fully connected, as shown in Figure 4. In a hunting system consisting of n AUVs, the communication topology-dependent Laplacian matrix L can be expressed as The AUV i kinetic equation can be expressed as a nonlinear system as follows where x i 2 R n represents the system state, u i represents the system control rate, and n i is the system output. f ðÁÞ, gðÁÞ, and kðÁÞ are system functions with corresponding dimensions. Combined with the above communication topology relationship, the control rate u i ðtÞ of AUV i can be expressed as where X ðtÞ ¼ ½ x T 1 ðtÞ x T 2 ðtÞ ÁÁÁ x T n ðtÞ T represents the state of each individual of the multi-AUV system at the time t set. LðtÞ is a Laplacian matrix at time t. d i ðÁÞ indicates the controller. The specific process is shown in Figure 5.

Multiautonomous underwater vehicle consistent collaborative hunting control based on generative adversarial network
When the multi-AUV hunting alliance hunts the mobile noncooperative target, the ideal hunting point changes at any time due to the escape of the noncooperative target. Through the detection of the noncooperative target, a hunting model is established to determine the ideal hunting position of each hunting AUV. As shown in Figure 6, when the noncooperative target escapes from T to T 0 , the ideal The noncooperative target has the escaping attribute. When the escape direction points to the AUV i gap center, the escape probability is the largest and the point is the escape point x d . The distance from the noncooperative target to the point x d is D, and the distance from AUV i to the point x d is d. Only when D > d is maintained can the noncooperative target be able to escape the hunting of the hunting alliance, as shown in Figure 7.
Let h ðtÞ is the state of the noncooperative target at time t, the multi-AUV hunting alliance topology optimization model is where x i ðtÞ represents the actual state of AUV i at time t and x j ðtÞ represents the actual state of AUV j at time t. However, in the actual underwater environment, multi-AUV systems are subject to time-varying ocean currents and the communication delay between AUVs is unpredictable, which have a great impact on the cooperation of multi-AUV. This article trains and generates a cooperative hunting control strategy based on GAN model.
According to the hunting alliance topology, a multi-AUV collaborative hunting strategy in an ideal environment can be generated, which leads to the unpredictable deviation of hunting AUV. Using the GAN model to train the above data can generate a more ideal control rate U i ðtÞ,  reduce the impact of the environment on multi-AUV collaborative hunting strategy, and improve the robustness of the algorithm. The generated confrontation network is mainly composed of the generator G and the discriminator D. The purpose of the discriminator D is to maximize the distinction between real data and counterfeit data generated by the generator. The formula can be expressed as In equation (6), x is the actual state of AUV i after the influence of time-varying ocean currents and communication delays. P data represents real data, which is represented in this article as the multi-AUV control rate in an ideal environment. P G represents the data generated by the generator. When the generator G is fixed, the generated data can be approximated to the true data to the greatest extent. The goal of the generator is to be able to generate a multi-AUV collaboration strategy that is closer to the ideal environment, so that the discriminator cannot identify the generated data as false. The specific formula can be expressed as follows By optimizing the control rate, a new control rate of a more suitable interference environment is generated, which reduces the influence of time-varying ocean currents and communication delays on multi-AUV cooperative hunting strategies, and improves the success rate of hunting.
The training of GAN is divided into two parts: generator training and discriminator training. During the generator training process, set the real data tag to 1 and the tag of the data generated by the generator to 0. Then, the generated data are sent to the discriminator together with the real data, and the training discriminator can recognize which is the real data and which is the generated data. The update process for the discriminator is as follows Figure 5. Hunting alliance topology optimization process.  where L is the loss function, q d is the discriminator parameter, g is the updated step size, and m is the size of the batch data. After determining the discriminator parameters, update the parameters of the generator. The multi-AUV control rate in the time-varying ocean current and communication delay interference environment is input to the generator, and the newly generated control rate label of the generator is set to 1. It is then sent to the discriminator for discrimination, and the error is fed back to the generator. The trained generator can generate a control rate U i ðtÞ that resists time-varying ocean currents and communication delay interference. The update formula is The discriminator and the generator are iteratively trained until the discriminator cannot distinguish between the generated data and the real data. Then, the control rate U i ðtÞ can more accurately control the multi-AUV system in the complex underwater environment.
To achieve consistent synergy of multi-AUV hunting systems, the discrete controller-based coordinated controller design method enables AUV i in multi-AUV hunting alliances to satisfy the following relationships where K is the controller gain, x j ðt k Þ represents the system state of AUV j at time t k , and x i ðt k Þ represents the system state of AUV i at time t k , and a ij is an adjacency matrix. By referring to the GAN model, the multi-AUV collaborative hunting strategy under time-varying ocean currents and communication delay interference is generated. It can effectively improve the success rate of hunting for noncooperative goals and make the algorithm more robust. The specific algorithm flow is provided in Table 1, and the schematic diagram is shown in Figure 8.

Simulation
The simulation calculation runs on a small server with a CPU of E5-2630 v4, the main frequency of 2.2 GHz, and a memory of 32 GB. The algorithm in this article simulates the data in MATLAB R2016a under the window10 system. Let AUV have a depth of 10 m and a speed of u ¼ 1.5 m/s. The effect of time-varying ocean currents and communication delays on AUV obeys a normal distribution with the mean of 0 and the standard deviation of 0.5. Set the initial state of the AUV to ½ 0 0 10 0 , and the sampling point is 0.1 s. Train the GAN model, as shown in Figure 9. The red curve represents the real target information, and the blue curve represents the hunting AUV control information generated by the generator. In Figure 9(a), the blue lines are constantly approaching the red lines. Figure 9(b) shows the algorithm after 30 iterations of learning. Most of the blue lines are closer to the red line, indicating that the training results are very close to the true value. Figure 9(c) shows the system output optimal control strategy. During the training of the generator, the data error is shown in Figure 10. The final output error is controlled within 3%.
It is difficult to stabilize the control of the hunting AUV due to the influence of time-varying ocean currents and communication delays. In the simulation process, the algorithm is compared with linear quadratic gaussian controller with loop transfer recovery (LQG/LTR) algorithm, fuzzy proportion integration differentiation (PID) algorithm, and fuzzy adaptive PID algorithm in the same environment. The control curve of different algorithms for hunting AUV in the interference environment is shown in Figure 11.
In Figure 11(a), when the algorithm LQG/LTR is employed, the maximum deviation of plane position change is 0.99 m and the average deviation is 0.6 m, that of fuzzy PID and fuzzy adaptive PID is 0.8 and 0.7 m, and the average plane position error is 0.57 and 0.46 m. When using our algorithm, the maximum positional deviation occurs in the range of 0.5 m, and the average positional deviation is in the range of 0.42 m.
In Figure 11(b), the maximum deviation of the LQG/LTR algorithm control depth is 0.97 m and the average deviation is 0.43 m, while that of the algorithm fuzzy PID and fuzzy adaptive PID is 0.92 and 0.6, relatively, and the average depth deviation is 0.46 and 0.33 m. When the AUV depth is controlled by the algorithm of this article, the maximum Table 1. Multi-AUV consistent collaborative hunting method based on GAN.
Input: Target status information ½ u v w p r q T , number of AUVs n. Output: The hunting point of a non-cooperative target and its hunting results.
1. Calculate the multi-AUV hunting strategy under ideal conditions based on the target state; 2. Calculate the control rate u i ðtÞ of AUV i ; 3. The hunting AUV is disturbed, and the actual position is in error with the ideal position; 4. GAN input x is the actual location of the hunting AUV; 5. Generator G generates a new control rate U i ðtÞ; 6. If discriminator D determines that U i ðtÞ is true 7. Go to step 10; 8. If discriminator D determines that U i ðtÞ is false 9. Go to step 5; 10. Calculate multi-AUV consistent collaborative; 11. Output the control rate U i ðtÞ of AUV i ; GAN: generative adversarial network; AUV: autonomous underwater vehicle.
deviation is 0.42 m and the average deviation is 0.21 m. Compared with the above four algorithms, the algorithm proposed in this article is more stable in the environment of interference.
The multi-AUV consistent cooperative hunting control method based on GAN can reduce the influence of timevarying ocean current and communication delay in the multi-AUV hunting process. The algorithm of this article is compared with other algorithms in the same interference environment for multi-AUV system consistency. Each initial position is generated by adding environmental errors to the AUV of the hunting point of the noncooperative target. The simulation results are shown in Figure 12. Figure 12 shows the consistent synergistic error curves for the number of hunting AUVs of 6, 8, and 12, respectively. When the number of hunting AUVs is 6, the   minimum consistent collaborative error is 0.21 m of our algorithm. As the number of hunting AUVs increases, the value of the consistency error will change. When the number of rounded AUVs is 12, the maximum consistency error of our algorithm is only 0.51 m. In other algorithms, when the number of hunting AUVs is 6, the evolutionary artificial neural networks (EANNS) algorithm has a minimum error of 0.37 m, but it increases with the number of hunting AUVs. When the number of hunting AUVs is 12, the consistent collaborative error will reach 0.94 m. From the above data analysis, our algorithm can effectively reduce the impact of time-varying ocean currents and  communication delays on multi-AUV systems, but the convergence speed is slow, and the algorithm needs to be optimized in future work. The noncooperative target hunting simulation of different algorithms in the same interference environment, the specific data of hunting time, and success rate are provided in Table 2. When the number of hunting AUVs is 6, the hunting time of the BB algorithm is at least 53.34 s, but the success rate of hunting is only 65.29%. When the number of hunting AUVs is 8, the PF algorithm has a minimum hunting time of 62.58 and the highest hunting success rate is 84.74% of the algorithm proposed in this article. When the number of hunting AUVs increased to 12, the success rate of hunting was 85.25% of the algorithm proposed in this article and the hunting time was at least 56.37 s.
In summary, when the number of hunting AUVs is 6 and 8, the algorithm proposed in this article does not have the minimum hunting time, but the hunting success rate is the highest, and there is no large fluctuation. Under the interference environment of time-varying ocean current and communication delay, the average hunting success rate of the algorithm in this article is 84.69%, and the hunting time is 62.53 s. But the convergence speed of multiple AUV systems is relatively slow. In the future, we will continue to optimize the algorithm to reduce the hunting time and improve the success rate of hunting.

Summary
Consistent coordinated control of multiple AUVs is the key to the noncooperative target hunting process, but how to achieve the successful hunting of multi-AUV against noncooperative targets under the influence of timevarying ocean currents and communication delays is unclear.. In view of the above problems, this article proposed a multi-AUV consistent collaborative hunting method based on GAN. The GAN network is introduced into the multi-AUV collaborative hunting field. The generator is used to generate the coordinated control rate suitable for the current complex environment, and the successful hunting of noncooperative targets under the time-varying ocean current and communication delay interference environment is realized. Experiments show that the algorithm proposed in this article shows a good hunting effect, but it needs to be further improved in terms of hunting time. In future work, we will focus on the above issues, making the algorithm more efficient.

Data availability statement
The data set in this article is a self-built multi-AUV data set. The data set contains the data of the confidential information, such as the performance parameters and tactical technical indicators of AUV. Therefore, the data set of this article has certain confidentiality and cannot be released.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.