Reliability evaluation of dynamic face recognition systems based on improved Fuzzy Dynamic Bayesian Network

The reliability of face recognition system has the characteristics of fuzziness, randomness, and continuity. In order to measure it in unconstrained scenes, we find out and quantify key broad-sense and narrow-sense influencing factors of reliability on the basis of analyzing operation states for six dynamic face recognition systems in the practical use of six public security bureaus. In this article, we propose a novel evaluation method with True Positive Identification Rate in dynamic and M:N mode and create a novel evaluation model of system reliability with the improved Fuzzy Dynamic Bayesian Network. Subsequently, we infer to solve the fuzzy reliability state probabilities of the six systems with Netica and get two most important factors with the improved fuzzy C-means algorithm. We verify the model by comparing the evaluation results with actual achievements of these systems. Finally, we find several vulnerabilities in the system with the least reliability and put forward a few optimization strategies. The proposed method combines advantages of the improved fuzzy C-means model with those of the dynamic Bayesian network to evaluate the reliability of the dynamic face recognition systems, making the evaluation results more reasonable and realistic. It starts a new research of face recognition systems in unconstrained scenes and contributes to the research on face recognition performance evaluation and system reliability analysis. Besides, the proposed method is of practical significance in improving the reliability of the systems in use.


Introduction
Nowadays, the public security situation is confronted with two major difficulties. Traditional and nontraditional security threats are intertwined. The complexity of social contradictions and the threat of terrorism are increasing. The situation has brought unprecedented pressure and challenge to public security agency in their work of combating, preventing, managing, and controlling threats. However, the increase in police force is far from enough to meet the actual 1 the complexity and uncertainty of systems, there is fuzziness in variable states of reliability. The fuzzy Bayesian network emerges subsequently. The deficiencies of the fuzzy set theory itself may cause information loss in the process of fuzzy Bayesian network inferring and solving. In addition, the above-mentioned research works do not consider broad-sense but narrow-sense influencing factors of reliability. For example, the recognition performance is an important factor that influences the reliability of the dynamic face recognition system.
At present, there are a few relevant research works on the security system. Lv 19 evaluates the reliability of a set of security equipment by testing their lives and fitting their failure functions. The method is quite timeconsuming, and it only analyzes internal failure factors. In Qu et al., 20 a failure risk evaluation index system of a video surveillance system and the fuzzy analytic hierarchy process model are created to evaluate the system failure probability, that is, the reliability evaluation of the video surveillance system. The index system is mostly qualitative, and the evaluation method itself is rather subjective.
Aimed at the problems in previous research works, we propose a relatively objective and effective reliability evaluation model with the improved Fuzzy Dynamic Bayesian Network. It is a novel reliability evaluation model with a combination of improved fuzzy C-means and dynamic Bayesian network. We evaluate the reliability of dynamic face recognition systems with the improved Fuzzy Dynamic Bayesian Network model and find out important influencing factors and vulnerabilities, and put forward a few optimization strategies.
The innovations and contributions of this work are summarized as follows: 1. We find out influencing factors of the reliability of face recognition systems in unconstrained scenes, including both broad-sense and narrowsense factors, and quantify them scientifically and reasonably. 2. We propose a novel evaluation method with True Positive Identification Rate (TPIR) of face recognition systems in dynamic and M : N mode. 3. We create a novel evaluation model of system reliability with the improved Fuzzy Dynamic Bayesian Network, which can dynamically evaluate the reliability of the systems with characteristics of fuzziness, randomness, and continuity. The Membership functions of the improved Fuzzy Dynamic Bayesian Network model are constructed not according to subjective experience but objective sample data. So the proposed model is obviously more advantageous than previous methods. 4. We conduct evaluation experiments on six dynamic face recognition systems in use and verify the model by comparing the evaluation results with actual achievements of these systems. Experimental and verification results show that our approach is reasonable and realistic.

Approach
The Bayesian network, also known as the belief network, is one of the most effective theoretical models in the field of random uncertainty knowledge representation and inference. It has obvious advantages in logic descriptions of random unascertained failures, while the simple Bayesian network has some limitations, mainly because it can only describe discrete random variables with finite states. Jousselme et al. 21 categorize unascertained knowledge into two kinds, which are ambiguity and vagueness. They correspond to random variables and fuzzy variables, respectively. The influencing factors of the dynamic face recognition system reliability are random, fuzzy, and continuous. ''Incidence'' is used to indicate the possibility of a random event, and the Membership Degree is used to indicate the extent to which a fuzzy event belongs to a certain state. Both the possibility and the Membership Degree are continuous. For example, ''high incidence of camera failure'' is a mixed event with the characteristics of randomness and fuzziness. The occurrence of this event involves probability and degree. ''Incidence'' is random and ''high'' is fuzzy. To describe ''high incidence of camera failure,'' we need to consider both the probability of ''incidence'' and the degree of ''high.'' In many complicated cases, the variables are all random, fuzzy, and continuous. If we describe this kind of variables with the simple Bayesian network, it is equivalent to assigning all of the degrees to 1 or 0. It means that the node variables either belong to a certain state or do not belong to the state at all, without considering intermediate states of node variables. The fuzzy Bayesian network and the Fuzzy Dynamic Bayesian Network can solve the kind of problems. The former is a model that combines the fuzzy set theory and the simple Bayesian network, [22][23][24][25][26][27][28][29] which is premised on the basis of the hypothesis that the system reliability state is independent of time. The latter is a model that combines the fuzzy set theory and the dynamic Bayesian network, [30][31][32] which is premised on the basis of the hypothesis that the system reliability state is related to time and changes with time. If we regard the state change process of the system reliability as a series of snapshots, each snapshot describes the state of the system reliability at a specific time. Actually, every snapshot, also known as a time slice, is a fuzzy Bayesian network. The Fuzzy Dynamic Bayesian Network is a special form of the fuzzy Bayesian network, and it consists of a series of fuzzy Bayesian networks. Because the system reliability state is often timedependent in reality, the Fuzzy Dynamic Bayesian Network is more suitable for system reliability description. But for both the fuzzy Bayesian network and the Fuzzy Dynamic Bayesian Network, there is a major defect in the construction of Membership functions. It may cause inaccuracy in evaluation results.
The improved Fuzzy Dynamic Bayesian Network model consists of the improved fuzzy C-means and the dynamic Bayesian network. Its main idea is to use improved Membership functions in fuzzy sets to represent the states of continuous variables and compute the prior probabilities on the basis of considering degrees. The fuzzy C-means is a mathematical model that is used for the soft classification of samples according to their attributes. 33,34 Because the Membership functions of the improved fuzzy C-means model are constructed according to objective sample data, they have obvious advantages over fuzzy Membership functions that are constructed according to subjective experience in unascertained information processing. The improved fuzzy C-means model is good at processing fuzzy unascertained information, while the dynamic Bayesian network is good at processing random unascertained information. We combine them to create a new approach to evaluating the reliability of complex systems, which is the improved Fuzzy Dynamic Bayesian Network. In this section, we only present the approach of the improved fuzzy Bayesian network, and we extend it on time series to get the improved Fuzzy Dynamic Bayesian Network model in the process of modeling in the next section.
Suppose u 1 , u 2 , . . . , u s , . . . , u Q represent Q dynamic face recognition systems that need to be evaluated. Let U = fu 1 , u 2 , . . . , u s , . . . , u Q g. X 1 , X 2 , . . . , X d are influencing factors of system reliability, namely, evaluation indicators. We regard them as d attribute nodes (root nodes) of the improved fuzzy Bayesian network. X si represents the ith attribute of u s , which is a mixed variable that is both fuzzy and random. Suppose observation times of the ith attribute of system u s are n si and the jth observed value of the ith attribute of system u s is x sij (s = 1, 2, . . . , Q; i = 1, 2, . . . , d; j = 1, 2, . . . , n si ).
We need to quantify to what degree x sij belongs to a certain state G k before evaluating system u s . Let G = G 1 , G 2 , . . . , G k , . . . , G C f g , which is the set of states, and G 1 , G 2 , . . . , G k , . . . , G C are ordered and limited, where C is the number of states or classifications. G k also represents a mixed event classification.P si (G k ) represents the possibility that mixed event G k happens to X si , that is, fuzzy probability. Then,P si (G 1 ), P si (G 2 ), . . . ,P si (G C ) are fuzzy state probabilities of attribute node X si . The computation equation is where x sij 2 G k means that the jth observed value of the ith reliability factor in system u s belongs to fuzzy subset G k , or that mixed event G k happens to observed value x sij . n si is observation times. n sk denotes the number of x sij belonging to fuzzy subset G k . m 0 sij (G k ) is the fuzzy measure of x sij belonging to fuzzy subset G k , also known as the Membership Degree. The absolute value of m 0 sij (G k ) cannot be known because of its uncertainty, and its relative value is easy to be determined. As can be seen from equation (1), when m 0 sij (G k ) only equals to 1 or 0, the model is just the simple Bayesian network. So the simple Bayesian network is a special form of the improved fuzzy Bayesian network, and the improved fuzzy Bayesian network is the general form of the simple Bayesian network.
In order to introduce the improved fuzzy C-means algorithm conveniently, we suppose n si equals to 1, that is, attribute node X si of system u s has only one observed value, and we regard each system as a whole sample. Suppose m k is the center vector of classification G k , and m k = (m k1 , m k2 , . . . , m kd ) (k = 1, 2, . . . , C). m is the centroid of the particle set which is composed of C classification centers, and m = (1=C) Obviously, it is a certainty classification. When we regard m k as a sample in G k approximately and regard a certain distance from u s to m k as a kind of similarity between u s and G k , the certainty of the classification above is changed to uncertainty, and the uncertainty classification is closer to the real classification situation. Let Equation (2) is a certain distance from u s to m k . It is a weighted distance, and w y = s 2 y = P d z = 1 s 2 z , which is a kind of clustering weight. w y represents the proportion of clustering contribution of X sy to that of all attributes. s 2 z is the quasi-variance, and s 2 where e is a constant. It can adjust its excessive influence on the Membership Degree when jju s À m k jj 2 is excessively small. Equation (3) is the improved fuzzy Membership function, also known as the Membership Degree. It strictly satisfies three measurement criteria: non-negative boundedness, additivity, and normalization. It is obviously scientific and reasonable to regard it as a Membership function of the unascertained classification.
The general form of the objective function of the improved fuzzy C-means is Then we can transform the improved fuzzy C-means into the following optimization problem: solve the minimum for J . When J is minimum, the corresponding value of m s (G k ) is optimum.
If we regard system u s as a sample and all systems as a sample set to compute the Membership Degree with the improved fuzzy C-means algorithm, the abovementioned m s (G k ) is a comprehensive Membership Degree corresponding to all the indicators of system u s . In this article, we regard each observed value x sij of attribute node X si as a sample and all the observed values of X si as a sample set, and compute the value of Membership Degree m 0 sij (G k ) corresponding to the single indicator X si with the improved fuzzy C-means algorithm. We compute the fuzzy probability of attribute node X si by substituting m 0 sij (G k ) into equation (1). Then we compute the average value of X si , and x si = (1=n si ) P n si j = 1 x sij (s = 1, 2, . . . , Q; i = 1, 2, . . . , d). After substituting x si (s = 1, 2, . . . , Q; i = 1, 2, . . . , d) into the improved fuzzy C-means algorithm, we compute the clustering weight w i of influencing factor X i of system reliability, whose value indicates the importance degree of influencing factor and evaluation indicator X i .

Data acquisition
In this article, one part of the raw data come from the 1-year maintenance records and local weathers of six dynamic face recognition systems in six public security bureaus. Because these data are rather sensitive, we use u 1 , u 2 , . . . , u 6 to represent these six systems, respectively. After analyzing the maintenance records, we obtain relevant basic data, such as failure time, failure location, failure module, failure cause, number of image channels involved, and duration of failure. The other part of the raw data come from the performance evaluation of the six systems on the spot. In this article, we only evaluate the recognition performance of the six systems. These data are used as the basic data of the evaluation experiments.

Modeling
The first task of improved fuzzy Bayesian network modeling is to set the network nodes. We regard all specific influencing factors of reliability as the root nodes of the improved fuzzy Bayesian network, subsystems as intermediate nodes, and the system reliability as the leaf node. The summary of the specific variables serving as the nodes of the model is shown in Table 1. On the basis of analyzing causal relationships between these nodes and basic data, we create an adjacency matrix of the network nodes. We first construct a framework of the improved fuzzy Bayesian network. Then we extend it along the time axis to obtain the improved Fuzzy Dynamic Bayesian Network with an interval of 1 month. It consists of five time slices, namely, five snapshots of the fuzzy Bayesian network. Figure 1 shows a framework of two time slices. For each time slice, there are 21 root nodes (influencing factors of reliability), 6 intermediate nodes, and 1 leaf node. There are other model parameters as follows: the number of systems Q = 6, factor X i (T ) (8 ł i ł 28), and the number of node states or classifications C = 3. The three states of attribute nodes are ''High,''''Middle,'' and ''Low,'' which are represented by G 1 , G 2 , and G 3 , respectively. n si is observation times of factor X si of system u s , and 1 ł n si ł 31. The conditional probabilities between parent and child nodes are determined by the following equations where X c represents a child node, X p represents a parent node, ''1'' means the occurrence of mixed event G k , and ''0'' means that the mixed event G k does not occur. The prior probabilities of the root nodes can be computed or evaluated through the following procedure. To begin with, we need to set some intermediate parameters. These parameters are some statistical results related to reliability factor failure coefficient x sij on the jth day, including failure rate l sij , the tth failure restoration time T R sijt , and the number of image channels involved in the tth fault, n sijt . Then the calculation equation of the cumulative failure restoration time on the jth day is When an influencing factor of the reliability is narrow-sense, the calculation equation of its failure coefficient x sij is where V s is the total number of image channels in system u s .
When an influencing factor of the reliability is broad-sense, to be specific, the recognition performance is different from the narrow-sense factors in calculation method, whose calculation equation is Acquisition subsystem X 3 Transmission subsystem X 4 Storage subsystem X 5 Recognition subsystem X 6 Display subsystem X 7 Control subsystem X 8 Power supply for front-end equipment X 9 Camera X 10 Light supplement lamp X 11 Lens X 12 Power supply for server room X 13 Power supply for transmission line X 14 Line X 15 Electromagnetic shielding X 16 Hardware of storage X 17 Software of storage X 18 Hardware of identification X 19 Software of identification X 20 Recognition performance X 21 Power supply for display X 22 Hardware of display X 23 Software of display X 24 Power supply for control X 25 Hardware of control X 26 Software of control X 27 Weather X 28 Adversary attack where R s1 is the face capture rate, R s2 is the face qualification rate, and R s3 is the True Positive Identification Rate (TPIR s ) of system u s at 1% False Positive Identification Rate (FPIR s ) in M : N mode, abbreviated as TPIR s at 1% FPIR s . Because of the stability of the recognition performance and the inconvenience of the repetitive evaluation experiments on the spot, we evaluate and compute the recognition performance only once for each system in each month.
The specific evaluation process of R s1 is as follows: Randomly sample N s1 5-min clips captured by N s1 face cameras of system u s in a certain proportion. Use a video-based pedestrian statistics algorithm to count the number of pedestrians in the ith clip, and let it be A s1i . Remove repetitive faces captured by a same camera in the corresponding period and get the actual number of faces captured by the ith camera, and let it be A s2i . Then, R s1 = ( The specific evaluation process of R s2 is as follows: Randomly sample 15,000 unblocked, clear, and frontal face images from the six systems in a certain proportion and give them a positive label. We call them qualified face images, namely, positive samples. Randomly sample 15,000 blocked, blurred, or non-frontal face images from the six systems in a certain proportion and give them a negative label. We call them unqualified face images, namely, negative samples. Design a convolution neural network model and train it with the abovementioned labeled sample data. Apply the trained convolution neural network model to classify the face images captured by each of the above-mentioned cameras into two classifications, qualified and unqualified. Count the number of qualified face images captured by the ith camera, and let it be A s3i . Then, The specific evaluation process of R s3 is as follows: Randomly sample 1000 pairs of face images from LFW, LFW BLUFR, MegaFace, and a self-built face dataset, respectively. Each pair of faces belong to the same identity. Divide all sample pairs into two face sets, each of which contains 4000 face images. One serves as a positive probe set and the other as a gallery set. Randomly sample 500 pairs of face images from the four above-mentioned face datasets, respectively, which serve as a negative probe set. The negative probe set has no overlap with the positive probe set, and each pair of faces belong to different identities. Conduct an evaluation experiment with the three face sets on the platform of system u s to get the number of True Positive outcomes (TP s ) and that of False Negative outcomes (FN s ). At this time, M = 6000, N = 4000. Then R s3 , that is, TPIR s at 1% FPIR s , is reported by the evaluation experiment, whose computation equation is R s3 = TPIR s = TP s =(TP s + FN s ). The images of the selfbuilt face dataset are selected from the six dynamic face recognition systems in a certain proportion.
According to the series of experiments and calculations above, we obtain the reliability factor failure coefficients and some related parameters. For example, a batch of statistical results of system u 5 are shown in Table 2. The reliability factor failure coefficients serve as the observed values of the root nodes in the improved Fuzzy Dynamic Bayesian Network. As for root node X i , we serve all the observed values of six systems in 1 year as one sample set and put them into the improved fuzzy C-means algorithm to obtain single indicator Membership Degree m 0 sij (G k ) of x sij on each observation day. When we place each single indicator Membership Degree of each system into equation (1) on a monthly basis, we obtain the monthly fuzzy probabilities and the classification of X si . Then we serve the fuzzy probabilities of X si in the 12th month as the prior probabilities of the root nodes in the improved Fuzzy Dynamic Bayesian Network model. For each pair of adjacent nodes in two time slices, the state transition probabilities are determined by the statistics on monthly classification changes. One of the state transition matrices is shown in Table 3. So far, the modeling tasks of the improved Fuzzy Dynamic Bayesian Network are completed.

Inference and solving
In this article, a piece of software named Netica is used for inference and solving. First, the model is initialized. The initial state parameters of each node are all set to 0.333. When the fuzzy prior probabilities of root nodes are put into the network, root node states of the network are updated. As soon as the conditional probabilities and the state transition probabilities are put into the network, the network inference is triggered. Then the fuzzy probability distributions of all nodes are obtained, and the fuzzy reliability state probabilities of a system in each time slice are obtained. As far as system u 1 is concerned, the sketch of inference and solving is shown in Figure 2. As can be seen in Figure 2, at time T the fuzzy reliability state probabilities of system u 1 are 0.014, 0.104, and 0.882, which represent the possibilities that system u 1 has ''High,''''Middle,'' and ''Low'' reliability states, respectively. The summary evaluation results of all the six systems are shown in Table 4. The clustering weights of the 21 indicators are shown in Table 5. Failure coefficient The data in this table are the statistical results based on a performance evaluation report and a day's maintenance records of system u 5 , as well as the local weather severity of the day.

Analysis and verification of results
As shown in Table 4, at time T + 4, the ''High'' reliability state probabilities of u 1 , u 2 , u 4 , and u 6 are higher than those of the other two states correspondingly. The ''High'' reliability state probability of u 1 is the highest. According to the maximum probability identification criterion, we determine that the reliability states of u 1 , u 2 , u 4 , and u 6 are all ''High.'' Similarly, the ''Middle'' reliability state probabilities of u 3 and u 5 are higher than those of the other two states correspondingly. According to the maximum probability identification criterion, we determine that the reliability states of u 3 and u 5 are all ''Middle.'' As can be seen in Table 5, there are two indicators with far larger clustering weights, X 20 (recognition performance) and X 12 (power supply for server room). The percentages of the two indicators in the whole index system are 67.0027% and 29.4319%, respectively, which means that the dispersion degrees of the two reliability factor failure coefficients are relatively high in all the six systems. The two indicators make greater contributions to the evaluation results. By comparing the two failure coefficients of the six systems with each other, we find that those of system u 1 are the lowest in the six systems, while those of u 5 are the highest in the six systems. The primary reason for this is that system u 1 belongs to a subway police substation in a city of south China. The face cameras of system u 1 are distributed on subway pavements inside. The decent conditions of system u 1 , such as high pedestrian flow volume, moderate illumination, superior algorithm, and reliable power, make it the highest reliability, while system u 5 is deployed in a northern coastal city of China. The unfavorable conditions of system u 5 , such as inappropriate camera layouts, much bad weather, unreliable power, and inferior algorithm, make it the lowest reliability. This partly verifies that the evaluation results are realistic.
In addition, we verify the evaluation results according to three kinds of actual achievements of the systems, including arresting criminal suspects, breaking cases, and warning to key population. First, we use FPIR 0 i at 1% FNIR 0 i (i = 1, 2, 3) to represent the ith actual achievements of a system. Then we serve FPIR 0 1 , FPIR 0 2 , and FPIR 0 3 as three root nodes and the total actual achievements as a leaf node to construct a framework of the improved Fuzzy Dynamic Bayesian Network. The computation process of FPIR 0 i is as follows: Suppose the probability that the target faces appear in the camera fields of view of the system equals 1. Regard all face images captured by the system in the evaluation period as a probe set whose size is M 0 and the ith kind of watch targets in the evaluation period as a gallery set whose size is N 0 i . Let TP 0 i be correct warning times and FP 0 i be false alarm times in the evaluation period.
We substitute the known parameters of system u s into equation (14) to obtain the value of its FPIR 0 i . We serve all the values of 1 À FPIR 0 i of the six systems in 1 year as the observed values of the root nodes in the improved Fuzzy Dynamic Bayesian Network. We compute the actual achievement state probabilities in the same way as in the previous reliability evaluation, which are shown in Table 6. We compare the evaluation results of each system with its actual achievements and then find that the reliability state probabilities and the corresponding actual achievement state probabilities are very close to each other, and the reliability of each system is consistent with its actual achievements, that is, the evaluation results are correct and reasonable. The comparisons of the evaluation results of u 1 and u 5 with their actual achievements are shown in Figure 3.
The original intention and the primary purpose of the reliability evaluation are to find vulnerabilities in the light of the evaluation results and improve the reliability effectively. According to Table 5, we know that factor X 20 exerts the greatest influence on the reliability, and factor X 12 comes second. If the reliability of a  system is low, the two indicators are the cruxes of the matter. It is of great significance to decrease failure coefficients of the two indicators for the reliability improvement of the system. As far as system u 5 is concerned, the failure coefficients of X 20 and X 12 are relatively high. So the public security bureau should improve the reliability from the two aspects. First, improve the system recognition performance. In detail, optimize the location layouts and deployment parameters of front-end cameras, and improve or upgrade the face recognition algorithm constantly. Then reduce power failures of the system in server room. Specifically, equip the server room with enough generators or expand capacity for unattended power source (UPS).

Conclusion
In this article, we propose a novel evaluation method with TPIR in dynamic and M : N mode, create a novel reliability evaluation model of face recognition systems in unconstrained scenes, and conduct evaluation experiments on six dynamic face recognition systems in practical use. Then we use Netica to infer and solve the fuzzy reliability state probabilities of the six systems and find out two most important influencing factors. By comparing the evaluation results with the actual achievements of the six systems, we prove that the evaluation method is scientific and reasonable, and the evaluation results are objective and correct. Finally, we put forward two strategies for improving the reliability of a system according to the evaluation results. The reliability  promotion strategies based on the evaluation results are strongly well-directed and operable. The improved Fuzzy Dynamic Bayesian Network model proposed in this article is applicable for evaluating the reliability of complex systems characterized by fuzziness, randomness, and continuity. The pioneered reliability evaluation method can contribute to the advancement of face recognition performance evaluation and system reliability analysis. At the same time, it is conducive to improving the effectiveness of dynamic face recognition systems.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by the Basic Special Project of Ministry of Public Security of China (grant number 2016GABJC01) and the National Key R&D Program of China (grant number 2016YFC0801003).