Modeling and analyzing malware diffusion in wireless sensor networks based on cellular automaton

Wireless sensor networks, as a multi-hop self-organized network system formed by wireless communication, are vulnerable to malware diffusion by breaking the data confidentiality and service availability, owing to their low configuration and weak defense mechanism. To reveal the rules of malware diffusion in the really deployed wireless sensor networks, we propose a model called Malware Diffusion Based on Cellular Automaton to describe the dynamics of malware diffusion based on cellular automaton. According to the model, we first analyze and obtain the differential equations, which can reflect the various state dynamics of sensor nodes with cellular automaton. Then, we attain the equilibrium points of the model Malware Diffusion Based on Cellular Automaton to determine the threshold for whether malware will diffuse or die out in wireless sensor networks. Furthermore, we compute the basic regeneration number of the model Malware Diffusion Based on Cellular Automaton using the next-generation matrix and prove the stability of the equilibrium points. Finally, via experimental simulation, we verify the effectiveness of the model Malware Diffusion Based on Cellular Automaton, which can provide administrators with the theoretical guidance on suppressing malware diffusion in wireless sensor networks.


Introduction
Wireless sensor networks (WSNs) are extensively used in the real deployments, such as military and civil applications. 1,2 WSNs are characterized by self-organization, dynamic topology, and limited capacity, and each sensor node (SN) has a limited communication range, 3,4 so they are vulnerable to malware diffusion because of their low configuration and weak defense mechanism. 5,6 Once WSNs are targeted by malware by exploiting network vulnerabilities, SN systems, or hardware and software vulnerabilities, the malware will quickly diffuse from one SN to another within communication range via communication. 7,8 Malware is an application with malicious intent, which is intentionally developed by attackers to cause damage to SN systems by injecting malicious data, breaking data transmission, blocking the network, or exhausting the capacity of SNs to lead to the WSNs paralyzed. [9][10][11][12][13] To solve the problem with malware diffusion in WSNs, the primary issue is to construct a malware diffusion model and to reveal the rules of malware diffusion which is helpful for suppressing malware diffusion. [14][15][16] Epidemiology is commonly used to solve the problem with the dynamics of malware diffusion, because malware diffusion has the same characteristics as the propagation of epidemic. 17,18 Researchers have presented many models, 19,20 such as SI, SIS, SIRD, and SEIRD, to exhibit the change of states which consist of states susceptible (S), exposed (E), infected (I), recovered (R), and dead (D) for SNs.
To deal with the problem with malware diffusion in WSNs, we propose a model to describe the dynamics of malware diffusion based on cellular automaton (CA) [21][22][23] that is more appropriate for describing the really deployed WSNs, compared with Markov chain. First, it can reflect the heterogeneity of WSNs by considering the networking and communication process among SNs and raising spatial-time discrete models of dynamical systems. 24 Second, the stochastic model established by Markov chain is usually in lack of space consideration and not applicable to describe the dynamics of malware diffusion, while the model based on CA can effectively overcome these problems. In the proposed model, one cellular lattice represents a cell, and the states of the cell can be obtained from the fraction of the number of SNs in states S, E, I, R, and D.
In view of the above problem to be solved, we first propose a model called Malware Diffusion Based on Cellular Automaton (MDBCA), in which the SNs can be described by the five states S, E, I, R, and D. Second, we formulate differential equations to reflect the state dynamics of SNs. To explore the threshold whether malware will diffuse or die out in WSNs, we compute the equilibrium points of the model MDBCA for the malware-infected WSNs. Finally, we compute the basic reproduction number at the equilibrium points using the next-generation matrix and prove the stability of the equilibrium points.
The rest of this article is presented as follows: in the second part, we review the related work. In the third section, we propose the model MDBCA and analyze and obtain differential equations of the model, which can reflect the dynamics of the fraction of SNs in states S, E, I, R, and D. In the fourth section, we prove the existence of the equilibrium points of the model MDBCA and compute the basic reproduction number at the equilibrium points and finally prove the stability of the model MDBCA at the equilibrium points. In the fifth section, we present the experimental simulation and data analysis using relevant parameters. In the sixth section, we give the conclusion of this article.

Related work
At present, many researchers have presented various methods applicable to solve the problems with malware diffusion in WSNs. Liu et al. 25 used the stochastic game to propose a method for WSNs to predict the probability of malware adopting the spread behavior. Shen et al. 26 developed traditional epidemic theory and constructed a malware propagation model by differential equations to represent the dynamics between states. Shen et al. 27 considered heterogeneous wireless sensor networks (HWSNs) and set up a dependability assessment mechanism for HWSNs with malware diffusion. Shen et al. 20 considered a clustered WSN under epidemic-malware propagation conditions. Jiang et al. 28 considered that WSNs can prevent from malware based on Stackelberg game. Acarali et al. 29 applied the principles of epidemic modeling to IoT networks consisting of WSNs. Wang et al. 30 used the pulse differential equation and the epidemic theory to propose a method for WSNs preventing from malware diffusion.
Many researchers have studied epidemic spread using CA. Holko et al. 31 presented a new numerical, two-dimensional CA framework for simulation of the spread of an infectious disease. Sharma and Gupta 32 analyzed the SEIR (susceptible-exposed-infectedrecovered) epidemic spread with time delay through a two-dimensional CA model. Lo´pez et al. 33 addressed population heterogeneity and distribution in epidemics models using a CA framework for simulation of the spread of an approach. White et al. 34 introduced a theoretical model based on CA framework for simulation of the epidemic spreading.
However, some malware diffusion issues based on CA in the malware-infected WSNs still have not been considered. The first issue is the elements mapping between SNs and CA, and the second issue is how to determine the condition in which malware will diffuse or die out in the malware-infected WSNs. In this study, we solve the first issue by proposing the model MDBCA, which can reflect the dynamics of the fraction of SNs by the differential equations. Furthermore, we solve the second issue by proving the stability of the model MDBCA at the equilibrium points, which is mathematically verified for theoretical correctness.

Description of the model MDBCA
To simplify the research, we consider the malwareinfected WSNs are deployed in a two-dimensional cellular lattice with a side length of L. The lattice is divided into identical square areas, and each of them represents a cell of CA. The spatial location of each cell is shown with the two-dimensional coordinates. 34 The SNs are randomly or uniformly distributed in the twodimensional cellular lattice.
As for the application of CA to the model MDBCA, it is important to describe model elements, 35 including cell, cellular state, cellular neighbor, and cellular state transition function, under the diverse characteristics of malware diffusion. Definition 1. CA can be described by the 5-tuplet, CA = fC, N , Q, V , Fg, where C is the cell set, N is the cell capacity, Q is the cellular state set, V is the cellular neighbor set, and F is the transition function of cellular state.
Definition 2. C (cell set) represents the number of areas that are divided into the two-dimensional cellular lattice (L 3 L), and each area represents one cell. The spatial position of the cell is uniquely identified by the horizontal coordinate (i) and the vertical coordinate (j) in the lattice; c i, j represents the cell located in the coordinate (i, j). C can be described by Definition 3. N (cell capacity) is the number of active SNs in each cell, which is a constant.

Definition 4.
Q is the cellular state set, and Q = fS, E, I, R, Dg.
Note that Q includes the susceptible state S, the exposed state E, the infected state I, the immune state R, and the dead state D. Correspondingly, in cells, we classify a SN into state S at the time when it is prone to be infected by malware, but has not be infected by malware; we classify a SN into state E at the time when it has been infected by malware, but cannot diffuse the malware to its neighbor; we classify a SN into state I at the time when it has been infected by malware and can diffuse the malware to its neighbor; we classify a SN into state R at the time when it is immune to the malware; we classify a SN into state D at the time when it loses all functions, as it has either entirely consumed its energy or been damaged by malware. The state transform rates among SNs in cell are shown in Figure 1.
In Figure 1, the symbol m denotes the probability of joining in the malware-infected WSNs for the new SNs in state S every time step. To simplify the research, we suppose the total number of active SNs N is a constant in each cell, and m is equal to the death rate; in other words, the birth rate and the death rate are considered to be the same. When the SNs in state S communication with the SNs in state I, the SNs in state S will be infected; note that one part of the SNs transforms from S to I by the rate t, and the other part transforms from S to E by the rate l. After the latent period, the SNs in state E can transform from E to I by the rate s. For the known malware, we will patch security programs to cure the infected SNs, so the SNs in state I can transform from I to R by the rate e.
The state of the cell c i, j at time t can be described as indicate the rates of SNs in states S, E, I, and R, respectively, at time t in the cell C i, j . Because the number of active SNs N in each cell is a constant, the SNs in state D are not included; we can obviously conclude the following equation Definition 5. V is the cellular neighbor set, V = fv a, b j1 ł a, b ł Lg, where the v a, b is one of the neighbors of the cell c i, j , and the cellular neighbor set of the cell c i, j can be described by where r denotes the communication radius of cells. Definition 6. F is the cellular state transition function, Since the state transition function of the cell c i, j is jointly determined by its own state Q tÀ1 i, j and its neighbor state Q tÀ1 a, b , the state transition function Q t i, j can be described by With the effect of the state transition functions and the transition rate among SNs, the dynamic transition relationship among SNs of the cell can be summarized as system (5), which is the local transition system of pre- subject to 1 ø S t i, j .0, 1.E t i, j ø 0, 1.I t i, j ø 0, and 1.R t i, j ø 0. Here, the parameter m is defined as the communication connectivity between the cell c i, j and its neighbor v i, j . The S tÀ1 i, j enters states I tÀ1 i, j and E tÀ1 i, j at the fractions lS tÀ1 i, j I tÀ1 i, j and tS tÀ1 i, j I tÀ1 i, j , respectively. At the same time, the S tÀ1 i, j enters states I tÀ1 a, b and E tÀ1 a, b at the fractions l P tively. The E tÀ1 i, j enters state I tÀ1 i, j at the fraction sE tÀ1 i, j . The I tÀ1 i, j enters state R tÀ1 i, j at the fraction eI tÀ1 i, j . All states S tÀ1 i, j , E tÀ1 i, j , I tÀ1 i, j , and R tÀ1 i, j enter state D at the fraction m, and the m is also the newly added fraction in state S.

Stability analysis of the model MDBCA
System equilibrium points of the model MDBCA System equilibrium points mean that the system finally converges to a stable state over time. As shown in system (5) above, the system reaches a stable state, when the c i, j in states S t i, j , E t i, j , I t i, j , and R t i, j do not change over time. The system equilibrium points can be divided into the malware-free equilibrium point and the endemic equilibrium point. The malware-free equilibrium point refers that the fractions of I t i, j and E t i, j converge to zero, that is, the SNs in states I and E in the system are extinct, while the endemic equilibrium point refers that the fraction of I t i, j and E t i, j are greater than 0, that is, the SNs in states S t i, j , I t i, j , and E t i, j are present. In other words, if the system exists the malware-free equilibrium point, the malware in the WSNs will eventually die out, while if the system exists the endemic equilibrium point, the malware in the WSNs will continuously diffuse, and the fractions of I t i, j and E t i, j will eventually reach a stable state.
Performing Taylor expansion on the system (5) and keeping only the first two terms, we obtain system (6) When dS tÀ1 i, j =dt = 0, dE tÀ1 i, j =dt = 0, dI tÀ1 i, j =dt = 0, and dR tÀ1 i, j =dt = 0, system (6) achieves an equilibrium solution. To simply the calculation, a special case is taken into consideration. When the SNs are uniformly distributed in the cellular lattice, then we can have N a, b =N i, j = 1. Furthermore, we can analyze the reduced system X = (S t i, j , I t i, j , E t i, j ) and compute R t i, j from equation (2). Therefore, we obtain system (7) After directly solving system (7), we can identify two equilibrium points: G 0 = (S 0 , I 0 , E 0 , R 0 ) = (1, 0, 0, 0) and G Ã = (S Ã , I Ã , E Ã , R Ã ). Here S Ã = (s + m)(m + e) ls + st + tm ð8Þ E Ã = lm(ls + st À sm + tm À m 2 À se À me) (l + t)(s + m)(ls + st + tm) ð10Þ From the viewpoint of epidemiology, the equilibrium point G 0 is called a malware-free equilibrium point, while G Ã is called an endemic equilibrium point. These equilibrium points can be used to analyze the dynamics of malware diffusion based on the model MDBCA in malware-infected WSNs.

Stability analysis of the malware-free equilibrium point of the model MDBCA
In order to verify the stability of system (7) at the malware-free equilibrium point G 0 , it is necessary to apply the next-generation matrix method to compute the basic reproduction numbe 36 R 0 of the system (7) at the malware-free equilibrium point G 0 , because R 0 can be used to quantify the fraction of infected SNs by the SNs in state I.
If R 0 \1, the infectious SNs will infect less than one SN, which means that malware will die out and there is a malware-free equilibrium point. In other words, system (7) will asymptotically stable at the malware-free equilibrium point G 0 .
If R 0 .1, then the infectious SNs will infect more than one SN, meaning that the malware will continue to diffuse and there is an endemic equilibrium point. In other words, system (7) will asymptotically stable at the endemic equilibrium point G Ã .
In this article, the basic reproduction number R 0 of system (7) at the malware-free equilibrium point G 0 following the next-generation matrix method is expressed as Theorem 1. If R 0 \1, the malware-free equilibrium point G 0 is locally asymptotically stable, whereas R 0 .1, the malware-free equilibrium point G 0 is unstable.
Proof. According to stability theory for ordinary differential equations, if and only all the eigenvalues of the Jacobian matrix of an equilibrium point are less than zero, the equilibrium point is locally asymptotically stable. When system (7) is subjected to Taylor series expansion at G 0 , the linear approximation system of system (7) is obtained ignoring the high-order infinitesimal. Then, the Jacobian matrix of system (7) at G 0 is shown as follows Here, J 11 , J 12 , J 13 , J 21 , J 22 , J 23 , J 32 , and J 33 are the values of system (7) corresponding to the Jacobian matrix at the malware-free equilibrium point G 0 . When it is denoted as J (G 0 ), the expression is shown as follows ð14Þ The submatrix of matrix (15) and its determinant value are evaluated as follows Therefore, when R 0 .1, det(A 3 )\0, all the eigenvalues of matrix (15) are less than zero, and the malwarefree equilibrium point G 0 is locally asymptotically stable, while R 0 .1, some eigenvalues of matrix (15) are greater than zero; in other words, the malware-free equilibrium point G 0 is unstable.
Theorem 1 indicates that when R 0 \1, the rate of SNs in states S, I, E, and R converges to 1, 0, 0, and 0, respectively. That is to say, when R 0 \1, the administrator will only keep implementing current security measures, and the malware will eventually die out, regardless of the initially state fraction of SNs in infected-malware WSNs.

Simulations and verification
The experimental simulation and data analysis in this article are completed by VC6.0 and MATLAB R2013a.
The key parameters of the experimental simulation are set as follows: 30,000 SNs are distributed in the twodimensional square area with a side of 100 3 100, and each square lattice is a cell, and all cells are fully connected with a communication radius 1. The Moore neighborhoods of each cell are its surrounding eight cells, and the m is set as 0.125. In order to validate the accuracy of Theorem 1, we design two cases: (1) the SNs uniformly distributed in the cells and (2) the SNs randomly distributed in the cells.

Case 1: the SNs uniformly distributed in cells
Validating stability of the malware-free equilibrium point. In this section, the initial state I 0 i, j is set as 0.01, 0.1, and 0.2, respectively. Figure 2 shows the changeable fraction trends of susceptible SNs under the condition of Theorem 1 for I 0 i, j = 0:01, I 0 i, j = 0:1, and I 0 i, j = 0:2. We observe different trends. When I 0 i, j = 0:01, the fraction of susceptible SNs slowly decreases to 97% in the first five time steps and then slowly decreases to a stable value of 100%. However, when I 0 i, j = 0:1 and I 0 i, j = 0:2, the fractions of susceptible SNs, respectively, decrease to 80% and 68% in the first 10 time steps and then gradually increase to a stable value of 100%. Figure 3 shows the changeable fraction trends of infectious SNs under the condition of Theorem 1 for I 0 i, j = 0:01, I 0 i, j = 0:1, and I 0 i, j = 0:2. We observe different trends. When I 0 i, j = 0:01, the fraction of infectious SNs slowly decreases to a stable value of 0. When I 0 i, j = 0:1 and I 0 i, j = 0:2, the fractions of infectious SNs gradually decrease to 2% in the first 20 time steps and then slowly decrease to a stable value of 0. Figure 4 shows the changeable fraction trends of exposed SNs under the condition of Theorem 1 for I 0 i, j = 0:01, I 0 i, j = 0:1, and I 0 i, j = 0:2. We observe different trends. When I 0 i, j = 0:01, the fraction of exposed SNs gradually increases to 0.2% in the first five time steps and then slowly decreases to a stable value of 0. When I 0 i, j = 0:1 and I 0 i, j = 0:2, the fractions of exposed SNs, respectively, gradually increase to 1.05% and 1.7% in the first five time steps and then gradually decrease to a stable value of 0. Figure 5 shows the changeable fraction trends of recovered SNs under the condition of Theorem 1 for I 0 i, j = 0:01, I 0 i, j = 0:1, and I 0 i, j = 0:2. We observe different trends. When I 0 i, j = 0:01, the fraction of recovered SNs slowly increases to 2% in the first 20 time steps and then slowly decreases to a stable value of 0. When I 0 i, j = 0:1 and I 0 i, j = 0:2, the fractions of recovered SNs, respectively, gradually increase to 15% and 26% in the first five time steps and then gradually decrease to a stable value of 0.
According to the simulation experiments shown in Figures 2-5, we can conclude that when the SNs uniformly distribute in the cells, the fractions of susceptible, infectious, exposed, and recovered SNs all converge  to the malware-free equilibrium G 0 , regardless of the initial states distribution of the SNs in state I.

Case 2: the SNs randomly distributed in cells
Validating stability of the malware-free equilibrium point. In this section, the initial state I 0 i, j is set as 0.01, 0.1, and 0.2, respectively. Figure 6 shows the changeable fraction trends of susceptible SNs under the condition of Theorem 1 for I 0 i, j = 0:01, I 0 i, j = 0:1, and I 0 i, j = 0:2. We observe different trends. When I 0 i, j = 0:01, the fraction of susceptible SNs slowly decreases to 98% in the first five time steps and then slowly decreases to a stable value of 100%. When I 0 i, j = 0:1 and I 0 i, j = 0:2, the fractions of susceptible SNs, respectively, decrease to 81% and 68% in the first nine time steps and then gradually increase to a stable value of 100%. Figure 7 shows the changeable fraction trends of infectious SNs under the condition of Theorem 1 for I 0 i, j = 0:01, I 0 i, j = 0:1, and I 0 i, j = 0:2. We observe different trends. When I 0 i, j = 0:01, the fraction of infectious SN slowly decreases to a stable value of 0. When I 0 i, j = 0:1 and I 0 i, j = 0:2, the fractions of infectious SNs gradually decrease to 2% in the first 20 time steps and then slowly decrease to a stable value of 0. Figure 8 shows the changeable fraction trends of exposed SNs under the condition of Theorem 1 for I 0 i, j = 0:01, I 0 i, j = 0:1, and I 0 i, j = 0:2. We observe different trends. When I 0 i, j = 0:01, the fraction of exposed    SNs gradually increases to 0.15% and then slowly decreases to a stable value of 0. When I 0 i, j = 0:1 and I 0 i, j = 0:2, the fractions of exposed SNs, respectively, gradually increase to 1.05% and 1.7% in the first 10 time steps and then gradually decrease to a stable value of 0. Figure 9 shows the changeable fraction trends of recovered SNs under the condition of Theorem 1 for I 0 i, j = 0:01, I 0 i, j = 0:1, and I 0 i, j = 0:2. We observe different trends. When I 0 i, j = 0:01, the fraction of recovered SNs slowly increases to 2% in the first 20 time steps and then slowly decreases to a stable value of 0. When I 0 i, j = 0:1 and I 0 i, j = 0:2, the fractions of recovered SNs, respectively, gradually increase to 15% and 26% in the first five time steps and then gradually decrease to a stable value of 0.
According to the simulation experiments shown in Figures 6-9, we conclude that while the SNs randomly distribute in cells, regardless of the initial state distribution of the SNs in state I, the fractions of SNs in different states all converge to the malwarefree equilibrium G 0 .
From these two cases, regardless of the SNs distributed in cells (randomly or uniformly) and the initial state distribution of the SNs in state I, the SNs in state E, I, and R will die out over time. In this matter, when R 0 \1, the fractions of susceptible, infectious, exposed, and recovered SNs all converge to the malware-free equilibrium point G 0 . Therefore, the effectiveness of Theorem 1 is verified.

Conclusion
To discover the rules of malware diffusion in WSNs, this article proposes the model MDBCA, which is more appropriate for describing the really deployed WSNs, and can reflect the networking among SNs more realistically. First, after the elements of SNs in the malware-infected WSNs are mapped to the model elements of CA, we obtained the differential equations, which can reflect the various state dynamics of the SNs. Then, we attain two equilibrium points of the model MDBCA: one is the malware-free equilibrium point and the other is the endemic equilibrium point. Finally, we prove the stability of the malware-free equilibrium points by the basic regeneration number R 0 . Using numerical calculations and simulation experiments, we can acquire the stability condition of the malware-free equilibrium point, which provides a theoretical guidance for administrators to suppress malware diffusion in WSNs effectively.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work was supported by National Natural Science Foundation of China under grant No. 61772018.