Security performance analysis for cell-free massive multiple-input multiple-output system with multi-antenna access points deployment in presence of active eavesdropping

This article investigates the influence of multiple-antenna deployment at access points on physical layer security under cell-free massive multiple-input multiple-output systems. To embody the concept of network multiple-input multiple-output, cell-free massive multiple-input multiple-output employs a great deal of geographically distributed access points, which jointly serve multiple users in the same frequency-time resource. Assuming a malicious active eavesdropper having knowledge of instantaneous channel state information, we derive the asymptotic closed-form expression of the achievable secrecy rate, which is used to evaluate network security performance. A distributed power allocation scheme is developed based on channel fading degrees of different users. All of access points distribute a larger proportion of local transmission power to the users with better channel conditions. We verified that the security performance of the multi-antenna access points outperforms that of the single one. Compared to equal power allocation, the proposed power control algorithm can further boost the network security performance.


Introduction
Telatar, 1 Foschini and Gans, 2 Bell Laboratory, in their pioneering paper proposed the concept of multipleinput multiple-output (MIMO) system, which can provide spatial multiplexing and diversity gained by using multiple antennas on transmitter and receiver. Thus, MIMO systems can significantly increase spectral efficiency throughout without additional frequency, power, and time resources. As a scalable multiuser MIMO implementation, massive MIMO technology has been derived by Marzetta. 3,4 Each base station (BS) is equipped with a large array of antenna elements. Massive MIMO has become one of the key technologies for 5G networks with a great capability of enhancing the spectrum and energy efficiency as well as reducing transmission power and inter-cell interference of the network through the deep mining of dimensional resources in a wireless communication environment. 5 According to the deployment of antenna arrays on BS, massive MIMO systems can fall into collocated MIMO systems and distributed ones. 6 The former is built with a compact large-scale antenna array in the BS, which has advantages of low data sharing cost and fronthaul requirements. The antenna of the latter is distributed over a wide area, which can provide uniformly good services for all authorized users as well as higher diversity against the shadow fading. 7 Multicell distributed MIMO is a generalized concept, including network MIMO, 8 coordinated multiple point (CoMP), 9 and distributed antenna systems (DASs). 10 In 2015, Nayebi et al. 11 first proposed the network architecture of cellfree massive MIMO, which has received widespread attention. The different structure of massive MIMO systems is illustrated in Figure 1. Chien et al. 12 investigated how to use power allocation to save load resources. To reduce the overhead for channel estimation, an aggregated channel estimation approach was introduced by Chien et al. 13 These studies have explored more performance advantages for cell-free massive MIMO.
With the dramatic array gain, massive MIMO has great potential in physical layer security (PLS). Also, this topic has been developed in recent years. 14 Due to the stronger directivity of the signal beam, the massive MIMO system has natural robustness to passive eavesdropping. 15 To obtain more favorable information, smart eavesdroppers usually take action not only by overhearing but also by jamming. Time-division duplexing (TDD) mode has the advantage of making forward and reverse channels under massive MIMO systems reciprocal. 16 It allows that only the uplink is used for channel estimation to acquire channel state information (CSI), dispensing with downlink estimation. At the same time, it provides a huge opportunity for active eavesdroppers to send spoofing pilot sequences, also called pilot spoofing attacks. The damage is to decrease the accuracy of channel estimation, and bring correlated precoding vectors between eavesdroppers and legitimate users, 17 resulting in severe leaking of information. To mitigate the impact of active eavesdropping in massive MIMO systems, power control methods are commonly applied, such as trying to ensure the minimum transmit power required for reliable and secure communication, 18 minimizing the signal-to-interference and -noise ratio (SINR) of eavesdroppers, 19 and using game theory to maximize the SINR of legitimate users. 20 The security aspect of the cell-free massive MIMO network under pilot spoofing attacks was first explored by Hoang et al., 21 which proposed a simple detection mechanism and designed power control schemes with the aim of security rate maximization and power consumption minimization. To continue handling the crisis of the active eavesdropper, Zhang et al. 22 analyzed the security performance for the multigroup multicasting scenario. Besides, a comparison about the lower bound on security capacity between the collocated and cellfree massive MIMO has been presented by Timilsina et al. 23 The achievable rate on legitimate users and the rate leaked to active eavesdroppers have been driven within the range of limited and unlimited numbers of APs. Research on security performance was expanded by introducing novel technologies. 24,25 Alageli et al. 24 have delved into the problem of secure communication assuming that there is an untrusted dual-antenna energy harvester in the network. One antenna is used to harvest energy, whereas the other one intercepts information signals as an active eavesdropper. Elhoushy et al. 25 have enhanced the robustness of network security against pilot spoofing attacks by applying reconfigurable intelligent surface, which reconfigures the wireless propagation channels through softwarecontrolled reflections. Based on more realistic consideration of low-resolution analog-to-digital converters, secure communication with active eavesdroppers has been studied by Zhang et al. 26 However, the abovementioned research on PLS is all limited to single-antenna APs.
To the best of the authors knowledge, there are no fruitful results with the cell-free massive MIMO system model established in multi-antenna APs. However, it is easier to realize channel hardening when each AP is deployed with 5-10 antennas. 27 By channel hardening, fading multi-antenna channels can behave almost as deterministic scalar channels. This basic property simplifies the resource allocation 28 and assists to deliver higher performance 29 in the cellular massive MIMO system. Ngo et al. 7 proved that the cell-free massive MIMO can also utilize channel hardening, but Chen and Bjoernson 30 have proved that channel hardening is not always available to adopt in the cell-free massive MIMO network. So the studies on cell-free massive MIMO should deploy multi-antenna APs in the pursuit of relying on channel hardening. Vu et al. 31 proposed a novel full-duplex mode, wherein the plurality of multiantenna APs provided uplink and downlink services for multiple single-antenna users in the same frequency band. Considering imperfect CSI, non-orthogonal pilots, and power control, the spectral efficiency with respect to numbers of APs antennas and the users in cell-free massive MIMO system has been investigated by Mai et al. 32 Motivated by the aforementioned discussion, we explore the PLS of cell-free massive MIMO systems with multi-antenna APs in the presence of an active eavesdropper. First of all, the closed-form expressions of downlink achievable rates at the legitimate users and the rate leaked to the wiretapper are derived. Second, we propose a distributed power control scheme that allocates power factors by estimated channel quality. This simple and feasible scheme puts power control operations locally. Moreover, the effect of the number of the antennas per AP is analyzed by the achievable secrecy rate. Simulation results show that the security improved by the multi-antenna APs consistently outperforms that of the single one in terms of achievable security rate. And dense nodes of APs cooperative with the aforesaid power allocation scheme can provide a respectable security performance.
The remainder of this article is organized as follows. In the section ''Channel and System Model,'' the system model including uplink channel estimation and downlink information transmission phase are presented. In section ''Security Performance Analysis,'' we carry out a set of general mathematical closed formulas for security performance analysis. In addition, we optimize the total downlink transmission power here. This is followed by the section ''Numerical Results and Discussion,'' where numerical results and discussion are presented. Finally, we provide our conclusion in the section ''Conclusions.'' Notations: Boldface letters denote the vector or matrix, for example, x. I N denotes the N 3 N identity matrix. Conjugate, transpose, Hermitian, and trace operator are indicated by (Á) Ã , (Á) T , (Á) H , and tr (Á), respectively. EfÁg and VarfÁg denote expectation and variance operations of random variables in brackets, respectively. C M 3 N stands for M 3 N dimensional complex space. jÁj and jjÁjj represent absolute value and the Euclidean norm, respectively. x;CN( x, L) is a complex Gaussian random vector with a mean vector of x and covariance matrix of L.

Channel model
We consider a cell-free massive MIMO system with M APs and K users in the presence of an active eavesdropper (Eve). Each AP is equipped with N antennas, and each user and Eve have one single antenna. All node positions are randomly located. The central processing unit (CPU) connects every AP nodes to perform joint signal processing, beamforming, and computing tasks via backhaul, such as fiber, cable, and microwave. 6 Let us denote the channel response matrix between the AP and the user as g mk 2 C N 3 1 , which can be expressed as where b mk is large-scale fading including shadow fading and path loss. It is related to the distance between the mth AP and the kth user, which can be considered as a constant. 33 h mk 2 C N 3 1 represents small-scale fading, which is composed of the superposition of the signal components generated by the reflection, scattering, and diffraction of the channel through various obstacles. In addition, we discuss the Rayleigh fading channel, that is, ½h mk i ;CN (0, 1) is independent and identically distributed (i.i.d). The notation ½h mk i is the ith element of h mk , i 2 f1, . . . , N g.

System model
The system model of the cell-free massive MIMO network is shown in Figure 2. We assume that the system is operated in TDD mode. The length of a TDD coherent interval is t c . Considering the downlink secure transmission, each coherence interval consists of two phases: uplink channel estimation and downlink information transmission.
Uplink channel estimation: During the pilot sequence training duration t up , every legitimate user sends its own unique pilot sequence to all APs, and APs process these orthogonal mixed pilot sequences by multiplying the conjugation of the user's own pilot sequence. Through the estimator, it can be used to obtain the estimation of channels that meet different needs. In this article, we adopt the minimum mean square error (MMSE) estimation method, often referred to as the optimal estimate.
In detail, the system design K vectors p 1 , p 2 , . . . , p K as standard and public pilot sequences. It has to satisfy p H i p j = 1 in the case of i = j, and p H i p j = 0 in the case of i 6 ¼ j. The kth user sends ffiffiffiffiffiffi t up p p k 2 C t up 3 1 to all APs, and Eve sends the same one with the target intercepted user. Generally assuming that Eve desires to wiretap the confidential messages from the k 0 th user, that is, p E = p k 0 , the process is completely synchronous. Hence, the received pilot vector at the mth AP is where p ut ¼ D P ut =s 2 and p ue ¼ D P ue =s 2 are normalized signal-to-noise ratio (SNR) of each pilot symbol. P ut and P ue are the average uplink power of each user and Eve, respectively. W p, m 2 C N 3 t up is an additive white Gaussian noise (AWGN) matrix. The elements of W p, m obey distribution with CN (0, 1), i.i.d. Then, the projection process can be put into p Ã k as The MMSE estimation of g mk and g mE can be given locally at the mth AP bŷ To facilitate later calculations, let us define and We can see that g mk is a diagonal matrix with identical elements, and so is g mE . These two statistics describe the estimated large-scale fading of APs. Assuming the antennas by each AP are uncorrelated, they are expressed directly and briefly as g mk = l mk I N , Downlink data transmission: During this phase, the mth AP expects to transmit the normalized signal s dk for the kth user such that Efjs dk j 2 g = 1 and utilizes the uplink estimated channel to perform beamforming. Then, the AP transmits the mixed signal for K users as where p mk is the downlink power allocated by the mth AP to the kth user. v mk 2 C N 3 1 represents the payload data precoding vector of the mth AP to the kth user. Maximum ratio transmission (MRT) is a simple linear precoding scheme that can provide nearly optimal performance. Therefore, using MRT scheme in this article, . For the sake of simplicity, the operation is aimed at Efjjs m jj 2 g = P K k = 1 p mk . After that, the received signal at the kth user from all APs is given by where It means the strength of the message that the user has received is however intended for the k 0 th user. And w d, k ;CN (0, s 2 ) is an AWGN.
As such, Eve will receive the information signal as where a ek ,

Security performance analysis
In this section, we analyze the security performance by achievable security rate. The security rate can be defined as the difference between the information rate of legitimate users and Eve. Also, the achievable rate of legitimate users and the information leakage rate into Eve are deduced.

Achievable rate of legitimate users
On the basis of information theory, the achievable rate of the user can be given as where g k is the SINR of the kth user. Due to the fact that multi-antenna APs deployment makes it realize the property of channel hardening, legitimate users have access to statistic CSI without downlink training. 27,30 The achievable rate of legitimate users can be regarded as the lower bound for mutual information between s dk and x k . Based on Hoang et al., 21 and considering that the effective noises are pair-wisely uncorrelated, we can obtain the lower-bound SINR of the kth user as var a kk f g+ As the kth user only gains the statistic CSI, Efa kk g is substituted for a kk denoting the strength parts of s dk . a kk À Efa kk g is the strength difference of the desired signal caused by the inaccuracy of the statistic CSI, which expatiates on the source of varfa kk g in the above formula. Theorem 1. When an active eavesdropper conducts a pilot spoofing attack, and all APs are configured with multi-antenna and conjugate beamforming technology, the closed-form expression of the kth user's SINR could be condensed as equation (14) by Lemma 1, where Qiu et al. 34 has not coped completely.

Information rate leaked into Eve
In this article, we focus on the worst-case scenario in terms of network security. The Eve has knowledge of instantaneous channel gain to get a genie-aided rate. Based on Jensen's inequality, is convex. The Eve gets the upper-bound eavesdropping information rate, which can be represented as Theorem 2. The closed-form expression of the active eavesdropper's SINR when every AP has multi-antenna and adopts conjugate beamforming is Proof: See Appendix 2.

Achievable secrecy rate
The above analysis lays the foundation for the derivation of the lower-bound secrecy rate at the k 0 th user. Following Wu et al. 14 and Jose et al., 16 the achievable secrecy rate can be expressed as where the notation is ½x + = max (x, 0). We regard the worst-case secrecy performance for the network, which is of utmost importance. Under any other conditions, such as legitimate users having perfect CSI, or Eve not being capable of achieving instantaneous CSI, the secrecy performance would be elevated spontaneously.

Power allocation scheme
In this subsection, we present a distributed power allocation (DPA) scheme, which is simple to implement for cell-free massive MIMO. Most power control algorithms take a certain utility metric as the optimization goal, depending on the powerful computing capability of the CPU, huge backhaul capacity, and perfect calibration of the hardware chains. Backhaul overhead is defined as the execution cost of computation task among cooperating APs. 35 Because distributed antenna networks increase in size and density, it is urgent to reduce backhaul overhead. 36 When the backhaul load is heavily limited, it is an alternative approach to conduct power control to APs locally. Assuming that the network does not detect the presence of active eavesdropping, each AP divides their power by the channel strength of different users. In this scenario, the influence of the unauthorized access of the active eavesdropper on the network can be more intuitive, credible, and persuasive. In our power allocation schemes, channel strength is measured on the basis of g mk = Efjĝ mk j 2 g = l mk I N , so the downlink transmission power allocated from the mth AP to the kth user can be acquired as where P dt is the total transmission power of each AP.
Dealing with power control, it is traditional to adopt equal power allocation (EPA) scheme in collocated massive MIMO systems. Because the distance is so close among antennas in the BS, all are regarded to be set up in the same place. The channel quality between each antenna and users is in equilibrium. While CPU supports distributed scenarios in cell-free massive MIMO systems, APs have widely divergent locations. If a certain AP is relatively far from some users, or has serious interferences, it will be sensible to make these power control factors smaller. In consideration of the above, DPA is more appropriate in the light of the distribution network traits. It is also easy to implement after CSI estimated by each AP.

Numerical results and discussions
In this section, we present the numerical results of the security performance under the multi-antenna cell-free massive MIMO network in the presence of an active eavesdropper. Unless noted additionally, system parameters are set as shown in Table 1. where z mk ;N (0, 1) and z mE ;N(0, 1) denote the normalized distribution of shadowing fading. PL mk and PL mE represent the path loss as where L , 46:3 + 33:9 3 log 10 (f ) À 13:82 3 log 10 (h AP ) À(1:1 3 log 10 (f )À0:7) 3 h UE +(1:56 3 log 10 (f )À0:8) and where PL(d) is the generalized notation in dB for PL mk or PL mE . d[d mk (or d[d mE ) represents the distance between the mth AP and the kth user (or Eve). By the way, d 0 = 10 (m) and d 1 = 50 (m). Moreover, the noise power can be given by where k B is the Boltzmann constant, equal to 1:38 3 10 À23 (Joule=Kelvin), and the noise figure is 9 dB. Figure 3 shows the cumulative distribution function (CDF) of the achievable secrecy rate by the k 0 th user. We focus on the performance gain brought by multiantenna APs deployment. When N = 1, it is the special case that all APs are single antenna in prior studies. Considering EPA for simplicity, no matter how many APs are deployed by single antenna or multiple antennas, the achievable secrecy rate is almost zero. With DPA, achievable secrecy rates are monotonically increasing with the antenna number of APs. This can be explained by the fact that when N rises, the legitimate channels between APs and users are more favorable. 95%-likely performance 7 is observable to analyze the level, which has a 95% chance to be achieved. Zooming in on the lower segment of the plot in Figure  3, the security rate of the k 0 th user when each AP is set with 5 or 8 antennas deployment is about 2 bit/s/Hz, which is twice as high as that of only single antenna. It also proves the conclusion by Irmer et al. 9 that multiantenna APs can inherit benefits from channel hardening.
As shown in Figure 4, we explore the intuitive influence on multi-antenna APs. As the amount of antennas by APs increases from 1 to 10, the achievable secrecy rate increases proportionally. With EPA, the number of antennas by APs and the number of legitimate users within the supported range, that is, K ł t up , have no perceptible effect on the achievable secrecy rate. However, by applying DPA, the performance improvement is substantial. Even the number of legitimate users has reached its limit, that is, K = t up , which is the maximum interference from other legitimate users. From the growth trend, we can see the great promotion of achievable secrecy rate gains from single APs to the multi-antenna configuration. This is also the most challenging step for all network construction. In theory, random matrices are used to deduce the expected value for future simplification. In practical application, multi-antenna APs increase the multiplexing gain with moderate physical sizes.
In Figure 5, the achievable secrecy rate is plotted as a function of spoofing pilot power by Eve. If Eve cannot transmit training pilots, that is, P ue = 0 W, the action will degenerate into passive eavesdropping. Between the active one and passive one, it is a significant decline of the achievable secrecy rate by EPA. But with our proposed DPA scheme, the performance gap is not such a great divergence, which indicates that power control is an effective method to restrain the harm of a pilot spoofing attack. When the power of Eve sending the training pilots is larger, the degree of pilot spoofing attack becomes more serious. Yet the damage to the target wiretapped users is not obvious. The reason is that DPA makes transmission power of each AP distributed to the legitimate users based on the channel conditions, which g mk is used as the performance metric. Particularly for the k 0 th user, g mk 0 suffers from the uplink power of Eve P ue . If spoofing pilot power of Eve increases, channel estimation of the wiretapped user would be more inaccurate. This can lead to weaker estimated channel g mk 0 and channel statistics g mk 0 , which makes the downlink transmission power received to Eve smaller. Therefore, DPA can exploit power control against Eve, which has strong spoofing pilot power.
Specifically, the achievable secrecy rate with singleantenna AP ranges from 1.769 bit/s/Hz with P ue = 0:1 W to 1.675 bit/s/Hz with P ue = 1 W. When the number of antennas per AP grows up to 5, the achievable secrecy rate value raises about 3.311 and 3.234 bit/s/Hz when pilot power of Eve increases 10 times from 0.1 W. In the same case with N = 8, the achievable secrecy rate shows a slighter downward trend from 3.996 to 3.862 bit/s/Hz. On one hand, the introduction of multi-antenna APs can eliminate some negative effects in the worst-case scenario in which Eve has strong power to send pilots. This means the beamforming vector has to be more inclined toward the  direction of Eve. On the other hand, the power allocation scheme is of great assistance to the security performance enhancement. All APs only consider the channel condition of legitimate users to distribute themselves local power. Eve has no chance to change the received signal strength. During the downlink transmission phase, the identity of Eve is a passive eavesdropper. This DPA scheme transfers the difficult assignment from CPU to AP nodes. It can reduce the cell-free massive MIMO system workload and can handicap Eve's position.
The achievable secrecy rate is depicted as a function of M in terms of parameters with various amounts of antennas by APs shown in Figure 6. Under the worstcase security assumption, the network is coupled with a natural bias that Eve has more knowledge about CSI than legitimate users. Regardless of power control, there is terrible security performance whatever the number of antennas by APs or the number of APs by applying EPA. In our proposed DPA scheme, we generally demonstrate that a higher level of security performance can be gained from services provided by more APs. This also takes advantage of the typical narrow beam orientation of massive MIMO systems. As a result of each five antenna point intervals, the curves are not very smooth. We could still find that multiantenna APs deployment is a bigger contributor to security performance improvement. When N = 1, the achievable secrecy rate climbs up from 0.950 to 2.143 bit/s/Hz. There are a dramatic rise of achievable secrecy rate in the case of N = 5 and N = 8, from 2.323 to 4.113 bit/s/Hz and from 2.897 to 4.630 bit/s/Hz between M = 20 and M = 100, respectively. Although densely random AP nodes in cell-free massive MIMO systems must have a closer geographical distance to Eve, the user-centric ideological concept can lead to significantly better security performance for legitimate users. Besides, in Figures 5 and 6, the simulation result is in keeping with the theoretical value to verify the correctness of secrecy performance analysis.

Conclusion
In this article, we have studied a cell-free massive MIMO network in the presence of an active eavesdropper, which is a smart attacker imitating the legitimate users' training pilots and sending the same one to all APs. Combined with previous research, it has proven the channel hardening that exists when multiple antennas are configured at APs. Therefore, it is imperative to   consider the scenario of multi-antenna APs under cellfree massive MIMO networks, which is also the first time to focus on the security with arbitrary finite number of antennas deployed by APs. Taking MMSE estimation method and conjugate beamforming technology into accounts, we derive general closed-form expressions of the achievable rate at the legitimate user nodes, the information rate leaked into the eavesdropper, and the achievable secrecy rate. Moreover, a DPA scheme is proposed and contrasts with EPA. This power control algorithm could be used as a prototype for the usercentric network. Taking achievable secrecy rate as the performance security evaluation in simulations, our results reveal that it is superb to employ multi-antenna APs to enhance the security of wireless communication networks. That also corresponds with the application condition of the channel hardening mentioned above.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Proof of equation (14) By substituting the concrete expression of a kk 0 into equation (14), we can obtain Initially, we handle v mk into the closed form of g mk = ffiffiffiffiffiffiffiffiffiffiffiffiffiffi tr(g mk ) p through operating where (a) is achieved by g mk =ĝ mk +g mk withg mk being the channel estimation error between the mth AP and the kth user. (b) is the result of independence principle betweenĝ mk andg mk , attributed to MMSE estimator. It has key implications of g mk ;CN(0, b mk I N À g mk ).
Furthermore, in order to obtain the explicit expression of the first denominator term in equation (24), we continue to calculate where (a) is acquired based on Lemma 1. For the second denominator term in equation (24), we need to make a calculation in the case of k 0 6 ¼ k, Substituting equations (26), (27), and (28) into equation (24), we can obtain equation (14).
Proof of equation (16) We first rewrite the SINR of Eve as And then, the numerator term of equation (29) Third, we address the residual non-closed part of equation (29) Last, using equations (30) and (31), the derivation of g k is easily obtained.