Secure performance analysis and pilot spoofing attack detection in cell-free massive MIMO systems with finite-resolution ADCs

In this article, the secure communication in cell-free massive multiple-input multiple-output system with low-resolution analog-to-digital converters is investigated in the presence of an active eavesdropper. Specifically, in this article, the deterioration caused by the analog-to-digital converter imperfections on the accuracy of the channel estimation and secure transmission performance is studied. Besides, the additive quantization noise model is utilized to analyze the impacts of the low-resolution analog-to-digital converters. The minimum mean square error channel estimation results show that there is a nonzero floor caused by the coarse analog-to-digital converters. Then, the closed-form expressions for both the legitimate users achievable ergodic rate and the information leakage to the eavesdropper are derived, respectively. Moreover, tight approximated ergodic secrecy rate expression is also presented with respect to analog-to-digital converters quantization bits, number of antennas, pilot power, and so on. To degrade the impacts of the pilot spoofing attack, an active attack detection approach based on random matrix theory is proposed which can only be operated at one access point. Simulation results are provided to corroborate the obtained results and analyze the impacts of various parameters on system secrecy performance. Also, the superiority of the proposed active attacks detection method is confirmed via simulation results.


Introduction
Massive multiple-input multiple-output (MIMO) is regarded as a promising wireless access technology for 5G and B5G wireless communication which can provide high spectral efficiency and high energy efficiency with simple signal processing. 1,2 Cellular massive MIMO equips the large number of the antennas on a single base station (BS) which has been analyzed widely and deeply on various respects, such as performance analysis, beamforming, power control, secure communication, and so on. [2][3][4] However, the performance of such cellular system is inevitably limited by the intercell interferences and channel correlation. One practical alternative is to install several antenna sets onto distributed physical locations which is also called distributed massive MIMO. 5,6 As a scalable distributed massive MIMO concept, cell-free massive MIMO has been The 63rd Research Institute, National University of Defense Technology, Nanjing, China proved to be a promising wireless network architecture which can provide coherent service by geographically distributed access points (APs) for future wireless communication systems. 7 Since its proposal, various works on performance evaluations and transmission schemes over cell-free massive MIMO have been investigated. [8][9][10] Due to the broadcast nature of wireless channels, facilitating secure and reliable wireless transmission is a critical topic for future wireless communication systems. [11][12][13] Despite the robustness of massive MIMO against passive eavesdropping, the active eavesdropping attacks are more relevant. 14,15 Currently, most related literatures merely focused on the secure transmission in co-located massive MIMO systems, [16][17][18] while secrecy provisions for cell-free massive MIMO have not been well explored. Hoang et al. 19 discussed the secrecy performance and power allocation for cell-free massive MIMO with active eavesdropper. The effect of hardware impairments over secure cell-free massive MIMO system with active spoofing attack has been investigated in Zhang et al. 20 However, all aforementioned works are based on adopting highresolution analog-to-digital converters (ADCs) in massive MIMO systems. To reduce energy consumption and physical hardware costs, implementing finiteresolution ADCs in massive MIMO systems has been regarded as an effective approach. Several researchers have devoted to the implementation of finite-resolution ADCs in cell-free massive MIMO systems recently. [21][22][23] Besides, the secure co-located massive MIMO with finite-resolution ADCs has been studied in Teeti. 24 Nevertheless, the impact of ADCs quality on secrecy performance in cell-free massive MIMO has not been considered in existing research.
In a time-division duplex (TDD) massive MIMO system, the BS always acquires the channel state information (CSI) of the legitimate users via uplink training based on the reciprocity of the downlink and uplink channels. All legal users are required to send the redesigned and assigned pilot (training) signals to the BS in uplink training phase. Since all pilot signals are repeatedly used and publicly known by the BS and all users, a smart malicious eavesdropper can easily learn the target user's pilot signals and transmit an identical pilot signal as that of a legitimate target user which is also called spoofing attack. 25 Due to the spoofing attack operated by the active eavesdropper, the adversary could manipulate the CSI estimation which is the weighted sum of the target legitimate user and the eavesdropper's channels. Consequently, the beamforming based on the estimated CSI during the data transmission results in information leakage to the eavesdropper. Some existing literatures show that the massive MIMO is robust against passive eavesdropping while the secrecy performance of the massive MIMO system is dramatically degraded by active attacks. 14,15,26 Thus, detection of the active eavesdropper can be considered as an effective countermeasure to weaken the influence of active spoofing attacks. Kapetanovic´et al. 27 proposed a detection scheme via random training pilots for massive MIMO system. However, some special pilot sequences are required to do the detection in this proposed method. Then, different schemes for active attack detection have been presented in Kapetanovic´et al. 28 which are operated at different locations and based on different system parameters. It is noted that the precoder normalization is crucially important for these active eavesdropper detection schemes. Moreover, a simple detection protocol based on the statistical properties of the channel estimation has been provided in Al-Nahari, 15 which is also determined by the correctness of the channel estimation. Besides, the minimum description length (MDL) source enumeration algorithm has been utilized for active eavesdropper detection in Tugnait,29,30 which can greatly improve the detection performance using short pilot sequences. According to Kritchman and Nadler 31 and Tugnait, 32 the random matrix theory (RMT) detection approach can achieve more precise results compared with MDL algorithm and has been proved to be highly effective in traditional MIMO system.
Motivated by the aforementioned observations, this article presents the first study of the security problem in cell-free massive MIMO system with finite-resolution ADCs against active eavesdropper. Furthermore, the tight asymptotic lower bound of achievable secrecy rate has been provided to enable the effects of ADC resolution, number of APs, each AP's antenna number, and so on. Also, an active eavesdropping detection algorithm based on RMT is presented. And the proposed detection algorithm can be operated at one AP which only needs a simple procedure. The specific contributions of this article can be summarized as follows: To capture the aggregate impacts of lowresolution ADCs, the additive quantization noise model (AQNM) is utilized to analyze the performance of the secure cell-free massive MIMO system. Specifically, this article operates the channel estimation via uplink training under minimum mean square error (MMSE) criterion. Moreover, the results show that the low-quality ADCs lead to a nonzero floor on the channel estimation. Also, more pilot power of the eavesdropper would bring larger damnification for channel estimation. The downlink secrecy communication has been considered with the low-resolution ADCs. Based on the AQNM and the imperfect CSI, the closed-form expression of the secrecy rate has been derived with respect to key system parameters, including the quantization bits of the ADCs, the number of the total antennas, the pilot and data transmission power, and so on. These novel results can provide a effective tool to quantitatively analyze the impacts of system parameters on system secrecy performance. To weaken the impacts taken by the active eavesdropping, an active attacks detection scheme based on the pilot self-contamination has been proposed which can only be operated at one AP. The detection is employed via determining the signal subspace rank which is estimated by RMT source enumeration approach. Also, the proposed detection approach has been illustrated accurate and effective via simulations.
The outline of the article is organized as follows. Section ''System model'' introduces the considered secure cell-free massive MIMO system model. Section ''Uplink training'' details the procedure of the uplink MMSE channel estimation. Section ''Secrecy performance analysis'' derives the achievable secrecy rate of the considered system. The RMT-based active eavesdropping detection is discussed in section ''RMT-based pilot spoofing attack detection.'' Simulation results are presented in section ''Numerical results.'' Finally, the article is concluded in section ''Conclusion.''

Notation
Throughout this article, boldface lower case and upper case letters are utilized to represent vectors and matrices, for example, y, Y. The superscript ( Á ) Ã , ( Á ) T , and ( Á ) H stand for the conjugate, transpose and Hermitian transpose operation, respectively. In addition, the Euclidean norm of a vector g and the absolute value of a scalar g can be denoted as g k k and g j j, respectively. Ef Á g, VarfÁg, rankfÁg, and EigfÁg represent operations of expectation, variance, rank, and eigenvalue of a matrix, respectively. x;CN (m, A) stands for a complex Gaussian random vector x with mean m and covariance matrix A. I M and C M 3 N are M À dimensional identity matrix and M 3 N dimensional complex space. Besides, independent and identically distributed is abbreviated as i.i.d. Finally, ½x + = maxf0, xg.

System model
As illustrated in Figure 1, the cell-free massive MIMO system comprising M APs, K one-antenna legitimate users, and an active eavesdropper (Eve) is considered in this article. Each AP is equipped with N antennas. All APs are connected to a CPU through backhaul for sharing information. The Eve is equipped with single antenna and intends to contaminate the uplink training by pilot spoofing attack. 19 The system is assumed operated in TDD mode, where the downlink and uplink channels are assumed to be reciprocal. Rayleigh blockfading channels are assumed between the APs and different terminals, where the channels remain constant in one time slot and vary independently afterward. Then, the channels between the mth AP and kth user (the Eve) can be denoted by [7][8][9] where b mk (b mE ) represents the large-scale fading, h mk (h mE ) models the small-scale fading vector with elements being independent and identically distributed (i.i.d) CN (0, 1) random variables (RVs). Since secure downlink transmission is investigated in this article, each time slot consists of uplink training and downlink transmission phases.

Uplink training
Due to the channel reciprocity, the CSI can be only acquired by uplink training. During this phase, all legitimate users send mutually orthogonal pilot sequences of length t p symbols for channel estimation at the APs, where t p ø K. Let ffiffiffiffiffiffiffiffiffi t p p p p u k 2 C t p 3 1 be the pilot vector sending by the kth user, where u k k k 2 = 1, p p represents the normalized pilot signal power. Since the pilot signals are repeatedly utilized and publicly predesigned, an intelligent eavesdropper can pretend to be a legitimate user by transmitting the identical pilot signals as the target user. No loss of generality, assuming that the Eve aims to eavesdrop the confidential information sending to the kth user, Eve's pilot sequence can be denoted by ffiffiffiffiffiffiffiffiffi normalized transmit power of Eve. The N 3 t p received pilot matrix at the mth AP is given by where W p, m 2 C N 3 t p denotes the additive white Gaussian noise matrix, whose elements are i.i.d CN (0, 1) RVs. By utilizing the additive quantization noise model (AQNM), the resulting signal after finiteresolution ADCs at mth AP can be given by 21,22 where g m ł 1 is scaling factor with respect to the ADC quantization bits b m , whose accurate values are shown in Table 1. Also,W p, m is the quantization noise matrix with covariance matrix as Then, projectingỸ p, m onto u l , the post-processing signal can be rewritten as where w p, m = W p, m u l ,w p, m =W p, m u l . By utilizing the MMSE estimation, the estimated channels can be written by 16,21 g ml =ĝ ml +g ml ð7Þ where the estimateĝ ml (ĝ mE ) and estimation error g ml (g mE ) are mutually independent, whose elements are zero mean i.i.d. RVs with variances l ml (l mE ) and b ml À l ml (b ml À l ml ). Moreover, the channel estimationĝ ml can be expressed asĝ Then, one important theorem can be given as follows.
Theorem 1. The variances l ml and l mE can de denoted by Proof. Please see Appendix 1.
Since the downlink beamformer is based on the estimated CSI via uplink training, the accuracy of the channel estimation is crucially important. Then, the normalized minimum square error (NMSE) of the target user can be used to evaluate the estimation performance which is defined as To achieve some valuable insights, one limit behavior of the NMSE can be considered as follows. As p d ! + ', we can get the limit value of l mk as Thus, there is a nonzero floor on the channel estimation error NMSE caused by the ADCs imperfection. Besides, this zero floor cannot be compensated by increasing the pilot power of the legitimate users.

Secrecy performance analysis
In this section, we will focus on the performance analysis as a measure to evaluate the performance of the considered secure communication system. [33][34][35][36][37] All APs perform conjugate beamforming to simultaneously transmit data to all users with the same time-frequency resource. Then, the signal transmitted by the mth AP can be represented by where q l denotes the normalized signal intended to the lth user with q l j j= 1, h ml is the power control coefficient satisfying the power constraint as Then, the received signal at the kth user and the Eve can be denoted by where w k ;CN (0, 1) and w E ;CN (0, 1) represent received noise elements at the target user and the Eve, respectively. Operated by the low-resolution ADCs at the users, the output signal can be given bỹ where g u denotes the scaling factor of the ADCs at the users. Also,w k is the quantization noise with the variance as We consider the realistic case that the target user cannot obtain its instantaneous CSI. Hence, the kth user will detect its desired signal q k by using the statistical CSI. Considering a pessimistic case, we assume that the Eve can obtain the perfect instantaneous CSI and has been equipped with high-quality hardware. Based on these settings, the lower bound on the target user's achievable rate and the upper bound for the Eve's ergodic capacity can be achieved. Furthermore, the signals at the kth user and the Eve can be rewritten as where w eff k and w eff E denote the effective noise at the kth user and the Eve which are given by In this article, the ergodic secrecy rate is chosen to evaluate the secrecy performance of the considered system. By considering the difference of the channel capacities between the target user and the Eve, the ergodic secrecy rate is given by where R k (R Eve k ) and z k (z Eve k ) denote the achievable ergodic rate and signal-to-interference-noise ratio (SINR) at the kth user (the Eve). Then, we can get that Subsequently, the closed-form expressions of R k and R Eve k can be given in the following theorem.
Theorem 2. In the considered cell-free massive MIMO system, the SINRs at the kth user and the Eve can be given by equations (26) and (27) shown at the top of the next page Proof. Please refer to Appendix 2.

RMT-based pilot spoofing attack detection
According to some existing papers, it is reasonable to operate the pilot spoofing attack detection based on the uplink training. Our proposed detection scheme is to perform source enumeration using RMT algorithm which is based on both the distribution of the signal eigenvalues and noise eigenvalues with the presence of noise. Also, to improve the detection accuracy, the uplink training can be divided into two phases: uplink training and uplink active attack detection. In the detection, it is set that only the legitimate target user send the pilot signal which can be designed that a fraction 0 ł a ł 1 of the pilot power is allocated to a scalar random sequence u B . The superimposing random sequence u B is normalized ( u B k k 2 = 1) and zero-mean whose elements can be chosen from finite alphabet: BPSK or QPSK. Therefore, instead of the pilot signal ffiffiffiffiffiffiffiffiffi t p p p p u k , the pilot signal used for detection at the target user (the kth user) is given by To keep the detection processing confidential, the random sequence u B is unknown to the APs and to the Eve in advance as it is designed randomly at the user terminal. The detection problem is in fact a binary statistical hypothesis testing problem framework, where the two hypotheses can be made with the null hypothesis H 0 denoting no attacks and the alternative hypothesis H 1 representing existing active eavesdropping. The two hypotheses can be presented as follows where V m 2 C N 3 t p is the effective noise component including the quantization noise matrix and additive noise matrix. Obviously, the received signals under these two hypotheses can be modeled as Gaussian variable with different expectation. According to the existing literature, 29-32 the deterministic signals can be used for source enumeration to detect the existing active attacks.
Then, we can define that the correlation matrix of the received signal as Furthermore, the correlation matrix of the source signals is given by As such, the matrix R s, i under the two hypotheses can be represented as where the matrices G and S d are defined as It is noted that rank(R s, 0 ) = 1. Due to the fact that g mk 6 ¼ g mE , we can derive that rank(R s, 1 ) = 2 with a.0.
Moreover, we can derive that where d 2 v is the variance of the noise matrix V m . Let § 1, 0 .0 and § 1, 1 . § 2, 1 .0 denote the nonzero eigenvalues of correlation matrix R s, 0 and R s, 1 . Hence, the eigenvalues of R y, i can be given by Thus, the two hypotheses can be distinguished by estimating the dimension d = 1 or d = 2 of the signal subspace which can be operated by the RMT estimator. Thus, the two hypotheses in equation (29) can be reformulated as The RMT approach considers distribution of the noise eigenvalues of R y, i with the absence of signals, that is, Y d, m = V m . In particular, the distribution of the largest eigenvalue of a pure noise matrix based on a certain asymptotic limit of the distribution is used for matrix rank judgment. Then, we define the correction matrix as The eigenvalues of R v can be ordered as l 1 ø l 2 ø Á Á Á ø l N . As the joint limit N , t p ! + ' with N =t p ! c ø 0, the largest eigenvalue l 1 converges to a Tracy-Widom distribution of order 2, that is where F TW , 2 (z) represents the Tracy-Widom probability distribution function of order 2. Besides, m t p , N and d t p , N denote the centering and scaling parameters which are only determined by N and t p . Furthermore, the parameters m t p , N and d t p , N can be denoted by Then, we should consider the behavior of correlation matrix adding the signals. In this case, the eigenvalue of matrix R y, i also follows a Tracy-Widom distribution. Hence, the signal dimension can be estimated by judging whether the largest eigenvalue is over some deterministic threshold. Motivated by the above analysis, the RMT algorithm for signal subspace rank estimation can be defined bŷ where k 2 f1, 2, . . . , min (N , t p ) À 1g, z(P fa ) satisfies that F TW , 2 (z(P fa )) = P fa , P fa denotes the false alarm probability. In addition, with the assumption of K signals, the noise variance can be estimated by maximum likelihood estimator aŝ The above RMT approach can be used for signal subspace rank estimation based on the fact whether d RMT = 1 ord RMT .1. Hence, the algorithm can only be performed in small set that is k 2 f1, 2g. Furthermore, this detection algorithm can be only operated at one AP. To ensure the detection performance, we can choose the AP which is nearest to the user and the Eve for active attack detection. Therefore, the AP can achieve the largest SNR and can get a larger detection probability.

Numerical results
In the simulations, all APs, users, and the Eve are randomly located over a circle cell of radius 300 m. A simplified path-loss model has been used as b = (d 0 =d) v , where d, d 0 , and v represent real geographical distance, reference distance, and loss coefficient, respectively. 38 To quantitatively illustrate the results, key system parameters are set that d 0 = 50m, v = 2:4, p p = p d = 10dBW, t p = K = 20, and MN = 200. Besides, the average power allocation scheme has been used in the simulations. All shown results were averaged over 5000 independent Monte Carlo trials. Figure 2 depicts the NMSE of the channel estimation versus the SNR of the legitimate user pilot signal as a function of the quantization bits and Eve's pilot power. It is noted that NMSE is monotonically increasing for decreasing the quantization bits b. As expected, increasing the power of the active eavesdropper has a negative effect on the MMSE channel estimation. The dashed horizontally lines in Figure 2 denote the asymptotic limits of the NMSE in high power regime. The results illustrate that the coarse low-resolution ADCs result in a nonzero floor on NMSE which cannot be compensated by increasing the pilot power. On the contrary, there are no nonzero floors with implementing perfect ADCs at the transceivers which illustrates that equipping perfect ADCs could achieve an error-free channel estimation with larger pilot power.
Since the derived secrecy rate expression in Theorem 1 is an appropriate tool for secrecy performance measure, the tightness of expression (23) should be verified via numerical results. Figure 3 shows the analytical and numerical results versus the number of the APs M  where the results are achieved by averaging over 5000 channel realizations and randomly locations. It is observed that the analytical results can match exactly with the simulated values which can validate the correctness of the obtained analytical expressions. As can be readily noted, deploying more APs yields better secrecy performance which can be explained that both the channel responses and the diversity gains can be improved with deploying more APs. From Figure 1, another interesting observation is that the system suffers a trivial performance loss by low-resolution ADCs of the APs, but cannot tolerate low-quality ADCs at the users. Figure 4 shows the achievable ergodic secrecy rate versus different ADC quantization bits with p E = 0dBW. Obviously, we can observe that the achievable ergodic secrecy rate increases with the ADC quantization bits. It can be explained that low-quality ADCs at APs and users bring larger channel estimation error and more signal quantization noise, which can cause a lower SINR (low achievable rate) at the target users. Besides, with the increase in the bits, the secrecy rate increases and converges to a finite rate limit (the dashed horizontally lines in Figure 4) which can be explained that the quantization distortion factor is very close to 1 when b ø 5. As expected, the considered cellfree massive MIMO system can obtain more secrecy rate with high-resolution ADCs at the users. Hence, low-resolution ADCs can be widely used at APs instead of the users. Figure 5 depicts the achievable secrecy rate as a function of the pilot power ratio between the Eve and the legitimate user, where b m and b u denote the quantization bits of low-resolution ADCs at APs and users, + ' represents perfect high-resolution ADC.
Interestingly, the results in Figure 2 further confirm the conclusions in Figure 1. Besides, Figure 5 illustrates that the antenna configuration at APs brings a significant impact on system performance. With the fixed total number of antennas, more APs can provide better quality of service, which is consistent with the results in Hu et al. 21 Also, Figure 5 shows that the Eve can overhear information to one target user with a less power budget when the system exploits low-resolution ADCs. Hence, a trade-off between costs and performance is required in practical system design.
In Figure 6, the active attacks detection probability versus the pilot power of the Eve with various parameters are depicted when p p = 10dBW and a = 0:5, where the results are obtained by 10,000 independent runs. The curves in Figure 6 illustrate that the detection performance improves with the increasing ADCs quantization bits and the antenna number of each AP. Moreover, it can be observed that the proposed RMT detector can detect active attacks at relative low power level. Also, we can note that the detection probability is over 95% when p E À p p = À 5dB for all parameter choices shown in Figure 6. Specifically, with more antennas at each AP (N ø 10), the detection probability can reach to 1 when the Eve's pilot power is about 6dB lower than that of the legitimate user. Figure 7 shows the detection probability of the proposed RMT-based approach versus the power allocation factor a with fixed total power p p = 10dBW and p E = 5dBW. At first, the results in Figure 7 indicate that the detection probability increases with the increasing a. This can be explained that the correlation between the two column vectors of matrix G defined in equation (47) decreases with increasing a which can lead to one relatively larger eigenvalue § 2, 1 . It is  propitious for promoting the RMT-based detection probability. The used pilot signal is publicly known and predesigned, while the detection signal is designed with adding a random sequence. Besides, a smart eavesdropper will wiretap and transmit the spoof pilot sequence. Thus, to make the detection procedure confidential and imperceptible, the power allocation factor should be as small as possible. For comparison, Figure 7 shows the detection performance under different N and ADCs quantization bits b m , which illustrates that the required allocation power for random sequence decreases with increasing N or b m with the fixed detection probability. These results are valuable and significant for practical detection algorithm design and application.

Conclusion
This article adopted the well-established distortion model (AQNM) to reveal the secure performance of cell-free massive MIMO system with low-resolution ADCs. Uplink MMSE channel estimation has been operated in the presence of active eavesdropping attack which shows that there was a nonzero floor on the channel estimation error brought by the ADC impairments. Based on the imperfect CSI, conjugate beamforming has been performed for downlink transmission. Then, we derived the closed-form expression of the achievable ergodic secrecy rate. To weaken the influence of the active attacks, any effective resisting measure was to detect the presence of the active eavesdropper. This article exploited the RMT-based detection approach to detect the dimension of the signals embedded in noise space. Finally, numerical results have been presented to verify derived results and the effectiveness of the active attack detection scheme.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Proof of Theorem 1
According to the MMSE channel estimation expression (9), we first focus on the term E g H mlỹp, m, l n o as Since g ml , g mE , w p, m , andw p, m are all zero-mean and mutual independent, we can derive the result as Furthermore, we can derive that Plugging equations (46) and (49) into equation (9), the norm-square of the channel estimationĝ ml can be derived that Besides, according to the MMSE principle, the channel estimationĝ ml and estimation errorĝ ml are uncorrelated. Hence, with the aid of Eg mlĝ H ml This finishes the proof.

Appendix 2
Proof of Theorem 2 Furthermore, we can derive that Next, we obtain the evaluation of Ew k j j 2 n o which can be given by where the term D k is defined as Substituting equations (52), (53), (55), and (56) into equation (24), we can derive the SINR expression of equation (26).
Then, we discuss the SINR z Eve k of the Eve. Due to the mutual independence of the effective signal and additive noise, the different parts in equation (17) can be separately derived. We first calculate that Here, (a) follows two properties: (1) the independence ofg mE andĝ Ã mk (ĝ mE ) and (2) the equivalent Eĝ mk k k 4 n o = N N + 1 ð Þl 2 mk . 19 Similarly, using the mutual independence of g mE andĝ ml (l 6 ¼ k), we have that Plugging these results into equation (25) yields the expression as equation (27), which completes the proof.