Lattice-based provable data possession in the standard model for cloud-based smart grid data management systems

The smart grid is considered to be the next-generation electric power network. In a smart grid, there are massive data to be processed, so cloud computing is introduced into it to form a cloud-based smart grid data management system. However, with data no longer being stored locally, how to ensure the integrity of data stored in the cloud in the smart grid has become an urgent problem awaiting solution. Provable data possession has been proposed to solve this problem. With the development of quantum computer technology, quantum attacks-resistant cryptographic schemes are gradually entering people’s horizons. Lattice cryptography can resist quantum attacks. In this article, a lattice-based provable data possession scheme is proposed for cloud-based smart grid data management systems. The scheme is proved unforgeable under the small integer solution hard assumption in the standard model. Compared with other two efficient lattice-based provable data possession schemes in the standard model, our scheme also shows efficiency.


Introduction
In recent years, the smart grid has gained more and more attention from many countries, and a great number of smart grid projects have been launched around the world. 1 A smart grid is a new type of modern power grid integrating the advanced sensing measurement technology, information and communication technology, automatic control technology, analysis and decision technology, and energy power technology with grid infrastructure. As the next-generation electric power network, it is aimed to achieve reliability, security, economy, efficiency, environment friendliness as well as use safety.
A smart grid relies on the Internet of Things (IoT) technology. The core technologies of IoT cover the awareness of physical state, information representation, information transmission, and information processing, ranging from the sensor network to the upper application systems. In the aspects of communication, security, and the upper application of smart grid information system, IoT plays an important role. The sensor network technology can be used in the data and information acquisition of smart meters; the real-time and security communication technology can be used for the transmission of smart grid operation parameters, so that the operational data and power load data of the smart grid can be transmitted in real-time; the data storage and information representation technology can be used to store, manage, query, and organize huge amounts of data of the smart grid; and the distributed data processing and task scheduling technology can be used for the security and stability analysis of the smart grid and the real-time deployment of energy after new energy integration. In a word, IoT has turned the power system from a relatively closed and self-contained control system into one that is part of a digital environment, which has not only improved the stability of the smart grid, but also made new energy sources such as wind power and nuclear power more easily integrated into the smart grid for unified planning and scheduling.
Nowadays, cloud computing provides users with massive computing and storage resources. Users can use it at a low cost without having to buy local equipment, saving a large amount of hardware, software, and maintenance costs. For cloud computing, the National Institute of Standards and Technology (NIST) defines three types of cloud services, that is, Infrastructure as a Service (IaaS), Platform as a Service (PaaS), and Software as a Service (SaaS). IaaS provides virtual machines or other infrastructure services such as storage resources; PaaS provides development platforms including software development kits, documentation, and testing environment; and SaaS provides application software based on cloud infrastructure, which users can use directly through browsers, and so on. At present, the research hotspots of cloud computing include job scheduling, 2 cloud-based service broker routing policy, 3 and so on.
In a smart grid, there are a large number of sensors deployed, exponentially increasing the collected data and thus greatly enlarging the data storage requirement. To deal with this issue, people combine the smart grid with cloud computing to form cloud-based smart grid data management systems. 4 In terms of the security of cloud-based systems, Gou et al. 5 analyzed various security issues and challenges in the cloud computing environment, which involve security policies, user-oriented security, data storage security, application security, and network security. In this article, we mainly focus on the data storage security.
In a cloud-based smart grid data management system, data are no longer stored locally. But if these data are changed intentionally or unintentionally without being detected in time by users, it will bring serious consequences to the operation of the smart grid. Therefore, ensuring the integrity of the data stored in the cloud in the smart grid has become an urgent problem awaiting solution. Ateniese et al. 6 studied integrity checking for the first time. They put forward the concept of provable data possession (PDP) and proposed two concrete schemes. In their model, users can be ensured that their data are intact in the cloud without having to download the data from the cloud. It is a lightweight integrity probabilistic checking model. In their first scheme, if a file is divided into 10,000 blocks, of which 1% are tampered with, a user will be able to detect the change at a probability of 99% by just randomly selecting 460 blocks (4.6%) as the audit request. Following Ateniese et al.'s pioneering work, many other PDP schemes 7,8 were proposed. To address the integrity of cloud-based smart grid data management systems, He et al. 4 proposed a certificateless PDP scheme. However, Zhou 9 later pointed out that He et al.'s scheme is insecure as a malicious cloud can cheat the users. In addition, He et al.'s scheme relies on the random oracle model 10 and cannot withstand the quantum computer attacks. 11 Our contributions 1. We propose a security model of the PDP scheme and summarize its security requirements. 2. We propose a quantum attack-resistant PDP scheme based on lattice for cloud-based smart grid data management systems and prove its unforgeable security in the standard model. We also show that the proposed scheme can satisfy other security requirements. 3. We compare our scheme with other two quantum attack-resistant PDP schemes in the standard model to show that our scheme is also efficient.
The rest of the article is organized as follows. In section ''Related works,'' we review some related work. In section ''Preliminaries,'' we describe some preliminaries about lattice. In section ''Definition and security requirements of PDP,'' we introduce the definition and security requirements of PDP. In section ''The scheme,'' we propose a concrete PDP scheme based on lattice. In section ''Analyses of the scheme,'' we analyze the security and efficiency of the proposed scheme. We conclude the article in section ''Conclusion.''

Related works
In 2007, Ateniese et al. 6 pioneered the idea of PDP and proposed two concrete schemes. However, their schemes only support private audit, that is, only cloud users can audit the data stored in the cloud, because in the audit process, some private information must be used. Later, Wang et al. 12 put forward a public audit scheme. They removed the private information in the audit process, so that anyone can audit the data stored in the cloud. In this way, users can delegate their audit work to a third-party auditor (TPA) to reduce the burden of their own and the TPA can prove the data corruption to the court when a dispute occurs. Furthermore, Wang et al. 13 pointed out that TPA can reveal users' data by solving enough linear equations in scheme [12] and proposed a privacy-preserving public PDP scheme by introducing random numbers. All the above PDP schemes were proposed in the random oracle model, which, however, is just an ideal model. There is no random oracle in reality, and the random oracles are often replaced by some concrete hash functions in the real world. Some studies 10 have shown that schemes secure in the random oracle model will not be still secure when the random oracles are replaced by concrete hash functions. To deal with this issue, Zhang et al. 14 proposed an efficient identity-based public PDP scheme in the standard model (without relying on the random oracle model).
With the rapid development of the quantum computer technology, all schemes based on factoring, discrete log, or bilinear pairings have become insecure under quantum computers, but schemes based on lattice can resist quantum computer attacks. 15 In 2016, NIST initiated a process to solicit, evaluate, and standardize post-quantum cryptographic algorithms around the world, marking the beginning of the post-quantum era. To deal with this issue, Xu et al. 16 proposed a lattice-based public PDP scheme. However, their scheme relies on the random oracle model. Chen et al. 17 proposed an efficient lattice-based PDP scheme in the standard model, which is based on the ideal lattice hard problem. Recently, Yang et al. 18 also proposed an efficient lattice-based PDP scheme in the standard model, which is based on the Ring Learning with Errors (RLWEs) hard problem. However, lattice with special structures such as ideal lattice and RLWE lattice is less secure than that without special structures. 19 Therefore, all the above schemes are not suitable for cloud-based smart grid data management systems.
In 2018, He et al. 4 proposed a certificateless PDP scheme for cloud-based smart grid data management systems. Thereafter, Zhou 9 pointed out that He et al.'s scheme 4 is not secure as a malicious cloud can cheat users. Furthermore, He et al.'s scheme is not quantum attack-resistant and is reliant on the random oracle model.

Preliminaries
In this section, we will give some definitions, theorems, and hard problem of lattice. The definitions will help the readers understand the concept of lattice better. The theorems show the existence of the algorithms TrapGen and SamplePre, which will be used in the design of our scheme. The hard problem will be used in the unforgeable proof of our scheme.

Notation
The lower-case letters denote the column vectors and the upper-case letters denote the matrices. A

Hard lattice problem
The security of our lattice-based PDP scheme relies on the hardness assumption of the small integer solution (SIS) problem.
Definition 4. SIS problem: 20 given n, m, q, b, and a random matrix A 2 Z n 3 m q , one must find a non-zero vector v 2 Z m so that Av = 0 mod q and jjvjj ł b.

Definition and security requirements of PDP
In this section, we will give the algorithm constitution and design goals of PDP.

Definition of PDP
There are three participants in a PDP scheme, that is, Cloud server, Cloud user, and TPA, as illustrated in Figure 1. The process of PDP is as follows.
A user first divides his data to n blocks and uses his private key to produce a tag for each block. Then, he uploads all the data blocks along with the tags to the cloud and deletes them locally. To ensure the data blocks are intact in the cloud, the user delegates TPA to audit the data blocks stored in the cloud periodically. In the audit process, TPA randomly selects part of the data blocks for auditing. He sends the index numbers of the selected blocks and some randomly selected values to the cloud. The cloud produces a proof of these data blocks using the selected data blocks, received random values and corresponding tags, and sends the proof to TPA. Then, TPA checks whether the proof can pass a verification equation in advance.
A PDP scheme consists of the following four algorithms: 1. Setup(1 k ): given a security parameter 1 k , it generates a public/private key pair (pk, sk). 2. TagSign(sk, i, m i ): on input private key sk, index i of file block m i , and file block m i , it generates a tag t i of m i . It is run by the cloud user. 3. Pr oofGen(pk, m i , t i , chal): on input public key pk, file blocks m i 's, tags t i 's, and a challenge chal, which contains a random index subset I of total file blocks and random value set W , it generates a proof of data possession V . It is run by the cloud server. 4. Verify Pr oof (pk, V , chal): on input public key pk, proof V , and challenge chal, it evaluates if V is a correct proof of data possession for the blocks determined by chal. For public audit, it is run by TPA.
Design goals of PDP 1. Unforgeablity: the PDP scheme must be existentially unforgeable against the adaptive chosen message attacks. The formal definition will be given in section ''Analyses of the scheme.'' 2. Public auditability: it does not need any private information when auditing the cloud. Therefore, users can delegate their audit tasks to TPA to release their own burden and prove the data corruption to the court when a dispute occurs. 3. Privacy preserving: TPA should not deduce the users' data by solving the linear equations when it audits the cloud.

The scheme
The scheme includes the following four algorithms. In the Setup algorithm, some common parameters are produced. In the TagSign algorithm, user produces a tag for each file block. Then, he transmits the file blocks along with the tags to the cloud and deletes them locally. In the ProofGen algorithm, TPA sends an audit request to the cloud and then the cloud responds a proof of data being intact to TPA. In the VerifyProof algorithm, TPA verifies the correctness of the proof. If it is correct, TPA can be sure that the data are intact.
Setup: the cloud user runs the TrapGen(q, n, m) algorithm to generate a uniform matrix A 2 Z n 3 m q and a basis T A 2 Z m 3 m of L ?
). Similarly, the cloud server runs the TrapGen(q, n, m) algorithm to generate a uniform matrix Q 2 Z n 3 m q and a basis T Q 2 Z m 3 m of L ? q (Q) so that jjT Q jj  TagSign: the message file is divided into r blocks m i 2 f0, 1g l , i = 1, 2, :::, r, and i 2 Z is the index of m i . For every block i, the cloud user computes . ProofGen: to confirm the data in the cloud are intact, TPA selects a subset J = fi 1 , i 2 , :::, i c g of set f1, 2, :::, rg and random elements w j 2 Z Ã q , j = i 1 , i 2 , :::, i c . TPA sends (j, w j ), j = i 1 , i 2 , :::, i c to the cloud server. Then, the cloud server selects v 2 Z n q randomly. The cloud server computes b Sample The cloud server sends (s, t, v) to TPA. VerifyProof: TPA verifies whether the equation ) holds true or not.

Security model of PDP
In the following, we will give the security model of PDP, which will be used in the unforgeable security proof of our scheme. The unforgeable security of a PDP scheme is defined through an interactive game between an adversary A and a challenger C. The game consists of the following stages: 1. Setup: the challenger C runs the setup algorithm to generate a public/private key pair (pk, sk). C gives pk to A and keeps sk secret. 2. Query: A can make TagSign queries adaptively: A selects a file block m i and sends it to C. C runs the TagSign algorithm to generate a tag of block m i and sends it back to A. 3. Challenge: C generates a challenge (I, W ) and sends it to A, where I is a subset of index numbers of file blocks and W is a set of random numbers. 4. Forge: A generates a proof V for (I, W ) and returns V to C.
If V can pass the VerifyProof algorithm and at least one query of TagSign(sk, i, m i ) does not happen, where i 2 I, A wins the game.

Definition 5.
A PDP scheme is secure if for any PPT adversary A, the probability that A wins the above game is negligible.

Security theorem
According to the above security model of PDP, we prove that our scheme is unforgeable assuming that the SIS problem is hard.
Theorem 3. In the standard model, if there is an adversary A that can break the unforgeability of the above scheme with probability e, then the SIS problem can be solved with e 0 = (1 À 2 Àv( log n) )e.
Proof. Given a random instance SIS (n, m, q, 2s) = (A, n, m, s), where A 2 Z n 3 m q , the challenger C is asked to find a short vector e satisfying Ae = 0 mod q and jjejj ł 2s ffiffiffi ffi m p , e 2 Z m . Challenge: C selects a subset J = fi 1 , i 2 , :::, i c g of set f1, 2, :::, rg and random elements w j 2 Z Ã q , j = i 1 , i 2 , :::, i c . C sends (j, w j ), j = i 1 , i 2 , :::, i c to A. Forge: A generates a forged proof (s, t, v) for (j, w j ), j 2 J . According to Definition 5, at least one query of TagSign(sk, i, m i ) must not happen, that is, A generates a forged tag e i for file block m i . If the forged tag is valid, then Ae i = u A(e i À e i ) = 0 mod q and jje i À e i jj ł 2s((1 + P l j = 1 m i ½j)=(l + 1)) ffiffiffi ffi m p ł 2s ffiffiffi ffi m p , that is, C solves the SIS problem. The min-entropy of SamplePre is v( log n). 21 Therefore, the probability of e i 6 ¼ e i is 1 À 2 Àv( log n) .

Other design goals
Some common design goals include public auditability and privacy preserving: 1. Public auditability: the public auditability of our scheme is evident. In our VerifyProof algorithm, TPA does not need any private information in the verification equation, which means anyone can audit the data. 2. Privacy preserving: TPA should not deduce the users' data by solving the linear equations when it audits the cloud. In our ProofGen algorithm, the cloud server sends proof (s, t, v) to TPA.
contains the original data blocks. If TPA selects the same index set of data blocks for audit c times, he can get c equations. However, b is unknown to him. The number of unknown variables is always greater than the number of equations. Therefore, he cannot reveal the user's data blocks by solving the linear equations.

Efficiency
In this section, we will evaluate the efficiency of our scheme. Only efficient scheme has practical value. To the best of our knowledge, there have been only two lattice-based PDP schemes in the standard model till now, that is, schemes [17] and [18]. We compare the costs of computation, storage, and communication of our scheme with those of the other two. The computation cost of the SamplePre algorithm is equivalent to the cost of about m 2 + 2mn multiplications in Z q when the parity check matrix belongs to Z n 3 m q . 22 The comparisons are listed in Tables 1 and 2, where c denotes the number of audited blocks, and l denotes the bit length of a file block.
To provide more direct comparisons, we take n = 391, q = 2.34e + 10, m = 71,380 for our scheme (non-ideal lattice) and n = 256, q = 7.21e + 16, m = 2204 for schemes [17] and [18] (ideal lattice). 23 With these parameters adopted, its security level is about equivalent to that of 88-bit symmetric key cryptosystem. In addition, we take l = 160. Therefore, we get Tables 3 and 4. From Table 3, we can get Figure 2 for cloud server's running time and Figure 3 for TPA's running time.
From Table 3, we can see that scheme [17] is the most efficient one in the TagSign stage. From Table 4, we can see that scheme [17] is the most efficient one in the storage of Pk + Sk and that ours is the most efficient one in the storage of Tags and the communication of Tags and Proof . From Figure 2, we can see that ours is the most efficient one in cloud server's running time (ProofGen stage). And according to Figure 3, scheme [17] is the most efficient one in TPA's running time  [18] m 2 + 2mn + n 2 m 2 2cm(n 2 + n) n 2 m + cnm Our scheme m 2 + 3mn m 2 + 2(n + c)m (c + 1)(m + 1)n + nm Note: n and m are of the same meaning as in the scheme, and c denotes the number of audited blocks.

Schemes
Pk + Sk Tag-Length Proof-Length [17] mn log q n m log q 2nm log q [18] 3nm log q n m log q 2nm log q Our scheme 2(n + m + l + 1) m log q m log q (2m + n) log q Note: n, m, and q are of the same meaning as in the scheme, and l denotes the bit length of a file block. (VerifyProof stage). Therefore, it can be concluded that our scheme is efficient.

Conclusion
To resist quantum computer attacks, a lattice-based PDP scheme is proposed for cloud-based smart grid data management systems to ensure the integrity of cloud data. We prove that the scheme is secure under the SIS assumption in the standard model, and performance analysis shows that the scheme is efficient. Figure 2. cloud server's running time.
Note: the abscissa axis denotes the number of challenged blocks, and the ordinate axis denotes the number of multiplication in Z q times 8 power of 10.