A novel two-dimensional reversible data hiding method with high embedding capacity in H.264/advanced video coding

Developing the technology of reversible data hiding based on video compression standard, such as H.264/advanced video coding, has attracted increasing attention from researchers. Because it can be applied in some applications, such as error concealment and privacy protection. This has motivated us to propose a novel two-dimensional reversible data hiding method with high embedding capacity in this article. In this method, all selected quantized discrete cosine transform coefficients are first paired two by two. And then, each zero coefficient-pair can embed 3 information bits and the coefficient-pairs only containing one zero coefficient can embed 1 information bit. In addition, only one coefficient of each one of the rest coefficient-pairs needs to be changed for reversibility. Therefore, the proposed two-dimensional reversible data hiding method can obtain high embedding capacity when compared with the related work. Moreover, the proposed method leads to less degradation in terms of peak-signal-to-noise ratio, structural similarity index, and less impact on bit-rate increase.


Introduction
With the introduction of sensor networks, many smart devices are used to collect a large amount of data, including images, videos, speeches, and texts, for smart homes, health monitoring, traffic control, and so on. This indeed makes people's life more convenient. However, this may lead to the leakage of personal information at the same time. For example, the face information and fingerprint information in public videos are abused. Therefore, it is important to guarantee the key content of video but not make the personal information leakage. Nowadays, distributing digital videos to the global has become easier because of the rapid development of high-speed broadband Internet and video encoding standard. For instance, in practice, 1 many social applications, such as Skype, Facebook, WhatsAPP, WeChat, Blog, and QQ, can be used to spread digital videos. This has brought many concerns and maybe lead to many problems, even criminal activities, such as the illegal distribution of a digital movies and the leakage of personal privacy information in public videos. Therefore, researching and finding out effective ways to solve these problems or prevent them happening has become necessary.
Currently, encryption and digital watermarking are two commonly used techniques in digital multimedia, such as images and videos, to these problems. When applying encryption technique in images or videos (also referred to as motion images), the computational complexity may be high. In addition, it is possible for a video codec to produce format incompatibility. At the same time, encryption usually makes the content of images or videos unavailable. However, in some cases, like copyright protection, the content should be available. Hence, there have been many researchers to start to research digital watermarking in videos not only for solving these problem but also for other purposes, such as broadcast monitoring, copy or playback control, online location, and content filtering. 1 Digital watermarking is a part of data hiding (DH), 1,2 and it is classified into irreversible and reversible watermarking corresponding to the technologies of irreversible DH and reversible data hiding (RDH), respectively. Compared with irreversible data embedding, RDH has attracted much more attention from many researchers because it can embed additional information into digital media, such as images and videos, and recover the original media content after extracting the embedded information from the marked digital media. 3,4 In the past two decades, RDH in images has been rapidly developed that leads to make many achievements. 5 For instance, Wu et al. 6 designed an RDH scheme in encrypted palette images since palette images are widely utilized in real life. In their RDH scheme, a color partitioning method is proposed to make use of the palette colors to construct a certain number of embeddable color triples for embedding the secret data. By doing this, their scheme can provide a relatively high data-embedding payload and have a low computational complexity. Recently, Yang et al. 7 propose an adaptive real-time RDH for JPEG images. This RDH scheme is realized by using successive zero coefficients in zig-zag order of discrete cosine transform (DCT) blocks. Their experimental results have verified that their proposed scheme can enhance embedding capacity meanwhile maintaining the image quality. Moreover, Chen and Wang 8 proposed a RDH method with high embedding payload for JPEG images. For this method, each quantized discrete cosine transform (QDCT) coefficients is changed for carrying 1 information bit, thus leading to high embedding payloads.
In videos, there exist many kinds of coding parameters that can be changed for RDH, even DH. Therefore, compared with images, videos have much more research room to develop the techniques of RDH and DH. Recently, as the development of video compression standard, such as H.264/advanced video coding (AVC) 9 and high-efficiency video coding (HEVC), 10 some video DH methods 2,11-16 and video RDH methods [17][18][19][20][21][22][23] are reported. For these DH methods, they are proposed for improving embedding capacity, 12,14 stopping intra-frame drift, 11,13,16 and reducing bit-rate increase. 24 For these RDH methods, they are proposed for improving error concealments performance, 18 making embedding capacity larger, 17 protecting privacy, 20,23 and preventing inter-frame distortion drift. 19 However, compared with the development of RDH technique in images, it is not enough. Thus, this has motivated us to continue to research RDH technique in video. Recently, Xu and Wang 18 proposed a two-dimensional (2D) RDH method, as shown in Figure 1, for error concealment of intra-frame in videos. Compared with onedimensional RDH method, 2D RDH can keep better performance in terms of peak-signal-to-noise ratio (PSNR) and structural similarity index (SSIM). Thus, Xu et al.'s scheme is better than Chung et al.'s scheme. 25 However, Xu et al. did not make full use of (0,0) and only a part of coefficient-pairs containing 1 zero coefficients are used to map for DH. Based on this, this has motivated us to propose a novel 2D RDH method in this article for improving the embedding capacity. In our experiments, we exploit the method of block selecting in Chen et al.' method 11 and apply the 2D RDH of Xu and Wang's method 18 and our proposed 2D RDH method in H.264/AVC reference software JM12.0. 26 In other words, we compare them in identical cases. Experimental results have verified that our proposed method outperforms Xu and Wang's method in terms of embedding capacity. Furthermore, our proposed 2D RDH method causes little degradation in visual quality and little impact on coding efficiency in terms of bit-rate increase.
The remainder of this article is organized as follows. In section ''proposed method,'' we present the proposed 2D RDH method. Some experimental results and analysis are given in section ''Experimental results and analysis.'' Finally, we draw some conclusions in section ''Conclusion''.

Proposed method
In common RDH methods, zero QDCT coefficients are not considered and exploited to embed information in compressed images and videos. Therefore, Chen et al. present a video RDH scheme by combining with zero QDCT coefficient-pairs from high-frequency areas. 17 Based on Chen et al.'s 17 work, we propose a novel 2D RDH method in H.264/AVC videos in the following. The histogram modification is shown in Figure 2.

Data embedding
During the procedure of data embedding, our proposed 2D RDH method is based on paired QDCT coefficients and thus all coefficients should first be paired two by two. In the following, all coefficient-pairs, each of which is also called as a point, that is, A(x, y) denoted as (x, y) in short in this article, constitute a set denoted as A and then the points in A will be changed for data embedding. In the embedding procedure, b is binary string. For each coefficient-pair, it is changed as follows.
2. If A(x, y) = (À 1, 0), that is, x = À 1 and y = 0, the point A(x, y) is changed by 3. If A(x, y) = (1, 0), the point A(x, y) is changed as follows Data embedding 1. If x\ À 1 and y = 0, the point A(x, y) is changed by equation (4) for data embedding 2. If x.1 and y = 0, the point A(x, y) is changed by equation (5) for data embedding 3. If x = y = 0, the point A(x, y) is changed by equation (6) for data embedding Exploiting equations (1)-(6), information can be embedded into the videos reversibly.

Data extraction and video recovery
Corresponding to the procedure of data embedding, the data extraction and the video recovery are addressed as follows.

Data extraction
at the same time the absolute value of y 0 is not greater than 1, the embedded information b is extracted by 2. If A 0 (x 0 , y 0 ) = (2, y 0 ) and jyj = 1 and the embedded information b is extracted by 3. For one point A 0 (x 0 , y 0 ), if the sum of the absolute values of x 0 and y 0 is not greater than 1 and x 0 and y 0 are not 21 at the same time, the embedded information b is extracted by Video recovery , if x 0 \ À 2 and at the same time the absolute value of y 0 is not greater than 1, this point is restored by 2. For one point A(x 0 , y 0 ), if x 0 .1 and at the same time the absolute value of y 0 is not greater than 1, this point is restored by 3. For one point A 0 (x 0 , y 0 ), if the sum of the absolute values of x 0 and y 0 is not greater than 2 and x 0 and y 0 are not 21 at the same time, this point is restored by 4. For one point A 0 (x 0 , y 0 ), if the absolute value of y is greater than 1, this point is restored by 6. If A 0 (x 0 , y 0 ) = (À 1, À 1), this point is restored by By using equations (10)- (15), the original compressed videos can be restored.

Analysis of embedding capacity and distortion
To analyze the embedding capacity and distortion of our proposed method, we first define three sets as follows where A 1 and A 2 are used for data embedding and A 3 is shifted for reversibility. Therefore, the embedding capacity is calculated by where k A 1 and k A 2 denote average embedding rate per coefficient-pair corresponding to the sets A 1 and A 2 , respectively, and h(A) is defined by where # is the cardinal number of a set and A denotes all points in the set A. Thus, equation (16) can be rewritten as In our experiments, we count h It is very close to the result shown in Table 1, that is, 7343 bits. Moreover, embedding distortion can be defined by where d A 1 , d A 2 , and d A 3 denote average modification rate per coefficient-pair corresponding to the sets A 1 , A 2 , and A 3 , respectively. Thereby, equation (19) can be rewritten as In fact, the embedding distortion cannot be measured by equation (19) since H.264/AVC has intraframe and inter-frame predictions. Equation (19) can stand for total number of modification on QDCT coefficients but not embedding distortion. Finding a good way to reasonably calculate the embedding distortion is a big challenge and it is also a research direction for us in the future. In this article, we will not address more details about how to find a good way to reasonably calculate embedding distortion.

Experimental results and analysis
This section contains four subsections, that is, setup, embedding capacity, visual quality, and bit-rate variation.

Setup
To evaluate the performance of the proposed 2D RDH method, we applied the proposed 2D RDH method in the H.264/AVC reference software JM12.0. 26 Twelve standard video sequences, that is, Akiyo, Claire, Coastguard, Container, Foreman, Miss America, Mobile, Mother-Daughter, News, and Suzie (as shown in Figure 3) downloaded from websites, 27 which are with the resolution of 176 3 144, are used in our experiments. Moreover, we give some main configuration parameters of JM12.0 in Table 2, QP will be discussed in the following several subsections, and other parameters not mentioned remain in their default values. The group of picture (GOP) is IBPBPBPBPBPBPBPB. To fairly compare the performance of the proposed 2D RDH method with related method proposed by Xu and Wang, 18 we take advantage of the block selecting method proposed by Chen et al. 11 to select blocks for data embedding. In other words, the two methods are compared in the identical conditions. In addition, we exploit embedding capacity, visual quality, and bit-rate variation to measure the performance of our proposed 2D RDH method. PSNR and SSIM 28 are used for objectively evaluating visual quality of marked videos. Bit-rate comparisons show the impact of our proposed 2D RDH method on H.264/ AVC encoder in terms of coding efficiency. In the following several subsections, the ''Original'' of PSNR, SSIM, and bit-rate is computed by the original H.264/ AVC encoder. Otherwise, they are computed by H.264/ AVC encoder with the corresponding DH methods. More analyses are given as follows. Table 1 shows the maximum embedding capacities on these 12 video sequences mentioned in section ''Setup'' by using Xu and Wang's method 18 and our proposed method. In Table 1, QP has three values, that is, 24, 26, and 28 and it determines the quantization step of H.264/AVC encoder. 9 According to Table 1, obviously, our proposed method has larger maximum embedding capacities on these 12 video sequences when compared with Xu and Wang's method. 18 For example, on Miss America in Table 1, our proposed method obtains 3204, 2628, and 2543 bits corresponding to QP = 24, 26, and 28, respectively. However, Xu and Wang's method 18 obtains 2086, 1720, and 1664 bits correspondingly. Moreover, our proposed method obtains average maximum embedding capacities of 6803, 6450, and 3598 bits, which are greater than that Xu and Wang's method 18 obtains, that is, 4339, 4126, and 2348 bits. These have verified that our proposed 2D RDH method has indeed an advantage in embedding capacity when compared with Xu and Wang's method. 18

Visual quality
In this subsection, we will measure the visual quality of marked videos by our proposed method in two sides. On one hand, we give However, we make use of PSNR and SSIM to objectively evaluate visual quality of marked videos and they are shown in Tables 3 and 4. In Table 3, when QP is fixed, ''Original'' PSNR value is greater than PSNR values caused by Xu and Wang's method 18 and our proposed method. In addition, Xu and Wang's method 18 provides larger PSNR values than our proposed method. For instance, exploiting H.264/AVC encoder with QP = 24 on Mobile, they provide 37.83, 37.55, and 37.47 dB. Average PSNR values also meet this. Noted that, however, Tables 3-5 correspond to Table 1. Therefore, it is not fair to compare the two 2D RDH methods like this. To better compare the two methods, we define PSNR variation (PSNRV) by where PSNR Ori and PSNR 2DRDH are determined by the H.264/AVC encoder without and with 2D RDH method, respectively. EC denotes embedding capacity.
PSNRV is used to evaluate embedding distortion. Based on this, we give Figure 6 corresponding to QP = 28. As shown in Figure 6, for these 12 video sequences, the proposed method leads to very close embedding distortion when compared with Xu and Wang's method. 18 In particular, although our proposed method provides greater embedding distortion on Suzie than Xu and Wang's method, 18 it provides less embedding distortion on other 11 video sequences. In other words, our proposed method has less impact caused by DH.
In Table 4, for ''Original,'' Xu and Wang's method, 18 and our proposed method, the SSIM values are decreasing with the increase in the QP value. Although our proposed method provides least SSIM value of 0.9404 on Coastguard when QP = 28, the difference between them are little. When considering the embedding capacity (shown in Table 1), the performance of our proposed method is accepted. Totally, our proposed method provides larger embedding capacity and leads to close embedding distortion when compared with the related work.

Bit-rate variation
Bit-rate after and before embedding data into video sequences is often used to evaluate the coding efficiency of H.264/AVC codec. In this subsection, we give Table 5 to    show bit-rate variation without and with DH method. Likewise, Table 5 also corresponds to Table 1. According to Table 5, our proposed method leads to close coding efficiency of H.264/AVC encoder with Xu and Wang's method. 18 Compared with ''Original,'' our proposed method has a little impact on coding efficiency. In addition, to better compare our proposed method with Xu and Wang's method, 18 we define bitrate variation by where B 2DRDH and B Ori are generated by H.264/AVC encoder with and without the RDH method, respectively. Similarly, based on this, we draw Figure 7.
Herein, Figure 7 corresponds Figure 6 and Table 1. As seen from Figure 7, obviously, except Claire, our proposed method leads to less bit-rate increase per embedding information bit when compared with Xu and Wang's method. 18 That is to say that our proposed method outperforms Xu and Wang's method 18 Figure 6. Peak-signal-to-noise-ratio variation comparisons (QP = 28). for data embedding. In addition, only one coefficient in other points is changed to vacate room for reversibility. Therefore, the proposed method provides higher embedding capacity compared with the related method. When compared the variation of PSNR and SSIM caused by each embedding information bit, the proposed method keeps better visual quality. Moreover, the increase in bit-rate caused by the proposed method is less.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this