A novel two-dimensional reversible data hiding scheme based on high-efficiency histogram shifting for JPEG images

In recent years, reversible data hiding technology has been widely used in JPEG images for special purposes such as file management and image authentication. Histogram shifting is one of the most popular techniques for achieving reversible data hiding technology. However, invalid shifting in histogram shifting limits the performance of existing reversible data hiding schemes. Therefore, we propose a two-dimensional histogram shifting-based reversible data hiding scheme in this article to improve the performance of marked JPEG images in terms of visual quality and file size. In the proposed histogram shifting method, only the coefficient pairs containing two non-zero quantized discrete cosine transform coefficients are changed for embedding data. Specifically, the coefficient pairs with at least one quantized discrete cosine transform coefficient valued −1 or +1 are shifted and the rests leave room for embedding data. With our proposed reversible data hiding scheme, the number of invalid shifting pixels is reduced so that it improves the performance of marked JPEG images. The experimental results show that the proposed method achieves a higher peak signal-to-noise ratio and has a lower increase in file size than state-of-art methods.


Introduction
For the last few decades, reversible data hiding (RDH) methods are springing up like mushrooms. The RDH is also called lossless data hiding in many cases. The difference between RDH methods and traditional information hiding methods is that the former can recover the original carrier image without losing any information. [1][2][3][4] And by RDH, not only the hidden data but also the original cover medium can be completely recovered by the authorized users. 5 There are four main categories for existing RDH methods: (1) lossless compression, [6][7][8][9][10][11] (2) difference expansion (DE), [12][13][14] (3) histogram shifting (HS), [15][16][17][18][19] and (4) integer-to-integer transform-based methods. [20][21][22][23] Recently, by applying RDH, the embedding distortions can be eliminated when extracting embedded data. RDH is usually used in some sensitive scenarios with distortions, such as medical images, military images, legal forensics, copyright protection, and other related fields, in which any distortion is not prohibited.
So far, various RDH algorithms have emerged. Among them, the prediction error expansion (PEE) algorithm 17,[24][25][26][27][28][29][30][31][32][33][34][35][36] is widely used. For example, in Qu and Kim, 33 an improved pixel-based pixel value ordering (PPVO) method was proposed to predict each pixel. Compared with previous pixel value ordering (PVO) methods, it was not constrained by blocks and secret data are embedded in some smoother areas. To further reduce distortions and improve embedding capacity, a new scheme was proposed in Long et al., 36 in which an image is divided into three L-shaped blocks with nonoverlapping pixels. After these blocks are scrambled with dromion and stream encryption, the difference is calculated, and the paired prediction errors corresponding to each block are accumulated to form a twodimensional (2D) prediction error histogram to carry data. Then, a halftone image RDH algorithm based on minimizing the visual distortions of pixel flipping and an RDH scheme based on the image texture to reduce invalid pixels shifting were proposed in Yin et al. 37 and Jia et al., 38 respectively. A method proposed by Li et al., 39 to reduce invalid pixel shifting is based on PVO and PEE, which can not only reduce the number of shiftable pixels, but also improve the image visual quality for using PVO in the PEE scheme.
For RDH schemes, PVO is very important and practical. The literature [40][41][42][43] used block-based ordering (neighboring pixel ordering). Sachnev et al. 44 proposed a popular efficient data hiding using a sorting prediction technique. In Weng et al., 45 a method based on improved dynamic pixel value ordering (IPVO) was proposed, which can flexibly modify the number of pixels in a block by classifying the local complexity into multiple levels.
HS is a very successful method of RDH, which was first proposed by Ni et al. 1 In this method, the histogram is generated and some values in the histogram are extended to carry data. Based on the sensitivity of Human Vision System, Chang et al. 46 introduced discrete cosine transform (DCT) technique and discarded the low-frequency components, and directly modified the continuous zero value coefficients in the midfrequency coefficients. Xuan et al. 47 proposed that HS technique can be used in JPEG images to extend multiple pairs of bins of DCT coefficient histograms and modify them to carry secret data. Peng et al. 48 proposed a method in which secret data are first encoded into grouped pixels through some specific pixel transformation operations. Then, multi-segment left and right HS based on threshold operation is used to record the pixel change operation. This method is one of the checkerboard prediction techniques and multiple embedding strategies. Gao et al. 49 proposed an RDH method based on prediction error histogram (PEH), which makes the full use of correlations of coded channels, and designed a quadratic time prediction RDH scheme based on 2D PEH. He et al. 50 proposed an improved RDH scheme for JPEG images based on block sorting and frequency selection. According to the distribution of DCT coefficients, the scheme accurately estimates the distortion caused by the embedded data in each block.
The rest of this article is organized as follows. In section ''Related works,'' the related works are introduced. The proposed scheme is described formally in section ''The proposed method.'' The experimental results are given in section ''Experimental results'' and followed by a conclusion in section ''Conclusion.''

Related works
As presented in section ''Introduction,'' lots of RDH schemes are based on HS. We review three main RDH schemes proposed in Huang et al., 51 Wedaj et al., 52 and Xiao et al., 53 respectively, in this section.

Huang et al.'s scheme
Huang et al. 51 selected quantized alternating current (AC) coefficients values of + 1 and 21 in some specific smooth blocks for extended embedding. The smoothness of the block is defined according to the number of AC coefficients with 0 value in each block. That is, the smoothness of the kth block can be defined by the following formula where # means the cardinality of a set. Note that, for a JPEG image of size M 3 N , the number of blocks k ranges from [1, M 3 N =64]. Based on the smoothness, the sequence of the quantized blocks (T 1 , :::, T M 3 N =64 ) is sorted in descending order to generate a new sequence (T a(1) , :::, T a(M 3 N =64) ), where a : f1, :::, M 3 N =64g ! f1, :::, M 3 N =64g is the unique one-toone mapping such that T a(1) ø :: Finally, for a given p (p indicates the selected blocks number, and is determined as the smallest one that can provide sufficient embedding capacity), the AC coefficients in the first p blocks are modified to embed data, that is, for each k 2 fa(1), :::, a(p)g and i 2 f1, :::, 63g, Z k, i is modified to Z Ã k, i as where m 2 f0, 1g is a to-be-embedded bit. Note that, in the above process, only the first p smooth blocks are used for data embedding, and the rough blocks with indices fp + 1, . . . , M 3 N =64g are kept unchanged.

Wedaj's scheme
The embedding method in Wedaj et al. 52 is the same as that in Huang et al. 51 The difference is that the former optimized the selection of cover blocks. The embedding efficiency of each AC frequency band should be calculated, compared, and descending ranked, and p blocks with maximum embedding efficiency should be selected to carry data. Then, for each AC band i 2 f1, . . . , 63g, the embedding efficiency denoted by R i is calculated as where Q i is the quantization table entry at a position at i, and E (i, n) and S (i, n) are defined as follows After the embedding efficiency is calculated, descending order is performed from high to low. Therefore, the embedding can be done for the corresponding AC coefficients that conform to the embedding rules until all embedding payloads are satisfied.

Li's scheme
Similarly, Xiao et al. 53 also calculated the embedding efficiency of all AC coefficient bands. However, different from Huang et al. 51 and Wedaj et al., 52 they did not embed data in the AC coefficients with value 6 1. Instead, they selected a pair of bins (c i , d i ) with c i \d i for expansion, in which the bins between c i and d i remain unchanged, and the bins smaller than c i or larger than d i are shifted. Specifically, for each k 2 f1, . . . , p, . . . , M 3 N =64g and i 2 f1, :::, l, :::, 63g, where m 2 f0, 1g is the secret bit.

The proposed method
In this section, we will describe the proposed scheme in details. Figure 1 shows the flowchart of the proposed scheme, in which data hiding is applied in quantized discrete cosine transform (QDCT) coefficients of cover images, and it includes two stages, block selection and data embedding. In block selection, every two adjacent QDCT coefficients are paired, and then embedding efficiency is evaluated and sorted in descending order. According to the ordered coefficient pairs, the minimum number of blocks that can meet the requirement of embedding capacity is p, and the secret bits are then embedded.

Coordinate classification based on QDCT coefficients pairs
As shown in Figure 2, JPEG encoder includes three main components, namely, DCT, quantitative, and entropy code. Through JPEG compression, the original image is orthogonally transformed. The specific method is used to perform 2D DCT on the non-overlapping 8 3 8 blocks and transport the DCT coefficients into the quantizer. The quantized DCT coefficients are arranged in a zig-zag scanning mode, and direct current (DC) coefficients are modulated by differential pulse code on one hand. Then, the run-length encoding (RLE) on the AC coefficients is pre-compressed and encoded by Huffman so that we can get the JPEG file.
What is more, in Figure 2, all coefficients are obtained by zig-zag scanning in the encoder, and then every two adjacent coefficients are paired. The subsequent embedding is based on the embedding of coefficient pairs. For example, S = fs 1 , s 2 , . . . , s 64 g of 64 coefficients in an 8 3 8 block is paired as (a, b) = f(s 1 , s 2 ), (s 3 , s 4 ), . . . , (s 63 , s 64 )g. Figure 3 shows a quantized 8 3 8 DCT coefficient block. Since there are 64 coefficients in a block, thus, there are 32 coefficients pairs which are then mapped into a 2D plane, as shown in Figure 4.
In Figure 4, all the coefficient pairs are divided into three categories: embeddable, immutable, and shiftable pairs. Note that the purple coordinates in Figure 4 (i.e. coefficient pairs) contain at least one 0 QDCT coefficient, and they are immutable coefficient pairs. Since modifying the coefficient 0 will produce more distortions and lead to the degradation of image quality, coefficient pairs containing the coefficient 0 are kept unchanged.
As shown in Figure 4, the red arrows of the solid line indicate that the coefficient pairs are embeddable, and the binary secret message bit is ''0'' or ''1'' around these arrows. In addition, the arrows also indicate the corresponding expansion directions of the coefficient pairs. The black dotted arrows indicate the shiftable pairs and the pairs can only be shifted to reserve room. In addition, there are four special circular red arrows, corresponding to four points, (1, 1), (21, 1), (21,21), and (1, 21), when the embedded secret message is ''10,'' coefficients of these four points will be kept unchanged. What is more, if the embedded secret message is ''11,'' then the vertical operation is performed, that is, the coordinate value is modified up or down accordingly, as shown in Figure 4.

Block selection
In the method proposed in Huang et al., 51 blocks are sorted according to the number of coefficients with 0 values in the blocks, and the experimental results also accurately show that in blocks with more zero AC coefficients, there are more AC coefficients of + 1 or 21 generally. Therefore, taking full advantage of this statistical property, selecting the first p blocks sorted blocks,  pairing the coefficients in these blocks, and embedding the data into coefficient pairs with a value of + 1 or 21 can reduce the distortion of the file and the incremental value of storage size.
However, we do not take the above scheme into two accounts here. On one hand, all AC coefficients of each block can be classified into three categories: embeddable, immutable, and shiftable AC coefficients. In some cases, the number of embeddable coefficients at a given spatial frequency is greater than the number of shiftable coefficients. Figure 5 shows the total number of embeddable and shiftable coefficients on each AC coefficient bit of all blocks of the aircraft F-16 image and the boat image, respectively. Figure 5 shows  that the embeddable AC coefficient at position 25 is obviously more suitable for embedding. On the other hand, the modification cost of each location is different, that is, the non-uniform quantization table structure leads to less distortion caused by modifying the AC coefficients near the DC coefficient than those at high frequency. Therefore, we will choose a method to select the position of the embedding coefficient below.
Fortunately, for a JPEG image with size M 3 N, an embedding efficiency metric is introduced, which was proposed in Wedaj et al., 52 that is, equation (3), including the number of embeddable and shiftable AC coefficients in all non-overlapping 8 3 8 blocks to select a position of the AC coefficients to reduce distortions. In addition, in equation (3), for the non-uniform structure of the quantization table, modifying the AC coefficient which is near the DC coefficient may lead to less distortion than modifying the last one.
So based on the idea of Wedaj et al. 52 and according to equation (3), the embedding efficiency R i of 31 AC coefficient pairs is calculated (The first coefficient pair contains DC coefficient, and this coefficient pair will be discarded.). Then, the range of i in equations (3)-(5) is changed to i 2 f1, 2, 3, . . . , 31g. Finally, descending order is performed.

Data embedding
Compared with the three schemes mentioned in section ''Related works,'' our method is different in the following two aspects: 1. In all coefficient pairs, those containing at least one zero element are kept unchanged. 2. Only coefficient pairs including at least one of the AC coefficients of + 1 and 21 are expanded for data embedding.
For convenience, we use (a i , b i ) and m 2 f0, 1g to represent the coefficient pairs and the secret bit, respectively.
The embedding is implemented as follows: Step 1: get the sequence T k of the block as the method in section ''Huang et al.'s scheme.'' As described in section ''Block selection,'' the QDCT coefficient pairs (called coordinates) of each block can be obtained from these blocks. In coordinates, if there is one or two zero elements, acquire a next coordinate. If not, go to Step 2.
Step 3: next, if m i = 0, the embedding operation is as follows, otherwise go to Step 4 Step 4: when m i = 1 Step 5: in this step, all operations are based on (a i , b i ) in which only one of the values belongs to + 1 or 21, otherwise, skip to Step 6. When m i = 0 When m i = 1 where |x| represents the absolute value of x. Otherwise, shifting (a i , b i ) as follows Step 1: this step is to recover all shifted coefficient pairs where (a i , b i ) Ã represents the marked QDCT coefficient pairs and (a i , b i ) 0 represents the recovered QDCT coefficient pair. After that, do inverse quantization and inverse DCT transforms to get the original pixel values.
Step 2: in this step, we do not only need to restore the paired pixel values to their original values but also extract the bit of the embedded secret message whose binary bit is ''0,'' that is, m 0 i = 0.
Step 3: the following formula is used to extract the binary bit ''1,'' that is, m where |x| represents the absolute value of x.
If the marked image is transmitted in a lossless channel, (a i , b i ) 0 = (a, b), then the cover image can be recovered losslessly.

Experimental results
Visual quality and file size increment are two performance evaluation metrics commonly used for evaluating RDH methods. In this section, the proposed method is evaluated against three state-of-the-art methods, that is, Huang et al.'s method, 51 Wedaj et al.'s method, 52 and Li et al.'s method. 53 To better show the performance, we give simulations for four methods, that is, the above three methods and ours, and compared each other in both the visual quality and the file size increment, where the setting is coherent.
Six color images-that is, Lena, Sailboat, F-16, Baboon, Peppers, and Splash-with the size of 512 3 512 are downloaded from the USC-SIPI database for experiments. These color images are first converted into six grayscale images and then into JPEG images with QF = 70 and QF = 80. Related analysis and comparisons are described in the following.

Visual quality
In this section, we take advantage of PSNR (Peak-Signal-to-Noise-Rate) to evaluate the visual quality of marked JPEG images with fixed embedding payloads. The relationship between PSNRs and embedding payloads on the six marked JPEG images is shown in Figure 6, where the horizontal axis represents the embedding capacity and the vertical axis represents the PSNRs. As shown in Figure 6, with the increasing of embedding payload, PSNRs for the four RDH schemes decrease. However, although PSNRs of the four methods decrease, it can still be found that under the same embedding payload condition, PSNRs of the proposed method are higher than those of the other three methods.
It can be clearly concluded as follows: 1. PSNRs in the proposed method are higher than those in other methods with the same embedding capacity. 2. With the increasing of the embedding capacity, PSNRs of the proposed method are closed to those in Xiao et al. 53 The reason is that the proposed method in this article combines every two coefficients in each block into a coefficient pair and maps them to a 2D histogram, which has less embedding capacity than the method in Xiao et al. 53 However, the method in this article uses more blocks, but makes fewer changes to the coefficients within each block.
In order to clearly explain the performance, we list PSNRs of six test images with 2500 bits and 4096 bits embedded by the four schemes in Tables 1 and 2. It can be found that the scheme proposed in this article is with high PSNRs. Comparing Tables 1 and 2, it is found that PSNRs of our scheme are higher than those of the other three schemes with less embedded bits. However, as the number of embedded secret bits increases, PSNRs of the method presented in this article become closer to those of the other three schemes, because when more and more secret data are embedded, more blocks are needed; thus, block selection is equivalent to no selection at all. However, the method of coefficient pairing adopted in this article has less embedding capacity compared with the other three methods.
These experiment results show that the proposed scheme goes well, which is profited from the 2D RDH: on one hand, we select relatively smooth regions for embedding through the pairwise method and block selection on the 2D JPEG domain; on the other hand, we discard all the AC coefficients containing 0. The reason is that whenever a zero AC coefficient is modified to non-zero, an extra symbol is needed to be coded, which will increase the size of the file.
Also, in some images, such as Lena and Aircraft F-16 in Figure 6, PSNRs in Huang et al. 51 are larger than those of the other three schemes, which may be due to the texture features. It is worth mentioning that our proposed method is based on 2D HS method, and only the AC coefficient 6 1 is selected for embedding, which is different from Xiao et al., 53 so the improvement of PSNRs is not significant in some images.

File size increment
For RDH, the file size increment is also an important comparison index. In general, a better RDH scheme of JPEG images should minimize the increase storage size as much as possible while ensuring the embedding capacity. Table 3 lists the size increasing of each test   image compared with the original image at different embedding capacities when QF = 70 and QF = 80. In Table 3, it demonstrates that in the four schemes, all the file sizes increase. It means that embedding secret message necessarily results in increasing the file size of the marked image compared to the original image.
However, although in this article, two values in a coefficient pair may be increased or decreased by 1, respectively, or kept unchanged in each embedding operation, the size of the final file does not increase significantly. Figure 7 shows the average file size increment for the three reference methods and the proposed method for Figure 7. Under the condition of QF = 70, the method proposed in this article is compared with the file size increment corresponding to the three state-of-the-art methods. [51][52][53] different embedding capacities. The horizontal axis represents the size of embedded capacity (in bits), and the vertical axis represents the value of increment of the file.
As shown in Figure 7, the file size increment is compared with that in Huang et al. 51 and Wedaj et al. 52 under the same embedded capacity condition. It shows that the file size increment of the method in this article is smaller and better. However, this advantage is not obvious comparing with that in Xiao et al. 53 Similarly, as shown in Figure 7, the comparison of file size increment between the methods in this article and Xiao et al. 53 under the same embedding capacity is very close or even slightly insufficient. The reason is that data are embedded in coefficient pairs. Each embedding operation is to modify at least one AC coefficient, and in most cases, both two AC coefficients will be modified. Figure 8. Under the condition of QF = 80, the method proposed in this article is compared with the file size increment corresponding to the three state-of-the-art methods. [51][52][53] Xiao et al. 53 did not embed data in all the AC coefficients with value + 1 or 21, but adaptively select AC coefficients. Figure 8 shows the comparison of the file size increment with different payloads when QF = 80. The scheme in this article has smaller file size increment. What is more, combining the results in Figure 7 with Figure 8, it can be found that with different QF conditions, the growth trend of file size increment is same.

Computational complexity
Usually, time complexity is used to illustrate the computational complexity. And the time complexity of schemes in this article and the research studies 51-53 is O(n 2 ). However, it is generally affected by the performance of computer hardware. All experiments in this article have been implemented in Windows 10 64-bit operating system with Intel(R) Core(TM) i5-9500F CPU @ 3.00 GHz and memory size of 8192 MB RAM. Simulation codes have been run in MATLAB R2018b. As shown in Table 4, it is a comparison of the runtime with 2500 bits and 14,400 bits secret message, respectively. The average runtime of six carrier images in three aspects is compared, such as total runtime, embedding process runtime, and extraction process runtime. It can be seen from Table 4 that the average runtime of the method proposed in this article is lower than that in Wedaj et al. 52 and Xiao et al., 53 and higher than that in Huang et al. 51 For the schemes in Wedaj et al. 52 and Xiao et al. 53 and the scheme in this article are all based on the effectiveness of the scheme in Huang et al. 51 Conclusion JPEG image format is becoming a research hotspot because of its universality. In this article, a novel RDH method based on 2D HS is proposed. It is based on non-zero AC coefficient pairs. In general, modifications to JPEG images introduce distortions and file size increment. The experimental results have shown that our proposed RDH scheme can reduce the distortions while ensuring the image quality, which is better than some advanced RDH schemes. Also, the bit rate increase of the marked image is lower than state-of-theart schemes. This proposed method can be applied to the sensitive scenarios with distortions, especially for the embedding distortions. In this article, secret data are only embedded in QDCT coefficients with + 1 and 21 values. However, more embedding locations would be discussed for more payloads and the application of the proposed scheme would be an interesting topic in the future.

Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.

Funding
The author(s) disclosed receipt of the following financial support for the research, authorship, and/or publication of this article: This work is supported by the National