Color image steganalysis based on embedding change probabilities in differential channels

It is a potential threat to persons and companies to reveal private or company-sensitive data through the Internet of Things by the color image steganography. The existing rich model features for color image steganalysis fail to utilize the fact that the content-adaptive steganography changes the pixels in complex textured regions with higher possibility. Therefore, this article proposes a variant of spatial rich model feature based on the embedding change probabilities in differential channels. The proposed feature is extracted from the residuals in the differential channels to reduce the image content information and enhance the stego signals significantly. Then, the embedding change probability of each element in the differential channels is added to the corresponding co-occurrence matrix bin to emphasize the interference of the residuals in textured regions to the improved co-occurrence matrix feature. The experimental results show that the proposed feature can significantly improve the detection performances for the WOW and S-UNIWARD stega-nography, especially when the payload size is small. For example, when the payload size is 0.05 bpp, the detection errors can be reduced respectively by 5.20% and 4.90% for WOW and S-UNIWARD by concatenating the proposed feature to the color rich model feature CRMQ1.


Introduction
With the development of 5G, more and more images will be transmitted in the Internet of Things. The huge amounts of images provide abundant covers for criminals to secretly transmit the stolen private or companysensitive data by image steganography. In the past decades, many steganalysis algorithms [1][2][3][4] have been proposed for the traditional steganographic algorithms, [5][6][7][8] and the content-adaptive steganographic algorithms developed in recent years. [9][10][11][12] Some of them can even locate or extract the secret messages in some special cases. 13 However, most of these steganalysis algorithms just detect the secret messages in the gray images.
As we know, in daily work and life, color images are widely used, which consist of multiple color channels. We can apply the steganalysis algorithms proposed for gray images to the detection of each color channel, then combine the detection results of multiple color channels to determine if the color image contains secret 1 Zhengzhou Science and Technology Institute, Zhengzhou, China 2 College of Information and Communication, National University of Defense Technology, Xi'an, China messages. However, compared with the gray image steganography, the color image steganography distributes the messages into multiple color channels, 14 then decreases the embedding payload in each color channels which would increase the difficulty of steganalysis. 15,16 So, the research on steganalysis for color image steganography is very important for the practical applications of steganalysis.
In 2014, Denemark et al. 28 incorporated the embedding change probabilities when calculating the residual co-occurrence matrices of gray image, then proposed the selection-channel-aware rich model feature-maxSRM-which significantly improves the steganalysis performance for content-adaptive gray image steganography. Activated by this idea, this article proposed a variant of spatial rich model feature based on embedding change probabilities in differential channels. First, considering the strong correlations between different color channels, the differential channels are computed to suppress the image content and strengthen the stego signal. Second, a proper method is presented to computing the embedding change probabilities in each differential channel. Finally, the co-occurrence matrix feature is extracted from the residuals in the differential channels by incorporating the embedding change probabilities into the co-occurrence matrices calculation. The experimental results show that the proposed feature significantly reduces the detection errors for the WOW and S-UNIWARD steganography, especially when the payload size is small.

Review of color image steganalysis
So far, the existing color image steganalysis algorithms can be roughly divided into the following five categories according to the extracted steganalysis feature: 1. Because color image steganography will affect the statistical characteristics of multiple channels, some color image steganalysis algorithms combine the features extracted from different channels to improve the detection accuracy. For example, Abdulrahman et al. 17 calculated the co-occurrence matrices from the gradient amplitudes of each channel and their derivatives, and then combined them to realize color image steganalysis. Liao et al. 18 obtained the complex textured regions in all channels and in each channel respectively, and then calculated the cooccurrence matrix of residuals in each channel of the two types of regions as features to improve the detection accuracy for contentadaptive steganography. 2. Because color image steganography will increase the number of colors or similar color pairs, some color image steganalysis algorithms use the statistics related to the number of colors or similar color pairs to distinguish the stego color images. For example, Fridrich and Long 19 extracted the ratio of similar color pairs as features. Su et al. 20 embedded the random information in the given image with fixed ratios, and then extracted the increased numbers of different colors and similar color pairs as features to detect the color stego image generated by LSB (least significant bit) steganography. 3. Because the color image steganography will decrease the strong consistency among the textures of different channels, some color image steganalysis algorithms use the features related to the consistency between the texture of different channels to detect the stego color images. For example, Abdulrahman et al. 16 depicted the texture direction consistency of different channels by the cosine and sine of the angles between the gradients of different channels, and extracted the feature based on the gradient direction consistency to improve the detection accuracy for color stego images. Liu et al. 21 measured the correlation coefficients among the LSB planes of different color channels and the correlation coefficients between the prediction errors of each channel to capture the influence of steganography on the correlation among different channels. 4. Because color image steganography will increase the prediction errors of each channel to other channels, some color image steganalysis algorithms extract features from the inter-channel prediction errors. For example, Lyu and Farid 22 calculated the logarithmic prediction errors which utilize the correlation among wavelet subband coefficients in different color channels, then used mean, variance, skewness, and kurtosis of them as features to realize pure blind detection of color image steganography. Li et al. 23 calculated the prediction errors of the channel Y to other channels, and extracted the statistical features to realize the detection of the color JPEG image steganography. 5. Because color image steganography will affect the joint distribution of the noises in different channels, some color image steganalysis algorithms use the co-occurrence matrices across the residuals of three color channels to detect the stego color images. For example, Goljan et al. 24 extracted the co-occurrence matrices between the residuals of three channels, then merged them with the rich model feature of each channel. Goljan and Fridrich 25 divided the image pixels into blocks according to the CFA (color filter array) characteristics from the imaging principle of camera, and then computed the co-occurrence matrix of residuals between different channels of each block for steganalysis. Kang et al. 26 extracts the cooccurrence matrix feature from the gradient amplitude residuals among different color channels and then combines it with the existing color image steganalysis features to improve the steganalysis performance for color image steganalysis. However, the above co-occurrence matrix features for color images are extracted from the residuals in each channel separately. So, some image content information still remains in the residuals. Therefore, Kang et al. 27 extracted the co-occurrence matrix features from the residuals of the differences between every two different color channels.
Spatial rich model feature based on embedding change probabilities for grayscale image Content-adaptive image steganographic algorithms calculate the embedding costs of image pixels using the distortion functions, and then constrain the embedding changes to the complex texture regions of the cover image. Therefore, Denemark et al. 28 incorporated the embedding change probabilities into the extraction of the popular spatial rich model feature, 2 then proposed the selection-channel-aware rich model feature named maxSRM. This feature enhances the effects of residuals in the complex textured regions on the co-occurrence matrices. The ''selection channel'' in Denemark et al. 28 represents the embedding change probability. In order to distinguish it from the color channel without confusion, this feature is referred as the spatial rich model feature based on embedding change probabilities in this article. Let X = fX i, j 0 ł i\H, j 0 ł j\W g denote the given image, where H and W are the height and width of the given image respectively, and X i, j represents the pixel in the ith row and jth column. Similar to the rich model feature, the maxSRM feature uses the same method to calculate the residuals and perform quantization truncation to obtain the quantized residual image N i, j is the adjacent pixel of X i, j , and N i, j 6 ¼ X i, j ;X i, j is the prediction of cX i, j based on N i, j ; q is the quantization step; T is the truncation threshold; round(x) is the rounding function; trunc T (x) is the truncation function as shown in equation (2) Before calculating the co-occurrence matrices, the embedding change probability of each pixel in the given image is estimated according to the possible embedding payload and the steganography algorithm used to generate the stego image. Then the estimated embedding change probabilities are incorporated in the calculation of co-occurrence matrices.
We take the calculation of a horizontal cooccurrence matrix of a residual image R as an example. First, the embedding change probability of each pixel b i, j is estimated according to the distortion function and payload size. Then, the maximum value of the four embedding change probabilities is taken across the four and added to the corresponding co-occurrence bin instead of adding 1 as follows where Z is a number used to make the sum of all elements in C h equal to 1, and Denemark et al. took above manner to get 78 cooccurrence matrices, then merged them in the same manner as the rich model feature 2 to obtain the 34,671dimensional steganalysis feature maxSRM for grayscale image.
The maxSRM feature takes the maximum embedding change probability across four neighboring positions as the increment of the co-occurrence bin. If any element in the four neighboring positions is changed with a high probability, the co-occurrence matrix will be changed greatly. On the other hand, if all the embedding change probabilities in these four positions are small, the corresponding co-occurrence matrix will be changed less. Therefore, the maxSRM feature is more sensitive to the stego signal and can greatly improve the detection performance.
Steganalysis feature based on embedding change probabilities in differential channels In Kang et al., 27 the authors have reported that the features extracted from the pixel difference between two channels can detect the color stego image more effectively. Similar to the effect on the gray image, the content-adaptive steganography would affect the elements in different positions of differential channels with different degrees. Inspired by the maxSRM feature, when extracting feature from the differential channels, if the embedding change probabilities in different positions of the differential channels can be considered, the detection performance may also be improved. Therefore, this section tries to analyze the embedding change probabilities in differential channels, then incorporate them into extracting the co-occurrence feature from the differential channels.

Embedding change probabilities in differential channels
In their study, 29 Sangwine and Horne reported that there are always strong correlations among three color channels R, G, and B of natural color images and gave the correlation coefficients between any two color channels as 0.78 (red and blue channels), 0.98 (red and green channels), and 0.94 (green and blue channels). Activated by this work, in Kang et al., 27 we have analyzed the superiority of the differential channel feature from the view of the variance change rate, and obtained the result that features extracted from the differential channels should be able to detect color stego images more effectively. Therefore, in this article, we first calculate the differences between any two of three color channels in a color spatial image as follows where 0 ł i\H; 0 ł j\W ; X R , X G , and X B denote the red channel, green channel, and blue channel of a color image respectively; and D RG , D RB , and D GB denote the differential channel between red and green channels, the differential channel between red and green channels, and the differential channel between green and blue channels respectively. We know that the change of any pixel in the same position of two channels may cause the change of the difference in the corresponding position of the differential channel. In other words, if any component of the pixel in a color image (e.g., the green component X G (i, j)) has a large embedding change probability, the differences in the corresponding position of the differential channels D RG and D GB may also have large embedding change probabilities. Therefore, the maximum value of two embedding change probabilities in the same position of two color channels is taken as the estimated embedding change probability in the same position of the corresponding differential channel.
Let p R , p G , and p B denote the embedding change probability matrices in the red channel X R , green channel X G , and blue channel X B respectively. Let p RG , p GB , and p RB denote the estimated embedding change probability matrices in the differential channels D RG , D GB , and D RB . The p RG , p GB , and p RB are calculated as follows where each element in each embedding change probability matrix denotes the embedding change probability in the corresponding position. Taking the differential channel D RG as an example, let D S RG denote the differential channel between the red and green channels of the stego image, and D C RG denote the differential channel between the red and green channels of the cover image. We can compare the differential channel D S RG with the differential channel D C RG to obtain the embedding change vector M RG of the differential channel as follows It is assumed that all pixels in three color channels are changed independently. Then, the following three possible probability distribution functions of M RG can be obtained from the embedding change probability matrices p R , p G , and p RG whereM RG (iW + j) = 1 À M RG (iW + j).
For each pair of stego image and its cover, we can compute the likelihood of that its embedding change vector M RG follows the probability distribution functions P R , P G , and P RG as follows Then, the likelihood ratios l RG, R = L RG =L R and l RG, G = L RG =L G can be used to test the superiority of the probability distribution functions P RG to P R and P G for the embedding change vector M RG . When both of l RG, R and l RG, G are larger than 1, it is demonstrated that the embedding change vector M RG is more possible to follow the probability distribution P RG . Namely, the differential channel D RG is more possible to be changed with the embedding change probability matrix P RG . Because the likelihood values L R , L G , and L RG may be too small to be stored, the log likelihood ratios ln l RG, R and ln l RG, G are used instead. Thus, when both of ln l RG, R and ln l RG, G are larger than 0, it is indicated that the element in differential channel D RG is more possible to be changed with the corresponding probability in the embedding change probability matrix P RG .
Here, steganography algorithm WOW was used to embed secret messages into each cover color image with payload size 0.4 bpp (bits per channel pixel) to test the superiority of the embedding change probability matrix P RG to P R and P G . The cover color image set contains 10,000 color images in ''tiff'' format with sizes of 512 3 512 pixels, which were generated by scaling the 10,000 raw color images downloaded from BOSSbase. The tool used is Advanced Batch Converter 3.8.20 with the typical interpolation filter-''bilinear.'' Figure 1 shows that the values of the log likelihood ratios ln l RG, R and ln l RG, G are much larger than 0 for almost all images. This indicates that it is more reasonable to use formula (6) to calculate the embedding change probability matrices in the differential channels. Therefore, it should be more effective for steganalysis to add the embedding change probability in P RG to the corresponding co-occurrence bin. It should be noted that when the elements in the same position of two channels are changed with equal values at the same time, the element in the corresponding differential channel will not be changed. Therefore, the embedding change probability in P RG is just the estimated change probability of the element in the differential channel.

Spatial rich model features based on embedding change probabilities in differential channels
In their study, 24 Goljan et al. used the spatial rich model 2 with truncation threshold 2 and a single Figure 1. Log likelihood ratios of the probability distribution function P RG to P R and P G for 10,000 color BOSSbase images: (a) ln l RG, R and (b) ln l RG, G . quantization step q = 1 to compute 12,753-dimensional feature for each color channel, then merged them of three channels to obtain the SRMQ1 (SRM with q = 1) feature for color image. This feature achieves the good detection performance for not only the typical LSB matching, but also the content-adaptive steganography. Goljan et al. concatenated the SRMQ1 feature and the 5404-dimensional color rich model feature CRMQ1 to generate the spatio-color rich model feature SCRMQ1 which can decrease the detection error significantly.
However, the SRMQ1 feature for color image does not consider the correlations between different color channels and that the content-adaptive steganography changes the pixels in different positions with different probabilities. Therefore, this section will improve the SRMQ1 feature by adding the embedding change probability to the corresponding co-occurrence bin of the residuals in the differential channels instead of adding 1. The extraction procedure of the improved feature is as follows: Step 1: According to the targeted steganography algorithm and the given payload size, estimate the embedding change probability of each pixel in original color channel of the analyzed image by the same manner in Filler et al. 30 Step 2: Compute the differential channels D RG , D RB , and D GB of the analyzed image by formula (5).
Step 3: Compute the embedding change probability matrix of each differential channel by formula (6).
Step 4: Use 31 high-pass filters in Fridrich and Kodovsky´2 to filter each differential channel, and quantize and truncate the filtering results by step q = 1 and threshold 2 to obtain 31 residual images for each differential channel.
Step 5: Similarly to Fridrich and Kodovsky´, 2 calculate 78 4D co-occurrence matrices of the 31 residual images of each differential channel. Taking a horizontal co-occurrence of residual image R computed from the differential channel D RG as an example, calculate the co-occurrence matrix as follows Step 6: According to the symmetry of residuals and co-occurrence matrices given in Fridrich and Kodovsky´, 2 merge the co-occurrence matrices of three differential channels into a 12,753-dimensional feature.
Step 7: Add the corresponding features of three differential channels to generate a 12,753-dimensional rich model feature based on embedding change probabilities in differential channels, which is referred as maxDSRMQ1.
Steganalysis method based on embedding change probabilities in differential channels This section uses above idea to improve the method in Goljan et al. 24 and proposes a steganalysis method based on embedding change probabilities in differential channels. The steganalysis features used in this algorithm include CRMQ1 and maxDSRMQ1. The proposed method contains the following two algorithms: the stego image detector training algorithm and the stego image detection algorithm.
Algorithm 1: Stego image detector training based on embedding change probabilities in differential channels.
Input: cover training images and stego training images.
Steps: 1. Differential channels calculation. For each training image, calculate three differential channels D RG , D GB , and D RB by subtracting the color channels X G and X B from the color channels X R and X G . 2. maxDSRMQ1 feature extraction. Use the method described in subsection ''Spatial rich model features based on embedding change probabilities in differential channels'' to extract the 12,753-dimensional maxDSRMQ1 feature from the three differential channels, where the quantization step is set as 1, and the truncation threshold of the residuals is set as T = 2. 3. CRMQ1 feature extraction. For each training image, extract the 5404-dimensional CRMQ1 features by the feature extraction method in Goljan et al. 24 4. Feature combination. Combine the features proposed in steps 1 and 3 to generate a total of 18,157-dimensional color image steganalysis feature based on embedding change probabilities in differential channels. 5. Ensemble classifier training. For each training image, if it is a cover training image, the label value is set as -1; if it is a stego training image, the label value is set as +1, and its corresponding steganalysis feature and label value are used as a training sample. Then, an ensemble classifier is trained as the stego image detector.
Algorithm 2: Stego image detection based on embedding change probabilities in differential channels.
Input: the detected color image and the trained stego image detector.
Output: the label of the detected image. Steps: 1. Extract 18,157-dimensional steganalysis feature from the input color image using steps 1-4 in Algorithm 1. 2. Feed the extracted steganalysis feature to the trained stego image detector, and return the output label.

Experimental results and analysis
In the experiments, the BOSSbase1.01 image database containing 10,000 color images with sizes of 512 3 512 pixels described in subsection ''Embedding change probabilities in differential channels'' were used. Two typical content-adaptive steganography algorithms, WOW 10 and S-UNIWARD, 11 were used to generate the stego images. Two stego images of these two steganography algorithms were obtained for each cover image by embedding pseudo-random messages with the same size into three channels. Finally, 2 3 5 = 10 groups, 100,000 stego images were obtained for five different payload sizes of 0.05, 0.1, 0.2, 0.3, and 0.4. The maxDSRMQ1 feature and SRMQ1 feature were extracted from each cover or stego image. We also extracted the maxSRM model with truncation threshold 2 and quantization step q = 1 from each original color channel, then merged them of three color channels to form a 12,753-dimensianal maxSRMQ1 feature. Then, we extracted the CRMQ1 feature, 24 CGC (channel gradient correlation) feature, and DSRMQ1# + DSGF# (CRMQ1 + SRMQ1 + SGF + DSRMQ1 + DSGF, DSGF: Steerable Gaussian Filters feature extracted from differential channels) feature from each cover or stego image.
The detectors with different features and some of their concatenations were trained as binary classifiers implemented using the FLD (Fisher linear discriminant) ensemble with default settings. The minimum total classification error probability under equal priors P E = min P FA (P FA + P MD )=2 was computed for each ensemble, where P FA and P MD are the false-alarm and missed-detection probabilities. For each payload size, the detection performance is evaluated by the median of P E measured on the test set over ten 5000/ 5000 database splits denoted as P E . The following experiments were performed under the ideal case when the steganalyst knows the embedding payload size. Table 1 shows the average detection errors of the SRMQ1, maxSRMQ1, and maxDSRMQ1 features for WOW and S-UNIWARD steganography. It can be seen that for two different content-adaptive steganography algorithms, incorporating the embedding change probabilities into the calculation of the co-occurrence matrix significantly improves the steganographic detection performance, especially when the payload size is low. The maxSRMQ1 and maxDSRMQ1 features have lower average detection errors for WOW and S-UNIWARD steganography than the SRMQ1 feature. The proposed maxDSRMQ1 feature achieves the lowest detection errors for different payload sizes, makes the detection error smaller than that of the original SRMQ1 feature by larger than 13% and smaller than that of the maxSRMQ1 feature by larger than 5%. Tables 2 and 3 show that the feature concatenations CRMQ1 + maxDSRMQ1 and CRMQ1 + maxSRMQ1 both bring significant improvement for WOW and S-UNIWARD steganography, and CRMQ1 + maxDSRMQ1 achieves the minimum detection errors. When the payload size is 0.05, compared with the feature concatenation CRMQ1 + SRMQ1, CRMQ1 + maxDSRMQ1 reduces the detection error by 13.43% and 8.34% for WOW and S-UNIWARD. Even if when the detection error of CRMQ1 + SRMQ1 is low for the payload size 0.4 and it is very difficult to improve the performance, CRMQ1 + maxDSRMQ1 still brings a small improvement (1.54% and 0.99%). In addition, The non-bold numbers in the brackets denote the reduced errors of maxSRMQ1 and maxDSRMQ1 from that of SRMQ1, and the bold numbers in the brackets denote the reduced errors of maxDSRMQ1 from that of maxSRMQ1.
compared with the latest works DSRMQ1# + DSGF# and CGC, the proposed steganalysis method with feature concatenation CRMQ1 + maxDSRMQ1 still can detect the stego color images with smaller errors.
The ROC (receiver operating characteristic) curves of above feature concatenations in Figures 2 and 3 show that CRMQ1 + maxDSRMQ1 always has the highest true positive rate under different false positive    rate. This further demonstrates the superiority of the proposed maxDSRMQ1 feature. The excellent performances of the proposed maxDSRMQ1 feature and the concatenation CRMQ1 + maxDSRMQ1 should be attributed to that the maxDSRMQ1 feature is extracted from the differential channels to further suppress the image content information in the residuals, and the embedding change probabilities are properly incorporated into the calculation of the co-occurrence matrices.

Conclusion
This article proposes an improved steganalysis feature based on embedding change probabilities in differential channels for color images. The proposed feature extraction method considers the strong correlations among different color channels, calculates the co-occurrence matrices from the differential channels to further reduce the impact of the image content on the feature and highlight the stego signals. Then, a method is proposed to compute the embedding change probabilities in differential channels, which are incorporated in the calculation of the co-occurrence matrices. The experimental results show that for the steganography algorithms WOW and S-UNIWARD, the proposed maxDSRMQ1 feature can significantly reduce the detection errors, especially when the payload size is small.
However, the used embedding change probabilities are not the precise values. In the future, we will try to estimate the embedding change probabilities more precisely to enhance the performance, and incorporate the embedding change probabilities to the extraction of other features.