An Extension of the Interscale SURE-LET Approach for Image Denoising

In this paper, an extension of the interscale SURE-LET approach exploiting the interscale and intrascale dependencies of wavelet coefficients is proposed to improve denoising performance. This method incorporates information on neighbouring coefficients into the linear expansion of thresholds (LETs) without additional parameters to capture the texture characteristics of this image. The resulting interscale-intrascale wavelet estimator consists of a linear expansion of multivariate thresholding functions, whose parameters are optimized thanks to a multivariate Stein's unbiased risk estimate (SURE). Some experimental results are given to demonstrate the strength of the proposed method.


Introduction
During the last decade, image denoising has undergone dramatic improvement; lots of new methods based on wavelet transforms have emerged for removing Gaussian noise.A standard methodology proceeds by wavelet transforming the image, operating on the transform coefficients with nonlinear estimation functions, and then inverting the wavelet transform to obtain the denoised image.The choice of estimation function is an essential part of the denoising problem.Estimation functions generally take the form of "shrinkage" operators that are applied independently to each transform coefficient (e.g., [1][2][3][4]), or are applied to the neighbourhoods of coefficients at adjacent spatial positions and/or from other sub-bands (e.g., [5][6][7][8][9][10][11][12][13][14][15][16]).As demonstrated by several algorithms presented in the above literature, the performance of image-denoising algorithms can be improved significantly by taking into account the statistical dependence between interscale and intrascale coefficients.Figure 1 illustrates the statistical dependence between interscale and intrascale coefficients.
Image denoising can be accomplished by many different approaches.For example: using a prior model for the transform coefficients or using a parametric form for the estimation function.Generally, the generalized Gaussians [7,11], scale mixtures models [13], Bessel K densities [17], and symmetric alpha stable densities [18], have been used as prior models for the transform coefficients, which may be used to drive a Bayes-optimal estimator such as a MAP or MMSE estimator.Alternatively, one may directly assume a parametric form estimation function, such as [6,8,9,19,20], and select parameters by optimizing performance under certain conditions.In [8,9], Luisier et al. introduced a new SURE [21] approach to image denoising -interscale orthonormal wavelet thresholdingwhich parameterized the denoising process as a sum of elementary nonlinear processes -LETs -with unknown weights instead of postulating a statistical prior model for the wavelet coefficients, and then adaptively optimized the parametric estimator by minimizing SURE, which provides an approximation of the mean squared error (MSE) as a function of the observed noisy data.Risk minimization and unknown weights' estimations ultimately come down to solving a linear system of equations.However, they only took into account interscale dependency using an interscale prediction model group delay compensation (GDC), and they disposed of intrascale dependency.Their experimental result demonstrated that, for most of the images, the interscale SURE-based approach is competitive in relation to the best techniques available that consider orthonormal wavelet transforms.However, it should be noted that this approach did not obtain good performance for images with substantial textures, such as the Barbara image.The main reason for this is that some local information (especially the texture of Barbara's trousers) is completely lost at coarser scales.Interscale correlations may be too weak for this image, which indicates that an efficient denoising process may require intrascale information as well.For other denoising methods, the reader is referred to [22,23] and the references cited there.
In this paper, we propose a multivariate SURE-LET approach to orthonormal wavelet image denoising as an extension of Luisier's bivariate approach.This method incorporates information on neighbouring coefficients into the LET without additional parameters to capture the texture characteristics of this image.The resulting interscale-intrascale wavelet estimator consists of a linear expansion of multivariate thresholding functions, whose parameters are optimized thanks to multivariate SURE.This paper is organized as follows: In Section 2, we explain the multivariate SURE theory for a neighbourhood vector and generalize the corresponding linear parameterization strategy.In Section 3, the competitive results with the best up-to-date algorithms will be shown.The conclusion can be found in Section 4.

Multivariate SURE-LET
Let k k g , ∈  be equally-spaced samples of a real-valued image, where  is a set of spatial indexes ( Consider the standard nonparametric regression setting:  ) vector is spatial neighbourhood vector of n y .
Specifically, u n is defined as all those coefficients within a square-shaped window that is centred at the n th coefficient, as illustrated in Figure 1.Without loss of generality, we can assume that u y , where y n   is the last 1 d − components of u n .So far, we have introduced an explicit dependence between n x and y n  .As we know from [8,9], the MSE in the space domain is a weighted sum of the MSE of each individual sub-band, which allows us to apply the denoising function independently in every high-pass subband.

Unbiased Estimate of the MSE
It seems impossible to compute the do not have access to the signal x .However, in the case of Gaussian noise, it is possible to apply an extension of Stein's principle [21] for deriving an explicit expression.
The following lemma 1 shows how it is possible to replace an expression that contains the unknown coefficient x by another one with the same expectation, yet containing the known noise coefficient y only.
Lemma 1 and Theorem 1 essentially recap the derivation of SURE, which can be found in [8].
Lemma 1.Let θ : d →   be a continuous and almost everywhere differentiable function, such that: Then, under the additive white Gaussian noise assumption: where , I d is a unit matrix and [ ] E ⋅ stands for the mathematical expectation operator.
be a continuous and almost everywhere differentiable function, such that: where F ⋅ is the Frobenius norm.In this multivariate context, Stein's principle [21] can be expressed as: where w n is, according the spatial neighbourhood vector Theorem 1.Under the same hypotheses as Lemma 1, the random variable: is an unbiased estimator of the MSE, i.e.: [ ] Since the noise b has zero mean, we can replace . A rearrangement of the y terms then provides the result of Theorem 1.
The expression in equation (2.12) may be evaluated on a single observation y ( u can be assembled by overlapping y ) to produce an unbiased estimate of the MSE.Although the derivation of this expression is relatively simple, it leads us to the somewhat counterintuitive conclusion that the estimator may be optimized without explicit knowledge of the clean coefficients x .It must be emphasized that this estimate is close to its expectation, which is the MSE of the denoising procedure, because the standard deviation of ε is small by the law of large numbers.

The Multivariate SURE-LET Approach
Similar to the LET of [5,8,9], we build a linearly parameterized multivariate estimation function incorporating information on neighbouring coefficients of the form: Here, ( ) unknown weight specified by minimizing the SURE given by (2.11), and a is a 1 K × vector.It should be noted that the new multivariate estimation function does not introduce more parameters, compared with LET in [8][9], which means that this improvement still maintains efficiency of calculation.In this formalism, The MSE estimate ε is quadratic in a , as follows: ) where we have defined: Finally, the minimization of (2.14) with respect to a boils down to the following linear system of equations: (2.17) Note that since the minimum of ε always exists, it is ensured that there will always be a solution to this system.When ( ) M rank K < , we can simply take its pseudo-inverse to choose any one among the admissible solutions.Of course, it is desirable to keep the number of degrees of freedom K as low as possible in order for the estimate ε to maintain a small variance.

The New Inter-and Intrascale Thresholding Function
To compensate for feature misalignment between child coefficients and parent coefficients, we will also use the GDC scheme [8,9], which builds an interscale predictor out of the low-pass sub-band at the same scale.Let p n y denote the value of the GDC output, which can be interpreted as a discriminator between high SNR wavelet coefficients and low SNR wavelet coefficients, corresponding to the noisy coefficient n y .A Gaussian smoother function proposed in [8,9] is chosen, namely the decision function: In order to incorporate information on neighbouring coefficients into the LET without additional parameters, we propose the following pointwise radial exponential function: Here, d is the dimension of vector u n and the radial profile of this pointwise function is exponential in || || n u .
By joining the interscale predictor and multivariate SURE-LET approach, we lead to the following general inter-and intrascale thresholding function: Here, ( ) i.e., a can be obtained by (2.17).
We can summarize our denoising algorithm as follows: 1) Perform a J level DWT to the noisy image f , i.e.Y f W = 4) Determine M and c using (2.15) and (2.16), and then solve the linear system (2.17) to obtain a .5) Sub-band adaptive image denoising using (2.20).6) Reconstruct the denoised image from the processed sub-bands and the low-pass residual.

Numerical Experiments
In what follows, we carried out all the experiments on 8bit greyscale test images of sizes 512 512 × and 256 256 × , as presented in Figure 2. The test images were obtained from the same sources, as mentioned in [8,9,11].We applied our multivariate SURE-LET (abbreviated as MuSURE-LET) algorithm according to the expression (2.20) with K = 3, after four or five decomposition levels (depending on the size of the image) of an orthonormal wavelet transform (OWT) using the standard Daubechies symlets with eight vanishing moments (sym8 in MatLab).
A good estimator for σ is the median of absolute deviation (MAD) using the highest level wavelet coefficients [2], as follows:   Here, sub-band HH is the finest scale wavelet sub-band in the diagonal direction.The denoising performances are measured in terms of a peak signal-to-noise ratio (PSNR), defined as: where N is the total number of pixels and ˆn The window size is dependent on the abundance of the textures of the example images.In our experiments, the window size 7 7 × yields the best results for those images with substantial textures, while the window size 3 3 × yields the best results for those images with less detailed textures.Table 1 shows the error variances of the denoised images, expressed as the PSNR defined in (3.2), at eight different power levels , for all the images, there is very little improvement at the lowest noise level.This makes sense, since the "clean" images in fact include quantization errors and have an implicit PSNR of 58.9 dB.

Comparisons with the Interscale SURE-LET Approach
In order to understand the relative contribution of our method, we first want to evaluate the improvements brought by the integration of neighbouring coefficients' dependencies.Compared with the bivariate SURE-LET approach defined in [8,9], we can evaluate the improvements brought by our multivariate SURE-LET function (2.20) (see Figure 3).As can be observed, the integration of neighbouring coefficients' dependencies improves the denoising performance considerably.For those images that have substantial textures.such as the Barbara image, the denoising gains are up to 0 8 1 1 .− .dB when the range of the PSNR values of input noisy images are in [15,30], and for ones that have less detailed textures, such as the Lena image, the denoising gains are up to 0 2 0 3 .− .dB when the range of the PSNR values of input noisy images are in [15,30].Figure 4 provides a visual comparison of an example image (Barbara) between the above-mentioned two methods.Our method is seen to provide fewer artefacts -for example, in parts of the forehead and hair of the woman -which means that our method can better suppress noise in the uniform areas.
In order to understand the relative contribution of our method, we first want to evaluate the improvements brought by the integration of neighbouring coefficients' dependencies.In Figure 3, we compare our multivariate SURE-LET function (2.20) with the bivariate SURE-LET (abbreviated as BiSURE-LET) defined in [8].As can be observed, the integration of neighbouring coefficients' dependencies improves the denoising performance considerably.For those images that have substantial textures, such as the Barbara image, the denoising gains are up to 0 8 1 1 .− .dB when the range of the PSNR values of input noisy images are in [15,30], and for ones that have less detailed textures, such as the Lena image, the denoising gains are up to 0 2 0 3 .− .dB when the range of the PSNR values of input noisy images are in [15,30].
Figure 4 provides a visual comparison of an example image (Barbara) with the above-mentioned two methods.
Our method is seen to provide fewer artefacts -for example, in parts of the forehead and hair of the womanwhich means that our method can better suppress noise in the uniform areas.

Comparisons with State of the Art Denoising Schemes
Compared with state-of-the-art denoising algorithms, for which the code is freely distributed by the authors: Bishrink ( 7 7 × ) [15,16], ProbShrink ( 3 3 × ) [12], BLS-GSM ( 3 3 × ) [13], Block-matching and 3D filtering (BM3D) [25], Non-local Means (NL-Means) [26] and Field of Experts (FoE) [27,28].Since the versions of the noise standard are not on a unit level, we have averaged the output PSNRs over eight noise realizations so as to apply the same noise realizations to different algorithms.) and denoised fingerprint and mandrill images are shown in Figures 5 and 6, respectively.
When looking more closely at the results, we observe the following.
Our method gives better results than Sendur's Bishrink 7 7, × which integrates both the inter-and the intrascale dependencies (an average gain of 0 8 + .dB).
Our method improves the PSNR by about 0.6 dB on average in comparison with FoE.

Figure 1 .
Figure 1.Illustration of the Parent-Child Relation and 3 3 × Neighborhood Window u n of a d -dimensional real-valued (


of n x , similar to u n formally.Equation(7) follows by choosing and focusing on the topleft element of matrix By expanding the expectation of the MSE, we have:

2 1 K
× vector.It is essential to notice that, because of the statistical independence between sub-bands of different iteration depths, n u and p n y will also be statistically independent.Therefore, the partial derivatives of with respect to the component n y are uncorrelated with p n y -Theorem 1 remains true -and then the linear parameter vector a is solved by minimizing the MSE estimate ε defined in Theorem 1,

Figure 2 .
Figure 2. The test images used in the experiments, referred to as 'Lena', 'Barbara', 'Boat', 'Mandrill', 'Fingerprint' and 'Bridge' (numbered from left to right and top to bottom)

Figure 3 .
Figure 3. PSNR improvements brought by our multivariate SURE-LET strategy compared to bivariate SURE-LET: (A) Lena image; (B) Barbara image

Figure 5 .
Figure 5.Comparison of the denoising results on the Fingerprint image (cropped to 200 200 × to show the artefacts): (A) Part of the

Table 1 .
Comparison of Some of the Most Efficient Denoising Methods (sym8)