Quantization for Robust Distributed Coding

A distributed source coding approach is proposed for robust data communications in sensor networks. When sensor measurements are quantized, possible correlations between the measurements can be exploited to reduce the overall rate of communication required to report these measurements. Robust distributed source coding (RDSC) approaches differentiate themselves from other works in that the reconstruction error of all sources will not exceed a given upper bound, even if only a subset of the multiple descriptions of the distributed source code are received. We deal with practical aspects of RDSC in the context of scalar quantization of two correlated sources. As a benchmark to evaluate the performance of the proposed scheme, we derive theoretically achievable distortion-rate performances of an RDSC for two jointly Gaussian sources by applying known results on the classical multiple description source coding.


Introduction and Motivation
Sensor networks are usually collections of remote sensors that gather temporal and spatial information about a source distributed in time and space. Each sensor reports its measurements to a central unit which extracts meaningful information about the distributed source being measured. The spatial and temporal correlations between sensor measurements can be exploited to reduce the amount of communication required between the sensors and the central unit to achieve a desired reconstruction quality of the distributed source. We call this problem the distributed source coding (DSC) problem.
For low-cost sensors, remote communication links to the central server are often unreliable. Therefore, at each given point in time, only a subset of all measurements might be available at the decoding unit. Even though the overall communication rate can be minimized by a distributed coding scheme that completely removes the redundancies between the sensor measurements, the reconstruction quality of the sources when a subset of the measurements are available might be unreasonably poor. Thus, robust distributed source coding is a generalization of the well studied multiple description (MD) coding in which all descriptions are describing a single source.
A robust distributed source coding (RDSC) scheme, on the other hand, minimizes the overall communication rate while satisfying a given bound on the maximum reconstruction error when only a subset of the measurements are available. Theoretical aspects of RDSC problem are addressed in [1,2]. Practical DSC schemes often consist of a quantizer followed by lossless discrete source coders. The correlations between the quantized values of two sensors are used to reduce the overall rate required to communicate them to the server. This reduction is still possible even if each sensor has only access to its own readings, in compliance with Slepian-Wolf's theorem, as pursued practically in [3]. Moreover, the quantizers can also be designed to optimally exploit this correlation in conjunction with the lossless coders following them. Such schemes are reported in [2,[4][5][6][7] mainly for two sources with or without side information. Scalability of distributed coding of correlated sources is considered in [5]. Similarly, [7] demonstrates that for binary symmetric and additive white Gaussian noise channel the correlation between sources can be useful in reducing quantization distortion and protecting data when 2 International Journal of Distributed Sensor Networks transmitted over noisy channels. Discussion on distortion performance gain and quantizer design using intersensor communication is reported in [8].
This work is a practical attempt to address the RDSC problem in the context of optimally designing scalar quantizers. We also introduce a paradigm shift in the problem formulations, which we believe is more appropriate for a distributed measurement context. The goal of conventional DSC and RDSC problems is the reconstruction of a single source from the encoded versions of a number of noisy measurements of (one for each sensor). We, on the other hand, assume that each sensor is measuring a separate random variable , with 's being possibly dependent. The objective is to minimize the expected distortion of the reconstructions of all sources.
The rest of this paper is organized as follows. In Section 2, we formulate the problem and present a graph theoretic, iterative algorithm to find a locally minimal solution of the problem in Section 3. To provide a benchmark against which the designs in Section 3 can be compared, a theoretically achievable distortion-rate performance of RDSC of two jointly Gaussian sources is derived in Section 4. Experimental results are reported in Section 5 and the paper is concluded in Section 6.

Problem Formulation
Our overall communications model is depicted in Figure 1. In this model, the outputs of two scalar sources, , = 1, 2, are quantized with two quantizers 1 (:) and 2 (:). The quantizer cell boundaries are denoted by ,0 < ,1 < ⋅ ⋅ ⋅ < , ∈ R, where R represents the real line. Let 1 and 2 be the discrete random variables as a result of quantizing 1 and 2 .
1 and 2 are then fed into their respective discrete lossless encoders. The encoders might or might not take into account the correlations between the two symbols in order to reduce the overall communication rates. Based on this, a number of different coding cases are possible as suggested in [4], which determine the communication rates used in further optimization processes. As an example, for variable length coding of quantized source symbols, if all dependencies between the symbols in the two encoders are ignored, the communication rate would simply be ( 1 , 2 ) = ( 1 ) + ( 2 ), the sum of the entropy rates of marginal quantized symbols. On the other hand, if the dependencies of the symbols are fully exploited to achieve the Slepian-Wolf bound, then the communication rate would be The reconstructions of input vector source = ( 1 , 2 ) given only 1 or 2 are called the marginal reconstructions and are denoted bŷ1 = (̂1 ,1 ,̂2 ,1 ) and̂2 = (̂1 ,2 ,̂2 ,2 ), respectively. The reconstructed vector source using both 1 and 2 is denoted bŷ0 = (̂1 ,0 ,̂2 ,0 ). The marginal and joint distortions are therefore

Lossless coder
Lossless coder

Marginal reconstruction
Marginal reconstruction Joint reconstruction The problem can now be formulated as a feasibility problem or the problem of characterizing all the achievable tuples ( 1 , 2 , 0 , ). A suitable optimization problem corresponding to this feasibility problem is the following: min.
for a given triplet ( , 1 , 2 ), assuming the optimization problem is feasible. This formulation is identical to multiple description coding (MDC) problem when 1 = 2 . The optimal quantizer design for MDC has been considered in [9,10]. A similar approach to that in [9] is adopted in this paper to deal with the more general RDSC problem. This involves minimization of a Lagrangian similar to those in [4,9] corresponding to the constrained optimization problem in (2) as given below: min.

An Iterative Design Algorithm
The joint optimization of 1 and 2 seems to be computationally prohibitive due to the interdependence of the conditional PDFs of each source given the quantized value of the other. We instead adopt an iterative approach in which each quantizer is optimized assuming the other one is fixed and known. Given two initial quantizers 0 1 and 0 2 , a series of International Journal of Distributed Sensor Networks 3 successively improved quantizers 1 , 2 , = 0, 1, 2, . . ., can be constructed as follows: The iteration stops after a predetermined number of steps or when the reduction in the value of L becomes negligible in consecutive iterations. The actual minimization procedures in (4) are performed with a generalization of the approach in [9]. The optimization problem is cast into the shortest path problem as follows. First, the domain of is uniformly prequantized into cells of interval length Δ each, being sufficiently large. Next, we construct an acyclic directed graph , called quantizer graph, by associating each prequantization cell boundary , 0 ≤ ≤ , with a node of the graph . We introduce a directed edge from node to node , 0 ≤ < ≤ , to represent the interval = [ 0 + Δ, 0 + Δ), where 0 is the smallest value in the support of 1 that is taken into account. By including in the edges of all pairs of and , becomes a complete directed acyclic graph.
When designing the quantizer 1 in the th iteration with 2 fixed as −1 2 , we assign to the edge from node to node the following weight: where 1 and 2 represent the transmission loss probabilities of each of the two descriptions. If the interval is a cell of the quantizer 1 , then the quantity 1, ( , ) is the contribution of the cell to the Lagrangian cost function L( 1 , 2 ). We use the conditional distribution of 2 given 1 ∈ to compute the expected marginal and joint distortions and the rate. Figure 2 illustrates how to calculate the statistics of the reconstruction of 2 for fixed −1 2 and given 1 ∈ . To be more specific, for the squared error distortion measure, the weights are explicitly calculated in the Appendix. Now, one can easily see that the problem of designing optimal 1 for fixed −1 2 is equivalent to finding the shortest path from node 0 to node of graph for an appropriate value. Given the value of , this shortest path problem can be solved in ( 2 ) time. The targeted transmission rate can be met through a binary search on .
It should be noted that this procedure will only produce quantizers of connected cells. For a DSC problem where only the performance of the system with both descriptions available is important, limiting the design to those quantizers with connected cells might significantly reduce the performance [4]. On the other hand, when acceptable marginal constructions are required, each individual quantizer will also be close to an optimal quantizer marginally designed for the source, which necessarily requires quantizers of connected cells.

Achievable RDSC Performance for Jointly Gaussian Sources
The problem of RDSC for symmetric Gaussian sources is considered in [1,2]. Given sensors that observe noisy versions of a single source , an achievable distortion-rate performance in reconstruction of is derived when any subset of the sensors data are available. We adopt a simple theoretical formulation more suited to our framework in studying achievable distortion-rate performance. In formulations of [1,2], the goal is the reconstruction of a specific source when the sensors observe noisy versions of . However, in our case, the sensors are to sense different but correlated sources. This necessitates a slight paradigm shift in our formulation. We use known results on classical multiple description source coding of Gaussian sources in [11] to derive achievable distortionrate performances for a two-dimensional jointly Gaussian RDSC problem. This distortion-rate analysis is useful in evaluating the performance of scalar quantization results in Section 3. We report our results for a symmetric case only, although a general derivation is possible. Also, these theoretical results would only apply to the case where both sensor measurements are known to both encoders.
The region of all achievable distortion-rate pairs for MDC of a Gaussian source is known [11]. For symmetric descriptions of a Gaussian source of variance 2 , the result of [11] can be written as where | | + = if > 0 and is zero otherwise. Now, consider two jointly Gaussian random variables two uncorrelated and hence independent Gaussian sources each of zero mean as 1 = ( + )/ √ 2 and 2 = ( − )/ √ 2, where and are the DC and AC components of 1 and 2 . Also, 2 = 1 + , 2 = 1 − . To describe 1 or 2 with bits, one can equivalently describe their DC and AC components. Applying the results in [11] to and independently, we arrive at the following achievable distortion-rate performance: Note that and 0 are the average distortions of reconstructing both sources 1 and 2 . It should once again be emphasized that the coding assumes the values of both sources 1 and 2 are known at both encoders. A closed form achievable bound on can be found at high rates. We derive one such bound by assuming the marginal distortions to be small but still much larger than the smallest marginal distortion promised by their distortion-rate functions. More precisely, we assume 2 −2 ≪ ≪ 1. This assumption is usually satisfied for practically meaningful designs considered in the next section. Physically, the assumption promises a small distortion given any of the descriptions alone, while allowing for a large improvement when both descriptions are available. Under this assumption, the minimum achievable joint distortion from (7) is for any given marginal rate and distortion . The joint distortion 0 in (7) can then be written as Minimizing 0 under the constraints + = and + = requires = = /2 and − = 0.5 log 2 ((1 + )/(1− )). Equation (8) follows from inserting these relations back into (9).

Experimental Results
In this section, we investigate the performance of our proposed system and report improvements over a naive scheme. When the two descriptions are equally important, that is, 1 = 2 , the minimization in (3) can be written in the following equivalent form: min.
In this new form, the parameter has a precise meaning as follows. Assume that the two descriptions are transmitted through two independent channels. The probability that each of the two transmissions fails is . Then, L in (10) is the expected distortion of the reconstructed sourceŝ1 and̂2. The relative importance of the marginal and joint distortions is reflected in parameter . For relatively large failure rates (poor channels), one would expect the two quantizers 1 and 2 to be close to an optimal quantizer for a single Gaussian. The quantizer cells of 1 and 2 are however somewhat interleaved to reduce the distortion when both quantization outputs are available at the decoder. An example of a pair of optimized quantizers for = 0.72, = 0.30, and = 2.27 is shown in Figure 3. While other choices are possible as discussed in Section 3, to calculate the overall communication rate throughout this section, the Slepian-Wolf bound is assumed to be achieved; that is, we assume ( 1 , 2 ) = ( 1 ) + ( 2 | 1 ). The joint and marginal distortions for = 0.30, = 0.72, and different marginal rates are depicted in Figure 4. The distortion of the optimal quantizer for a single Gaussian source and the same rate, obtained from the Lloyd-Max algorithm, is also shown in the same figure for comparison. Evidently, the difference between the two is extremely small. In other words, the penalty in the performance of marginal reconstruction is negligible, when jointly designing the quantizers for two sources versus designing the quantizers for each source independently. The distortion-rate performance of the proposed distributed coding scheme, consisting of scalar quantizers and entropy coders, is compared against theoretically achievable performances given by (8). The comparison is carried out in the following manner. For fixed and , the quantizers are optimally designed for a target marginal distortion. This design results in a marginal bit rate and a joint distortion, which is then plotted as a function of this marginal rate and is compared to what is theoretically achievable for the same marginal rate and distortion. is represented as "Lloyd-joint" and is based on independently designing an optimal quantizer for 1 and 2 having the same number of cells as 1 and 2 using the Lloyd-Max algorithm. Our proposed method exploits the correlation between the sources to improve the expected distortion. As is evident from the figure, the improvements at low rates are remarkable.

Concluding Remarks
This paper considered the practice of RDSC in the context of optimally designed scalar quantizers. A practically convenient coding scheme, consisting of a pair of quantizers and discrete lossless coders to robustly encode two correlated sensor measurements, is proposed in this paper. When the particular form of the lossless coders is chosen, we devised an optimization algorithm for designing the quantizers. It is then shown that a proper choice of the quantizers can significantly increase the overall system performance.
Sensors are assumed to be measuring some random field; therefore, aside from temporal correlations between the measurements of a sensor, a great number of correlations can exist between the measurements of nearby sensors too. Such correlations can be exploited to reduce the amount of communication required to report the sensor measurements to a server.