Efficient Data Reduction Techniques for Remote Applications of a Wireless Visual Sensor Network

Abstract A Wireless Visual Sensor Network (WVSN) is formed by deploying many Visual Sensor Nodes (VSNs) in the field. After acquiring an image of the area of interest, the VSN performs local processing on it and transmits the result using an embedded wireless transceiver. Wireless data transmission consumes a great deal of energy, where energy consumption is mainly dependent on the amount of information being transmitted. The image captured by the VSN contains a huge amount of data. For certain applications, segmentation can be performed on the captured images. The amount of information in the segmented images can be reduced by applying efficient bi-level image compression methods. In this way, the communication energy consumption of each of the VSNs can be reduced. However, the data reduction capability of bi-level image compression standards is fixed and is limited by the used compression algorithm. For applications attributing few changes in adjacent frames, change coding can be applied for further data reduction. Detecting and compressing only the Regions of Interest (ROIs) in the change frame is another possibility for further data reduction. In a communication system, where both the sender and the receiver know the employed compression standard, there is a possibility for further data reduction by not including the header information in the compressed bit stream of the sender. This paper summarizes different information reduction techniques such as image coding, change coding and ROI coding. The main contribution is the investigation of the combined effect of all these coding methods and their application to a few representative real life applications. This paper is intended to be a resource for researchers interested in techniques for information reduction in energy constrained embedded applications.


Introduction
The Wireless Visual Sensor Network (WVSN) consists of many Visual Sensor Nodes (VSNs).Typically, a VSN consists of an image sensor for acquiring images of the field of interest, an embedded processing unit for onboard processing, memory for storage and a radio transceiver for communicating the results wirelessly to the server.WVSNs are suitable for applications with a limited energy budget and are applied in remote areas where it is inconvenient to modify the location of the VSN or to frequently change the batteries.A lower energy budget places constraints on the type of the components used for implementing VSNs.Components with low power consumption are preferred for such applications.
In the literature, many authors e.g., [1][2][3] have focused on different implementation strategies for performing visionprocessing tasks in either the VSNs or the server.Some authors considered capturing the image of the field of view, compressing it and sending it to the server for further processing [1].In this case the communication energy consumption is higher because, in their implementations, consideration has been given to transmitting raw compressed images to the server.On the other hand, the proposal from some other authors has been to perform all the vision processing tasks at the VSN and to transmit the object features to the server as the final results.In this case, the communication energy consumption is low but the computational energy consumption is high because the VSN is performing operations for a longer time.
An example representing all the computation in the VSN is presented in [2], where the authors implemented a distributed vision processing system for human pose interpretation on a wireless smart camera network.They stated that their motivation behind the employment of distributed processing was to process the data in realtime and also to provide scalability for developing more complex vision algorithms.By performing local processing from the smart camera they extracted critical joints of the subject in the scene in real time.The results obtained by multiple smart cameras are then transmitted through the wireless channel to a server for the reconstruction of the human pose.SensEye is another such example, which is a multi-tier network of heterogeneous wireless nodes and cameras, which aims at low power, low latency detection and a wakeup, which is discussed in detail in [3].The authors implemented a surveillance application using SensEye, which performs advanced image processing operations such as object detection, recognition and tracking.
Both local processing and wireless communication consume a large portion of the total energy budget of the VSN.Transmitting the results from the VSN without local processing reduces the processing energy consumption but the consequence of this is a higher communication energy consumption due to the transmission of large chunks of raw data.On the other hand, performing all the processing locally at the VSN and transmitting the final results reduces the communication energy consumption, but the disadvantage of this is the higher processing energy consumption because of the increased processing at the VSN.These two extremes of processing are shown graphically in Figure 1.
Previous studies on intelligence partitioning between the VSN and the server in [4,5] have concluded that choosing a suitable intelligence partitioning strategy reduces the total energy consumption of the VSN.Transmitting the uncompressed images wirelessly from the VSN to the server rapidly depletes its total energy.Communication energy consumption is heavily dependent on the amount of information that is being transmitted.Coding the binary image after pre-processing and segmentation is a good alternative in relation to achieving a general architecture for WVSN, which is discussed in [6] and is shown in Figure 2. Figure 2 shows that the remaining tasks, such as binary image processing operations, labelling and features extraction are performed at the server.
The amount of data after compression (Figure 2) is limited by the used compression algorithm.It is preferable to reduce the data further by applying some image processing techniques such as the change coding presented in [11] and Region of Interest Coding (ROI), which is explained in [12].
In the current manuscript, we propose to postpone the header information of the compression standards from the Visual Sensor Node (VSN) to the server and to analyse its effect on the previously achieved results of image coding [7], change coding [11] and ROI coding [12] for both statistical as well as real captured images.In our analysis, we have used three image compression standards: JBIG2, Gzip and CCITT Group 4 (JBIG2 is developed by the Joint Bi-level Image Experts Group, CCITT Group 4 is developed by the International Telegraph and Telephone Consultative Committee and Gzip_pack is the standard Gzip for packed images, from this point onward Gzip will be written to refer to Gzip_pack).The selection of these three compression standards is based on the analysis provided in [7].
The main contribution of the current manuscript is to apply and evaluate the performance of the developed IRTs for real life applications.We also compared our results with previously achieved results from the literature for real life applications e.g., Meter Reading [1] in Section 6.
The remainder of the paper is organized as follows.Section 2 presents the related work.Section 3 provides a brief discussion of the Information Reduction Techniques (IRT).The header information of the considered bi-level image compression standards has also been discussed in Section 3.This is followed by an analysis of the IRT in Section 4. The results of IRT based on statistical images are presented in Section 5.The results of the IRT for several representative real life applications are presented in Section 6.Finally, Section 7 concludes the paper.

Related Work
Few researchers have compared image compression methods.A comparison of international standards for lossless still image compression was made in [13] and in this case they have thoroughly investigated the compression ratio of all the well-known compression methods available at that time.However, the comparison in [13] does not deal with analysis of the latest JBIG2 standard.Another issue is that consideration is only given to still images.
In [14], the compression ratio and execution time of the compression methods was investigated.In addition, they investigated the efficiency of compression methods based only on textual data and still images.Textual images are very different from the images in a machine vision scenario.
Another study on the comparison of compression methods was conducted in [15].They applied many compression standards and some compression programs to different medical images.They have compared both the compression ratio and the execution time of the compression techniques.They have pointed out that the compression performance depends on the type of images and the implication of this is that these results cannot be directly applied to machine vision applications because of the different types of images.
An early effort in relation to progressive image encoding has been performed in [22].Using progressive encoding, they achieved different qualities for different parts of the image.Using this method an arbitrary ROI in any image can be encoded progressively up to lossless.
A fast and efficient image compression algorithm based on set partitioning in hierarchical trees (SPIHT) was proposed in [23].This algorithm is based on the principle of partial ordering by magnitude with a set partitioning sorting algorithm, ordered bit plane transmission and the exploitation of self-similarity across different scales of an image's wavelet transform.SPIHT is a powerful image compression algorithm that produces an embedded bit stream from which the image can be reconstructed at various bit rates.
Three mechanisms are available in the well known JPEG2000 compression standard for compressing different parts of an image with different spatial qualities: tiling, code block selection and coefficient scaling.These three methods have been best described in [24].
The adaptive SPIHT compression method explained in [19] can be used to compress different parts of an image with different compression algorithms providing different qualities for the ROIs and the background image.Selective compression is conducted by performing JPEG2000 on the ROIs and SPIHT on the remainder of the image.The compression process becomes energy efficient by performing energy efficient SPIHT on the non-ROI parts of the image.

Information Reduction Techniques
In many applications, such as meter reading, the monitoring of magnetic particles in a hydraulic system, the monitoring of a habitat (e.g., monitoring of birds to prevent them from colliding with windmills) in a specific area, the localization of robots/vehicles etc., the image represents two aspects: the objects and the background.Hence, in all such applications, the images can be segmented into bi-level images and by doing this the amount of information in the images will be reduced.
Bi-level image compression standards are effective in reducing the amount of information and can be applied to these segmented bi-level images.By implementing a suitable bi-level image coding method, together with other data reduction techniques in the VSN, communication energy consumption can be reduced.

Image Coding
Based on the general architecture in [6], the compression efficiency of various bi-level image-coding methods has been investigated in [7].The pool of image compression standards is shown in Figure 3.It has been concluded that JBIG2, CCITT Group4 and Gzip_pack provide good compression efficiency.However, the compression efficiency of the various compression standards is limited to the used compression algorithm.For further data reduction, other efficient techniques must be explored.

Change Coding
In all the previously mentioned applications (meter reading, magnetic particles etc.), the adjacent frames are quite similar.In other words, the differences between two adjacent frames in these applications are only marginal.Change coding can be applied for further data reduction (beyond the compression level provided by image coding).The compression efficiency of change coding has been investigated in [11].The architecture of change coding is shown in

Region of Interest Coding
Lossless Region of Interest (ROI) coding for bi-level images (Figure 5) is another possibility for further data reduction for applications having fewer changes between adjacent frames, which has been investigated in [12].

Postponing Header Information from VSN to the Server
It has been observed in [12] that even for only one object in the frame, the compressed file size using CCITT Group 4 is very high compared to Gzip and JBIG2.It shows that the header information of CCITT Group 4 is huge.In the communication systems, where both the sender and the receiver know the implied compression standard, the header information can be inserted into the received bit stream at the receiver's side.In this way the output data is further reduced.Postponement of the header information from the VSN to the server will affect the results in [7, 11 and 12].The idea of postponing the header information from VSN to the server is shown in Figure 6.
CCITT Group 4 is a compression type that is supported by TIFF (Tagged Image File Format) file format.Interested readers are referred to a detailed explanation of CCITT Group 4 standard in [9].Due to the versatile header of the TIFF file, an image compressed using CCITT Group 4 contains long header information.By postponing the header information from the VSN to the server, a saving in terms of storage and transmission energy is expected.The Gzip file format consists of a ten byte header, the payload and an eight byte footer.The header contains a magic number, a version number and a time stamp.The payload is the compressed data, while the footer contains a Cyclic Redundancy Codes (CRC-32) checksum and the length, which represent the original uncompressed data.Thus, the header information of the Gzip is 18 bytes.
JBIG2 [8] is a lossless image compression standard, which is based on a form of arithmetic coding, called an MQ coder, which is explained in [17].The MQ coder used is an adaptive binary arithmetic coder, which is characterized by multiplication free approximation and a renormalization-driven update of the probability estimator and bit stuffing, which is introduced by the Q-coder and is explained in [18].
The header of JBIG2 is 13 bytes long.The header consists of eight bytes of ID string, one byte of file header flag and four bytes, which represent the number of pages.The field number of pages in the JBIG2 header may not be present if the number of pages in the compressed file is not known.Since in our experiments we have compressed each image individually, the field number of pages in the header is present.In this way, the header overhead in our JBIG2 compressed images is 13 bytes.

Analysis of the Information Reduction Techniques
This section summarizes the results of the IRT, i.e., image coding, change coding and ROI coding.These results have been obtained by postponing the header information from the VSN to the server.The results are presented graphically, where we discuss the remarkable points.
In [7], the effects of various features such as the shapes, sizes and number of objects in the statistically generated bilevel images, on the compression efficiency of six compression methods have been analysed.The conclusion drawn was that the increase in the size and number of objects in the input images resulted in an increase in the size of the compressed file for all the compression methods.The size of the compressed file from Gzip (for packed images), CCITT Group 4 and JBIG2 is lower when compared to other compression methods.Based on these facts, the conclusion drawn was that CCITT Group 4, JBIG2 and Gzip are the most suitable bi-level image compression standards for remote applications of WVSN.
The results for both before and after header removal for image coding are presented in Figure 7.The removal of the header from the compressed bit stream had an enormous effect on the results of the CCITT Group 4 as compared to those for the other two methods (Figure 7).It must be observed in Figure 7 that the compressed file size of CCITT Group 4 is the smallest after header removal.CCITT Group 4 becomes the best compression method after the removal of the header from the results of image coding.In Figure 7, BHR and AHR represent the before header removal and after header removal respectively.We investigated the effect of change coding on the compression efficiency of the three selected bi-level image compression standards and presented the results in [11].We determined that change coding in combination with bi-level image coding standards offers better compression efficiency compared to only image coding for a case where many objects appear/disappear in a set of frames.Furthermore, for up to 95% of changes in terms of the number of objects in adjacent frames, the change coding provides better compression efficiency compared to image coding.Beyond 95% of changes, in terms of the number of objects in adjacent frames, the compression efficiency of the change coding does not offer any saving (it becomes worse than image coding).
Illumination noise due to segmentation errors (illumination noise) may increase the size of objects by a few pixels, which severely affects the compression efficiency of change coding.
We proposed in [11] that the effect of illumination noise may be relaxed by applying morphology to the preprocessed image in order to remove unconnected pixels in the frames (unwanted pixels which are generated due to illumination noise).It was concluded that change coding in combination with image coding and morphology is a good strategy for data reduction of bilevel images in wireless visual sensor networks.
We applied the idea of postponing the header information from the VSN to the server on change coding and analysed its effect on the compression efficiency of the three selected bi-level image compression standards (Figure 8).The header removal significantly affected the results of CCITT Group 4 compared to JBIG2 and Gzip (Figure 8).Based on the results in Figure 8, we concluded that CCITT Group 4 is the best compression method if the header is not included in the compressed bit stream in relation to the case of change coding (Figure 8).In [12], we analysed the effect of ROI coding on the compression efficiency of the three bi-level image compression standards.ROIs of various shapes and varying numbers of objects in large and small sized frames were considered.An exploration was made into the fact that ROI coding offers better compression efficiency for some compression standards as compared to others.
In [12], it was observed that the header of CCITT Group 4 is larger than that of JBIG2 and Gzip.Thus, the header information of the respective methods was subtracted and the results were presented graphically in Figure 9 for the case in which the number of objects in the frames was increased from 0 to 20 (using the setting of 0-100% standard deviations in the average number of objects).
Figure 9 shows that after header removal, CCITT Group 4 in combination with ROI coding proves to be the best compression method.After the removal of the header information from the compressed images, the curve for the CCITT Group 4 is the lowest when it is compared to the other compression methods.
Based on the results of the header removal for image coding, change coding and ROI coding, it was concluded that CCITT Group 4 is the most suitable compression method to be used in energy constrained wireless applications.

Performance Evaluation of IRT for statistical images
In this section we analyse and discuss the average file size of CCITT Group4, JBIG2 and Gzip for image coding, change coding and ROI coding for both before and after header removal, based on statistically generated images.The average file size has been presented for various application domains.Consideration has been given to frames with too few objects (zero to four objects) and frames with too many objects (16 to 20).The statistical images are uploaded to a webpage and the link to it is provided in [16].
In our simulations, we have two kinds of average compressed file sizes, one in which the average of compressing 50 frames for each index in Figure 7, 9 and 10 is determined (this average is shown on the vertical axis of Figure 7, 9 and 10).The other type is the average for each compression standard (average file size for each index is determined first.Then the sum of the 11 average file sizes is calculated and this is then divided by 11.In this manner, the average file size for one shape is calculated, the sum of the average file sizes of the six shapes is determined and this is then divided by six in order to calculate the final average file size for each standard which is shown in Table 1 and Table 2).
Thus, Table 1 shows the average file size for frames with too many or too few objects of various shapes.It must be observed in Table 1 that the average compressed file size of both CCITT Group 4 and Gzip is larger than that of JBIG2 before header removal for all three cases of image coding, change coding and ROI coding.This means that, in terms of compression efficiency, JBIG2 is the best compression method for all the cases of image coding, change coding and ROI coding before header removal.Note that BHR and AHR in both Table 1 and 2 represent before header reduction and after header reduction.
In terms of compression efficiency, JBIG2 is the most effective; while Gzip is the least effective compression standard for the IRTs (image coding, change coding and ROI coding).Header removal from the compressed images has little effect on the average compressed file size of both JBIG2 and Gzip (Table 1).
After the postponement of the header information from the VSN to the server, the results are affected and JBIG2 is not the best compression standard anymore.After header removal, CCITT Group 4 is the most effective compression standard for the cases of image coding and change coding, but not for ROI coding.
It must be observed in Table 1 that the ROI coding has significant effect on Gzip while it has a negative effect on both JBIG2 and CCITT Group 4. For JBIG2 and CCITT Group 4, the average compressed file size after ROI coding is higher than that for change coding.The reason for this is the overhead involved due to run length codes.The run length codes are needed for the reconstruction of the original images at the server.This shows that ROI coding is not a good method for applications with many objects in the frames.It also shows that CCITT Group 4 in combination with change coding is a better option for data reduction in such applications (better than JBIG2 because of its low processing complexity).Thus, compressing the ROIs in the change frame using CCITT Group 4 (excluding the header information) at the VSN is a good strategy for reducing the data in the WVSN.Hence, the conclusion is that CCITT Group 4 in combination with change coding and ROI coding is the most suitable compression strategy for applications with few objects in the frames.

Performance Evaluation of the IRT for real applications
We have evaluated the performance of the IRT for various monitoring examples from real life applications.
The selected examples are primarily intended to evaluate and present an approximation of the IRT for various categories of real life monitoring applications.We proposed and evaluated the IRT mainly for the project "A Monitoring System for Failure Detection in Machinery" which is based on the detection of magnetic particles in oil in hydraulic systems.We also applied IRT to other real life applications and evaluated its performance for those applications.The other monitoring applications are meter Reading, node localization and human detection.The readers are referred to details of meter reading in [1] and [20], that of node localization applications in [20] and [21] and that of human detection in [10].
The results of the IRT for the considered applications are shown in Figure 10.Parts (a), (b), (c) and (d) in Figure 10 show the average file size of IRT for real life applications of human detection, LED light detection, meter reading and magnetic particle detection respectively.The average compressed file size for image coding for the meter reading application is higher than that of the other applications.The reason for this is that there are many objects (digits) in the images and hence its compressed file size lies in the range of applications having too many objects in the frames (see Table 1).For the other applications, there are few objects in the images and thus the compressed file size is lower and lies in the range of Table 2.These results for real life applications have validated the trends in our statistical analysis of Section 5.
We can see in Table 3 of [1] that in terms of compression rate, the most efficient compression method is SPIHT, its compression rate for meter reading is 0.18 bits per pixel.In our analysis, we have used frames with 400 rows and 640 columns.Thus, for such frames, the compressed file size using SPIHT is 0.18x400x640 = 46080 bits = 5760 KB.We must observe in Figure 10 (c) that the average compressed file size for meter reading using our proposed method is 268 bytes, which is achieved by applying the techniques of change coding, ROI coding and header removal in combination with CCITT group 4.

Conclusion
This paper provides an analysis of our proposed information reduction techniques, i.e., image coding, change coding and ROI coding, for machine vision applications attributing varying numbers of objects of various shapes and locations in the frames.The effect of removing the header information from the bit stream of three well-known bi-level image compression standards (CCITT Group 4, JBIG2 and Gzip) on the average file size of all the IRTs has also been investigated.Results based on statistically generated images show that, in terms of compression efficiency without header removal, JBIG2 is the most effective, while Gzip is the least effective compression method for all the cases of image coding, change coding and ROI coding.However, the postponement of the header information from the VSN to the server changed the situation.After postponement of the header information to the server, the CCITT Group 4 becomes the best compression method for all the information reduction techniques.The results of the statistical analysis are verified by applying the proposed IRT to various representative real life applications.We conclude that the detection of ROIs in the change frame in combination with CCITT Group 4 (without including its header information at the VSN) is a good strategy for efficiently reducing the data in the bi-level images.In this way the information communication in wireless visual sensor networks can be reduced, which will result in a sufficient reduction in communication energy consumption.Reduced communication energy consumption will increase the lifetime of the Visual Sensor Node.

Figure 1 .
Figure 1.The two extremes of image processing tasks in WVSN.

Figure 3 .
Figure 3. Selection of bi-level image coding method.

Figure 6 .
Figure 6.Pictorial view of postponing the header information from VSN to the server.

Figure 7 .
Figure 7. Image coding results for BHR and AHR.

Figure 9 .
Figure 9.Effect of header removal on the results of ROI coding.

Figure 10 .
Figure 10.Average compressed file size for various applications.

Table 1 .
Average file size for applications with too many objects

Table 2 .
Average file size for applications with too few objects