Investigation of Image Fusion for Remote Sensing Application

Remote sensing techniques have proven to be powerful tools for the monitoring of the Earth’s surface and atmosphere on a global, regional, and even local scale, by providing im‐ portant coverage, mapping and classification of land cover features such as vegetation, soil, water and forests. The volume of remote sensing images continues to grow at an enormous rate due to advances in sensor technology for both high spatial and temporal resolution sys‐ tems. Consequently, an increasing quantity of image data from airborne/satellite sensors have been available, including multi-resolution images, multi-temporal images, multi-fre‐ quency/spectral bands images and multi-polarization image. Remote sensing information is convenient and easy to be accessed over a large area at low cost, but due to the impact of cloud, aerosol, solar elevation angle and bio-directional reflection, the surface energy pa‐ rameters retrieved from remote sensing data are often missing; meanwhile, the seasonal var‐ iation of surface parameter time-series plots will be also affected. To reduce such impacts, generally time composite method is adopted. The goal of multiple sensor data fusion is to integrate complementary and redundant information to provide a composite image which could be used to better understanding of the entire scene.


Introduction
Remote sensing techniques have proven to be powerful tools for the monitoring of the Earth's surface and atmosphere on a global, regional, and even local scale, by providing important coverage, mapping and classification of land cover features such as vegetation, soil, water and forests. The volume of remote sensing images continues to grow at an enormous rate due to advances in sensor technology for both high spatial and temporal resolution systems. Consequently, an increasing quantity of image data from airborne/satellite sensors have been available, including multi-resolution images, multi-temporal images, multi-frequency/spectral bands images and multi-polarization image. Remote sensing information is convenient and easy to be accessed over a large area at low cost, but due to the impact of cloud, aerosol, solar elevation angle and bio-directional reflection, the surface energy parameters retrieved from remote sensing data are often missing; meanwhile, the seasonal variation of surface parameter time-series plots will be also affected. To reduce such impacts, generally time composite method is adopted. The goal of multiple sensor data fusion is to integrate complementary and redundant information to provide a composite image which could be used to better understanding of the entire scene.

Definition of image fusion
The definition of image fusion varies. For example: • Image fusion is the combination of two or more different images to form a new image by using a certain algorithm (Genderen and Pohl 1994 ) [1].
• Image fusion is the process of combining information from two or more images of a scene into a single composite image that is more informative and is more suitable for visual perception or computer processing. (Guest editorial of Information Fusion, 2007) [2].
• Image fusion is a process of combining images, obtained by sensors of different wavelengths simultaneously viewing of the same scene, to form a composite image. The composite image is formed to improve image content and to make it easier for the user to detect, recognize, and identify targets and increase his situational awareness. 2010. (http://www.hcltech.com/aerospace-and-defense/ enhanced-vision-system/).
Image fusion has proved to be an effective way for optimum utilization of large volumes of image from multiple sources since early 1990's. Multiple image fusion seeks to combine information from multiple sources to achieve inferences that are not feasible from a single sensor or source. It is the aim of image fusion to integrate different data in order to obtain more information than can be derived from each of the single sensor data alone [3].
This chapter focused on multi-sensor image fusion in remote sensing. The fusion of information from sensors with different physical characteristics enhances the understanding of our surroundings and provides the basis for regional planning, decision-making, urban sprawl monitoring and land use/ land cover classification, etc.

Techniques and application of image fusion
In the past decades it has been applied to different fields such as pattern recognition, visual enhancement, object detection and area surveillance.In 1997, Hall and Llinas gave a general introduction to multi-sensor data fusion [4]. Another in-depth review paper on multiple sensors data fusion techniques was published in 1998 [3]. This paper explained the concepts, methods and applications of image fusion as a contribution to multi-sensor integration oriented data processing. Since then, image fusion has received increasing attention. Further scientific papers on image fusion have been published with an emphasis on improving fusion quality and finding more application areas. As a case in point, Simone et al. describe three typical applications of data fusion in remote sensing, such as obtaining elevation maps from synthetic aperture radar (SAR) interferometers, the fusion of multi-sensor and multitemporal images, and the fusion of multi-frequency, multi-polarization and multi-resolution SAR images [5]. Quite a few survey papers have been published recently, providing overviews of the history, developments, and the current state of the art of image fusion in the image-based application fields [6][7][8], but recent development of multi-sensor data fusion in remote sensing fields has not been discussed in detail (Table 1).

Categorization of image fusion techniques
During the past two decades, several fusion techniques have been proposed. Most of these techniques are based on the compromise between the desired spatial enhancement and the spectral consistency. Among the hundreds of variations of image fusion techniques, the widely used methods include, but are not limited to, intensity-hue-saturation (IHS), highpass filtering, principal component analysis (PCA), different arithmetic combination(e.g. Brovey transform), multi-resolution analysis-based methods (e.g. pyramid algorithm, wavelet transform), and Artificial Neural Networks (ANNs), etc. The chapter will provide a general introduction to those selected methods with emphases on new advances in the remote sensing field. In general, all above mentioned approaches can be divided into four different types: signal level, pixel level, feature level, and decision level image fusion [4].

1.
Signal level fusion. In signal-based fusion, signals from different sensors are combined to create a new signal with a better signal-to noise ratio than the original signals.

2.
Pixel level fusion. Pixel-based fusion is performed on a pixel-by-pixel basis. It generates a fused image in which information associated with each pixel is determined from a set of pixels in source images to improve the performance of image processing tasks such as segmentation 3. Feature level fusion. Feature-based fusion at feature level requires an extraction of objects recognized in the various data sources. It requires the extraction of salient features which are depending on their environment such as pixel intensities, edges or textures. These similar features from input images are fused.

4.
Decision-level fusion consists of merging information at a higher level of abstraction, combines the results from multiple algorithms to yield a final fused decision. Input im- ages are processed individually for information extraction. The obtained information is then combined applying decision rules to reinforce common interpretation.

Convenient image fusion methods
The PCA transform converts inter-correlated multi-spectral (MS) bands into a new set of uncorrelated components. To do this approach first we must get the principle components of the MS image bands. After that, the first principle component which contains the most information of the image is substituted by the panchromatic image. Convenient fusion algorithms mentioned above have been widely used for relatively simple and time efficient fusion schemes. However, several problems must be considered before their application: 1) These fusion algorithms generate a fused image from a set of pixels in the various sources. These pixel-level fusion methods are very sensitive to registration accuracy, so that co-registration of input images at sub-pixel level is required; 2) One of the main limitations of HIS and Brovey transform is that the number of input multiple spectral bands should be equal or less than three at a time; 3) These image fusion methods are often successful at improves the spatial resolution, however, they tend to distort the original spectral signatures to some extent [14,15]. More recently new techniques such as the wavelet transform seem to reduce the color distortion problem and to keep the statistical parameters invariable.

Multi-resolution analysis-based methods
Multi-resolution or multi-scale methods, such as pyramid transformation, have been adopted for data fusion since the early 1980s [16]. The Pyramid-based image fusion methods, including Laplacian pyramid transform, were all developed from Gaussian pyramid transform, have been modified and widely used [17,18].
In 1989, Mallat put all the methods of wavelet construction into the framework of functional analysis and described the fast wavelet transform algorithm and general method of con-New Advances in Image Fusion structing wavelet orthonormal basis. On the basis, wavelet transform can be really applied to image decomposition and reconstruction [19,20]. Wavelet transforms provide a framework in which an image is decomposed, with each level corresponding to a coarser resolution band. For example, in the case of fusing a MS image with a high-resolution PAN image with wavelet fusion, the Pan image is first decomposed into a set of low-resolution Pan images with corresponding wavelet coefficients (spatial details) for each level. Individual bands of the MS image then replace the low-resolution Pan at the resolution level of the original MS image. The high resolution spatial detail is injected into each MS band by performing a reverse wavelet transform on each MS band together with the corresponding wavelet coefficients (Figure 1).

Figure 1. Generic flowchart of wavelet-based image fusion
In the wavelet-based fusion schemes, detail information is extracted from the PAN image using wavelet transforms and injected into the MS image. Distortion of the spectral information is minimized compared to the standard methods. In order to achieve optimum fusion results, various wavelet-based fusion schemes had been tested by many researchers. Among these schemes several new concepts/algorithms were presented and discussed. Candes provided a method for fusing SAR and visible MS images using the Curvelet transformation. The method was proven to be more efficient for detecting edge information and denoising than wavelet transformation [21]. Curvelet-based image fusion has been used to merge a Landsat ETM+ panchromatic and multiple-spectral image. The proposed method simultaneously provides richer information in the spatial and spectral domains [22]. Donoho et al. presented a flexible multi-resolution, local, and directional image expansion using contour segments, the Contourlet transform, to solve the problem that wavelet transform could not efficiently represent the singularity of linear/curve in image processing [23]. Contourlet transform provides flexible number of directions and captures the intrinsic geometrical structure of images.
In general, as a typical feature level fusion method, wavelet-based fusion could evidently perform better than convenient methods in terms of minimizing color distortion and denoising effects. It has been one of the most popular fusion methods in remote sensing in recent years, and has been standard module in many commercial image processing soft wares, such as ENVI, PCI, ERDAS. Problems and limitations associated with them include: (1) Its computational complexity compared to the standard methods; (2) Spectral content of small objects often lost in the fused images; (3) It often requires the user to determine appropriate values for certain parameters (such as thresholds). The development of more sophisticated wavelet-based fusion algorithm (such as Ridgelet, Curvelet, and Contourlet transformation) could improve the performance results, but these new schemes may cause greater complexity in the computation and setting of parameters.

Artificial neural network based fusion method
Artificial neural networks (ANNs) have proven to be a more powerful and self-adaptive method of pattern recognition as compared to traditional linear and simple nonlinear analyses [24]. The ANN-based method employs a nonlinear response function that iterates many times in a special network structure in order to learn the complex functional relationship between input and output training data. The general schematic diagram of the ANN-based image fusion method can be seen in Figure 2. The input layer has several neurons, which represent the feature factors extracted and normalized from image A and image B. The function of each neuron is a sigmoid function given by [25]:

New Advances in Image Fusion
In Figure 6, the hidden layer has several neurons and the output layer has one neuron (or more neuron). The ith neuron of the input layer connects with the jth neuron of the hidden layer by weight W ij , and weight between the jth neuron of the hidden layer and the tth neuron of output layer is V jt (in this case t = 1). The weighting function is used to simulate and recognize the response relationship between features of fused image and corresponding feature from original images (image A and image B). The ANN model is given as follows: In equation (6), Y=pixel value of fused image exported from the neural network model, q=number of nodes hidden (q~8 here), V j =weight between jth hidden node and output node (in this case, there is only one output node), c=threshold of the output node, H j =exported values from the jth hidden node: Where W ij =weight between ith input node and the jth hidden node, a i =values of the ith input factor, n=number of nodes of input (n~5 here), h j =threshold of the jth hidden node.
As the first step of ANN-based data fusion, two registered images are decomposed into several blocks with size of M and N ( Figure 2). Then, features of the corresponding blocks in the two original images are extracted, and the normalized feature vector incident to neural networks can be constructed. The features used here to evaluate the fusion effect are normally spatial frequency, visibility, and edge. The next step is to select some vector samples to train neural networks. An ANN is a universal function approximator that directly adapts to any nonlinear function defined by a representative set of training data. Once trained, the ANN model can remember a functional relationship and be used for further calculations. For these reasons, the ANN concept has been adopted to develop strongly nonlinear models for multiple sensors data fusion. Thomas et al. discussed the optimal fusion method of TV and infrared images using artificial neural networks [26]. After that, many neural network models have been proposed for image fusion such as BP, SOFM, and ARTMAP neural networks. BP algorithm has been mostly used. However, the convergence of BP networks is slow and the global minima of the error space may not be always achieved [27]. As an unsupervised network, SOFM network clusters input sample through competitive learning. But the number of output neurons should be set before constructing neural networks model [28]. RBF neural network can approximate objective function at any precise level if enough hidden units are provided. The advantages of RBF network training include no iteration, few train-ing parameters, high training speed, simply process and memory functions [29]. Hong explored the way that using RBF neural networks combined with nearest neighbor clustering method to cluster, and membership weighting is used to fuse. Experiments show this method can obtain the better effect of cluster fusion with proper width parameter [30].
Gail et al. used Adaptive Resonance Theory (ART) neural networks to form a new framework for self-organizing information fusion. The ARTMAP neural network can act as a selforganizing expert system to derive hierarchical knowledge structures from inconsistent training data [31]. ARTMAP information fusion resolves apparent contradictions in input pixel labels by assigning output classes to levels in a knowledge hierarchy. Wang et al. presented a feature-level image fusion method based on segmentation region and neural networks. The results indicated that this combined fusion scheme was more efficient than that of traditional methods [32].
The ANN-based fusion method exploits the pattern recognition capabilities of artificial neural networks, and meanwhile, the learning capability of neural networks makes it feasible to customize the image fusion process. Many of applications indicated that the ANN-based fusion methods had more advantages than traditional statistical methods, especially when input multiple sensor data were incomplete or with much noises. It is often served as an efficient decision level fusion tools for its self learning characters, especially in land use/land cover classification. In addition, the multiple inputs − multiple outputs framework make it to be a possible approach to fuse high dimension data, such as long-term time-series data or hyper-spectral data.

Dempster-Shafer evidence theory based fusion method
Dempster-Shafer decision theory is considered a generalized Bayesian theory, used when the data contributing to the determination of the analysis of the images is subject to uncertainty. It allows distributing support for proposition not only to a proposition itself but also to the union of propositions that include it. Huadong Wu et.al. presented a system framework that manages information overlap and resolves conflicts, and the system provides eneralizable architectural support that facilitates sensor fusion [33].
Compared with Bayesian theory, the Dempster-Shafer theory of evidence feels closer to our human perception and reasoning processes. Its capability to assign uncertainty or ignorance to propositions is a powerful tool for dealing with a large range of problems that otherwise would seem intractable [33]. The Dempster-Shafer theory of evidence has been applied on image fusion using SPOT/HRV image and NOAA/AVHRR series. The results show unambiguously the major improvement brought by such a data fusion, and the performance of the proposed method [34]. H. Borotschnig et.al. compared three frameworks for information fusion and view-planning using different uncertainty calculi: probability theory, possibility theory and Dempster-Shafer theory of evidence [35]. The results indicated that Dempster-Shafer decision theory based sensor fusion method will achieve much higher performance improvement, and it provides estimates of imprecision and uncertainty of the information derived from different sources New Advances in Image Fusion 8

Applications of image fusion
It has been widely used in many fields of remote sensing, such as object identification, classification, and change detection. The following paragraphs describe the recent achievements of image fusion in more detail.

Object identification
The feature enhancement capability of image fusion is visually apparent in VIR/VIR combinations that often results in images that are superior to the original data. In order to maximize the amount of information extracted from satellite image data useful products can be found in fused images [3]. An integrated system for automatic road mapping from high-resolution multi-spectral satellite imagery by information fusion was discussed by Jin et al. in 2005 [36]. Garzeli. A. presents a solution to enhance the spatial resolution of MS images with high-resolution PAN data. The proposed method exploits the undecimated discrete wavelet transform, and the vector multi-scale Kalman filter, which is used to model the injection process of wavelet details. Fusion simulations on spatially degraded data and fusion tests at the full scale reveal that an accurate and reliable PAN-sharpening is achieved by the proposed method [37]. A case study, which extracted artificial forest and residential areas using high spatial resolution image and multiple spectral images, was shown as follows.
Forest classification and mapping provides an important basis for forest monitoring and ecological protection. The method based on single pixel or only on spectral features cannot effectively distinguish the types of forest. Here we present an approach for extracted artificial forest areas using SPOT 5 Panchromatic band and multiple spectral images in Naban River National Nature Reserve, is located in Jing Hong City, Yunnan province, South China. The resolution of the panchromatic band of SPOT-5 image is 2.5 m and that of the multispectral bands is 10 m. The Pansharpening fusion method is first used for panchromatic and multi-spectral data fusion of SPOT-5 image data. Next, histogram equalization, median filtering and PCA method are used to make image optical spectrum enhancement and denoising, so as to improve the multi-scale image segmentation effect. Compared with the original spectrum data, the image textures of artificial forest after the pretreatment are regularly arranged and its visual texture features are very obvious. The particle size information of natural forests is significant. So that forest classification could be easily achieved (Figure 3).

Land use and land cover classification
Classification of land use and land cover is one of the key tasks of remote sensing applications. The classification accuracy of remote sensing images is improved when multiple source image data are introduced to the processing [3]. Images from microwave and optical sensors offer complementary information that helps in discriminating the different classes. As discussed in the work of Wu et al., a multi-sensor decision level image fusion algorithm based on fuzzy theory are used for classification of each sensor image, and the classification results are fused by the fusion rule. Interesting result was achieved mainly for the high speed classification and efficient fusion of complementary information [38]. Land-use/land-cover classification had been improved using data fusion techniques such as ANN and the Dempster-Shafer theory of evidence. The experimental results show that the excellent performance of classification as compared to existing classification techniques [39,40]. Image fusion methods will lead to strong advances in land use/land cover classifications by use of the complementary of the data presenting either high spatial resolution or high time repetitiveness.  Results indicated that the accuracy of residential areas of Yiwu city derived from fused image is much higher than result derived from CBERS multiple spectral image ( Figure 5).

Change detection
Change detection is the process of identifying differences in the state of an object or phenomenon by observing it at different times. Change detection is an important process in monitoring and managing natural resources and urban development because it provides quantitative analysis of the spatial distribution of the population of interest [41]. Image fusion for change detection takes advantage of the different configurations of the platforms carrying the sensors. The combination of these temporal images in same place enhances information on changes that might have occurred in the area observed. Sensor image data with low temporal resolution and high spatial resolution can be fused with high temporal resolution data to enhance the changing information of certain ground objects. Madhavan et al. presented a decision level fusion system that automatically performs fusion of information from multi-spectral, multi-resolution, and multi-temporal high-resolution airborne data for a change-detection analysis. Changes are automatically detected in buildings, building structures, roofs, roof color, industrial structures, smaller vehicles, and vegetation [42]. An example of change detection using Landsat ETM+ and MODIS data is presented as follow.
Recent study indicated that urban expansion could be efficiently monitored using satellite images with multi-temporal and multi-spatial resolution. For example, Landsat ETM+ Panchromatic image (Figure 6 a) with spatial resolution of 10 m of Chongqing City, Southwest China, in 2000 was fused with daily-received multiple spectral bands of MODIS data (spatial resolution: 250m) (Figure 6 b)in 2006.
Brovey transformation fusion method was used.
( ) Where DN fused means the DN of the resulting fused image produced from the input data in three MODIS multiple spectral bands (DN b1, DN b2 , DN b3 ) multiplied by the high resolution Landsat ETM+ Pan band (DN pan ).
The building areas remained unchanged from 2000 to 2006 were in grey-pink. Meanwhile, the newly established buildings were in dark red color in the composed image ( Figure 7) and could be easily identified. In recent years, object-oriented processing techniques are becoming more popular, compared to traditional pixel-based image analysis, object-oriented change information is necessary in decision support systems and uncertainty management strategies. An in-depth paper presented by Ruvimbo et al. introduced the concept and applications of object-oriented change detection for urban areas [43]. In general, due to the extensive statistical and derived information available with the object-oriented approach, a number of change images can be presented depending on research objectives. In land use and land cover analysis; this level of precision is valuable as analysis at the object level enables linkage with other GIS databases or derived socio-economic attributes.

Discussion and conclusions
Multi-sensor image fusion seeks to combine information from different images to obtain more inferences than can be derived from a single sensor. It is widely recognized as an efficient tool for improving overall performance in image based application. The chapter provides a state-of-art of multi-sensor image fusion in the field of remote sensing. Below are some emerging challenges and recommendations.

Improvements of fusion algorithms
Among the hundreds of variations of image fusion techniques, methods which had be widely used including IHS, PCA, Brovey transform, wavelet transform, and Artificial Neural Network (ANN). For methods like HIS, PCA and Brovey transform, which have lower complexity and faster processing time, the most significant problem is color distortion. Waveletbased schemes perform better than those methods in terms of minimizing color distortion. The development of more sophisticated wavelet-based fusion algorithm (such as Ridgelet, Curvelet, and Contourlet transformation) could evidently improve performance result, but they often cause greater complexity in computation and parameters setting. Another challenge on existing fusion techniques will be the ability for processing hyper-spectral satellite sensor data. Artificial neural network seem to be one possible approach to handle the high dimension nature of hyper-spectral satellite sensor data.

From image fusion to multiple algorithm fusion
Each fusion method has its own set of advantages and limitations. The combination of several different fusion schemes has been approved to be the useful strategy which may achieve better quality of results. As a case in point, quite a few researchers have focused on incorporating the traditional IHS method into wavelet transforms, since the IHS fusion method performs well spatially while the wavelet methods perform well spectrally. However, selection and arrangement of those candidate fusion schemes are quite arbitrary and often depends upon the user's experience. Optimal combining strategy for different fusion algorithms, in another word, 'algorithm fusion' strategy, is thus urgent needed. Further investigations are necessary for the following aspects: 1) Design of a general framework for combination of different fusion approaches; 2) Development of new approaches which can combine aspects of pixel/feature/decision level image fusion; 3) Establishment of automatic quality assessment method for evaluation of fusion results.

Establishment of an automatic quality assessment scheme.
Automatic quality assessment is highly desirable to evaluate the possible benefits of fusion, to determine an optimal setting of parameters for a certain fusion scheme, as well as to compare results obtained with different algorithms. Mathematical methods were used to judge the quality of merged imagery in respect to their improvement of spatial resolution while preserving the spectral content of the data. Statistical indices, such as cross entropy, mean square error, signal-to-noise ratio, have been used for evaluation purpose. While recently a few image fusion quality measures have been proposed, analytical studies of these measures have been lacking. The work of Chen et al. focused on one popular mutual informationbased quality measure and weighted averaging image fusion [44]. Zhao presented a new metric based on image phase congruency to assess the performance of the image fusion algorithm [45]. However, in general, no automatic solution has been achieved to consistently produce high quality fusion for different data sets. It is expected that the result of fusing data from multiple independent sensors will offer the potential for better performance than can be achieved by either sensor, and will reduce vulnerability to sensor specific counter-