Illumination invariants based on Markov random fields

We propose textural features, which are invariant to illumination spectrum and extremely robust to illumination direction. They require only a single training image per texture and no knowledge of illumination direction or spectrum. Hence, these features are suitable for content-based image retrieval (CBIR) of realistic scenes with colour textured objects and variable illumination. The illumination invariants are derived from Markov random field based texture representations. Our illumination invariant features are favourably compared with frequented features in this area - the local binary patterns, steerable pyramid and Gabor textural features, respectively. The superiority of our new invariant features is demonstrated in the illumination invariant recognition of the most advanced representation for realistic real-world materials - bidirectional texture function (BTF) textures.


Introduction
Textures are important clues to specify objects present in a visual scene.Unfortunately, the appearance of natural textures is highly illumination and view angle dependent.As a consequence, most recent realistic texture based classification or segmentation methods require multiple training images [18] captured under all possible illumination and viewing conditions for each class.Such learning is obviously clumsy, probably expensive and very often even impossible if required measurements are not available.
Authors [2] allow a single training image per class, but they require uniform albedo surfaces and the knowledge of illumination direction.The normalisation canceling lighting variations caused by the object geometry [4] completely wipes out rough texture structures with all its valuable discriminative information.It suffers also with instability due to involved nonlinear trans-formations.The above mentioned drawbacks are inevitable, because there is no discriminative function for grey images of objects with Lambertian reflectance that is illumination direction invariant [1].The quasiinvariants [19] compromise full invariance to be less noise sensitive.Local Binary Patterns [13] (LBP) are popular illumination invariant features, but very noise sensitive [17].Other options are parameters of Weibulldistribution of image edges [6] (six-stimulus theory), the logarithm of Gabor filter responses together with new Gaussian colour model [5], which is a subclass of illumination model we use.
We introduce textural features, which are robust to illumination direction changes.This property is verified on University of Bonn BTF texture measurements [12], where illumination sources are spanned over 75% of possible illumination half-sphere.We require only a single training image per texture and no knowledge of illumination direction.Our features are also invariant to illumination brightness and spectrum changes, and robust to Gaussian noise degradation [17].In contrast with the similar test setup [9], we employ newly derived illumination invariants, which result in significant improvement even in more difficult arrangement (more than twice textures, half resolution, and different viewpoints) presented hereafter.

Texture Representation
Let us assume that each multispectral texture (composed of C spectral planes) can be modeled either by a 3-dimensional Markov random field (MRF) model or spectral planes can be mutually decorrelated by the Karhunen-Loeve transformation (Principal Component Analysis) and subsequently modeled using a set of C 2-dimensional MRF models.The texture is factorised into K levels of the Gaussian pyramid, so we can use lower order MRF factor models.
Let us denote a multiindex r = (r 1 , r 2 ) , where r 1 is the row and r 2 the column index, respectively.

MRF Models
A multispectral texture factor (for the kth Gaussian pyramid level) is represented using either adaptive 3D / 2D causal autoregressive random (CAR) field model [8] or 2D Gaussian Markov random field model (GMRF) [7].All these models can be unified in the following matrix equation form: where is the Cη × 1 data vector with multiindices r, s, t, γ = [A 1 , . . ., A η ] is the C × C η unknown parameter matrix with submatrices A s .In the case of C 2D CAR / GMRF models stacked into the model equation ( 1) the parameter matrices A s are diagonal otherwise they are full matrices for general 3D CAR models.The contextual neighbour index shift set is denoted I r and η = cardinality(I r ) .GMRF and CAR models mutually differ in the correlation structure of the driving noise r (1) and in the topology of the contextual neighbourhood I r (see [7] for details).As a consequence, all CAR model statistics can be efficiently estimated analytically [8] while the GMRF statistics estimates require either numerical evaluation or some approximation.Given the known CAR process history Y (t−1) = {Y t−1 , Y t−2 , . . ., Y 1 , Z t , Z t−1 , . . ., Z 1 } the parameter estimation γ can be accomplished using fast, numerically robust and recursive statistics [8]: Ṽzy(t−1) Ṽzz(t−1) , , where V 0 is a positive definite matrix (see [8]).

Illumination Invariant Features
We assume fixed positions of viewpoint and illumination sources, the illumination sources are supposed to be far enough to produce uniform illumination.Based on the work [3], assuming Lambertian surface reflectance, the two images Ỹ , Y acquired with different illumination spectra can be transformed to each other: where Ỹr , Y r are multispectral pixel values at position r and B is a C × C transformation matrix.The linear formula (3) is also valid for more illumination sources provided that the spectra of all sources are the same and the positions of the illumination sources remain unchanged.More importantly, it can be proved that formula ( 3) is able to model specular reflectance component (e.g.dichromatic reflection model [15], which comprise well-known Phong model).
With the previous assumptions, the following illumination invariant features 1. 2. 6. 7. can be derived from the estimated MRF statistics for GMRF models with centered Y r,j , while 1. -5. are CAR invariants.In the case of 2D models, invariants 3. -7. are computed for each spectral plane separately.

trace: tr
Feature vectors are formed from these illumination invariants, which are easily evaluated during the MRF parameters estimation process.The distance between feature vectors of two textures T, S is computed using the Minkowski norms L 1 , L 0.2 , or alternatively with fuzzy contrast [14] in its symmetrical form F C 3 : where M is the feature vector size and µ(f i ) and σ(f i ) are average and standard deviation of the feature f i computed over all database, respectively.The sigmoid function τ models the truth value of fuzzy predicate.

Experiments
We have designed our experiments to test the method robustness to illumination direction changes, which drastically violates our previous theoretical assumptions.The experiments are performed on three different sets of 1215 BTF texture images.These BTF data are from the University of Bonn database [12] and contain BTF colour measurements such as ceiling, corduroy, two fabrics, walk way, foil, floor tile, pink tile, impalla, proposte, pulli, wallpaper, wool, and two lacquered wood textures.Each BTF material sample (Fig. 1) is measured in 81 illumination angles as a RGB image (C = 3).Each of our three test sets has fixed viewpoint direction, the declination angles of viewing direction from surface normal are 0 • , 30 • , and 60 • , in plane rotation is not included.All images were cropped to the same size 256 × 256.

Compared Features
Our proposed features are compared with the most popular features such as the Gabor features, steerable pyramid features and Local Binary Patterns (LBP).
The Gabor features [11], which are computed from responses of Gabor filters, were computed separately for each spectral plane and concatenated into the feature vector.The Opponent Gabor features [10], which are extension to colour textures, analyses relations be- 54 tween spectral channels.We have also experimented with brightness normalisation prior to feature computation (denoted as "nm.").Both Gabor feature vectors are compared with the author's suggested norms ( [11,10]).
Steerable pyramid features are statistics of Steerable pyramid decomposition, which were proposed for texture synthesis in [16].The feature vectors are compared with the same norm as Gabor features.
LBP [13] are histograms of texture micro patterns, which are thresholded values at each pixel neighbourhood.Subsequently, the histograms are compared using Kullback-Leibler divergence as suggested in [13].We have tested features: LBP 8,1+8,3 and LBP u2 16,2 , which were published with best performance under different illuminations (Outex database) and rotation invariant features LBP riu2 16,2 .The features were computed either on grey-scale images or on each spectral plane separately and concatenated into the feature vector.Brightness normalisation is not necessary.
The proposed MRF features were computed at K = 4 levels of Gaussian pyramid, using the 6-th order hierarchical neighbourhood, which consists in η = 14 neighbours.The size of feature vectors used is listed in Tab. 1.

Results
In our experiments, single training image per each material was randomly chosen and the remaining images were classified using the nearest neighbour approach.The results in Tab. 2 are averages over 10 5 random choices of training images and the last column consists in averages of previous columns.We can see that the far best performance (90.3%) was achieved with 2D CAR-KL model and L 1 distance.The best alternative features were Opponent Gabor features with average performance 77.4%, the best of LBP features achieved 65.6%.Although the LBP feature are invariant to brightness changes, these results demonstrate their inefficiency to handle illumination direction variations.Rotation invariant LBP features are more capable, however rotating illumination cannot be modeled as a simple image rotation.