New Parallel Models for Face Recognition

Subspace methods such as principal component analysis (PCA) and linear discriminant analysis (LDA) extract the features based on space domain. Transformation such as discrete cosine transform (DCT) extracts features based on frequency domain. In this paper, we present two parallel models which intend to utilize the features extracted from frequency and space domain of facial images. Both features are combined under a fusion based scheme. FERET database is chosen to evaluate the performance of the proposed method. Simulation results indicate that the proposed method outperforms other traditional methods and enhance the representation of facial image under low-dimensional features.

selection plays a very important role to avoid overlapping features and information redundancy.We propose a new parallel model for face recognition utilizing information from frequency and spatial domains.Both features are processed in parallel way.It is wellknown that image can be analyzed in spatial and frequency domains.Both domains describe the image in very different ways.The frequency domain features are extracted using DCT, DFT and DWT methods respectively.By utilizing these two very different features, a better performance is guaranteed.Feature fusion method suffers from the problem of high dimensionality because of the combined features.It may also contain redundant and noisy data.To solve this problem, LDA is applied on the features from frequency and spatial domains to reduce the dimensionality and extract the most discriminant information.However, LDA has a big drawback.If the number of samples is smaller than the dimensionality of the samples, the sample scatter matrix may become singular or close to singular, leading to computation difficulty.This problem is called small sample size (SSS) problem.Several variants of LDA have been developed to counter SSS problem such as, Liu LDA (Liu et al, 1992), Chen LDA (Chen et al, 2000), D-LDA (Hu & Yang, 2001) and modified Chen LDA.These modified LDA techniques will be presented and discussed.Different variants of our parallel model face recognition with different frequency domain transformation techniques and variants of LDA algorithms are proposed.The strategy of integrating the multiple features is also discussed.A weighting function is proposed to ensure the features from spatial and frequency domains contribute equal weight in the matching score level.ORL and FERET face databases were chosen to evaluate the performance of our system.The results showed that our system outperformed most of the conventional methods.

Frequency domain analysis methods
Frequency domain analysis method has been widely used in modern image processing.In this section, DFT, DCT and DWT are presented.

Discrete fourier transform
Fourier Transform is a classical frequency domain analytical method.For an 1×N input signal, f(n).DFT is defined as The 2D face image is first converted to 1D vector, f(n) by cascading each column together and transforming them into frequency domain.Only low frequency coefficients are selected because most of the signal's energy is located in the low frequency band.In this chapter, 300 coefficients (from k=1 until k=300) are selected.As a matter of fact, human visual system is more sensitive to variation in the low-frequency band [10].

Discrete cosine transform
DCT possesses some fine properties, such as de-correlation, energy compaction, separability, symmetry and orthogonality.According to the JPEG image compression standard, the image is first divided into 8×8 blocks for the purpose of computation efficiency.Then, two dimensional DCT (2D-DCT) is applied independently on each block.
The DCT coefficients are scanned in a zigzag manner starting from the top left corner of each block as shown in Fig. 1 because DCT coefficients with large magnitude are mainly located at the upper left corner.The first coefficient is called DC-coefficient.The remaining coefficients are referred to as AC coefficients.The frequency of the coefficients increases from left to right and from top to bottom.The DCT coefficients at the most upper-left corner of each 8×8 block are selected and merged to a 1D vector.For an N×N image, the 2D DCT is defined as (2) For υ , ν = 0,1,2,…N-1 and α(u) and α(ν) are defined as follow: Based on (Lay and Guan, 1999) and (Tjahyadi et al, 2007) works, DC and AC01, AC10, AC11 which are located at the top-left corner of the block are selected because they give the best result.LDA is further applied to the selected coefficient to extract the most discriminant features for the ease of computation and storage.where a is the scale and t is the time, and b is the shift.For DWT, the scale, a is restricted to powers of 2 and the position, b, is restricted to the integers multiples of the scales.DWT is defined as (4) where j and k are integers and φ j,k are orthogonal baby wavelets defined as (5) Baby wavelets φ j,k have an associated baby scaling function defined as ( 6) The scaling function can be expressed in terms of low-pass filter coefficients h 0 (n) as shown below: (7) The wavelet function can be expressed in term of high-pass filter coefficients h 1 (n) as below (8) Hence, the signal f(t) can be rewritten as below: (9) Where cA 1 (k) and cD 1 (k) represent the approximation coefficients and detail coefficients level 1 respectively.Similarly, the approximation and detail coefficient can be expressed in term of low-pass filter coefficients, h 0 (n) and high-pass filter coefficients, h 1 (n).

Linear discriminant analysis
As mentioned in the previous section, feature fusion method suffers from the problem of high dimensionality.Our proposed method incorporates LDA to reduce the dimensionality of the features from frequency and spatial domains.Conventional LDA seeks for a set of projection vectors, W which form the maximum between-class scatter, S b and minimum within-class scatter matrix, S w simultaneously (Chen et al, 2000).The function of W is given in Eq. ( 14).14) with total scatter matrix, S t .S t is the sum of within-class scatter matrix and between-class scatter matrix.The new projection vector set is defined as in Eq.
(17).The rank of S t is defined as in Eq. ( 16) as shown in (Chen et al, 2000).If S t ≠ n, S t is nonsingular.Under this circumstance, the LDA criteria will be fulfilled if W t S w W=0 and W t S b W≠0.Although KM-1> K(M-1) , this does not guarantee that S t is always not equal to n. respectively.Then, the within-class scatter matrix S w is transformed to S ww .S ww is defined as below: (18) The projection vector that can satisfy the objective of an LDA process is the one that can maximize the between-class scatter matrix.Only the smallest eigenvalues and the corresponding eigenvalues are chosen to form V W and D W respectively.The most discriminant vector set for D-LDA is given by ( 19) Chen LDA used a different approach to counter the problem.Chen LDA starts by calculating the projection vector in the null space of the S w .This is done by performing singular value decomposition on S w .Then a set of eigenvectors, of which corresponding eigenvalues are equal to zero, are chosen to form the projection vector.The projection vector Step 1, Perform the singular value decomposition of S w .Choose a set of eigenvectors, in which the corresponding eigenvalues are zero to form Q.
Step 2, Compute S bb , where S bb =QQ t S b (QQ t ) t .S b is the between-class scatter matrix.
Step 3, Perform the singular value decomposition of S bb .Choose a set of eigenvectors, in which the corresponding eigenvalues are the largest, to form U. U is the most discriminant vector set for LDA.
In this chapter, Chen LDA algorithm is modified.Instead of only choosing the eigenvectors which the corresponding eigenvalues are equal to zero in the step 1, we further includes those eigenvectors which the corresponding eigenvalues are close to zero.We deduced that the most discriminant features are not only located in null space of S w but also eigenvalues that close to zero.By selecting more eigenvectors, the most discriminant information in S w is preserved.

Parallel models for face recognition
As mentioned in previous section, LDA is applied on the features extracted from frequency and spatial domains.There are two set of features.One carries the important information of the face image which is derived from the spatial domain and the other one from frequency domain.Both sets of feature describe the face images in very different way.Here, both feature sets are assumed to be equally important.In order to make both features from spatial and frequency domains give equal weight in total matching score, a weighting function is applied to the feature set from spatial domain.The weighting function is given in Eq. ( 20).

Simulation results
The Olivetti Research Laboratory (ORL) and FERET databases were chosen to evaluate the performance of our proposed system.ORL database contains 400 pictures from 40 persons, each person has 10 different images.For each person, 5 pictures are randomly chosen as the training images.The remaining 5 pictures serve as the test images.The similarity between

Spatial domain result
The dimensionality of the face image was 32×32.ORL database is chosen to evaluate the performance.According to Eq. ( 15) and Eq. ( 16), S w and S t are singular.Hence, Liu LDA cannot solve the problem.Chen LDA, modified Chen LDA and D-LDA are employed to extract the most discriminant information and further reduce the dimensionality of the feature set from spatial domain.PCA result is included for comparison purpose.The performance for each system is shown in Table 1.

Method
Recognition Rate (%) PCA 89.5 Chen LDA 90.5 D-LDA 89.5 Modified Chen LDA 91.5 Table 1.Spatial domain result As shown above, the modified Chen LDA gave the best result.We deduced that modified Chen LDA gave the best result because it preserved more discriminant information of S w compared to Chen LDA.Hence, modified Chen LDA will be employed to extract the feature when the sample encounter SSS problem.

Frequency domain result
Since there were only 4 coefficients selected from each block, the total number of coefficients was 64.According (3) and (4), S w and S t are non-singular and LDA can be performed in DCT domain without difficulty.Liu LDA and D-LDA were employed to extract the most discriminant features.For DFT and DWT, the number of selected features that represent face image is 300 and 400 respectively.Therefore, Chen LDA, modified Chen LDA and D-LDA are incorporated to extract the most discriminant features.
From Table 2, it can be seen that Liu LDA and D-LDA gave equally good result in DCT domain which the sample does not suffer SSS problem.They achieved 94% recognition rate.For DFT and DWT which both S w were singular, modified Chen LDA gave the best result.It scores 96.5% and 94% in DFT domain and DWT domain respectively.Among the frequency domain analysis method, DFT gave better result compared to others.DFT + modified Chen LDA gave the best result.

Method
Recognition rate (%) The performances of the proposed parallel models are further evaluated using fb probe set of FERET database.

Conclusion
In this paper, a new parallel model for face recognition is proposed.There are three variants of parallel model which incorporate different variants of LDA.The utilizing information form frequency and spatial domains.Both features are processed in parallel way.LDA is subsequently applied on the features to counter high dimensionality problem that encounter by feature fusion method.The high recognition rate that is achieved by the proposed methods shows that features of both domains contribute valuable information to the system.Parallel model 1 and 2 gave the best result.Parallel model 2 achieved 99% and 96.7% recognition rate in ORL and FERET database respectively.

Fig. 1 .
Fig. 1.The zigzag scanning pattern in DCT block 2.3 Discrete wavelet transform DWT has been widely employed for noise reduction and compression in modern image processing.DWT operates by performing convolution on a target signal with wavelet kernel.There are several well-known wavelets such as coif (3), Haar and etc. DWT decomposes a signal into a sum of shifted and scaled wavelets.The continuous wavelet transform between a signal f(t) and a wavelet φ(n) is defined as is implemented by first computing the one-dimensional DWT along the rows and then columns of the image (Meada et al, 2005) as shown in Fig. 2. Features in LL sub-band are corresponding to low-frequency coefficients along the rows and columns and all of them are selected to represent the face image.

Fig. 2 .
Fig. 2. Two-dimensional discrete cosine transform which contains K classes and each class has M samples, each sample is represented by n-dimensional vector.The rank of S w is defined as in Eq. (12).LDA algorithm has a big drawback which is SSS problem.Liu et al, Yang et al and Chen et al proposed different approaches to handle the SSS problem.If the rank of S w ≠ n, then S w is singular.Liu et al modified the traditional LDA algorithm by replacing S w in Eq. ( proposed a solution called D-LDA to solve the small sample size problem.Unlike conventional LDA, D-LDA starts by diagonalizing the between-class scatter matrix S b .All of the eigenvectors of which the corresponding eigenvalues are equal to zero or close to zero are discarded because they do not carry any discriminative power (Hu and Yang, 2001).The remaining eigenvectors and the corresponding eigenvalues are chosen to form D b and V b set projects S b to another subspace and the new S b is b S .Singular value decomposition is performed on b S .A set of projection vector, in which corresponding eigenvalues are the largest are chosen.Now, there are two set of eigenvectors.A set of eigenvectors is derived from the null space of S w .Another set of eigenvectors is derived from S b , in which the corresponding eigenvalues are the largest.With both set of eigenvectors, the objective of LDA is fulfilled.Chen LDA is summarized as below: is the feature from spatial domain and f is the feature from frequency domain.The sizes of both features are 1×n.The weighting function is applied to the spatial domain features.The feature vectors from both domains are merged into 1-D vectors [f1,f2,…fn, ωs1, ωs2,…, ωsn].In section 3, the problem of LDA had been discussed.Chen LDA, D-LDA and modified Chen LDA are capable to counter SSS problem.But Chen LDA and modified Chen LDA do not perform well when S w is non-singular.Liu LDA cannot counter SSS problem when Eq. (16) equal to n. D-LDA can perform well regardless the condition of S w because D-LDA starts calculating S b instead of S w .Our results in section 5 showed that Liu LDA and D-LDA are equally good when S w is non-singular.Modified Chen LDA gave the best result when S w is singular.Based on the simulation result in section 5, three variants of our parallel model face recognition system as shown in Figure3are developed.The selection of LDA algorithm is based on the choice of feature domain.The selected DCT features from DCT domain in ORL database in small and the corresponding S w is non-singular.Hence, D-LDA is incorporated to extract the most discriminant features and to further reduce the dimensionality.D-LDA has advantage over Liu LDA in term of computation because D-LDA does not involve matrix inversion.For DWT and DFT, the feature sets are relatively large and S w is singular.Modified Chen LDA is employed to extract the most discriminant features because it gave the best result when S w is singular.

Fig. 3 .
Fig. 3. Parallel models for face recognition two images is measured using Euclidean Distance.Shorter distance implies higher similarity between two face images.fb probe set from FERET database was chosen to evaluate the proposed methods.The training set consists of 165 frontal images from 55 people.Each person has 3 different frontal images.

Table 3 .
Comparison of recognition rate of other face recognition methods