Fault Detection Algorithm Based on Filters Bank Derived from Wavelet Packets

The fault detection and isolation (FDI) are of particular importance in industry. In fact, the early fault detection in industrial systems can reduce the personal damages and economical losses. Basically, model-based and data-based methods can be distinguished. Model-based techniques require a sufficiently accurate mathematical model of the process and compare the measured data with the estimations provided by the model in order to detect and isolate the faults that disturb the process. Parity space approach, observers design and parameters estimators are well known examples of model-based methods (Patton et al, 2000), (Zwingelstein, 1995), (Blanke et al., 2003), (Maquin & Ragot, 2000). In contrast, data-based methods require a lot of measurements and can also be divided into signal processing methods and artificial intelligence approaches. Many researchers have performed fault detection by using vibration analysis for mechanical systems, or current and voltage signature analysis for electromechanical systems (Awadallah & Morcos 2003), (Benbouzid et al., 1999). Other researchers use the artificial intelligence (AI) tools for faults diagnosis (Awadallah & Morcos 2003) and the frequency methods for faults detection and isolation (Benbouzid et al., 1999). This study continues our research in frequency domain, concerning fault detection by means of filters bank (Mustapha et al-a, 2007), (Mustapha et al-b, 2007). The aim of this article is to propose a method for the on-line detection of changes applied after a modeling of the original signal. This modeling is based on a filters bank decomposition that is needed to explore the frequency and energy components of the signal. The filters coefficients are derived from the wavelet packets, so the wavelet packets characteristics are approximately conserved and this leads to both filtering and reconstruction of the signal. This work is a continuity of our previous works for deriving a filters-bank from wavelet packets because the wavelet packets offer more flexibility for signal analysis and offer a lot of bases to represent the signal. The main contributions are to derive the filters and to evaluate the error between filters bank and wavelet packets response curves. Filters bank is preferred in comparison with wavelet O pe n A cc es s D at ab as e w w w .ite ch on lin e. co m


Introduction
The fault detection and isolation (FDI) are of particular importance in industry. In fact, the early fault detection in industrial systems can reduce the personal damages and economical losses. Basically, model-based and data-based methods can be distinguished. Model-based techniques require a sufficiently accurate mathematical model of the process and compare the measured data with the estimations provided by the model in order to detect and isolate the faults that disturb the process. Parity space approach, observers design and parameters estimators are well known examples of model-based methods (Patton et al, 2000), (Zwingelstein, 1995), (Blanke et al., 2003), (Maquin & Ragot, 2000). In contrast, data-based methods require a lot of measurements and can also be divided into signal processing methods and artificial intelligence approaches. Many researchers have performed fault detection by using vibration analysis for mechanical systems, or current and voltage signature analysis for electromechanical systems (Awadallah & Morcos 2003), (Benbouzid et al., 1999). Other researchers use the artificial intelligence (AI) tools for faults diagnosis (Awadallah & Morcos 2003) and the frequency methods for faults detection and isolation (Benbouzid et al., 1999). This study continues our research in frequency domain, concerning fault detection by means of filters bank , (Mustapha et al-b, 2007). The aim of this article is to propose a method for the on-line detection of changes applied after a modeling of the original signal. This modeling is based on a filters bank decomposition that is needed to explore the frequency and energy components of the signal. The filters coefficients are derived from the wavelet packets, so the wavelet packets characteristics are approximately conserved and this leads to both filtering and reconstruction of the signal. This work is a continuity of our previous works for deriving a filters-bank from wavelet packets because the wavelet packets offer more flexibility for signal analysis and offer a lot of bases to represent the signal. The main contributions are to derive the filters and to evaluate the error between filters bank and wavelet packets response curves. Filters bank is preferred in comparison with wavelet packets because it could be directly hardware implemented as a real time method. Then, the Dynamic Cumulative Sum detection method (Khalil, Duchêne, 1999) is applied to the filtered signals (sub-signals) in order to detect any change in the signal (figure 1). The association of filters bank decomposition and the DCS detection algorithm will be shown to be of great importance when the change is in the frequency domain. This paper is decomposed as follows. First we will explain the problem and we will present the utility of the decomposition before the detection. Then in section 3, the wavelet transform and the wavelet packets are presented and the derivation of filters bank from a wavelet packets and the problem of curve fitting are introduced, in the same section, the best tree selection based on the entropy of the signal and the filters bank channels corresponding to the suitable scale levels are discussed. In section 4, the Cumulative Sum (CUSUM), the Dynamic Cumulative Sum (DCS) algorithms and the fusion technique are detailed. Finally, the method is applied for the diagnosis of the Tennessee Eastman Challenge Process.

Filters bank for decomposition and detection
The simultaneous detection and isolation of events in a noisy non-stationary signal is a major problem in signal processing. When signal characteristics are known before and after the change, an optimal detector can be used according to the likelihood ratio test (Basseville & Nikiforov 1993). But when the signal to be detected is unknown, the Generalized Likelihood Ratio Test (GLRT) which consists of using the maximum likelihood estimate of the unknown signal will be used. In general, the segmentation depends on the parameters that change with time. These parameters, to be estimated, depend on the choice of the signal modeling. Most authors make use of application-dependent representations, based on AR modeling or on wavelet transform, in order to detect or characterize events or to achieve edge detection in signals (Mallat, 2000). When the change is energetic, many methods exist for detection purposes. But when the change is in frequency contents, special modeling, using a filters bank, is required before the application of the detection methods. After this modeling, the detection algorithm (DCS) will be applied on the decomposed signals instead of applying it to the original signal (see figure 1). The motivation is that the filters bank modeling can filter the signals and transform the frequency change into energy change. Then we choose only the sub-signals which present energy changes after decomposition. Furthermore, the detectability of DCS is improved if the changes are in energy. The subsignals can be used also to classify the detected events and this will be done after extracting the necessary parameters from isolated events and finally this aims to make diagnosis. This work is originated from the analysis and characterization of random signals. In our case, the recorded signals can be described by a random process x(t) as: After the point of change t r t r is the exact time of change. x 1 (t) and x 2 (t) can be considered as random processes where the statistical features are unknown but assumed to be identical for each segment 1 or 2. Therefore we assume that the signals x 1 (t) and x 2 (t) have Gaussian distributions. We will suppose also that the appearance times of the changes are unpredictable. In our case, we suppose that the frequency distribution is an important factor for discriminating between the successive events. In this way, the filters bank decomposition will be useful for classification purposes. For each component m, and at any discrete time t, the sample y (m) (t) is on-line computed in terms of the original signal x(t) using the parameters a(i) (m) and b(i) (m) of the corresponding filter according to the following difference equation: has a probability density function f 1 Figure 2 shows the original signal presenting a change in frequency content at time 1000 time units (TU). We can see that the decomposition enhances the energy changes, and it is more accurate to apply the detection on the sub-signals instead of applying it on the original signal. Fig. 2. a) original simulated signal presenting a frequency change at t r =1000 TU. b,c,d) the decomposition of the signal into three sub-signals using 3 channels filters bank. www.intechopen.com

Wavelet Transform
The Fourier analysis is the most well known mathematical tool used for transforming the signal from time domain to frequency domain. But it has an important drawback represented by the loss of time information when transforming the signal to the frequency domain. To preserve the temporal aspect of the signals when transforming them to frequency domain, one solution is to use the Wavelet Transform (Flandrin, 1993) which analyzes non-stationary signals by mapping them into time-scale and time-frequency representation.
The Wavelet Transform is similar to the Short Time Fourier Transform but provides, in addition, a multi-resolution analysis with dilated and shifted windows. The multi-resolution analysis consists of decomposing the signal x(t) using the wavelet ) (t ψ and its scale function (Papalambros & Wilde, 2000).
where a and b are respectively the dilation and translation parameters. The filter associated to the scale function ) (t φ is a low pass filter and the filter associated to the wavelet ) (t ψ is a band pass filter. This relation can also be written as follows: , then: The relation (5) indicates that the wavelet transform can be obtained by filtering the signal The band width of the band pass filters bank decreases when a increases. In order to decompose a signal into components of equal decreasing frequency intervals, we have to use a discrete time-frequency domain and the dyadic wavelet transform: Response curves of the wavelet filters.
If the signal x(t) is sampled, the discrete wavelet transform will be: Note that all the wavelet frequencies will be between 0 and f s (sampling frequency of the signal). The use of discrete time-frequency domain and the dyadic wavelet transform leads to decompose the signal into components of equal decreasing frequency intervals.

Scaling function and wavelet bases The projection of a function f(t)∈ L 2 (R) on the orthonormal basis {φ(t-k)} is a correlation between the original function f(t) and the scaling function φ(t) sampled at integer intervals.
The approximations resulting from the projection of f(t) on the scaling function basis form The projection of a function on the orthonormal scaling function basis results in loss of some information about the function: To obtain the detail information of the function, we use the projections on the orthonormal wavelet bases. With Discrete Wavelet Transform (DWT), the multi-resolution analysis uses a scaling function and a wavelet to perform successive decompositions of the signal into approximations and details (Chendeb, 2002). www.intechopen.com

Orthogonal and bi-orthogonal analysis with Mallat algorithm
Practically, the Mallat's algorithm is used for orthogonal and bi-orthogonal analysis. This algorithm consists of low-pass filtering the signal with a filter having its impulse response h[n] then down-sampling the result to obtain the lower resolution approximation, and highpass filtering the signal with a filter having its impulse response g[n] then down-sampling the result to obtain the lower resolution detail information.
The numerical sequence h[n] will be considered as the impulse response of a digital filter.
In the filters bank theory, we can find that the wavelet can be modeled by a discrete quadrature mirror filters bank g [n] and h [n] such that (Flandrin, 1993): will be considered as the impulse response of a discrete low pass filter whose coefficients can represent the scale function φ(t). And h[n] will be considered as the impulse response of a discrete high pass filter whose coefficients can represent the wavelet Ψ(t). Figure 5 represents the Mallat's multiresolution recursive analysis. The discrete wavelet transform consists of discrete filtering operation followed by downsampling process. Figure 6 shows the response curves of the quadrature mirror filters used for three levels decomposition.

Base and wavelet selection
The choice of the wavelet is a critical problem. To extract a specific signal event, the choice of the wavelet becomes important such that the wavelet must be adapted to the changes to be detected. On the other hand, the choice of scale levels is also important. After decomposing a signal, the most suitable scale levels are chosen according to their frequency bands, and the application of the detection must be applied to the most suitable scale levels.
Most likely, the choice of the filters bank, after the derivation, depends on the original signal and its frequency band. The number of filters depends on the details that we have to extract from the signal and to the events that must be distinguished.
The result of detection depends on the number of the filters used and the cut off frequency of each channel. In practice, L-channels derived filters are used. These filters are chosen so that the cut off frequencies are between zero Hertz and the half of the sampling frequency (f s /2). We will see later an automatic procedure, based on the entropy of the signal, to select the best filters bank cut off frequencies.

The wavelet packets
The wavelet packet decomposition is a generalization of the wavelet decomposition that offers more flexibility for signal analysis. In wavelet analysis, the approximation of the signal is decomposed into a lower-level approximation and detail. So the dyadic decomposition tree of the wavelet transform represents the signals by means of a fixed basis, and for n-level decomposition tree, we have n+1 possible ways to decompose the signal. In wavelet packet analysis, the details as well as the approximations can be decomposed into a lower-level approximations and details and this yields more than 1 2 2 − n possible ways to decompose the signal. So the wavelet packets offer a lot of bases to represent the signal from which a best decomposition tree can be selected, according to a given criterion, to meet the design objectives.

Wavelet packets decomposition
The wavelet packets are defined by Coifman, Meyer and Wickerhauser (Mallat, 1999) as a generalized relation between the multiresolution approximations and the wavelet. The approximation multiresolution space j V will be decomposed into a lower resolution approximation 1 V j+ and details 1 W j+ . This is done by decomposing the orthogonal basis The difference between the wavelet and the wavelet packet decompositions are shown by t h e b i n a r y t r e e i n d i c a t e d b y f i g u r e 8 . A n d f o r s e e k o f s i m p l i c i t y , i n w a v e l e t p a c k e t decomposition, the vector space V will be replaced by W.   Fig. 9. Binary tree obtained after wavelet packets decomposition.
www.intechopen.com Figure 9 shows clearly that the wavelet packet decomposition offers a lot of bases to represent the signal and we have more than 1 2 2 n− ,instead of n+1, possible ways to decompose the signal, from which a best decomposition tree can be selected for signal analysis.

Wavelet packets filters bank
Each node of the binary tree is indexed by (j,n) where n is the number of the node at level j. For each node there is an associated space n j W , which admits an orthogonal The relations between a node and its children nodes can be seen as follows: The wavelet packets coefficients are calculated by using a filters bank algorithm. Suppose that the projection tree of a signal in the sub-space W is C, we can define a wavelet packets as the projection of the signal in the sub-space n j W , where n j, ψ is an orthogonal basis. For each node (j,n) of the tree, the coefficients of the wavelet packets of a function ) (t f can be calculated as follows: The wavelet packets decomposition is obtained by using pyramidal recursive algorithm as follows: The packet n j C 2 , 1 + (respectively ) is obtained by low-pass (respectively high-pass) filtration of n j C , by using the filter g (resp. h), followed by down sampling by factor of 2 that projects the signal to sub-space n j W 2 , 1 + (resp. ). www.intechopen.com

Best tree selection
The wavelet packet decomposition generates redundant representations. Each decomposition level contains all the signal information. Some combinations of packets contain non-redundant representation of the signal, from which a best tree can be selected. The selection must be based on a criterion according to the problem to be treated. In our case and for detection purposes, the Shannon entropy-based algorithm is used, where the criterion is defined to allow an optimal separation of the frequency components of the signal (in case of frequency change). The best tree selection algorithm offers optimal signal representation (Coifman & Wicherhauser, 1992 . M(X), which is the expectation of information quantity, reflects the concentration of energy in the discrete series X. Based on the entropy criterion, the best tree selection is done as follows: a. For a given node A j,n , the energy is compared to threshold energy (SE) then this node is labeled by 0 if its energy is lower than SE and by 1 in other cases.
b. Indicate the number of components present in each packet. c. Select the single-component packets. In (Hitti. & Eric, 1999), Hitti proposed a method to determine the threshold energy by estimating the sum of energies of packets in each level j ( j EnsSP ) and then the threshold energy can be calculated as follows:

Filters bank parameters estimation (Butterworth filters)
In order to explore the frequency and energy components of the original signal, an important pre-processing step is required before detection, feature extraction and classification. At a discrete time t, the signal is first decomposed by using an L-channels filters bank whose parameters are derived from wavelet packets. Each component m∈ {1,…, L} is the result of filtering the original signal x(t) by the derived filter for a given cut off frequency. The filters coefficients are derived from the wavelet packets in order to use them as wavelet packets so the wavelet packets characteristics are approximately conserved and can be used for both filtering and reconstruction of the signal. Also the wavelet packets derived filters bank can be hardware implemented and can be used for online detection. For a given wavelet, we can use its frequency domain transfer function h wav to derive (extract) the transfer function of a low pass Butterworth filter h filt . After the extraction, the frequency responses of the wavelet packets-based filters bank can be determined as follows: 1. Selection of the wavelet. 2. Extractions of the transfer function of the wavelet h wav . 3. Calculation of the cut off frequency (f c ) of the wavelet filter (at -3db attenuation). 4. Estimation of the order of the filter that minimizes the error between the two transfer functions h wav and h filt . The estimation is done by using the least square method to calculate the optimal filter order N that minimizes the error between the two transfer functions h wav and h filt (figure 11).
where H(jf) is the transfer function, f is the frequency, f c is the cut off frequency and N is the order of the filter. Practically, the coefficients a(i) and b(i) necessary to calculate the equation (19) will be determined from the Butterworth table. Note that m represents the decomposition level.
To have a quadrature mirror filters, we can extract a high pass filter by using the following relation: So we can design a filters bank that behaves as a wavelet packets. This filters bank can be used, instead of a wavelet packets, to decompose a signal into some frequency components in order to explore the signal. The main problem is to solve nonlinear curve-fitting (data-fitting) in the least squares sense. Given input data h wav , and the desired output data h filt , find coefficient N that "best-fit" the two curves is well known in the numerical analysis domain. This problem is named curve-fitting problem (Papalambros & Wilde, 2000), (Rustagi, 1994). This problem consists to find the parameters which permit the best fitting of the analytic model to the real data. If the model is linear, the problem is solved by using linear regression methods (Saporta, 1990), and if the model is nonlinear, the problem is solved by using nonlinear regression methods (Rardin, 1998).
The optimization consists to express the relation between the two variables f and h wav for a function h filt that have non linear dependence with the vector: In order to solve this problem we use the least squares method, we must minimize with respect to N the quadratic error: So the nonlinear curve-fitting problem can be solved by the classical algorithm of Gauss -Newton (Papalambros & Wilde, 2000) that is: knowing the desired output we can find the optimal coefficient N that best fits the two curves. Because F is a non linear function of parameter N, the optimization is done by an iterative method: We start with a given value of N, then we continue the iterations until that the parameter N can't be changed. The figure 12 shows the response curve of the wavelet 'db5' and the response curve of the filter derived from it. The order of the derived filter is 6, and the error corresponding to the minimum error and calculated by using the Euclidian distance: Is e=0.78869.

Sequential algorithms
The CUSUM algorithm is based on a recursive calculation of the logarithm of the likelihood ratios. This method can be considered as a sequence of repeated tests around the point of change t r (Nikiforov, 1986), (Basseville & Nikiforov 1993). Let x 1 ,x 2 ,x 3 ,…,x t be a sequence of observations. Let us assume that the distribution of the process X depends on parameter 0 θ until time t r and depends on parameter 1 θ after the time t r . At each time t we compute the sum of logarithms of the likelihood ratios as follows: where, ƒ θ is the probability density function. The importance of this sum comes from the fact that its sign changes after the point of change.
The CUSUM method is usually applied when we have a priori knowledge about the segments before and after the changes (energy change or probability density distribution before and after the change). The Dynamic Cumulative Sum (DCS) technique, based on the local dynamic cumulative sum, is preferred when the parameters of the signal are unknown (Khalil & Duchêne, 1999). The DCS is a repetitive sequence around the point of change t r . It is based on the local cumulative sum of the likelihood ratios between two local segments estimated at the current time t. These two dynamic segments ) (t a S (after t) and S (before t) are estimated by using two windows of width W (figure 13) before and after the instant t as follows: follows a probability density function   (Khalil, 1999) proves that the DCS function reaches its maximum at the point of change t r . The detection function used to estimate the point of change is: The instant at which the procedure is stopped is t s = min {t : g(m)t ≥ h}, where h is the detection threshold. The point of change is estimated as follows: The application of the DCS method improves the detection when applied after filtration, specially when the signal presents no abrupt change, and the direct application of the DCS algorithm leads us to faulty results and sometimes difficult to interpret for accurate fault www.intechopen.com detection ( figure 14). The use of the filters bank improves considerably the detection ability of the DCS method. And the sub-signal frequency components, constituting the frequencies in defined ranges will be transformed into energy changes.

Fusion of change points
Because the detection algorithm is applied individually to each frequency component, it is important to apply a fusion technique to the resulting times of change in order to get a single value for a given fault in the system. The fusion technique is achieved as follows: -Each point of change at a given level is considered as an interval [tc-a, tc+a], where a is an arbitrary number of points taken before and after the point of change. - The time intervals with common areas are considered to correspond to the same fault. - The resulting point of change t f is calculated as the center of gravity (or mean) of the superimposed intervals. This procedure is shown on figure 16. The signal is a simulated one with two segments (each segment has 1000 points) of different frequency. The point of change of each component is calculated by the detection algorithm, and then the fusion technique is applied to get a single point of change t f . The real point of change is t r = 1000 and, after fusion, the estimated point of change is t f = 1003.   Figure 17 shows the best tree obtained after decomposing the signal by using filters bank derived from the 'Haar' wavelet packet.

Performance evaluation
To study the performance of the detection method, the probability of detection and the probability of false alarm are calculated. Also the detection delay which is the difference between the stopping time and the real time of change is calculated. In order to evaluate the performance of our algorithm, we compare the DCS with filtration and the DCS without filtration in case of a change in variance. The comparison is done by using the Receiver Operational Curve ROC which is a curve that represents the probability of detection as function of the probability of false alarm. The two methods are tested, by using 20 signals randomly generated. The variances of the signals are 1 and 1.3. The comparison between curves in figure 18 shows that, for the same probability of false alarm, the probability of detection, in case of DCS with filtration, is higher than that of the DCS without filtration. However, let us focus on the following remarks: 1. The use of a filters bank to decompose the signal, before the application of the DCS method, causes a delay to the detection because the current output sample is calculated in terms of current and past input samples. In the case of on-line detection, it is necessary to estimate a model on line at the same time as the input data is received, to make the decision on-line, and in order to investigate possible time variation in the signal parameters during the collection of data. Our future works include the study of detection delay in order to deal with real time applications. 2. The use of filters bank is not suitable in case of varying mean signals because the band pass filters, used to decompose the signal, will attenuate all mean variation. In this last part, the influence of the window size is also investigated. The comparison between curves in figure 19 shows that, for the same probability of false alarm, the probability of detection increases if the width W of the window increases. These curves are plotted for W = 25, W = 50 and W = 100 points. It is clear that if the window becomes large, the false alarm probability decreases. But in this case small events will not be detected and the delay detection time will be increased. A trade-off must be found between the detection delay and the length of the window.

Applications to the Tennessee Eastman Challenge Process
In this section, the method based on wavelet packet filters bank decomposition and DCS algorithm is applied to detect disturbances on the Tennessee Eastman Challenge Process (TECP, Figure 20). The TECP (Downs & Vogel, 1993) is a multivariable non-linear, high dimensionality, unstable open-loop chemical reactor that is a simulation of a real chemical plant provided by the Eastman Company. The process results in final products G and H from four reactants A, C, D and E. The plant has 7 operating modes, 41 measured variables and 12 manipulated variables. There are also 20 disturbances IDV1 through IDV20 that could be simulated (Downs & Vogel, 1993), (Singhal, 2000). The sampling period for measurements is 60 seconds. The TECP offers numerous opportunities for control and fault detection and isolation studies. In this work, we use a robust adaptive multivariable (4 inputs and 4 outputs) RTRL neural networks controller (Leclercq et al., 2005), (Zerkaoui et al, 2007) to regulate the temperature (Y1) and pressure (Y2) in reactor, and the levels in separator (Y3) and stripper (Y4). For this purpose, the controller drives the purge valve (U1), the stripper input valve (U2), the condenser CW valve (U3), and reactor CW valve (U4). The controller is presented in figure  20 (full lines represent measurements and dashed line represent actuators updating). This controller compensates all perturbations IDV1 to IDV 20 excepted IDV1, IDV6 and IDV7. Particularly, the controller is robust for perturbation IDV16 that will be used in the following.  (Leclercq et al., 2005), (Zerkaoui et al, 2007).

www.intechopen.com
The figure 21 illustrates the advantage of our method to detect changes for real world FDI applications. Measurements of the stripper level (figure 21 a) are decomposed into 3 components by using filters bank derived from the 'Haar' wavelet packet. From time t r = 600 hours, the perturbation IDV16, that corresponds to a random variation of the A, B, C composition, modifies the dynamical behavior of the system. The detection functions applied on the 3 components ( figure 21 f, g, h) can be compared with the detection function applied directly on measurement of pressure (figure 21 b). After fusion, the point of change is calculated to be t f = 659. Detection results are considerably improved by using the derived filters bank as a preprocessor.

Conclusions and perspectives
The aim of our work is to detect the point of change of statistical parameters in signals collected from complex industrial systems. This method uses a filters bank derived from a wavelet packet and combined with DCS to characterize and classify the parameters of a signal in order to detect any variation of the statistical parameters due to any change in frequency and energy. The main contribution of this paper is to derive the parameters of a filters bank that behaves as a wavelet packet. The proposed algorithm provides also good results for the detection of frequency changes in the signal. The application to the Tennessee Eastman Challenge Process illustrates the interest of the approach for on-line detection and real world applications.
In the future, our algorithm will be tested with more data issued form several systems in order to improve and validate it and to compare it to other methods. We will consider mechanical and electrical machines (Awadallah & Morcos 2003, Benbouzid et al.,1999, and as a consequence our intend is to develop FDI methods for wind turbines and renewable multi-source energy systems (Guérin et al., 2005).