Adaptive-FRESH Filtering

Adaptive filters are self-designing systems for information extraction which rely for their operation on a recursive algorithm (Haykin, 2001). They find application in environments where the optimum filter cannot be applied because of lack of knowledge about the signal characteristics and/or the data to be processed. This a priori knowledge required for the design of the optimum filter is commonly based on stochastic signal models, which traditionally are stationary. However, the parameters of many signals found in communication, radar, telemetry and many other fields can be represented as periodic functions of time (Gardner, 1991). When this occurs, stationary signal models cannot exploit the signal periodicities, while cyclostationary models become more suitable since they represent more reliably the statistical signal properties.1 Firstly, let us review some of the key points of cyclostationary signals, while introducing some definitions that will be used throughout this chapter. We have said that most of signals used in many fields such as communication, radar, or telemetry exhibit statistical parameters which vary periodicallywith time. The periodicities of these parameters are related to the parameters of the signalmodulation, such as the carrier frequency or the chip rate among others (Gardner, 1986; 1994). Let the second-order2 auto-correlation function of a zero-mean stochastic signal be defined as: Rxx(t,λ) E {x(t) x (λ)} (1)


Introduction
Adaptive filters are self-designing systems for information extraction which rely for their operation on a recursive algorithm (Haykin, 2001). They find application in environments where the optimum filter cannot be applied because of lack of knowledge about the signal characteristics and/or the data to be processed. This a priori knowledge required for the design of the optimum filter is commonly based on stochastic signal models, which traditionally are stationary. However, the parameters of many signals found in communication, radar, telemetry and many other fields can be represented as periodic functions of time (Gardner, 1991). When this occurs, stationary signal models cannot exploit the signal periodicities, while cyclostationary models become more suitable since they represent more reliably the statistical signal properties. 1 Firstly, let us review some of the key points of cyclostationary signals, while introducing some definitions that will be used throughout this chapter. We have said that most of signals used in many fields such as communication, radar, or telemetry exhibit statistical parameters which vary periodically with time. The periodicities of these parameters are related to the parameters of the signal modulation, such as the carrier frequency or the chip rate among others (Gardner, 1986;. Let the second-order 2 auto-correlation function of a zero-mean stochastic signal be defined as: A signal is said to be (second-order) cyclostationary if, and only if, its (second-order) auto-correlation function is a periodic function of time, namely with period T. Therefore, it can be expanded in a Fourier series (Giannakis, 1998): where α p = p/T are called the cycle frequencies of x(t), and the set of all the cycle frequencies is referred to as the cyclic spectrum. The Fourier coefficients of the expansion in (2) are named 1 Cyclostationary signal models are a more general class of stochastic processes which comprise the stationary ones. Therefore, they always represent the statistical properties of the signal at least as well as the stationary ones. 2 Since the optimality criterion used in this chapter is based on the Mean Squared Error (MSE), only the second-order statistical moments are of interest to us. The first-order statistical moment (i.e. the mean) is zero by assumption. Throughout this chapter, the second-order cyclostationarity is exploited. For brevity, hereinafter the cyclostationarity and correlation functions are assumed to be of second order, even without explicit mention.

11
www.intechopen.com cyclic auto-correlation functions and are computed as: In practice, the signal periodicities are often incommensurable with each other, which yields that the auto-correlation function in (1) is not periodic, but an almost-periodic function of time.
In this general case (in the sense that periodic functions are a particular case of almost-periodic ones), the signal is said to be almost-cyclostationary, and its auto-correlation function allows its expansion as a generalized Fourier series: where the set A xx stands for the cyclic spectrum of x(t), and is generally composed of the sum and difference of integer multiples of the signal periodicities (Gardner, 1987;Gardner et al., 1987;Napolitano & Spooner, 2001). Additionally, the definition of the cyclic auto-correlation functions must be changed accordingly: One of the most important properties of almost-cyclostationary signals concerns the existence of correlation between their spectral components. The periodicity of the auto-correlation function turns into spectral correlation when it is transformed to the frequency domain. As a result, almost-cyclostationarity and spectral correlation are related in such a way that a signal exhibits almost-cyclostationary properties if, and only if, it exhibits spectral correlation too.
Let X ∆ f (t, f ) be the spectral component of x(t),a r o u n dt i m et and frequency f ,a n dw i t h spectral bandwidth ∆ f : The spectral correlation function of the signal x(t) is defined as: which represents the time-averaged correlation between two spectral components centered at frequencies f and f − α, as their bandwidth tends to zero. It can be demonstrated that the spectral correlation function matches de Fourier transform of the cyclic correlation functions (Gardner, 1986), that is: where the inherent relationship between almost-cyclostationarity and spectral correlation is fully revealed. Intuitively, Eq. (8) means that the spectral components of almostcyclostationary signals are correlated with other spectral components which are spectrally separated at the periodicities of their correlation function, i.e. their cyclic spectrum. For this reason the cycle frequency is also known as spectral separation (Gardner, 1991). Note that, at cycle frequency zero, the cyclic auto-correlation function in (5) represents the stationary (or time-averaged) part of the nonstationary auto-correlation function in (3). Therefore, it is straightforward from (8) that the spectral correlation function matches the Power Spectral Density (PSD) at α = 0, and also represents the auto-correlation of the signal spectral components, which is indicated by (7).
The key result from the above is that, since almost-cyclostationary signals exhibit spectral correlation, a single spectral component can be restored not only from itself, but also from other components which are spectrally separated from it as indicated by the cyclic spectrum of the signal. It is clear that a simple Linear Time-Invariant (LTI) filter cannot achieve this spectral restoration, but intuition says that a kind of filter which incorporates frequency shifts within its structure could. Contrary to optimal linear filtering of stationary signals, which results in time-invariant filters, the optimal linear filters of almost-cyclostationary signals are Linear Almost-Periodically Time-Variant (LAPTV) filters (also known as poly-periodic filters (Chevalier & Maurice, 1997)). This optimality can be understood in the sense of Signal-to-Noise Ratio (SNR) maximisation or in the sense of Minimum Mean Squared Error (MMSE, the one used in this chapter), where the optimal LAPTV filter may differ depending on the criterion used. Therefore, LAPTV filters find application in many signal processing areas such as signal estimation, interference rejection, channel equalization, STAP (Space-Time Adaptive Processing), or watermarking, among others (see (Gardner, 1994) and references therein, or more recently in (Adlard et al., 1999;Chen & Liang, 2010;Chevalier & Blin, 2007;Chevalier & Maurice, 1997;Chevalier & Pipon, 2006;Gameiro, 2000;Gelli & Verde, 2000;Gonçalves & Gameiro, 2002;Hu et al., 2007;Li & Ouyang, 2009;Martin et al., 2007;Mirbagheri et al., 2006;Ngan et al., 2004;Petrus & Reed, 1995;Whitehead & Takawira, 2004;2005;Wong & Chambers, 1996;Yeste-Ojeda & Grajal, 2008;Zhang et al., 2006;). This chapter is devoted to the study of LAPTV filters for adaptive filtering. In the next sections, the fundamentals of LAPTV filters are briefly described. With the aim of incorporating adaptive strategies, the theoretical development is focused on the FRESH (FREquency SHift) implementation of LAPTV filters. FRESH filters are composed of a set of frequency shifters followed by an LTI filter, which notably simplifies the analysis and design of adaptive algorithms. After reviewing the theoretical background of adaptive FRESH filters, an important property of adaptive FRESH filters is analyzed: Their capability to operate in the presence of errors in the LAPTV periodicities. This property is important because small errors in these periodicities, which are quite common in practice due to non-ideal effects, can make the use of LAPTV filters unfeasible. Finally, an application example of adaptive FRESH filters is used at the end of this chapter to illustrate their benefit in real applications. In that example, an adaptive FRESH filter constitutes an interference rejection subsystem which forms part of a signal interception system. The goal is to use adaptive FRESH filtering for removing the unwanted signals so that a subsequent subsystem can detect any other signal present in the environment. Therefore, the interference rejection and signal detection problems can be dealt with independently, allowing the use of high sensitivity detectors with poor interference rejection properties.

Optimum linear estimators for almost cyclostationary processes
This section is devoted to establishing the optimality of LAPTV filters for filtering almost-cyclostationary signals. The theory of LAPTV filters can be seen as a generalization of the classical Wiener theory for optimal LTI filters, where the signals involved are no longer stationary, but almost-cyclostationary. Therefore, optimal LTI filters become a particularization of LAPTV ones, as stationary signals can be seen as a particularization of almost-cyclostationary ones. The Wiener theory defines the optimal (under MMSE criterion) LTI filter for the estimation of a desired signal, d(t), given the input (or observed) signal x(t),w h e nb o t hd(t) and x(t) are jointly stationary. In this case, their auto-and cross-correlation functions do not depend on the time, but only on the lag, and the estimation error results constant too. Otherwise, the estimation error becomes a function of time. The Wiener filter is still the optimal LTI filter if, and only if, d(t) and x(t) are jointly stationarizable processes (those which can be made jointly stationary by random time shifting) (Gardner, 1978). In this case, the Wiener filter is optimal in the sense of Minimum Time-Averaged MSE (MTAMSE). For instance, jointly almost-cyclostationary processes (those whose auto-and cross-correlation functions are almost-periodic functions of time) are always stationarizable. Nonetheless, if x(t) and d(t) are jointly almost-cyclostationary, it is possible to find an optimal filter which minimizes the MSE at all instants, which becomes a Linear Almost-Periodically Time-Variant (LAPTV) filter (Gardner, 1994). This result arises from the orthogonality principle of optimal linear estimators (Gardner, 1986), which is developed next. Let d(t) be the estimate of d(t) from x(t), obtained through the linear filter h(t, λ): The orthogonality principle establishes that, if h(t, λ) is optimum, then input signal x(t) and the estimation error (ε(t)=d(t) − d(t)) are orthogonal with each other: 3 stands for the cross-correlation function between u(t) and v(t),a n dh Γ (t, λ) stands for the optimal LAPTV filter (where the meaning of the subindex Γ will be clarified in the next paragraphs). Since d(t) and x(t) are jointly almost-cyclostationary by assumption, their autoand cross-correlation functions are almost-periodic functions of time, and therefore they can be expanded as a generalized Fourier series (Corduneanu, 1968;Gardner, 1986): where A dx and B xx are countable sets consisting of the cycle frequencies of R dx (t, λ) and R xx (t, λ), respectively. In addition, the cyclic cross-and auto-correlation functions R α dx (τ) and R α xx (τ) are computed as generalized Fourier coefficients (Gardner, 1986): Substitution of (13) and (14) in (11) yields the condition: This condition can be satisfied for all t, λ ∈ R if h Γ (t, λ) is also an almost-periodic function of time, and therefore, can be expanded as a generalized Fourier series (Gardner, 1993;Gardner & Franks, 1975): where Γ is the minimum set containing A dx and B xx which is closed in the addition and subtraction operations (Franks, 1994). The Fourier coefficients in (18), h γ q Γ (τ),canbecomputed analogously to Eq. (15) and (16). Then, the condition in (17) is developed by using the definition in (18), taking the Fourier transform, and augmenting the sets A dx and B xx to the set Γ, which they belong to, yielding the following condition: which must be satisfied for all f , λ ∈ R, and where the Fourier transform of the cyclic crossand auto-correlation functions, stands for the spectral cross-and auto-correlation function, respectively (Gardner, 1986). Finally, we use the fact that two almost-periodic functions are equal if and only if their generalized Fourier coefficients match (Corduneanu, 1968), which allows to reformulate (19) as the design formula of optimal LAPTV filters: Note that the sets of cycle frequencies A dx and B xx are, in general, subsets of Γ.Con sequen t ly , the condition in (21) makes sense under the consideration that S α (15) and (16). Furthermore, (21) is also coherent with the classical Wiener theory. When d(t) and x(t) are jointly stationary, then the sets of cycle frequencies A dx ,B xx ,a n dΓ consist only of cycle frequency zero, which yields that the optimal linear estimator is LTI and follows the well known expression of Wiener filter: Let us use a simple graphical example to provide an overview of the implications of the design formula in (21). Consider the case where the signal to be estimated, s(t),iscorrupted by additive stationary white noise, r(t), along with an interfering signal, u(t),t of o r mt h e observed signal, that is: Assuming that s(t), r(t),andu(t) are statistically independent processes, the design formula becomes: In the following, let us consider the PSDs plotted in Fig. 1 for the signal, the noise and the interference, all of which are flat in their spectral bands, with PSD levels η s , η r and η u , respectively. Let us further simplify the example by assuming that the signal is received with a high Signal-to-Noise Ratio (SNR), but low Signal-to-Interference Ratio (SIR), so that η u ≫ η s ≫ η r . The Wiener filter can be obtained directly from Fig. 1, and becomes: where B u and B s represent the frequency intervals comprised in the spectral bands of the interference and the signal, respectively. Thus, the Wiener filter is not capable of restoring the signal spectral components which are highly corrupted by the interference, since they are almost cancelled at its output. On the contrary, an LAPTV filter could restore the spectral components cancelled by the Wiener filter depending on the spectral auto-correlation function of the signal and the availability at the input of correlated spectral components of the signal which are not perturbed by the interference. In our example, consider that the signal s(t) is Amplitude Modulated (AM). Then, the signal exhibits spectral correlation at cycle frequencies α = ±2 f c , in addition to cycle frequency zero, which means that the signal spectral components at positive frequencies are correlated with those components at negative frequencies (Gardner, 1987). 4 The spectral correlation function of the AM signal is represented in Fig. 2. The design formula in (25) states that the Fourier coefficients of the optimal LAPTV filter represent the coefficients of a linear combination in which frequency shifted version of S α xx ( f ) are combined in order to obtain S α dx ( f ). For simplicity, let us suppose that the set of cycle frequencies Γ only consists of the signal cyclic spectrum, that is Γ = {−2 f c ,0 ,2f c },s ot h a t the design formula must be solved only for these values of α k . Suppose also that the cycle . This is coherent with the assumption that noise is stationary and with Fig. 1, where the bandwidth of the interference is narrower than 2 f c , and therefore none of its spectral components is separated 2 f c in frequency. Fig. 3 graphically represents the conditions imposed by the design formula (25) on the Fourier coefficients of the optimal LAPTV filter, where each column stands for the different equations as α k takes different values from Γ. The plots in the first three rows stand for the amplitude of the frequency shifted versions of the spectral auto-correlation function of the signal, the noise and the interference, while the plots in the last row represent the spectral cross-correlation between the input and the desired signals. The problem to be solved is to find the filter Fourier coefficients, which multiplied by the spectral correlation functions represented in the first three rows, and added together, yield the spectral cross-correlation function in the last row. Firstly, let us pay attention to the spectral band of the interference at positive frequencies. From Fig. 3, the following equation system apply: 4 The definition of the spectral correlation function used herein (see Eq. (7)) differs in the meaning of frequency from the definition used by other authors, as in (Gardner, 1987). Therein, the frequency stands for the mean frequency of the two spectral components whose correlation is computed. Both definitions are related with each other by a simple change of variables, that is S α f ) corresponds to spectral correlation function according to the definition used in (Gardner, 1987). Fig. 3. Graphical representation of the design formula of LAPTV filters, which has been applied to our example. Each plot corresponds to a different value of α k in (25). Fig. 4. Fourier coefficients of the optimal LAPTV filter.
where B + u stands for the range of positive frequencies occupied by the interference. The solution to the linear system in (28) is: The result in (31) is coherent with the fact that there are not any signal spectral components located in B + u after frequency shifting the input downwards by 2 f c .A f t e r u s i n g t h e approximations of high SNR and low SIR, the above results for the other filter coefficients can be approximated by: The underlaying meaning of (32) is that the optimal LAPTV filter cancels the spectral components of the signal which are corrupted by the interference. But contrary to the Wiener filter, these spectral components are restored from other components which are separated 2 f c in frequency, which is indicated by (33). By using a similar approach, the Fourier coefficients of the LAPTV filter are computed for the rest of frequencies (those which do not belong to B + u ). The result is represented in Fig. 4. We can see in Fig. 4(a) that |H 0 Γ ( f )| takes three possible values, i.e. 0, 0.5, and 1. |H 0 Γ ( f )| = 0when f ∈ B u (as explained above) or f / ∈ B s (out of the signal frequency range). The frequencies at which the Fourier coefficients are plotted with value |H α Γ | = 0.5 correspond to those spectral components of the signal which are estimated from themselves, and jointly from spectral components separated α = 2 f c (or alternatively α = −2 f c ), since both are only corrupted by noise. For their part, the frequencies at which |H 0 Γ ( f )| = 1 match the spectral components cancelled by H Fig. 4(b) and 4(c)). These frequencies do not correspond to the spectral band of the interference, but the resulting frequencies after shifting this band by ±2 f c . Consequently, such spectral components are estimated only from themselves.
The preceding example has been simplified in order to obtain comprehensive results of how an LAPTV filter performs, and to intuitively introduce the idea of frequency shifting and filtering. This idea will become clearer in Section 4, when describing the FRESH implementation of LAPTV filters. In our example, no reference to the cyclostationary properties of the interference has been made. The optimal LAPTV filter also exploits the interference cyclostationarity in order to suppress it more effectively. However, if we had considered the cyclostationary properties of the interference, the closed form expressions for the filter Fourier coefficients would have result more complex, which would have prevented us from obtaining intuitive results. Theoretically, the set of cycle frequencies Γ consists of an infinite number of them, which makes very hard to find a closed form solution to the design formula in (21). This difficulty can be circumvented by forcing the number of cycle frequencies of the linear estimator h(t, λ) to be finite, at the cost of performance (the MSE increases and the filter is no longer optimal). This strategy will be described along with the FRESH implementation of LAPTV filters, in Section 4. But firstly, the expression in (22) is generalized for complex signals in the next section.

Extension of the study to complex signals
Complex cyclostationary processes require up to four real LAPTV filters in order to achieve optimality, that is: 1. To estimate the real part of d(t) from the real part of x(t), 2. to estimate the real part of d(t) from the imaginary part of x(t), 3. to estimate the imaginary part of d(t) from the real part of x(t) and 4. to estimate the imaginary part of d(t) from the imaginary part of x(t). This solution can be reduced to two complex LAPTV filters whose inputs are x(t) and the complex conjugate of x(t),t h a ti sx * (t). As a consequence, the optimal filter is not formally a linear filter, but a Widely-Linear Almost-Periodically Time-Variant (WLAPTV) filter (Chevalier & Maurice, 1997) (also known as Linear-Conjugate Linear, LCL (Brown, 1987;Gardner, 1993)). Actually, the optimal WLAPTV filter reduces to an LAPTV filter when the observations and the desired signal are jointly circular (Picinbono & Chevalier, 1995). 5 The final output of the WLAPTV filter is obtained by adding together the outputs of the two complex LAPTV filters: Since the orthogonality principle establishes that the estimation error must be uncorrelated with the input, it applies to both x(t) and x * (t), yielding the linear system: Analogously to Section 2, it can be demonstrated that the condition in (36) is satisfied if both h Γ (t, λ) and g Γ (t, λ) are almost-periodic functions, and the linear system in (36) can be reformulated as the design formula (Gardner, 1993): where the set of cycle frequencies Γ is defined in this case as the minimum set comprising A dx ,A dx * ,B xx and B xx * , which is closed in the addition and subtraction operations (with A dx * and B xx * being the sets of cycle frequencies of the cross-correlation functions of the complex conjugate of the input, x * (t),withd(t) and x(t), respectively.) As it occurred for real signals in Section 2, finding a closed-form solution to the design formula in (37) may result too complicated when a large number of cycle frequencies compose the set Γ. In the next section a workaround is proposed based on the FRESH implementation of LAPTV filters and the use of a reduced set of cycle frequencies, Γ s ⊂ Γ.

Implementation of WLAPTV filters as FRESH (FREquency-SHift) filters
For simplicity reasons, LAPTV filters are often implemented as FREquency SHift (FRESH) filters (Gameiro, 2000;Gardner, 1993;Gardner & Franks, 1975;Gonçalves & Gameiro, 2002;Loeffler & Burrus, 1978;Ngan et al., 2004;Reed & Hsia, 1990;Zadeh, 1950). FRESH filters consist of a bank of LTI filters, whose inputs are frequency-shifted versions of the input signal. In general, the optimum FRESH filter would require an infinite number of LTI filters. Because this is not feasible, the FRESH filters used in practice are sub-optimal, in the sense that they are limited to a finite set of frequency shifters. Any WLAPTV filter can be implemented as FRESH filter. 6 This result emerges from using in (34) the generalized Fourier series expansion of LAPTV filters. Let h(t, γ) and g(t, γ) be two arbitrary LAPTV filters. Then, each of them can be expanded in a generalized Fourier series yielding: It can be clearly seen that the output of the WLAPTV filter, d(t), is the result of adding together the outputs of the LTI filters h α k (t) and g β p (t), whose inputs are frequency-shifted versions of the input signal x(t) and its complex conjugate x * (t), respectively. This is precisely the definition of a FRESH filter. The most difficult problem in the design of sub-optimal FRESH filters concerns the choice of the optimal subset of frequency shifts, Γ s ⊂ Γ, under some design constraints. For instance, a common design constraint is the maximum number of branches of the FRESH filter. The optimum Γ s becomes highly dependent on the spectral correlation function of the input signal and can change with time in nonstationary environments. However, with the aim of simplifying the FRESH implementation, Γ s is usually fixed beforehand and the common approach to determine the frequency shifts consists in choosing those cycle frequencies at which the spectral cross-correlation functions between d(t) and x(t), or its complex conjugate, exhibits maximum levels (Gardner, 1993;Yeste-Ojeda & Grajal, 2008). Once the set of frequency shifts has been fixed, the design of FRESH filters is much simpler than that of WLAPTV filters, because FRESH filters only use LTI filters. Because of the nonstationary nature of the input (it is almost-cyclostationary), the optimality criterion used in the design of the set of LTI filters is the MTAMSE criterion, in contrast to the MMSE criterion (Gardner, 1993). The resulting FRESH filter is a sub-optimum solution since it is optimum only within the set of FRESH filters using the same set of frequency shifts. Let us formulate the output of the FRESH filter by using vector notation: where w † (t) stands for the conjugate transpose of vector w(t), and each component of the input vector x(t) represents the input of an LTI filter according to the scheme represented in Fig. 5: . . .
where P is the number of branches used for filtering the frequency shifted versions of x(t), (L − P) is the number of branches used for filtering its complex conjugate x * (t),a n dL is the total number of branches. Note that the first P cycle frequencies can be repeated in the next P − L cycle frequencies, since using a frequency shift for the input x(t) does not exclude it from being used for its complex conjugate. With the aim of computing the MTAMSE optimal set of LTI filters, a stationarized signal model is applied to both the desired signal d(t) and the input vector x(t), which are jointly almost-cyclostationary processes, and therefore stationarizable (Gardner, 1978). The stationary auto-and cross-correlation functions are obtained by taking the stationary part (time-averaging) the corresponding nonstationary correlation functions. As a result, the orthogonality principle is formulated as follows: where 0 is the null vector, and the cross-correlation (row) vector R dx (τ) and the autocorrelation matrix R xx (τ) are computed by time-averaging the corresponding nonstationary correlation functions. For R dx (τ) this yields: where the cyclic cross-correlation functions were defined in (15). The matrix R xx (τ) becomes: where the element in the q-th row and k-th column is: where the cyclic auto-correlation functions were defined in (16). Finally, the orthogonality principle leads directly to the multidimensional Wiener filter, and the frequency response of the set of filters is obtained by taking the Fourier transform in (44): where the Hermitian property of matrix S xx ( f ) has been used, and the element of the q-th row and k-thcolumnofmatrixS xx ( f ) is: It is noteworthy that the expression in (37), which defines the optimal WLAPTV filter, is a generalization of (48) for the case where Γ s = Γ. In addition, we want to emphasize the main differences between optimal and sub-optimal FRESH filters: 1. Optimal FRESH filters are direct implementations of optimal WLAPTV filters defined by (37), and generally consist of an infinite number of LTI filters. Contrarily, sub-optimal FRESH filters are limited to a finite set of LTI filters.
2. For jointly almost-cyclostationary input and desired signals, optimal FRESH filters are the optimal with respect to any other linear estimator. On the contrary, sub-optimal FRESH filters are defined for a given subset Γ s and, therefore, they are optimal only in comparison to the rest of FRESH filters using the same frequency shifts.
3. Optimal FRESH filters minimize the MSE at all times (MMSE criterion). However, sub-optimal FRESH filters only minimize the MSE on average (MTAMSE criterion). This means that another FRESH filter, even by making use of the same set of frequency shifts, couldexhibitalowerMSEatspecifictimes.
Finally, the inverse of matrix S xx ( f ) could not exist for all frequencies, which would invalidate the expression in (49). However, since its main diagonal represents the power spectral density of the frequency shifted versions of the input, this formal problem can be ignored by assuming that a white noise component is always present at the input. At this point we have finished the theoretical background concerning FRESH filters. The following sections are focused on the applications of adaptive FRESH filters. Firstly, the introduction of an adaptive algorithm in a FRESH filter is reviewed in the next section.

Adaptive FRESH filters
The main drawback of using FRESH filters is that the phase of the LTI filters (or alternatively the phase of the frequency shifters) must be "synchronized" with the desired signal. Otherwise, the contribution of the different branches can be destructive rather than constructive. For example, the phase of the signal carrier must be known if the frequency shifts are related to the carrier frequency, or the symbol synchronism must be known if the symbol rate is involved in the frequency shifts. As a consequence, the knowledge required for the design of sub-optimal 7 FRESH filter is rarely available beforehand, which leads to the use adaptive algorithms. Since FRESH filters consist of a set LTI filters, conventional adaptive algorithms can be directly applied, such as Least-Mean-Square (LMS) or Recursive-Least-Squares (RLS). The general scheme of an adaptive FRESH filter is shown in Fig. 6. 7 Hereinafter, FRESH filters are always limited to a finite set of frequency shifts. Therefore, sub-optimal FRESH filters will be referred to as "optimal" for brevity. In order to simplify the analysis of the adaptive algorithms, the structure of FRESH filters is particularized to the case of discrete-time signals with the set of LTI filters exhibiting Finite Impulse Response (FIR filters). After filtering the received signal, x(n), the output of the adaptive FRESH filter is compared to a desired (or reference) signal, d(n),andtheerror ,ε(n), is used by an adaptive algorithm in order to update the filter coefficients. Commonly, the desired signal is either a known sequence (as a training sequence for equalizers), or obtained from the received signal (as for blind equalizers). Let the output of the FRESH filter be defined as the inner product: where the vectors w(n) and x(n) are the concatenation of the filter and the input vectors, respectively: w i (n) and x i (n) are the filter and input vectors, respectively, of the i-th branch: where M i is the length of the i-th filter. Consequently, the total length of vectors w(n) and Using the results in the previous sections, the optimal set of LTI filters is given by multidimensional Wiener filter theory: where R x stands for the time-averaged auto-correlation matrix of the input vector, 8 where · denotes the time average operator: In Eq. (56), the vector p represents the time-averaged cross-correlation between the input vector x and d(n): The correlation functions in (57) and (59) are time-averaged in order to force the LTI characteristic of w o . The expressions in (56), (57), and (59) allow to follow the classical developments of adaptive algorithms (Haykin, 2001). Despite using classical algorithms, adaptive FRESH filters naturally exploit the cyclostationary properties of the signals as the inputs of the LTI filters are frequency-shifted versions of the input of the adaptive FRESH filter.

Blind Adaptive FRESH filters (BAFRESH)
One of the main reasons for using adaptive FRESH filters is their inherent ability to operate blindly (Zhang et al., 1999). Commonly, adaptive algorithms require a reference signal which is not available in many applications of adaptive filtering. When the adaptive filter works without such a reference signal, we say that it is "blind". Blind Adaptive FRESH (BAFRESH) filters can be obtained directly by simply using as frequency shifts those cycle frequencies which belong to the cyclic spectrum of the input and satisfy the condition: for the first P branches, and for the rest of branches (those working with x * (t)). Then, the input signal itself can be used as the reference signal, as it is represented in Fig. 7, while the adaptive algorithm converges to the same solution as if a "clean" reference signal were used. 9 Intuitively, the conditions (60) and (61) mean that a BAFRESH filter must use only those cycle frequencies (among all the cycle frequencies of the input) at which only the part of the input which is correlated with the desired signal exhibits spectral correlation. It is noteworthy that (60) excludes cycle frequency zero from being used in the branches not using the complex conjugate of the input signal.
The reason is that all stochastic processes exhibit spectral correlation at cycle frequency zero Fig. 7. Block diagram of a BAFRESH filter. and therefore (60) could only be satisfied if the desired and the input signals are the same, which would eliminate the necessity of a filter. Contrarily, many stochastic processes do not exhibit spectral cross-correlation with their complex conjugate at cycle frequency zero, for instance the circular stochastic processes. This allows the condition (61) to be satisfied without implying that d(t)=x(t). When adaptive BAFRESH filters use a suitable set of frequency shifts, the correlation between the inputs of the LTI filters, x i (n), and the reference signal x(n), is only due to the signal to be estimated. Thus, the adaptive algorithm will converge to a solution by which the desired signal is estimated at the output of the BAFRESH filter, while any other component present at the input is minimized. This is a powerful characteristic of adaptive FRESH filters which turns them into a good candidate for those applications where no reference signal is available at the receiver.

Adaptive FRESH filters for cycle-frequency errors compensation
This section deals with the problem of uncertainties in the periodicities of LAPTV filters. 10 The purpose in this section is to analyze the nonstationary behavior of adaptive FRESH filters in order to mitigate this problem. Since they exploit the spectral correlation of signals at specific values of frequency separation or cycle frequency, LAPTV filters are severely affected by errors in such cycle frequencies. These errors may be due to non-perfect knowledge of the cyclic spectrum, or natural effects such as the Doppler effect. In such a case, it will be demonstrated that the filter can become totally useless. The reason is that a high enough error (in comparison to the observation time) produces a loss of coherence among the outputs of the FRESH branches. This causes a destructive interference rather than a constructive one when the outputs of the branches are added together. Although this is a well known problem, it has been exhaustively dealt with only in the field of beamforming and direction finding, mainly regarding MUSIC (Schmidt, 1986) and SCORE (Agee et al., 1990) algorithms. Besides describing the problem, some solutions have been proposed. Schell and Gardner stated that the degradation in performance increases with the product of the observation time and the error in cycle frequency . As a solution, they proposed two alternatives: Using multiple cycle frequencies in order to increase robustness, or adaptively estimating the cycle frequencies by maximizing the magnitude of the FFT of the lag product of the observations. Lee and Lee proposed two procedures for estimating the cycle-frequency offset (Lee & Lee, 1999). Both of them are based on the maximization of the highest eigenvalue of the product of a sample autocorrelation-related matrix. In (Lee et al., 2000), cycle frequencies are estimated iteratively through a gradient descent algorithm which maximizes the output power of an adaptive beamformer, assuming that the initial cycle frequencies are close enough. A similar approach can be found in (Jianhui et al., 2006), where the conjugate gradient algorithm is used instead of gradient descent, and after proper modifications in the cost function. A new approach is described in (Zhang et al., 2004), where the robustness of the beamformer is achieved by minimizing a cost function which is averaged over a range of cycle frequencies, instead of trying to correct the cycle frequency errors. The approach used in this section is rather different from the cited work. The purpose of this section is to explore an additional advantage of using adaptive filters for FRESH filtering, which is their capability of working in the presence of errors in the frequency shifts. This is possible because the adaptive filter is able to track these errors, by updating the coefficients of the LTI filters in a cyclic manner. As a result, the adaptive filter at each branch of the FRESH filter behaves as a Linear Periodically Time-Variant (LPTV) filter, rather than converging to an LTI filter. This ability is strongly conditioned by the rate of convergence of the adaptive algorithm. For slow convergence, the outputs of the branches with frequency-shift errors are cancelled. As the convergence accelerates, the adaptive filters of those branches with frequency-shift errors behave as LPTV filters in order to compensate the errors. These subjects shall be dealt with later on, but firstly let us highlight the problem of cycle-frequency errors in FRESH filtering.

Effect of errors in the frequency shifts
The scheme of the adaptive FRESH filter considered in this section has been shown in Fig. 6. The problem of an error in the frequency shifts of the FRESH filter is that the correlation between the desired signal and the corresponding filter input vanishes. For a nonadaptive FRESH filter, the error can be tolerated during a limited observation time. At each branch, the contribution to the signal estimate is constructive for an interval lesser than the inverse of the error of its frequency shift. Therefore, the observation time tolerated by the whole filter is determined by the inverse of the biggest error in the frequency shifts. 11 For an infinite observation time, the optimal LTI filter is null for all branches with an error in their frequency shifts, as we shall demonstrate in this section. This is due to the discrete nature of the cyclic spectrum, which means that the cyclic correlation function is different from zero only at a countable set of cycle-frequency values. Thus, an error in a cycle frequency yields that the signal at the input of the corresponding filter is correlated neither with d(n), nor with the inputs of the rest of filters.
Let us use the definitions introduced in the previous section, specifically from (52) to (59). For convenience, let the auto-correlation matrix of the input vector be expressed in the form where R x ij ,w it hi ≤ j, stands for the auto-correlation matrix of input vectors from the i-th to the j-th branch: Similarly, the vector x ij (n) stands for the concatenation of input vectors from the i-th to the j-th branch: Let us assume that there is an error in the frequency shift used for the first branch. Then, the time-averaged cross-correlation between x 1 (n) and the input vectors of the rest of branches is zero. Moreover, the time-averaged cross-correlation of x 1 (n) with the desired signal, d(n),is also zero. Thus, the optimal set of LTI filters when there exist errors in the frequency shift of the first branch results: Therefore, the optimal LTI filter of the first branch and its output are null. In addition, the optimal set of filters for the other branches is equivalent to not using the first branch. In short, what is happening is that the first frequency shift does not belong to set Γ.W h e na l lb r a n c h e s exhibit errors in their frequency shifts, then the optimal filter vector is zero and the FRESH filter becomes useless. It is noteworthy that the cancellation of the branches with an error in their frequency shift is caused by the LTI characteristic of the filters, which has been forced by time-averaging R x and p. In the next section, it is shown that the errors in the frequency shifts can be compensated by allowing the filters to be time-variant, specifically, LPTV.

LPTV solution
When there exist errors in the frequency shifts, an adaptive-FRESH filter can compensate them. In order to understand the behavior of the adaptive filter, let us assume that these errors are known. One direct solution to this problem is to substitute the corresponding LTI filters by LPTV filters consisting of a correcting frequency shift followed by an LTI filter, as shown in Fig. 8. Formally, the output of the optimal FRESH filter defined in (56), when there are not errors in the frequency shifts, is: where w oi stands for the optimal LTI filter in the i-th branch and x i (n) are the outputs of the frequency shifters without error. The LPTV solution can be obtained by including the errors in the frequency shifts: wherew o (n) is the set of LPTV filters, andx(n) are the input vector to these filters (the output of the first frequency shifters in Fig. 8).
By defining the diagonal matrix where I N is the N × N identity matrix, the vectorsw o (n) andx(n) c a nb ee x p r e s s e di nt h e form:w Additionally, the optimal set of linear time-variant filters is obtained by minimization of the instantaneous MSE. This optimal set (whose input vector isx(n)) is (Van Trees, 1968): where Therefore, since (69) is verified, a sufficient condition to obtain (69) from (71), is that: where R x and p have been defined in (57) and (59) respectively. Note that (74) and (75) are not the true auto-correlation matrix and cross-correlation vector betweenx(n) and d(n). On the contrary, they are based on the stationarized signal model of the input vector x(n) and the desired signal d(n), so that the LTI characteristic of the filters after the second frequency shifters in Fig. 8 is forced. An additional motivation for introducing (74) and (75) is that these expressions will serve us to develop the convergence of the LMS algorithm mathematically. We have chosen the Least Mean Squares (LMS) adaptive algorithm for the analysis because of its mathematical simplicity, which allows an analytical treatment of the problem. Nonetheless, any adaptive algorithm (for instance, Recursive Least Squares (RLS) (Haykin, 2001)) can be used instead to compensate cycle-frequency errors. The approach herein is the same as as in (Widrow et al., 1976), where the MSE (Mean-Squared-Error) of the adaptive filter is decomposed into the error due to noise in the gradient estimation, and the error due to lag between the optimal and the adaptive filters. It is straightforward that the LPTV solution in (69) is equivalent to the optimal FRESH filter in the absence of errors. In practice, the errors in the frequency shifts are unknown. Thus, there is not any fixing frequency shifter (the second one, in Fig. 8), and the filters of each branch must work with input vectorx(n). The advantage of using an adaptive filter is that it can produce the periodic variations of the filters by updating their coefficients. This is only possible for a fast enough rate of convergence. Otherwise, the adaptive filter tends to the LTI optimal solution defined by (56) after substituting the input vector x(n) withx(n) in (57) and (59). This solution implies the cancellation of the branches with errors in their frequency shift. This issue is addressed in next section.

Excess MSE of the LMS algorithm in the presence of errors in the frequency shifts
In this section we develop an analytical expression for the time-averaged MSE at steady-state of adaptive FRESH filters using the LMS algorithm (LMS-FRESH). The objective is to obtain an expression that accounts for errors in the frequency shifts of the LMS-FRESH filter. Basically, the LMS algorithm consists in updating the filter weights in the direction of a gradient estimate of the MSE function, which can be thought as a stochastic version of the steepest descent algorithm (Widrow et al., 1976). Letw(n) be the concatenation vector defining the LMS-FRESH filter. The updating algorithm can be formulated as follows: where µ is the convergence factor (also known as step-size parameter). The LMS-FRESH filter exhibits an excess MSE with respect to the set of filtersw o (n) presented in the previous section as the LPTV solution. This excess MSE can be computed as follows (Widrow et al., 1976): Let us separate the excess MSE into two terms: The excess MSE due to gradient noise and the excess MSE due to lag error. The separation is possible by considering the weight error vector as the sum of two terms: Thefirsttermofthesumintherighthandside,ũ(n)=w(n) − E {w(n)}, produces the excess MSE due to gradient noise, which is related to the gradient estimation process. The second term,ṽ(n)=E {w(n)} −w o (n), produces the excess MSE due to lag, which quantifies the error due the fact that the adaptive filter cannot follow the variations ofw o (n) with time. By substituting (78) in (77), we obtain: ξ e (n)=E ũ † (n) Rx(n)ũ(n) + E ṽ † (n) Rx(n)ṽ(n) + 2Re E ũ † (n) Rx(n)ṽ(n) (79) It can be easily shown (from the definitions ofũ(n) andṽ(n)) that the last term of (79) is zero. Thus, the total time-averaged MSE at steady-state of the LMS-FRESH filters can be expressed as the sum of three terms: where ξ min is the minimum time-averaged MSE attained by the optimal FRESH filter, ξ ∇ is the time-averaged excess MSE due to gradient noise, and ξ lag is the time-averaged excess MSE due to lag error. The minimum time-averaged MSE results from multivariate Wiener theory (Gardner, 1993): where σ 2 d = E |d(n)| 2 is the mean power of the reference signal. At each instant, the excess MSE due to gradient noise can be computed as follows (Widrow et al., 1976): Assuming that the LMS adaptive filter converges to an LPTV filter at each branch (as shown in Fig. 8), the time-averaged excess MSE at steady-state can be approximated by 12 : (57). From (83), ξ ∇ can be reduced as much as desired by decreasing µ. However, the convergence factor µ controls the rate of convergence of the LMS, so that it is faster as µ increases. Additionally, it will be shown next that the excess MSE due to lag error increases as µ decreases. The excess MSE due to lag error becomes apparent when the adaptive filter cannot follow the variations ofw o (n) (the LPTV solution) with time. This excess MSE is a function of the weight-error vectorṽ(n), which represents the instantaneous difference between the expected valueoftheadaptivefilterandw o (n): The mathematical development of the time-averaged excess MSE at steady-state due to the lag error can be found in the appendix of (Yeste-Ojeda & Grajal, 2010), and yields: where v, which does not depend on time, is the weight-error vector at steady-state. Under the same conditions that were assumed for (83) (see footnote 12):

Analysis of the results
The study of the previous expressions allows to infer several conclusions: 1. The lag error, ξ lag , tends to zero as µ increases. This is easily seen by taking the limit of (86) as µ → ∞: lim This result indicates that, for a high enough convergence factor, the LMS-FRESH filter is fast enough to follow the variations of the LPTV solution, and therefore the excess MSE due to lag error is zero. However, µ cannot be made as big as desired because then, the adaptive algorithm may not converge, as we shall discuss next.
2. The lag error is also zero when there are not any errors in the frequency shifts. In this case, Φ = I M , which substituted in (86) gives v = 0. This result is coherent with the concept of lag error. In the absence of errors, the LPTV solution to which the LMS-FRESH filter converges becomes LTI. Consequently, the existence of a lag is not possible.
3. On the contrary, the lag error is maximum when µ tends to zero (and there exist frequencyshift errors). Taking the limit of (86) as µ → 0 yields: which implies that the LMS-FRESH filter converges to the null vector in a mean sense. Moreover, the filter converges to the null vector also in a mean square sense. This results is derived from the fact that the gradient noise tends to zero as µ tends to zero (see (83)), which entails thatũ(n)=0, and therefore the LMS-FRESH filter vector matches its expected value, i.e.w(n)=E {w(n)} = 0. As a result, the outputs of the branches with an error in their frequency shift are null. The time-averaged excess MSE due to lag error is obtained by substituting (88) in (85), which yields: Thus, the total time-averaged MSE at steady-state for µ tending to zero is This result could be obtained considering that, since the output of the adaptive-FRESH filter is null, the error signal is ε(n)=d(n).
4. The optimal convergence factor which minimizes the time-averaged MSE of the LMS algorithm is obtained from (80), (83) and (85), which yields: where vector v is defined by (86).
As regards the convergence of the LMS-FRESH filter, a sufficient condition is that the poles of all components of vector v(z) are located inside the unit circle (Yeste-Ojeda & Grajal, 2010). This is equivalent to the condition that the maximum absolute value of the eigenvalues of matrix is lesser or equal than 1. When a single branch is used for the LMS-FRESH filter, then it is straightforward that the eigenvalues of (92) are e −j2π∆ (1 − 2µλ k ),w i t hk = 1,...,M,a n d where ∆ is the frequency-shift error of the single branch, and λ k are the eigenvalues of matrix R x . As a result, the condition for the convergence of the LMS algorithm is: where λ max is the maximum eigenvalue of matrix R x . This is the same condition as for stationarity environments (Haykin, 2001;Widrow et al., 1976). 13

Application
Finally, let us quantify the performance of adaptive FRESH filters for cycle-frequency error compensation through a case study where the received signal consists of a BPSK signal embedded in stationary white Gaussian noise. The receiver incorporates a FRESH filter with the purpose of extracting the BPSK signal from the noise. Two frequency shifts related to the cyclic spectrum of the BPSK signal are used, the inverse of its symbol interval α 1 = 1/T s ,and twice its carrier frequency, α 2 = 2 f c . These two cycle frequencies have been chosen, according to the common approach mentioned in Section 4, because they exhibit the highest values of spectral correlation for a BPSK modulation . The desired signal for the adaptive algorithm is the received signal, d(n)=x(n), following the blind scheme described in Section 5.1 (see Fig. 9). In all cases, the carrier frequency and symbol interval have been set to f c = 0.3 (normalized to the sampling frequency) and T s = 32 samples. The noise power is set to σ 2 r = E |r(n)| 2 = 1, and the SNR is also fixed to SNR= 0 dB, and is defined as the ratio between the mean power of the signal and the noise: SNR= E |s(n)| 2 /σ 2 r .B o t hw 1 (n) andw 2 (n), are FIR filters with M i = 64 taps. In order to clarify some concepts, let us consider firstly the case where the FRESH filter is composed of only the branch associated with the frequency shift α 2 = 2 f c . The total time-averaged MSE at steady-state (hereinafter referred to as simply "the MSE") is shown in Fig. 10, as a function of the convergence factor and for different values of the frequency-shift error, ∆ 2 , which are also normalized to the sampling frequency. The MSE of the LMS-FRESH filter when ∆ 2 = 0 is plotted with a dashed thick line as a point of reference. This is the lower bound of the MSE attainable by the LMS-FRESH filter, and converges to the MSE of the optimal FRESH filter, ξ min ,asµ decreases. Note that the minimum MSE is always greater than the noise power, i.e. ξ min > 1, since even if the signal were perfectly estimated ( s(n)=s(n)), the error signal would match the noise (ε(n)=r(n)). When there exist errors in the frequency shift, the MSE is always higher than the lower bound. Since the lower bound includes the gradient noise effect, the excess MSE over the lower bound  Fig. 11. Analytical time-averaged MSE as a function of the convergence factor, when using only the branch corresponding to frequency shift α = 1/T s . The thick dashed line corresponds to ∆ 1 = 0.
is due to the lag error only, and varies from zero (for high µ)uptow † o R x w o = σ 2 d − ξ min ,(see (89), for small µ). Thus, the total MSE tends to σ 2 d when µ decreases. The curves also show the dependence of the MSE with the error in the frequency shift, which increases with ∆ 2 (at any given µ). Therefore, in order to obtain a small excess MSE due to lag, it is required a faster rate of convergence (bigger µ)when∆ 2 increases. An additional result shown in Fig. 10 is that for high enough errors in the frequency shifts, the MSE does not reach ξ min at any value of µ. In other words, it is impossible to simultaneously reduce the MSE terms due to the lag error and the gradient noise. In such a case, the optimal µ locates at an intermediate value as a result of the trade-off between reducing the lag error and the gradient noise. In practice, the error of the frequency-shifts is commonly unknown. Then the convergence factor should be chosen as a trade-off between the maximum cycle-frequency error which can be compensated by the adaptive filter and the increase of the excess MSE due to gradient noise. Similar results can be extracted from Fig. 11, which corresponds to the case where only the branch of the FRESH filter with frequency shift α = 1/T s has been used. The main difference between Figures 10 and 11 is that the minimum MSE, ξ min ,i sb i g g e rw h e no n l yα = 1/T s is used. The reason is that the spectral correlation level exhibited by a BPSK signal is smaller at α = 1/T s than at α = 2 f c . Furthermore, it can be seen that σ 2 d is not an upper bound for the MSE (as could be thought from Fig. 10), but the limit of the MSE as µ tends to zero. For the case illustrated in Fig. 12, the two branches shown in Fig. 9 are used, but there is uncertainty only in the symbol rate, that is, the error in the frequency shift related to the carrier frequency, ∆ 2 , is always zero. The curves show that the improvement in the MSE is not very significant when the error in the symbol rate is compensated. This occurs when the contribution to the signal estimate for a branch is more significant than for the other, which is mainly caused by the different spectral level exhibited by a BPSK signal at cycle frequencies α = 1/T s and α = 2 f c . As a consequence, the minimum MSE when only the second branch is used (ξ min,2 ), and when both branches are used (ξ min,12 ) are very similar. Furthermore, µ tends to zero the MSE tends to ξ min,2 instead of σ 2 d , since there is no uncertainty in the carrier frequency.   Fig. 13. Time-averaged MSE as a function of the convergence factor, when using the two branches. ∆ 1 = 10 −5 in all cases, except for the thick dashed line which corresponds to ∆ 1 = 0and∆ 2 = 0. Simulation results are represented by cross marks. Fig. 13 shows the MSE when there exist errors in both frequency shifts. The curves correspond to an error in the symbol rate ∆ 1 = 10 −5 and different errors for the carrier frequency. In this case, the minimum MSE is attained at a convergence factor such that the lag error of both branches is compensated. However, compensating the error in the carrier frequency is critical, while compensating the error in the symbol rate only produces a slight improvement. This conclusion can be deduced from the curve for ∆ 2 = 10 −7 .S i n c e∆ 1 = 10 −5 ,i ti sr e q u i r e da convergence factor close to µ = 10 −4 or higher in order to compensate the frequency-shift error at the first branch (see Fig. 11). However, the excess MSE due to lag error is small for µ = 10 −6 or higher, where only the frequency-shift error at the second branch can be compensated. The analytical expression for the MSE has been obtained under some assumptions (Gaussianity, small error conditions and input uncorrelated over time) which in practice are only approximations, and also in the case studies presented. Therefore, in order to check its accuracy, Figures 12 and 13 also include the MSE obtained by simulation, which is represented Parameter Min. value Max. value INR (interference-to-noise ratio) -20 dB 20 dB T s2 (interference symbol interval) 1sample 64 samples f c2 (interference carrier frequency) 0 1 ∆ 1 −10 −5 10 −5 ∆ 2 −10 −5 10 −5 Table 1. Range of the random variables used in the simulation with a BPSK interference. The INR is defined analogously to SNR. f c2 is normalized with the sampling rate. ∆ 1 and ∆ 2 are the absolute cycle-frequency errors of the branches with cycle frequencies α 1 = 1/T s and α 2 = 2 f c , respectively.
by the lines with cross marks: 200 realizations have been used for the ensemble averages in order to obtain the instantaneous MSE. Then, the instantaneous MSE has been time-averaged over 200.000 samples after the convergence of the LMS. The agreement between theoretical and simulation results is excellent in all cases, which confirms the validity of the assumptions made.
A last case study is presented with a twofold purpose: 1) To demonstrate that the adaptive algorithm compensates cycle-frequency errors also in the presence of interferences. 2) To demonstrate that an adaptive algorithm different from the LMS exhibits a similar behavior. For this reason, we shall use the RLS algorithm in this last case study, despite the lack of an analytical expression for the MSE. The scheme for the adaptive FRESH filter presented in Fig. 9 is valid also in this case study, but a BPSK interference having random parameters has been added to the input, along with the BPSK signal and the noise defined for the previous case studies. The errors in the frequency shifts of the FRESH filter are also random. All random variables are constant at each trial, but change from trial to trial according an uniform distribution within the ranges gathered in Table 1. As regards the RLS algorithm, it is exponentially weighted with a convergence factor λ = 1 − µ,w h e r eµ is known as the forgetting rate (Haykin, 2001). Also, we have used in this case study the BPSK signal as the reference signal, so that the MSE does not depend on the random interference power. Fig. 14 shows the MSE obtained by simulation (using 100 trials for computing the ensemble averages and 20000 samples for the time averages). The results show the capability of the RLS algorithm for compensating errors in the frequency shifts in the presence of the interference. Otherwise, the RLS algorithm would have converged to a FRESH filter which cancels its output, and the obtained MSE would have been equal to σ 2 d . Analogously to the previous case studies using the LMS algorithm, the RLS algorithm cannot compensate errors in the frequency shifts if the forgetting rate is too small (slow convergence). In such a case, the adaptive FRESH filter tends to cancel its output and the error is σ 2 d . For moderate forgetting rates, the cycle-frequency errors are compensated, and the MSE approaches ξ min . Contrarily, an excessively high forgetting rate increases the MSE as a consequence of the gradient noise. In summary, the possible existence of cycle-frequency errors in LAPTV filtering is a serious problem that must be managed. When using the non-adaptive FRESH implementation, the minimum time-averaged MSE is obtained when the branches with uncertainties in their frequency shifts are not used (or equivalently, their output is cancelled). On the contrary, adaptive-FRESH filters can work in the presence of errors in the frequency shifts. In such a case, the adaptive-FRESH filter behaves as an LPTV filter for those branches with an error in the frequency shift. In order to be effective, the rate of convergence of the adaptive algorithm must be carefully chosen. The optimal rate of convergence results from the trade-off between decreasing the excess MSE due to gradient noise (slow rate of convergence) and decreasing the excess MSE due to lag error (fast rate of convergence). The analytical expressions in this section allow to compute the optimal convergence factor and the time-averaged MSE at steady-state of the LMS-FRESH filter.

Adaptive FRESH filters for interference rejection in signal detection
Finally, we end this chapter with the description of a real application example of adaptive-FRESH filters. The system developed in this section, previously presented in (Yeste-Ojeda & Grajal, 2008), finds application in fields such as electronic warfare or spectrum management. The increasing saturation of the radio-electric spectrum has led modern radar and communication systems to employ complex signal waveforms capable of operating under the presence of interferences. On the contrary, interception systems face serious problems in the detectiong unknown or partially known signals if the signal of interest (SOI) is hidden by temporally and spectrally overlapping interferences. Therefore, hostile transmitters can take advantage of this fact and reduce their probability of interception by following this strategy. The problem dealt with in this section consists in detecting an unknown signal hidden by an interference with known parameters when only one sensor is available. The SOI and the interference are overlapped simultaneously in time and frequency, with the interference being more powerful that the SOI. As the reader will have guessed, the solution adopted to solve this problem is to exploit the cyclostationary properties of the interference in order to extract it from the received signal. This procedure requires knowing, at least, the cyclic spectrum of the interference. On the other hand, much work has been done when only the SOI cyclic spectrum is known, mainly with the aim of robust signal detection and estimation (Gardner, 1993;Mirbagheri et al., 2006;Zhang et al., 1999). The approach in this section is significantly different from these works, since it exploits the cyclostationarity of those signals to be removed, i.e. the interference. Our approach is based on a "divide and conquer" strategy, by which the global problem is split into two sub-problems: A signal separation problem and a signal detection one. Firstly, the interference is separated by means of a blind adaptive FRESH filter. Once the interference has been removed, the detection is performed on the residual, which is assumed to consist of only the noise and the SOI.

The interference rejection system
This system is mainly based on an BAFRESH filter, following the scheme represented in Fig. 15. This scheme is the same as the one represented in Fig. 7, but after rearranging it so that the error signal is the output of the system, that is the input of the detection system. Let the received signal be composed by the SOI, s(t), the interferences, u(t), and a stationary white noise, r(t): x

(t)=s(t)+u(t)+r(t)
where s(t), u(t),a n dr(t), are assumed to be independent from one another. If the interferences were perfectly estimated, that is u(t)=u(t), the estimation error would only consist of the SOI plus noise, ε(t)=s(t)+r(t). Thus, the input of the detector must be the estimation error, ε(t), as indicated by Fig. 15. The next step in the design of the FRESH filter is to choose a suitable set of frequency shifts. Since the adaptive FRESH filters is blind, these cycle frequencies must belong uniquely to the cyclic spectrum of the signal to be estimated, i.e. the interference. We shall follow the strategy mentioned in Section 4 consisting in taking those cycle frequencies related with the interferences which exhibit the highest spectral correlation level, with the exception of cycle frequency zero for the branches not using the complex conjugate of the input. Since the cyclic spectrum of the SOI is unknown, we can only assume that the chosen cycle frequencies do not belong to its cyclic spectrum (which would be very infrequent in practice as the interference and the SOI are transmitted by different sources). Given the set of frequency shifts used, Γ s , the adaptive FRESH filter will converge to the optimum set of LTI filters, whose frequency response has been defined in (49). In our case, the desired signal is the interference, which is the only signal correlated with the inputs of the LTI Fig. 16. Scheme for the whole interception system.
filters. As a result, the frequency response of the optimum set of LTI filters becomes: where η r is the noise Power Spectral Density (PSD), and I L is the identity matrix of size L × L.
In addition, the definition of the spectral correlation vector S uu ( f ) is analogous to (50), while the spectral autocorrelation matrices S ss ( f ) and S uu ( f ) are defined analogously to (51).

Intermediate whitening filter and FRESH filter training
Although the separation and detection problems have been considered independently, the noise PSD at the output of the interference rejection system depends on the adaptive FRESH filter and may change with time during the convergence of the adaptive algorithm. As a result, it is advisable the use of an intermediate stage in order to effectively separate the detection problem from the FRESH filter. A whitening filter previously to the detector (as indicated in Fig. 16) can do this task, so that the detector can be designed for a white noise. The optimum detector to be used depends on the specific SOI to be detected. The whitening filter is defined as the inverse of the squared root of the noise PSD at the output of the separator. This output consists of the white noise at the input plus the noise leakage at the output of the FRESH filter, which can be expressed as a function of the FRESH filter vector. Therefore, the frequency response of the whitening filter becomes: Moreover, using an adaptive algorithm requires some training time, during which the detector decisions could be wrong. The scheme proposed in Fig. 16 consists of a branch for training the interference rejection system which incorporates the adaptive FRESH filter, and an independent non-adaptive FRESH filter which effectively performs the signal separation task and is the one connected to the detector. The coefficients of the non-adaptive FRESH filter are fixed once the adaptive one has converged. This requires some mechanism for controlling the beginning and ending of a learning interval, in which the detector decisions are considered wrong. Such a mechanism might be based on monitoring the power variations of the interferences estimate and the SOI plus noise estimate, updating the non-adaptive FRESH system when both powers stabilize.

Application
Finally, let us quantify the performance of the described interception system through a case study whose scenario is represented in Fig. 17. In that scenario, the SOI is a Continuous-Wave Linear Frequency Modulated (CW-LFM) radar signal which is being transmitted by a hostile equipment and, therefore, unknown. This signal is intentionally transmitted in the spectral band of a known Direct-Sequence Binary Phase Shift Keying (DS-BPSK) spread-spectrum communication signal, with the aim of hindering its detection. The SOI sweeps a total bandwidth of BW = 20 MHz with a sweeping time T p = 0.5 ms, while the DS-BPSK interfering signal employs a chip rate 1/T c = 10.23 Mcps. The PSDs of the SOI, the interference and the noise are shown in Fig. 18. The structure of the FRESH filter is designed based on a previous study of the performance of the sub-optimal FRESH filters. As a result, the interference rejection system incorporates an adaptive FRESH filter consisting of 5 branches, each one using a FIR filter of 1024 coefficients. The frequency shifts of the FRESH filter correspond to the 5 cycle frequencies of the DS-BPSK interference with the highest spectral correlation level, which are : {±1/T c } for the input x(t),a n d{2 f c ,2 f c ± 1/T c } for the complex conjugate of the input x * (t);where f c is the carrier frequency and T c is the chip duration of the DS-BPSK interference. The adaptive algorithm used is Fast-Block Least Mean Squares (FB-LMS) (Haykin, 2001), with a convergence factor µ = 1. Next, we present some simulation results on the interception system performance obtained after the training interval has finished. Firstly, the improvement in the SIR obtained at the output of the interference rejection system is represented in Fig. 19, as a function of the input  Signal-to-Interference-plus-Noise Ratio (SINR), for several values of Interference-to-Noise Ratio (INR). Two main results are revealed in Fig. 19: 1. The SIR improvement tends to zero as the SINR increases, because the SOI is powerful enough to mask the interference which makes the FRESH filter fail to estimate the interference. This is corroborated through Fig. 20, where the simulated INR at the output of the interference rejection system is shown. For high enough input SINR, the ouput INR matches the input INR, indicating that the FRESH filter cannot extract any of the interference power. This allows to define a "useful region", where the interference rejection system obtains a significant SIR improvement. In our example, the useful region comprises an input SINR≤ 0 dB.
2. The SIR improvement saturates for high input INR values, which is shown by the fact that the SIR improvement for INR= 40 dB matches the curve obtained for INR= 30 dB. This is a limitation is due to the adaptive algorithm and does not appear when the optimal FRESH filter is used instead.
3. In addition, although it seems logical that the output INR increases with the input one, Fig. 20 reveals that this is not true for low input SINR and INR. The reason is that the interference becomes masked by noise if the input INR is low enough (i.e. INR= 0 dB). As the input INR increases, the interference rejection system increases its effectiveness, so that the output INR decreases. That is why the output INR is lower for INR= 10 dB and INR= 20 dB than for INR= 0 dB.
The output INR can provide an idea about the degradation of the probability of false alarm (P FA ) of the detector (the probability of detection when the SOI is not present). However, each particular detector is affected in a different way. We shall illustrate this fact with two different detectors. The first one is an energy detector (ED), which consists of comparing the total energy to a detection threshold set to attain a desired P FA . The second one is a detector based on atomic decomposition (AD), such as that proposed in (López Risueño et al., 2003). This detector exhibits an excellent performance for LFM signals, as the SOI considered in our case study. The AD-based detector can be thought as a bank of correlators or matched filters, each one matched to a chirplet (a signal with Gaussian envelope and LFM), whose maximum output is compared to a detection threshold. Both detectors process the signal by blocks, taking a decision each 1024 samples. Fig. 21 shows the degraded P FA of the whole interception system in the presence of the interference, and when the detection threshold has been determined for an input consisting of only noise. The curves clearly show the different dependence of the P FA of both detectors on the input INR. The energy detector exhibits a higher sensitivity to the interference than the AD-based one. Thus, the AD-based detector visibly degrades its P FA only for an input INR= 40 dB and above. On the contrary, ED always exhibits a degraded P FA in the presence of the interference due to the energy excess, which is proportional to the output INR shown in Fig. 20. Finally, we end this application example by showing the sensitivity improvement of the interception system obtained thanks to the interference rejection system. The sensitivity is defined as the SNR at the input of the interception system required to attain an objective probability of detection (P D = 90%), for a given probability of false alarm (P FA = 10 −6 ). Thus, the detection threshold takes a different value depending on the input INR so that the P FA holds for all the INR values. The simulation results are gathered in Tab Table 2. Sensitivity (SNR, dB) for the CW-LFM signal as a function of the INR. P FA = 10 −6 , P D = 90%.
without the interference rejection system, are shown. The sensitivity improvement obtained by using the interference rejection system is also shown. As can be seen, the improvement is very significant and proves the benefit of using the interference rejection system. Moreover, the improvement is higher for increasing input INR. However, there is still a sensitivity degradation as the INR increases due to an increase in the detection threshold and/or a distortion produced by the interference rejection system to the SOI because of the signal leakage at the FRESH output (the latter only applies to AD, since ED is insensitive to the signal waveform). And, as expected, the AD-based detector outperforms ED (López Risueño et al., 2003).

Summary
This chapter has described the theory of adaptive FRESH filtering. FRESH filters represent a comprehensible implementation of LAPTV filters, which are the optimum filters for estimating or extracting signal information when the signals are modelled as almost-cyclostationary stochastic processes. When dealing with complex signals, both the signal and its complex conjugate must be filtered, resulting in WLAPTV filters. The knowledge required for the design optimal FRESH filters is rarely available beforehand in practice, which leads to the incorporation of adaptive scheme. Since FRESH filters consist of a set of LTI filters, classical algorithms can be applied by simply use the stationarized versions of the inputs of these LTI filters, which are obtained by time-averaging their statistics. Then, the optimal set of LTI filters is given by the multidimensional Wiener filter theory. In addition, thanks to their properties of signal separation in the cycle frequency domain, adaptive FRESH filters can operate blindly, that is without reference of the desired signal, by simply using as frequency shifts the cycle frequencies belonging uniquely to the desired signal cyclic spectrum. Furthermore, adaptive FRESH filters have the advantage of being able to compensate small errors in their frequency shifts, which can be present in practice due to non-ideal effects such as Doppler or the oscillators stability. In this case, the convergence rate of the adaptive algorithm must be carefully chosen in order to simultaneously minimize the gradient noise and the lag errors. The chapter is finally closed by presenting an application example, in which an adaptive FRESH filter is used to suppress known interferences for an unknown hidden signal interception system, demonstrating the potentials of adaptive FRESH filters in this field of application. Adaptive filtering is useful in any application where the signals or the modeled system vary over time. The configuration of the system and, in particular, the position where the adaptive processor is placed generate different areas or application fields such as prediction, system identification and modeling, equalization, cancellation of interference, etc., which are very important in many disciplines such as control systems, communications, signal processing, acoustics, voice, sound and image, etc. The book consists of noise and echo cancellation, medical applications, communications systems and others hardly joined by their heterogeneity. Each application is a case study with rigor that shows weakness/strength of the method used, assesses its suitability and suggests new forms and areas of use. The problems are becoming increasingly complex and applications must be adapted to solve them. The adaptive filters have proven to be useful in these environments of multiple input/output, variant-time behaviors, and long and complex transfer functions effectively, but fundamentally they still have to evolve. This book is a demonstration of this and a small illustration of everything that is to come.