0 A Gibbs Sampling Algorithm to Estimate the Occurrence of Ozone Exceedances in Mexico City

Inhabitants of many cities around the world suffer the effects of high levels of ozone air pollution. When the ozone concentration stays above the threshold of 0.11 parts per million (0.11ppm) for a period of one hour or more, a very sensitive part of the population (e.g. elderly and newborn) in that environment may experience serious health deterioration (Bell et al., 2004, 2005, 2007; Gauderman et al., 2004; Loomis et al., 1996; O’Neill et al., 2004; WHO, 2006). Therefore, being able to predict when exceedances of the 0.11ppm threshold (or any threshold above this value) may occur is a very important issue. Depending on the country, different thresholds may be considered to declare environmental alerts. The US Environmental Protection Agency (US-EPA) has established as its standard that the three-year average of the fourth highest daily maximum 8-hour average ozone concentration measured at each monitor within an area over each year must not exceed 0.075ppm (see EPA, 2008). In the case of Mexico, the standard is 0.11ppm and an individual should not be exposed, on average, for a period of an hour or more (NOM, 2002). In Mexico City when the ozone concentration surpasses the threshold 0.2ppm an emergency alert is issued and measures are taken to prevent population exposure to the pollutant (see http://www.sma.df.gob.mx). It is possible to find in the literature a large number of works focussing on the study of the behaviour of air pollutants in general. Among the ones related to ozone air pollution we may quote, Horowitz (1980), Roberts (1979a, 1979b), Smith (1989) considering extreme value theory; Flaum et al. (1996), Gouveia & Fletcher (2000), Kumar et al. (2010), Lanfredi & Macchiato (1997) and Pan & Chen (2008) using time series analysis; Álvarez et al. (2005) and Austin & Tran (1999) considering Markov chain models; Achcar et al. (2008b, 2009a, 2009b) using stochastic volatility models; and Huerta & Sansó (2005) with an analysis of the behaviour the maximum measurements of ozone with an application to the data from Mexico City. Davis (2008) and Zavala et al. (2009) present studies that analyse the impact on air quality of the driving restriction imposed in Mexico City. When the aim is to estimate the number of times that a given environmental standard is surpassed, Raftery (1989) and Smith (1989) use time homogeneous Poisson processes to model 8


Introduction
Inhabitants of many cities around the world suffer the effects of high levels of ozone air pollution.When the ozone concentration stays above the threshold of 0.11 parts per million (0.11ppm) for a period of one hour or more, a very sensitive part of the population (e.g.elderly and newborn) in that environment may experience serious health deterioration (Bell et al., 2004(Bell et al., , 2005(Bell et al., , 2007;;Gauderman et al., 2004;Loomis et al., 1996;O'Neill et al., 2004;WHO, 2006).Therefore, being able to predict when exceedances of the 0.11ppm threshold (or any threshold above this value) may occur is a very important issue.Depending on the country, different thresholds may be considered to declare environmental alerts.The US Environmental Protection Agency (US-EPA) has established as its standard that the three-year average of the fourth highest daily maximum 8-hour average ozone concentration measured at each monitor within an area over each year must not exceed 0.075ppm (see EPA, 2008).In the case of Mexico, the standard is 0.11ppm and an individual should not be exposed, on average, for a period of an hour or more (NOM, 2002).In Mexico City when the ozone concentration surpasses the threshold 0.2ppm an emergency alert is issued and measures are taken to prevent population exposure to the pollutant (see http://www.sma.df.gob.mx).It is possible to find in the literature a large number of works focussing on the study of the behaviour of air pollutants in general.Among the ones related to ozone air pollution we may quote, Horowitz (1980), Roberts (1979aRoberts ( , 1979b)), Smith (1989) considering extreme value theory; Flaum et al. (1996), Gouveia & Fletcher (2000), Kumar et al. (2010), Lanfredi & Macchiato (1997) and Pan & Chen (2008) using time series analysis; Álvarez et al. (2005) and Austin & Tran (1999) considering Markov chain models; Achcar et al. (2008bAchcar et al. ( , 2009aAchcar et al. ( , 2009b) using stochastic volatility models; and Huerta & Sansó (2005) with an analysis of the behaviour the maximum measurements of ozone with an application to the data from Mexico City.Davis (2008) and Zavala et al. (2009) present studies that analyse the impact on air quality of the driving restriction imposed in Mexico City.When the aim is to estimate the number of times that a given environmental standard is surpassed, Raftery (1989) and Smith (1989) use time homogeneous Poisson processes to model

Some non-homogeneous Poisson models
There are several works that use non-homogeneous Poisson model to study problems in a wide variety of research areas (see for example Achcar, 2001;Achcar et al., 1998;Ramírez-Cid & Achcar, 1999;Wilson & Costello, 2005, among others).As in Achcar et al. (2008aAchcar et al. ( , 2010Achcar et al. ( , 2011)), in here, non-homogeneous Poisson processes are used as models to estimate the probability that an ozone measurement surpasses a given threshold a certain number of times in a time interval of interest.However, in here we focus mainly in the behaviour of the mean of the Poisson process.The description of the models considered here are given as follows.
Let N t ≥ 0 record the number of times that a given threshold for a given pollutant is surpassed in the time interval [0, t), t ≥ 0. We assume that N = {N t : t ≥ 0} follows a Poisson process which is non-homogeneous in time and with rate function λ(t) > 0, t ≥ 0. Therefore, at time t, the random variable N t has Poisson distribution with rate function λ(t) and mean function m(t)= t 0 λ(s) ds, i.e., for k = 0,1,...ands, t ≥ 0, The function λ(t), t ≥ 0, will depend on some parameters that need to be estimated.Once those parameters are estimated, the resulting form for λ(t) may be substituted in (1) and the corresponding probabilities may be calculated.Additionally, since the rate function is related to the rate at which a surpassing occurs, we are able to obtain information on that as well.
There are several forms that λ(t), t ≥ 0, may assume.Based on the information provided by previous works, we are going to consider three forms for the rate function.They are given by the Weibull (W), Musa-Okumoto (MO) (Musa & Okumoto, 1984), and a generalised form of the Goel-Okumoto (GGO) (Goel & Okumoto, 1978) models, i.e., 3) respectively.Hence, the vectors of parameters to be estimated are θ =( α, σ), (α, β) ∈ R 2 + ,in (2) and (3), respectively, and is θ =(α, β, γ) ∈ R 3 + in (4).The problem is reduced to estimating the vector of parameters that produces the behaviour of λ(t), t ≥ 0 that more adequately fits the situation presented by the ozone measurements of the Mexico City monitoring network.In order to do so, a Bayesian approach will be used.Remark.Note that, even though the rate functions (2), (3) and (4) have been considered in past works, the interest here is to analyse the fitting for more recent data.That is justified by the fact that the year 2000 was the point in time where the last of a series of important measures to control pollution emission were taken by the environmental authorities in Mexico City.Previous preliminary analysis shows that the behaviour of the daily maximum ozone measurements have changed.

A Bayesian formulation of the model
In this section a Bayesian formulation of the model is presented in which the likelihood function follows a Poisson model with rate function that is time dependent.The aim is to estimate the parameters describing this rate function.Therefore, assume that there are K ≥ 1 days in which a given threshold is surpassed by the daily maximum ozone measurement.Let d 1 , d 2 ,...,d K be those days and denote by D = {d 1 , d 2 ,...,d K } the set of observed data.There is a natural relationship among posterior and prior distributions and the likelihood function of the model which is given by (5) where P(θ | D) and P(θ) are the posterior and the prior distributions of the vector of parameters θ, respectively, and L(D | θ) is the likelihood function of the model.The components of (5) are given as follows.

133
A Gibbs Sampling Algorithm to Estimate the Occurrence of Ozone Exceedances in Mexico City www.intechopen.com 1.The likelihood function.Since a non-homogeneous Poisson model has been assumed, we have that the likelihood function will take the form (see for instance Cox & Lewis, 1966;Lawless, 1986) where θ is the vector of parameters.The likelihood function takes different forms depending on the expression for the rate function λ(t), t ≥ 0. Hence, taking that into account we have the following.
In the case of the Weibull function given by (2), we have that m (W) (t)=( t/σ) α ,a n d therefore, When considering the Musa-Okumoto function given by (3), we have that m (MO) (t)= β log (1 + t/α),andthen, In the case of the generalised Goel-Okumoto function given by (4), we have that the mean is m (GGO) (t)=α [1 − exp (−βt γ )], and therefore, 2. The prior distributions.Assuming prior independence among the parameters of each model considered we have the following prior distributions.In the case of the Weibull intensity (2), in a first instance we consider for all regions and parameters, uniform prior distributions defined on appropriate intervals.In a second instance, we take Beta(a 1 , b 1 ) and Gamma(a 2 , b 2 ) prior distributions for α and σ, respectively.(In here, we are denoting by Beta(a, b) and Gamma(c, d) the Beta and Gamma distributions with means a/(a + b) and c/d, respectively, and variances ab/[(a + b) 2 (a + b + 1)] and c/d 2 , respectively.)When considering the intensity function (3) we take uniform prior distributions, with appropriate hyperparameters, for all parameters and regions.In the case of the intensity function (4) we assume that α and β have uniform prior distributions (with appropriate hyperparameters) for all regions.In the case of the parameter γ we have γ ∼ Gamma(a 3 , b 3 ) for all regions.
The hyperparameters a i and b i ,f o ri = 1, 2, 3 are considered to be known and will be specified later.They will be such that either we have non-informative prior distributions or using prior information of experts we have informative prior distributions for the parameters of each model.

134
Air Quality -Models and Applications www.intechopen.com3. The posterior distributions.The posterior distributions for the models considered here have the following forms.In the case of the Weibull intensity (2), we have that Remark.In the case of uniform prior distributions for α and σ,w When considering λ(t) given by ( 3), we have that, In the case of λ(t) given by ( 4) we have that

Conditional marginal posterior distributions, a Gibbs sampling and a Metropolis-Hasting algorithm
In order to simulate values from the joint posterior distribution in each model, we use a combination of Gibbs and Metropolis-Hastings type algorithm (see Metropolis et al., 1953;Hastings, 1970;Gelfand & Smith, 1990).The estimation of the parameters in each model will be made through a sample obtained from their complete marginal conditional posterior distribution using a Metropolis-Hasting algorithm.The values obtained are used in the next step of the Gibbs sampling.The complete marginal conditional posterior distributions in each model are given as follows.
1.In the case of the Weibull intensity (2) and complete posterior distribution given by (10), we have that where

135
A Gibbs Sampling Algorithm to Estimate the Occurrence of Ozone Exceedances in Mexico City www.intechopen.com Remark.In the case of uniform prior distribution for α and σ we have that P(α | σ, D) ∝ ψ 1 (α, σ) and P(σ | α, D) ∝ ψ 2 (α, σ). 2. When considering the intensity function given by (3) we have that where 3. In the case of the intensity function (4) we have where Note that from the forms of most of the marginal conditional distributions obtained here it is not straightforward to obtain sampled values of the parameters directly from them.Hence, we are going to obtain a sample by using a Metropolis-Hastings algorithm.We sample the proposed values for the parameters using their respective prior distributions.Therefore, if θ =(θ 1 , θ 2 ,...,θ n ) is the vector of parameters in the present step of the algorithm, then the acceptance probability in the Metropolis-Hastings algorithm is given by where is the proposed vector where each coordinate was sampled using its prior distribution, θ (−j) is the vector θ without its jth coordinate and ψ l is the appropriate ψ function that appears in the expression for the marginal conditional distributions.In order to select the model that best explains the behaviour of the data considered here, we use the plots of the Monte Carlo estimates for m(t) based on the simulated samples versus t and the accumulated number of occurrences up to t versus time.If the curves are similar, 136 Air Quality -Models and Applications www.intechopen.comwe have an indication of a good fit of the proposed model to the data.We also consider a Bayesian discrimination method.Hence, from Raftery (1996) we have that the marginal likelihood function of the whole data set D for Model l, l = 1, 2, . . ., J is given by [l] ) P(θ [l] ) dθ [l]  (14) where θ [l] is the vector of parameters for Model l and P(θ [l] ) is the joint prior distribution for θ [l] .The Bayes factor criterion prefers Model i to Model j if V j /V i < 1.A Monte Carlo estimate for the marginal likelihood V l is given by where M is the size of the simulated Gibbs sample and θ [l,i] , i = 1, 2, . . ., M is the sample obtained when considering the Model l, l = 1, 2, . . ., J.

An application to the ozone measurements in Mexico City
In this section we apply the results described in earlier sections to the case of daily maximum ozone measurements from the monitoring network of the Metropolitan Area of Mexico City.The data used in the analysis (obtained from http://www.sma.df.gob.mx/simat/)corresponds to seven years (from 01 January 2003 to 31 December 2009) of the daily maximum measurements in each region.The measurements are obtained minute by minute and the averaged hourly result is reported at each station.The daily maximum measurement for a given region is the maximum over all the maximum averaged values recorded hourly during a 24-hour period by each station placed in the region.The seven-year average measurements in regions NE, NW, CE, SE and SW are 0.085, 0.097, 0.098, 0.101 and 0.112, respectively, with standard deviation 0.028, 0.036, 0.036, 0.033 and 0.039.We also have that the threshold 0.17ppm was surpassed 13, 78, 53, 40 and 143 days in regions NE, NW, CE, SE and SW, respectively.The computational details of the implementation of the computer programme of the algorithm used to estimate the parameters are given as follows.When considering the Weibull rate function, we have used three different initial values for the algorithm generating the values of α and σ.Those initial values were based on the results given by Achcar et al. (2008a) for each specific region.Hence, we have taken one value near the left extreme of the 95% credible interval for the mean, another near the right extreme and the mean itself.In the case of the rate functions (3) and (4) similar procedure was considered.
Regarding the hyperparameters of the prior distributions when considering the Weibull rate function we have the following.In the case of the first stage of the sampling procedure (i.e., uniform prior distributions for α and σ), we have that in all regions the parameter α has a U(0,1) prior distribution.The parameter σ has a U(0,10) prior distribution when using data from region CE and a U(0,100) prior distribution for the remaining regions.In the case of non uniform prior distributions, the value of the hyperparameters may vary from region to region.In the case of the parameter α we have that the Beta distribution has hyperparameters When the GGO rate function is taken into account, we have for all regions that the parameters β and γ have U(0.000001,0.001)and Gamma(0.5,1.5)prior distributions, respectively.In the case of the parameter α we have that for regions NE, NW, CE and SE, its prior distribution is U (300,2000).In the case of region SW we have α with U(0,3000) prior distribution.A summary of the estimated quantities of interest is given in Table 3.The value of the Bayes factor for each model and region are given in Table 4.We may see from assuming uniform prior distribution for α and σ is the one with larger Bayes factor.In the case of the other regions the version where α has a Beta prior distribution is the one with larger Bayes factor.Another way of verifying the fitting of a model to the data, is to perform graphical analysis.Hence, in Figure 1 we have the plots of the observed and estimated means for each of the It is possible to see from Figure 1 that the model that best fit the data varies according to the region.We have that for region NE the estimated mean by using the Weibull rate function with α uniformly distributed is the one that best fit the observed mean.In the case of region NW, we have that the GGO model is the most adequate to represent the behaviour of the observed mean.However, we may notice that towards the end of the observational period the estimated Weibull mean function (either formulation) also provides a good approximation.When we look at the observed mean in region CE, we have that the GGO model fits quite well the observed mean until approximately the year 2007 and then it starts to drift away from it.However, the estimated mean using the MO model gives a good approximation from the year 2007 on.The case of region SE is similar to that of region CE.However, the fitting of the estimated mean function using the GGO model ends earlier (around 2006) and the fitting using the MO model is not as good as in the case of region CE.When we consider region SW, we have that the estimated mean function using any of the models provides a very good approximation for the observed mean.The best approximation is the one given by the GGO model though.

Conclusion
In this work we have considered a Gibbs sampling algorithm to estimate the parameters of a rate function of a non-homogeneous Poisson process.The methodology was applied to some models used to study the behaviour of ozone measurements from the monitoring network of Mexico City.Two criteria were used to select the model that best adjusts to the observed data.One of them was graphical analysis and the other was the Bayes factor.Looking at Table 3 we have that the Weibull rate function is the one producing the largest Bayes factor and therefore, under this criterion, that is the model that should be considered.However, when looking at Figure 1, we have that depending on the region of the city considered, we might have a better adjustment of the estimated and observed mean number of exceedances if other rate function were considered.We have that the Weibull rate function has the best graphical adjustment only when region NE is considered.In the other regions we have that the best graphical adjustment either is when using the GGO rate function or when using the MO (see Figure 1, regions NW, CE and SW).There are situations where the GGO rate function adjusts well until a given point in time and afterwards some of the other rate functions considered are the ones that give the best graphical fitting (see Figure 1, regions CE and SE).Looking at Figure 1, we have that perhaps in the case of regions NW, CE and SE one could consider the GGO model with the presence of change-points.Another option is to consider two different parametric forms for the rate function of the Poisson process.One of them explaining the behaviour of the data up to a point in time and another explaining the behaviour afterwards.When comparing the results obtained here with previous works, we have the following.Taking into account the study performed using measurements from 01 January 1998 to 31 December 2004, in all regions the selected model (using the Bayes information criterion and the deviance information criterion) is when the Weibull rate function is used (see Achcar et al., 2008a).In terms of graphical adjustment, in most of the cases, results were similar to the ones obtained here using the 2003-2009 data.However, when considering regions CE and SE, the fitting of the Weibull rate function is better when using the 1998-2004 data.The values of the parameters α and σ of the Weibull rate function are very different with the exception of regions NW and SW, where the parameters α are very similar, but the parameter σ presents significant differences.In Achcar et al. (2010), we have that the GGO and MO rate function were used when considering measurements also taken from 01 January 1998 to 31 December 2004.In that case the regional splitting of the city was not taken into account, but the ozone measurements were the overall measurements for the Metropolitan Area of Mexico City.In that work, the presence of a change-point was assumed and we have that the model selected using the deviance information criterion to represent the behaviour of the ozone measurements was the GGO with the presence of a change-point.In terms of graphical fitting we have that the GGO model with a change-point also provides the best fit.When considering the 2003-2009 data and the regional division of the city, we have that not always the GGO model is the one providing the best graphical fitting.Additionally, when using other criteria for selecting the models, the GGO may not be the selected model either, as we can see from Figure 1.In the cases where the GGO rate function is selected as providing the best graphical fitting, we have that, when compared to the results obtained using the 1998-2004 data, the values of the parameters α and β are similar except in the case of region SW.However, the parameter γ presents very different behaviour (see Achcar et al., 2010).As we can see from Figure 1, sometimes the MO rate function may be used to explain the behaviour of the data in the last years of the period 2003-2009.This rate function was also considered for the overall measurements during the period 1998-2004.The graphical fitting was much worse than the fitting obtained when using the 2003-2009 data.We would like to recall that around the year 2000 was implemented the last of the major environmental measures taken by the environmental authorities in Mexico.We also have that from that year on, the daily maximum ozone concentration presents a decreasing tendency.The change in behaviour is captured by the models presented here, where different graphical fittings and values of parameters are detected when considering the 1998-2004 and 2003-2009 data sets.An advantage of using the algorithm and consequently the programme presented here is that it may the modified by the user to adapt to each particular case studied.Each step of the algorithm and programme can be monitored and therefore, if there is any problem with the model considered, one can detect precisely where the problem is.Even though in some cases the burn-in period of the algorithm is larger than in previous cases, the overall running time is smaller than when considering some other software packages as the ones used by Achcar et al. (2008aAchcar et al. ( , 2010Achcar et al. ( , 2011)).

139A
Fig. 1.Accumulated observed and estimated means for all models and all regions.Solid plain line represents the accumulated observed mean.Lines with , , and , correspond to the accumulated estimate means for models GGO, MO, Weibull where α has a uniform prior and Weibull where α has a Beta prior, respectively.

141A
Gibbs Sampling Algorithm to Estimate the Occurrence of Ozone Exceedances in Mexico City www.intechopen.com term<-disum * (beta_actual-beta_propuesto) +alfa_actual * (exp(-beta_propuesto * dn^gamma_actual) -exp(-beta_actual * dn^gamma_actual)) res<-((beta_propuesto/beta_actual)^n) * exp(term) res } aceptar_gamma<-function(datos_observados,logsum, gamma_propuesto,gamma_actual, beta_actual,alfa_actual){ n<-length(datos_observados) dn<-tail(datos_observados,1) expsum<-0 for(i in 1:n){ expsum<-expsum+(datos_observados[i]^gamma_actual -datos_observados[i]^gamma_propuesto) } term<-logsum * (gamma_propuesto-gamma_actual) +beta_actual * expsum+alfa_actual * (exp(-beta_actual * dn^gamma_propuesto) -exp(-beta_actual * dn^gamma_actual)) result<-((gamma_propuesto/gamma_actual)^n) * exp(term) result } Gelman & Rubin, 1992)nd regions we consider a burn-in period of 30000 steps.The exceptions are the cases of Weibull rate function with α having a Beta prior distribution and regions CE, NE and SW, and the case of GGO rate function and region SW where we have a burn-in period of 130000 steps.In all cases a sample of size 10500 was used to estimate the parameters of the models.The burn-in period was determined by trace plot analysis as well as by using the Gelman-Rubin test (seeGelman & Rubin, 1992).Remark.In the case of GGO rate function and region SW, the sampling procedure were as follows, for each interation producing a value of β we had 20 iterations for α and σ.H en c e,w e have 20 iterations for α, and the last generated value is used to generate a β.This generated β is used to generate σ and again we have 20 iterations for σ and the 20th generated value is used to generate α, and so on.This artifice was necessary to obtain convergence of the Gibbs and Metropolis-Hastings algorithms.The estimated mean, standard deviation (indicated by SD) and the 95% credible interval in the case of Weibull rate function are given in Table1.Note that in the case of the parameters β its marginal conditional posterior distribution depends on the values of K and d K which may vary from region to region.Hence, we have thatK = 13, 78, 53, 40 and 143, and d K = 1927, 2258, 2283, 2282 and 2431for regions NE, NW, CE, SE and SW, respectively.Table2gives the estimated quantities of interest.
Table1.Posterior mean, standard deviation (indicated by SD) and 95% credible interval of the parameters α and σ when the Weibull rate function is considered.In the case of the MOP model, we have that α and β will have U(0,700) and U(1,2500) prior distributions, respectively, in all regions.Remark.

Table 2 .
Table4that the best model according to the Bayes factor criterion in each region is the one using the Weibull rate function.In the case of regions NE and CE the model Posterior mean, standard deviation (indicated by SD) and 95% credible interval of the parameters α and β when the MO rate function is considered.

Table 3 .
Posterior mean, standard deviation (indicated by SD) and 95% credible interval of the parameters α, β and γ when the GGO rate function is considered.

Table 4 .
Bayes factor for each model and region considered.