A Generalized Mathematical Model in One-Dimension

This research monograph presents a mathematical approach based on stochastic calculus which tackles the "cutting edge" in porous media science and engineering - prediction of dispersivity from covariance of hydraulic conductivity (velocity). The problem is of extreme importance for tracer analysis, for enhanced recovery by injection of miscible gases, etc. This book explains a generalised mathematical model and effective numerical methods that may highly impact the stochastic porous media hydrodynamics. The book starts with a general overview of the problem of scale dependence of the dispersion coefficient in porous media. Then a review of pertinent topics of stochastic calculus that would be useful in the modeling in the subsequent chapters is succinctly presented. The development of a generalised stochastic solute transport model for any given velocity covariance without resorting to Fickian assumptions from laboratory scale to field scale is discussed in detail. The mathematical approaches presented here may be useful for many other problems related to chemical dispersion in porous media.


Introduction
In the previous chapter we derived a stochastic solute transport model (equation (3.2.14)); we developed the methods to estimate its parameters, and investigated its behaviour numerically. We see some promise to characterise the solute dispersion at different flow lengths, and there are some indications that equation (3.2.14) produce the behaviours that would be interpreted as capturing the scale-dependency of dispersivity. However, there are weaknesses in the model as evident from Chapter 3. These weaknesses, which are discussed in the next section, are stemming from the very assumptions we made in the development of the model. One could argue that by relaxing the Fickian assumptions, we are actually complicating the problem quite unnecessarily. But as we see in Chapter 3 and in this chapter, we develop a new mathematical and computational machinery at a more fundamental level for the hydrodynamic dispersion in saturated porous media.
We see that equation (3.2.14) is based on assuming a covariance kernel for the velocity fluctuations, and the solution is dependent on solving an integral equation (see equation (3.3.11)). In Chapter 3, the integral equation is solved analytically for the covariance kernel given by equation (3.3.10) to obtain the eigen values and eigen functions, but analytical solutions of integral equations can not be easily derived for any arbitrary covariance kernel. This limits the flexibility of the SSTM in employing a suitable covariance kernel independent of the ability to solve relevant integral equations. Further, we need to solve the SSTM in a much more computationally efficient manner, and estimating dispersivity by always relating to the deterministic advection-dispersion equation is not quite satisfactory. Therefore, we seek to develop a more general form of equation (3.2.14) in this chapter.

The Development of the Generalized Model
We restate equation (3.2.14) in the differential form: The number of terms that need to be summed up, m, has to be determined by considering the contribution each pair of eigen value i  and corresponding eigen function j f make to approximate the covariance.
Instead of solving the integral equation to obtain eigen function, we assume the following function to be a generalized eigen function. (We will discuss the method to obtain this function for any given kernel in section 4.3.) We define, where   j f x is the jth eigen function of a given covariance kernel, 0 j g , 1 j g and kj g are numerical coefficients, jk r and jk s are the coefficients in the exponent where k is an integer index ranging from 2 to j p . j p is determined by the computational method in section 4.3.
The differential operator is defined by, and the second term on the right hand side of equation ( Therefore, equation (4.2.5) can be expressed as, , We see that, in equation (4.2.9), the coefficients, If we recall the premise on which stochastic calculus is discussed in Chapter 2,   , C x t is a continuous, non-differentiable function, and equation (4.2.1) should be interpreted as an Ito stochastic differential, which essentially mean equation (4.2.1) has to be understood only in the following form: ij P x i  . Therefore, the Ito stochastic product rule is the same as the product rule in standard calculus (Klebaner, 1998), and we employ this fact in the previous derivations. Further, we assume that the mean velocity   , V x t is a continuous, differentiable function which is a reasonable assumption given that the average velocity in aquifer situation is based on the hydraulic conductivity, porosity and the pressure gradient across a large enough domain within a much larger total flow length. Therefore, we can write the drift term of equation (4.2.1) as follows: (4.2.14) By substituting equations (4.2.14) and (4.2.9) into equation (4.2.1), , ,

A Computational Approach for Eigen Functions
We discuss the approach in this section to obtain the eigen functions for any given kernel in the form given by equation (4.2.3). We calculate the covariance kernel matrix (COV) for any given kernel function and COV can be decomposed in to eigen values and the corresponding eigen vectors using singular value decomposition method or principle component analysis. This can easily be done using mathematical software. Then we use the eigen vectors to develop eigen functions using neural networks.
Suppose that we already have an exponential covariance kernel as given by,  In equation (4.3.4), COV is the covariance matrix which contains all the variances and covariances, and it is a symmetric matrix with size n n  where n is the number of intervals. The diagonals of COV represent variances where 1 x is equal to 2 x and offdiagonals represent covariances between any two different discrete 1 x and 2 x .
After the covariance matrix is defined, we can transform the COV matrix into a new matrix with new scaled variables according to Karhunen-Loève (KL) theorem. In this new matrix, all the variables are independent of each other having their own variances. i.e, the covariance between any two new variables is zero. The new matrix can be represented by using the Karhunen-Loève theorem as, where n is the total number of variables in the new matrix; j  represents the variance of the th j rescaled variable and j  is also the th j eigen values of the COV1 matrix; and ( ) j x  is the eigen vectors of the COV1 matrix. The number of eigen vectors depends on the number of discrete intervals. This decomposition of the COV1 matrix which is called the singular value decomposition method can easily be done by using mathematical or statistical software.
Once we have the eigen vectors, the next step is to develop suitable neural networks to represent or mimic these eigen vectors. In fact, it is not necessary to simulate all the eigen vectors. The number of neural networks is decided by the number of eigen values which are significant in the KL representation. In some situations, for example, we may have 100 eigen values in the KL representation but only 4 significant eigen values. Then, we just create four networks to simulate these four eigenvectors which correspond to the most significant eigen values. The way we decide on the number of significant eigen values is based on the following equation: where [ ] R i represents the contribution of the th i eigen value in capturing the total original variance, k represents the number of the significant eigen values, and Th is the contribution of all significant eigen values in capturing the total original variance. In this chapter, Th is chosen to vary between 0.95 and 1. This means that if the total number of eigen values is 100 from the KL expansion and the contribution of the first 4 eigen values takes up more than 95% of the original variance, there are only four individual neural networks that need to be developed.
The main factors that need to be decided in the development of neural networks are the number of neurons needed, the structure of neural networks and the learning algorithm. The number of neurons in neural networks is case-dependent. It is difficult to define the number of neurons before the learning stage. In general, the number of neurons is adjusted during training until the network output converges on the actual output based on least square error minimization. A neural network with an optimum number of neurons will reach the desired minimum error level more quickly than other networks with more complex structures. The proposed neural network is a Radial Basis Function (RBF) Network. The approximation function is the Gaussian Function given below: where r and s are constants. In this symmetric function, s defines the centre of symmetry and r defines the sharpness of Gaussian function.
Based on numerical values of the significant eigen vectors, several RBF networks with one input ( x ) and one output (eigen vector) are developed to approximate each significant www.intechopen.com eigenvector. Now let us have the following function the form of which is previously given in equation (4.2.3) to define the RBF network, where 0 g is the bias weight of the network, 1 g and k g is the 1 st and th k weight of the network, and p is the total number of neurons in this RBF network (the reason for using p+1 for the summation is that k starts at 2). Figure 4.1 displays the architecture of a neural network for the case of one-dimensional input and   x  is used to represent the output of the neural network given by equation (4.3.9).
After we decide the input-output mapping and architecture of deterministic neural networks, the next step is to choose the learning algorithm, i.e., the method used to update weights and other parameters of networks. The backpropagation algorithm is used as the learning algorithm in this work. The backpropagation algorithm is used to minimize the network's global error between the actual network outputs and their corresponding desired outputs. The backpropagation leaning method is based on gradient descent that updates weights and other parameters through partial derivatives of the network's global error with respect to the weights and parameters. A stable approach is to change the weights and parameters in the network after the whole sample has been presented and to repeat this process iteratively until the desired minimum error level is reached. This is called batch (or epoch based) learning. Now let us assume that the actual network output is T ( ( ) x  ), the desired network output . The network's global error between the network output and actual output is where i T and i Z are the actual output and the network output for the th i training pattern, and N is the total number of training patterns. The multiplication by 1 / 2 is a mathematical convenience (Samarasinghe, 2006).
The method of modifying a weight or a parameter is the same for all weights and parameters so we show the change to an arbitrary weight as an example. The change to a single weight of a connection between neuron j and neuron i in the RBF network based on batch learning can be defined as, where  is called the learning rate with a constant value between 0 and 1. It controls the step size and the speed of weight adjustments. k is the total number of input vectors. The process that propagates the error information backwards into the network and updates weights and the parameters of network is repeated until the network minimizes the global error between the actual network outputs and their corresponding desired outputs. In the learning process, the weights and the parameters of the network converge on the optimal values.
To illustrate the computational approach, we give some examples here. The first covariance kernel is chosen to be 2 1 2 ( , ) This covariance kernel needs a relatively lower number of significant eigen values to capture 95% or more of the total variance; therefore we choose to work with this kernel and later we use two other forms of kernels: one discussed previously in Chapter 3 and other one is empirically based.  Table 4.1 reports all eigen values in the KL representation of the covariance matrix. The most of the eigen values is equal to zero and these eigen values can not affect the covariance matrix and just a few are significant and capture the total variance in the original data.
www.intechopen.com Therefore, we need to focus on the significant eigen values as well as their corresponding eigen functions. There are 6 significant eigen values whose contribution takes up 99.9035% of the original variance and table 4.1 shows the value of each significant eigen value and the proportion of variance captured by the corresponding eigen value.
Thus, the six eigenvectors corresponding to the significant eigenvalues are simulated by the individual RBF network. Although we use a different RBF network to approximate each of these six eigenvectors, the structure of RBF network is the same but the weights and parameters inside the individual RBF network are different.  Then each individual RBF network was trained to learn their related eigen vector, while the weights and all the parameters of Gaussian functions were updated until each network reaches global minimum error given by        From the previous two examples, we have seen that the covariance kernel given by equation (4.3.12) provides a relative small number of eigen functions and therefore one may say that the kernel given by equation (4.3.12) has fast convergence. This is quite a desirable property to have, especially in terms of computational efficiency of the algorithms. In the next example, we find the eigen values and the eigen functions of the covariance kernel we use in www.intechopen.com   In Table 4.5, we give the 32 eigen values which capture up to 94% of the only original variance; Table 4.6 gives only the first six eigen functions obtained from the networks for brevity. Figure 4.7 shows the first eight eigen functions.

Eigen values Values
The analytical forms of eigen functions     We can also use an empirically derived covariance kernel. As an example, let us consider equation (4.3.15). www.intechopen.com (4.3.15) Equation (4.3.15) is depicted in Figure 4.8, and Figure 4.9 shows the corresponding covariance matrix. Table 4.7 gives the most significant eigen values (the first nine values); Table 4.8 shows the functional forms of eigen functions and Figure 4.10 shows the graphical forms of eigen functions.     We have seen that some covariance kernels provide relatively small number of significant eigen values for the given domain [0,1] and whereas for others, obtaining the significant www.intechopen.com Computational Modelling of Multi-Scale Non-Fickian Dispersion in Porous Media -An Approach Based on Stochastic Calculus 138 eigen functions could be tedious. The main point in this exercise is to show that any given covariance kernel can be used to obtain the eigen functions of the form given by equation (4.2.3). A corollary to this statement is that if we express the eigen functions in the form given by equation (4.2.3), then we can assume that these is an underlying covariance kernel responsible for these eigen functions. In deriving the SSTM in the form of equation (4.2.15), we assume that the form of equation (4.2.3) is given. But we see now that any covariancekernel driven SSTM can be represented by equation (4.2.15).

Effects of Different Kernels and x h
We have seen in sections 4.2 and 4.3 that the SSTM developed in Chapter 2 and 3 can be recasted so that we could employ any given velocity kernel. In fact, we can even use an empirical set of data for the velocity covariance. We can anticipate that the generalized SSTM would behave quite similar to the one developed in Chapter 2 given that same covariance kernel is used. We compute the 95% confidence intervals for the concentration breakthrough curves (concentration realizations) at x = 0.5 m when the flow length is 1 m to compare the differences that occur in using different kernels in the generalized SSTM. The mean velocity is kept constant at 0.5 m/day, and the covariance kernel given by equation (4.3.14) is used to obtain Figure 4.11, and Figure 4.12 is obtained by employing the kernel, . First, the confidence intervals shown in Figure 4.11 are very similar to those ones could obtain by using the SSTM developed in Chapter 2. Comparing the effects of the kernels on the behaviours of the generalized SSTM, we see that the confidence interval bandwidth in Figure 4.12 is almost non-existent. The reason is that the kernel used has a faster convergence when decomposed in the eigen vector space. For smaller values of  , the randomness in the concentration realization are minimal but as 2  is increased, we see increased randomness in the realizations. This also allows us to use the kernel used in Figure 4.12 for larger scale computations. We conclude that the choice of the velocity covariance kernel has a significant effect on the behaviour of the generalized SSTM increasing the flexibility of the SSTM.  We investigate the effects of the kernels on the dispersivity values; we compute them using the stochastic inverse method (SIM) discussed in Chapter 3. Table 4.9 shows the results. For all practical purposes they are essentially the same. The mechanic of dispersion is more influenced by 2  for a given b or if both 2  and b are allowed to vary, on both 2  and b . The mechanics of dispersion in general can also be assumed to be influenced by the mathematical form of the kernel. The both of these kernels are exponential decaying functions. Because of the case of computations, we continue to use kernel 2 in Table 4.9 in the most of the work discussed in this book.
From the derivation of equation ( The drift and diffusion coefficients of the multi-dimensional SDEs are vector x F and the matrix x G  is independent of t , and the drift coefficient x F can also be assumed to be independent of t in many cases. Therefore, equation ( In obtaining equation (4.2.6), we employ the fact that independent Brownian motions have quadratic covariation. The most important use of quadratic covariation is that we can determine the movement of   x I t with respect to time using the following well known results for Ito diffusions, which are also continuous Markov processes. It can be shown that, for an infinitesimal time increment, increments for a given x , using the mean and variance obtained by equations (4.5.7) and (4.5.9). It should be noted that in generating the standard Wiener process increments, we use the zero-mean and  -variance Gaussian increments.
We take  The statistical properties of these realizations are essentially the same to those of the realizations given in Figure 4.13.
As   x I t nonlinearly through x G , but this influence can always be captured by suitable changes in  . Therefore, we can keep b at a constant value that is appropriate for the porous medium under study. We found that b =0.1 is suitable for our computational experiments in this chapter as well as in chapters 6, 7 and 8. Equation (4.5.11) can be written in component forms,   We produce the 3-dimensional graphs of ij P when 0,1 i  and 1, 2, 3, 4, 5 j  in Figure   4.18 and Figure 4.19; b is plotted as the y-axis. As one could expect, it is reasonable to assume that function surface of ij P is a smooth, continuous function of b . We can define continuous functions of x and b to define 2 j P s .
We need to interpret this equation as follows: the solute concentration at a given point x consists of a combination of three diffusion processes which are solely based on velocity.
The spatial influence on the concentration is mediated through the prevailing concentration, and its spatial gradients. In keeping with the Ito definition of stochastic integration (Klebaner, 1998) we must use the concentration and its spatial gradients at a previous time.
As a difference equation, we can write equation (4.5.13) as, where n t denotes the discretized time and the spatial gradients act as the coefficients of   x dI t , and therefore can be written as, .
x If , x i F and ii a are deterministic functions of x then the moment evolution equations can be simplified to equations (4.5.7) and (4.5.8).

The Evaluation of
 

x C t Diffusions
The SSTM can be written as, for given x , (see equation (4.5.13)), where subscript x refers to the first and second derivatives with respect to x . Note that the coefficients are functions of t for given x , and , x i I s are Ito diffusions based on the velocity structure. Equation (4.8.1) is a stochastic diffusion and a SDE which displays the interplay between the concentration profile and velocity structures in the medium. By substituting the equations of the form of equation ( In equation (  where  is the Dirac-delta function. Once we solve the Fokker-Plank equation (4.8.12) along with its initial condition, the time evolution of probability density function can be found. This is also a weak solution to equation (4.8.10). Equation ( Similar to equations (4.7.6) and (4.7.7), we can write time evolution of the moments The derivation of these equations are very similar to those given by Gillespie (1992).
The time evolution of the mean of   x C t is given by, In equation ( where m is the number of effective eigen functions, and , ( ,0,1, 2). In the numerical solutions, we make use of the finite differences, for a given dependent variable, say U , based on the grid given in Figure 4.20  By using a numerical scheme developed, we obtain realizations of   x C t as strong solutions to equation (4.9.1). We can also obtain solutions to the Fokker-Plank equation (4.8.12) using the same finite differences.

Remarks for the Chapter
In this Chapter, we develop a generalized form of SSTM that can include any arbitrary velocity covariance kernel in principle. We have demonstrates that for a given kernel, a generalized analytical forms for eigen functions can be obtained by using the computational methods developed. We have also developed a Langevin form of the SSTM for a given x , and the time evolution of concentration,     x C t itself and its first-and second derivatives with respect to x . This focus of SDE provides a very convenient and computationally efficient way to solve the stochastic partial differential equation associated with the SSTM.
By deriving a Langevin form of the SSTM, we essentially prove that any time course of the concentration at a given point behaves according to the underlying SDE, which would characterize the nature of local porous medium and is a statement of mass concentration of the solute. This research monograph presents a mathematical approach based on stochastic calculus which tackles the "cutting edge" in porous media science and engineering -prediction of dispersivity from covariance of hydraulic conductivity (velocity). The problem is of extreme importance for tracer analysis, for enhanced recovery by injection of miscible gases, etc. This book explains a generalised mathematical model and effective numerical methods that may highly impact the stochastic porous media hydrodynamics. The book starts with a general overview of the problem of scale dependence of the dispersion coefficient in porous media. Then a review of pertinent topics of stochastic calculus that would be useful in the modeling in the subsequent chapters is succinctly presented. The development of a generalised stochastic solute transport model for any given velocity covariance without resorting to Fickian assumptions from laboratory scale to field scale is discussed in detail. The mathematical approaches presented here may be useful for many other problems related to chemical dispersion in porous media.