Drought-Forest Fire Relationships Drought-Forest Fire Relationships

This study was carried out to determine the methods that bear the most realistic results in predicting the number of fires and burned area under the climate conditions in future. Different indices and statistical methods were used in predicting the burned area and the number of fires. With this aim, in addition to the indices used in estimating the climate, Machine Learning and multivariate adaptive regression spline (MARS) models are also used in predicting these factors. According to the results obtained in several studies, the relationship between the drought and fire indices burned area and the number of fires changes from region to region. While better results are obtained in predicting the burned area and the number of fires via the drought indices being used in this study and the MARS models that the combinations of these indices use, it is seen that a 30–39% success was achieved for predicting the amount of burned area via Machine Learning methods (Kernel Nearest Neighbor (kNN), Recursive Partitioning and Regression Trees (RPART), Support Vector Machine (SVM) and RF), and this success ranges widely from 8 to 41% in terms of the number of fires. RPART, of these four algorithms, performed the best in fire prediction, but kNN was the worst.


Introduction
The fires that occurred in the Mediterranean Region between 2001 and 2014 are as follow: 3250 in France, 1425 in Greece, 6525 in Italy, 16,000 in Spain, 21,800 in Portugal and 2200 in Turkey, 205 of which were seen in Antalya. These statistics emphasize how much important it is to understand the factors which have impact on the burned areas and the number of fires and their consequences in terms of the sustainable management of forests [1]. The amount of burned areas and the number of fires are affected by numerous variables such as the size of the fire-sensitive forest area, topography, landscape (e.g., road, creek, lake, and agricultural areas), flammable materials characteristics, fire season, altitude, firefighting policy and the efficiency of organization, the number of concurrent fires and climate conditions [2]. For this reason, although the relationship between these factors and climate variables was examined first for the estimation of the number of fires and the amount of burned areas, it was seen in the research in Canada that the majority of variance could not be explained. Second, therefore, the relationships were examined by developing several indexes such as KBDI (Keetch-Byram Drought Index), SPEI (The Standardized Precipitation-Evapotranspiration Index), FWI (Fire Weather Index) and PDSI (Palmer Drought Severity Index). In the studies conducted across the world, various results were obtained in the explanation of the variance depending on the study areas and variables utilized such as 12, 35, and 66% [2][3][4][5][6][7][8][9]. In the third phase, descriptive regression models were used. It was stated by Viegas et al. [10] in France that ISI (Initial Spread Index), as a FWI component, is successful under extreme fire conditions. Jong et al. [11] state that the 75, 90 and 99% uses of at least one of the FWI components for the UK is of a significant advantage in calibration studies. Bedia et al. [12] emphasized DSR (Daily Severity Ratio) and FWIP90 as FWI components for a better understanding of the spatial and temporal distribution of fires in Spain and identified an increase in the amount of burned areas. Venalainen et al. [13] ascertained a 99% relationship for Central and South Europe and 95% relationship for Eastern Europe between FWI and burned areas in the estimation of forest fires in Europe from 1960 to 2012. When Urbieta et al. [14] examined the fire activities for Europe (Portugal, Spain, Southern France, Italy and Greece) and the Pacific coasts of the USA (Oregon and California) through FWI components, it was found R 2 > 0.70 for the models obtained for Europe and R 2 > 0.50 for the USA. New approaches have been developed in order to look into the relationship of the variables having effects on the burned forest areas and the number of fires. Some of these approaches are the Machine Learning methods (MLM) used in ecological applications such as kNN (Kernel Nearest Neighbor), SVM (Support Vector Machine), RPART (Recursive Partitioning and Regression Trees), and RF (Random Forest) [15,16]. T max and relative humidity were determined to be more successful in explaining fire activity in linear regression than RF algorithm along with all effective climate factors [15]. Moreover, RPART algorithm is made use of in determining the need for saplings after fire [17].
This study was carried out in three phases in Antalya region. In the first phase, it was investigated the relationship between the amount of burned area and the number of fires through the variables in Table 1. In the second phase, the descriptive regression equations that are used in such countries as Portugal, Spain, Canada and the USA were tried to be achieved in order to obtain the nearest estimations for the monitored numbers of fire and amount of burned area. In the third phase, the eligibility of MLM such as kNN, SVM, RPART and RF was investigated in estimating the burned areas and the number of fires. The results obtained in each phase were discussed in terms of the models' predicting abilities and the variables. The data sets comprised of either meteorological data or fire statistics cover the period between 2001 and 2014.

Study area
Antalya region, which is of the largest forest area in Turkey after Amasya Region, includes a forest area of 1,146,062 ha, a rate of 60.43% [18]. The dominant tree species of the region is Pinus brutia Ten. (65%), and it is, respectively, succeeded by Cedrus sp. (16%), Pinus nigra (8%), Abies sp. (5%), Juniperus sp. (4%) and other leafed species (2%). Such forests are between 600 and 2100 m, and maquis shrubs are mainly seen up to this altitude. Antalya Regional Directorate of Forestry is located between 36° 00′ 45″ north latitudes and 29° 16′ 15″ east longitudes. Antalya Region has a hot-summer mediterranean climate (according to Köppen-Geiger climate classification system Csa), which is dry and hot in summers, warm and rainy in winter (Figure 1). The lowest precipitation and the highest temperature for a five-month period, from May to September, in 2001-2014 were obtained relatively in 2007 and 2003. Seeing the history of regional fires, it is seen that a total of 24,390 ha has burned in 2097 fires between 2001 and 2014. This value is above the national average (based on the data obtained from 28 Regional Directorates of Forestry) considering both the areal (∼4145 ha) and numerical average (∼1097). For the same period, the annual average burned area and number of fires are, respectively, ∼1742 ha and ∼150 [14]. August is of the highest average month in terms of both the number of fires and burned area in Antalya's 14-year fire records (BA). The forest areas in the region encountered the highest damage by the burning of 16,890 ha in 2008 and the most of the fires (214) occurred in 2013.
The daily, monthly and seasonal fire data and daily, monthly and seasonal meteorological data covering 2001-2014 for Antalya region were used. The fire statistics were obtained from General Directorate of Forestry [18], and meteorological data were obtained from GDM (General Directorate of Meteorology). The fire season is accepted to be May 1 to September 30. April and October were not taken into consideration because they, respectively, cover 0.26 and 4.28% of the yearly burned areas. Meteorological data are comprised of daily highest temperature, total precipitation, wind speed and relative humidity. The annual average KBDI, FWI and SPEI values of the study area are seen in Figure 2. When compared to FWI values in EUMED region given by Urbeita et al. [14], FWI values are seen to be more stable. The highest and lowest values were encountered twice for the average FWI between 1995 and 2010. For the period 2001-2014, a decrease is widely seen to be in the annual average number of fires and burned areas to the eastwards of EUMED region from the west. Turkey itself, facing with 30,724 fires, changed the distribution of fires to some extent by outnumbering Greece (20,002 fires). When the comparison is made by taking forest areas into consideration, it is seen that the countries facing with fires the most are put in order relatively as follow: Portugal, Greece, Italy, Spain, France and Turkey.

KBDI, FWI, SPEI
KBDI involves a range of values from 0 to 800. Eight hundred signify the extreme drought, while 0 represents saturated soil. In the cases that daily rainfall data causing the change of index values reach significant amounts, KBDI value is generally needed to start from 0 [19]. Such rainfalls are frequently encountered in spring and winter seasons in Antalya region. KBDI (Q) is measured depending on average annual rainfall (R-inches) and daily maximum temperature (T-degrees Fahrenheit). The rate of change (dQ) in index is calculated via the following formula. In the formula, dT signifies a temporal change. The structure of FWI system is shown in Figure 3. Temperature, relative humidity, wind speed and rainfalls in 24 h are taken into consideration in calculating the components. Seven standard components of FWI allow numerical values for a possibility of fire [20]. SPEI, using the distinction between rainfall and evapotranspiration (P-ET 0 ) as data, is calculated in a similar way as SPI. Climatic water balance compares current quantity of water (P) with atmospheric steam deficit (ET 0 ). Therefore, the data used in SPEI are more suitable in measuring drought severity rather than considering only the rainfall [21]. The data used in calculating SPEI in the study area were obtained from the link [22] in the grid cell size by entering longitude and latitude values.

kNN, SVM, RPART, RF
Learning methods via artificial intelligence has become more usable in modeling the complex relationships and interactions without restricting assumptions of parametric statistics [23,24]. Machine learning methods generate controllable approaches that try to model the relationships [23]. Some of these methods are artificial neural networks [25], classification and regression trees [26], support vector machines [27], random forest [28], fuzzy logic [29], maximum entropy [30] and kNN [15].
K-nearest neighbors (kNNs) are one of the oldest and simplest methods used in model classifications among machine learning methods. kNN labels each of unlabeled samples according to their closest neighbors in the data set. Therefore, its performance depends on the distance metrics (Euclidean distance, Minkowski distance, Mahalanobis distance) used for calculating its closest neighbors [31]. SVM offers some theoretical advantages since local minimums do not exist in the optimization phase of the model compared to other machine learning methods. The inputs in SVM are transformed through non-linear methods in the m-dimensional feature space. In this way, SVM finds the best linear classifier hyperplane in the feature space [32].
Here, Ø i (x ) represents the non-linear transformation according to the Kernel function [33].
In order to analyze complex ecological data sets, RPART comes up as the powerful statistical tools. One of the biggest reasons for this is that it offers useful alternatives, while it models non-linear data, which includes independent variables interacting with each other [29]. Regression trees were used in numerous ecological applications such as the relationship between the severity and frequency of forest fires [34].
Breiman [35] suggests, instead of producing a single decision tree, unifying the multivariate decisions trees ( Ø k ) , each of which is trained through different training data sets. RF (Random forest) is a learning algorithm that generates multi-classifiers instead of a single classifier and then classifies the new data (x) with the votes from the predictions (h(x, Ø k ), k=1,…) [36]. In the performance, evaluation of the algorithm of these four different classifications was used four factors which were also preffered by Hong et al. [37]. These factors can be defined as follows: Here, TP is the number of true positive, FP is the number of false negative, FN is the number of false negative, P is the number of real positive cases in data, and N is the number of real negative cases in the data.

Statistical analysis
The statistical program, R x 64 3.2.3, was used in order to carry out statistical analysis in the study. The statistical analyses were utilized in order to determine both the components of drought indices and the machine learning methods as well as determining the relationship between KBDI, SPEI and FWI and the number of fires, the data of burned areas. All data for drought indices and components were measured as daily, monthly and seasonal data (May 1 to September 30). Since most of the forest fires arise under extreme air conditions [8], maximum and 90% extreme variables of drought indices and their own components were calculated.
Because of the uneven distribution of the raw data for burned area used in the study, their natural logarithm equivalents were used instead of such data. It was examined the correlation between the natural logarithm values of the data for the number of fires and burned area and all the components of all data for drought indices. Each component was later used in the step-wise regression analysis.

Results and discussion
In the first phase, no relationship between the drought indices and either the number of fires or the data of burned areas was found. The variance values that can be explained by daily data range from 2 to 18%. The best correlation results were achieved for the natural logarithm of the data for the burned area and the number of fires. As a result of the correlation analyses by using monthly data, variance results between natural logarithm and the data for the number of fires and burned area are, respectively, 53 and 51% through maximum DSR values (Figure 3).
When the corrected R 2 values were calculated for all the variables in Table 1, it was determined that the highest correlation (56%) for the number of fires belongs to the maximum DSR and FWI. The highest correlation with 50% among the corrected R 2 values calculated for all the variables in terms of the burned area still belongs to the maximum DSR. Urbieta et al. [14] laid stress on a strong relationship between FWI and the burned area in EUMED for the last 30 years and said that 60% of this relationship could be explained through the R 2 values. In the same study, the R 2 values are able to explain the variance ranging from 20 to 55% for the number of fires and number of large fires. It is of high importance for estimating the activity in fire season that the R 2 values bear similar results for the both areas where the variables differ such as the differences in the study periods, socio-economic conditions, fire extinguishing activities, fuel component accumulations.
Canadian FWI system is one of the most commonly used fire weather indices all over the world thanks to its success in determining the fire risk and flexibility although it was originally developed for Canada [38][39][40]. Beside this, six components forming the system can individually estimate the fire risk in a succesful way in different conditions. In particular, FWI and FFMC among these components stand out in explaining the fire activity [41]. FWI component itself is used not only for generally determining the fire severity of many fuel types, but also for explaining the fire danger [42,43]. As for FFMC component, it is generally used as the indicator of fire outbreaks and the potential human-induced fire outbreak danger [43][44][45][46][47][48][49]. In addition to these two indices, it was also determined that there is a close relationship between ISI and DSR and the weather conditions in another study carried out in New Zealand. DSR, obtained with the calculation of FWI, as well as FFMC and FWI, was found as a component to be of the highest relationship with high temperature and high west winds in Canterbury, New Zealand. In a different research area of the same study, ISI was also added to these indices in this relationship with different weather conditions [41]. The increase in DSR leads to an interpretation that extinguishing fires needs more efforts and time [50][51][52].
The data on the forest fires in recent years in EUMED countries display that the forest fires have decreased [14,53]. Although Turkey and EUMED countries have different tendencies, it is seen that the relationship between forest fires and climates is of vital importance. The more the number and variety of fire and climate data increase, the clearer the relationship between them will come out to be [14]. When considered the other studies carried out in the Mediterranean region, it is seen that the areas, the drought indices (FWI, DSR) of which are of low long-term averages such as those of Turkey, are able to tolerate the climate changes better [54,55]. For that reason, even though there is an increase in the number of fires in the Antalya region, it is thought to be of a downtrend in the amount of burned area and the frequency of large fires.
Along with the assumption that a comprehensive model would be more determinant and easy to generalize [56,57], a two-sided (forward-backward) step-wise regression analysis was made by using all the variables in Table 1 in the second phase of the study, and the results are presented in Table 2.
Extreme values (90th percentile) of monthly BUIX and FWI, DSR and TX are the components that could be used for estimating both the number of fires and the burned area in Antalya region. The components in the both regression equations are statistically significant (P < 0.01). When the equations were examined, it is seen that the extreme values are more satisfactory than the components in terms of both NF and BA for Antalya region. The shared predicting factors in our study and that of Amatulli et al. [58] are the maximum values of BUI and DSR. The regression equation found by Amatulli et al. [58] can explain about 75% of the variance in estimating the amount of the burned area, similarly 71% of it (Figure 4) can be explained in our study with the equation in Table 2. When these results are assessed on the basis of countries, the explained variance is lower because large fires occur intensely in some regions like in Turkey. While a strong relationship between the amount of burned areas and FWI, BUI, ISI and SSR values in Balıkesir region was stated by Ertuğrul and Varol [6], in Muğla region was found a relationship only between SSR and the amount of burned area. Similarly, the regression equation found by Balshi et al. [59] for the west of Canada explains 80% of the variance, whereas the one for the east of Canada is able to explain only 43% of the variance. Antalya region comes in second after Muğla region in terms of the burned area and the three regions (Antalya, Muğla and Balıkesir) are of approximately 61% of the burned areas in Turkey (for the period 1977-2014). Figure 4, the monthly monitored and estimated values graphic of the burned area and number of fires. Once Figure 4 was examined, the estimated and monitored values for both BA and NF are seen to be distributed equally on the trend line. The high and low estimates are considered to be caused by climate and index components, responsible for the unexplained part of variance.     and Vilar et al. [57] also expressed the existence of a non-linear relationship between the fires and independent variables. Therefore, Machine Learning algorithms were used in the third phase of the study.

It is shown in
These four algorithms (kNN, SVM, RPART and RF) managed to predict of the amount of burned areas with a success rate ranging 30-39%. In terms of the number of fires, its success widely ranges from 8 to 41%. The unsuccessful algorithms failing to predict both the number of fires and the amount of burned areas were able to be very successful in predicting the possibility of the destruction of areas greater than 300 ha. Also, the prediction results for the possibility of fire outbreaks according to drought value are seen in Table 3.
In the study carried out by Cortez and Morais [32] in the Montesinho National Park in Portugal, the most successful results were obtained with SVM model, while the performance of kNN and SVM models in Antalya (79.21% and 30.81) was seen to be relatively lower than those of other models. The difference is thought to stem from that the study in Portugal was carried out in a national park and also the direct weather inputs were used instead of drought indices as an input. The drought indices take into consideration not only weather inputs but also the factors such as fine fuel moisture and duff moisture. The fire possibility is of poor prediction accuracy for all models compared to large fires. RPART displays the best performance in predicting both the existence of fires and large fires. kNN, on the other hand, displays the worst performance for the both classifications.
The predictions trying to understand the relationship between the fires and droughts qualitatively were tested via the analyses in three phases. The results of these studies show that the use of MARS models bears better results at predicting both the amount of burned areas and the number of fires rather than using several drought indices. Because much of the area burned occurs during extreme fire weather condition [8], the extreme values (maximum and 90th percentile) of drought indices take place in MARS models. As well as stating that MARS and Machine Learning methods are more successful because they take the complex relationships in the data into consideration, it is seen that MARS models perform better in both the  article by Amatulli et al. [58] and our study. It is stated that it is successful because MARS models constitute functions in each hyper-region by taking regression coefficients into consideration [61]. Contrary to the expectations, Machine Learning methods such as kNN, SVM, RPART and RF were unable to perform as successfully as MARS and linear regression models in order to predict the amount of burned areas and the number of fires. The performances of Machine Learning methods vary from geographical region to region, and these methods perform better in larger areas (like EUMED) compared to its performance in smaller areas [62]. Trigo et al. [63] emphasized that through Machine Learning methods were obtained poor results in Portugal where the burner areas hit the top in 2003, and large fires became intense in a small region. It is considered that a single fire that destroyed an area of 16,890 ha in 2008 in Antalya region as the study area bore similar results. Similarly, Prasad et al. [64] also stated that the same variables would vary depending on the spatial and analysis scale because environmental and social conditions differ from region to region. SVM and RF stand out compared to Machine Learning methods in terms of prediction performance [65] and RPART algorithm produced more successful results in our study. Besides the statements of both Trigo et al. [63] and Prasad et al. [64], it is thought that RPART algorithm performed better for Antalya region as it used decision tree mechanism along with regression equations in a similar way to MARS.

Conclusion
The relationship between factors such as meteorological factors, FWI System components, KBDI and SPEI and both the amount of burned area and the number of fires was investigated for Antalya region. The results show that the extreme values of the given factors are more effective on the amount of burned area and the number of fires. DSRX explains the majority of the variance in the number of fires and the amount of burned area. Normal or extreme values of KBDI and SPEI could not take place in the equations obtained as a result of step-wise regression. In a study in Antalya carried out by Varol and Ertuğrul [6], it was found that there was a significant relationship between KBDI values and the burned area only in the years when big fires arose but there was no relationship between KBDI values and the number of fires. As a result of step-wise regression analysis, the extreme ones of FWI, DSR, T components for both the burned area and the number of fires were selected as significant variables. As a result of the obtained equations, approximately 65% of the variance in the number of fires and 62% of the burned area could be explained through 90th percentiles of these components.
This study will be able to be used as base work in order to estimate the number of fires and the amount of burned area in Antalya region for future climate scenarios. Moreover, when the extent of the study is expanded in a way that it covers fire sensitive forests, the differences and similarities between regions will be pointed out as well as a general view for Turkey which could be reached.
In a study carried out in Canada by Harrington et al. [3], the mean and extreme values of the FWI components were compared with each other, and the extreme values in six out of nine different regions were seen to be more successful in the explanation of the variance. In another study carried out by Flannigan and Harrington [2], the use of meteorological data as a satisfactory variable explained that it could explain 30% of the variance in Canada, and this value was a similar result to FWI components. It was pointed out that maximum temperature and low humidity have a strong relationship with the dry conditions, and therefore, this is a frequently preferred variable. In our study, the existence of maximum temperature in the equations of either the number of fires or burned area promotes this assertion as a result of step-wise regression.
Urbeita et al. [14] stated that SPEI8 produced better results in the period from autumn to spring. In our study, however, SPEI is seen to be ineffective compared to other indices and SPEI1 produced better results for the forest fires in Antalya region. It is thought that the analyses for investigating the relationship between SPEI and forest fires should be made in longer periods, and the studies, as in Antalya subregion, should be expanded to the worldwide; thus, it would bear better results [14].
What determine in fire statistics are surely not factors used only in calculating drought indices. However, it is a crucial data source that is not of a certain relationship between fire statistics and meteorological data. So, it is needed a detailed and foolproof record of both fire statistics and meteorological data records. In terms of forest fire statistics, much more certain data are existent for the period after 2001, while there are no data for earlier periods. Similarly, the lack of daily meteorological data led to the determination of the study period between 2001 and 2014 for Antalya region.
Many researchers also state that a number of social factors like unemployment or arson are of influence on forest fires [66]. Moreover, in some studies, it is pointed out that socio-economic and landscape factors would be more efficient on the burned area than climate conditions in local scale [67]. So, it will be necessary to take some social factors such as unemployment, population as well as road intensity into consideration in order that the number of fires and the amount of burned areas can be predicted more successfully [15]. Therefore, MARS is thought to be more efficient after it has been added more variables. In defining complex relationships, simple and multiple regression techniques are not productive for investigating the relationships in high-dimensional data sets, while MARS can be a solution to overcome these obstacles. Moreover, it is usually simpler to compare the models developed using MARS approach with other modeling and mathematical techniques. MARS approach is considered to be more suitable than other approaches for representing the temporal variability in the factors such as temperature, precipitation, relative humidity and wind. The effects of predictors on the distribution, the model, intervals, explained variance and the extrapolation skills [58] also support the aforementioned statements. It should not be forgotten that even the best model chosen presents the maximum variance explained in the observation period, and it should be taken into account that the model would exceed this period in the future climate conditions.
Determining the factors effective on the reasons for fires is of high importance in determining the way, quality and density degree of technical interference with natural and artificial forest stands that fire sensitive plant species generate especially in the regions of ecological conditions, which are also fire sensitive. Silvicultural activities closely affect the amount of fuel [68]. The plantation and restoration activities, which aim at establishing artificial forests, it is necessary to form fire resistant and especially fire retardant forests in the Mediterranean Region climate zone, which is fire sensitive in terms of especially ecological conditions. Thus, it is so important to choose the suitable origins and clones of those as well as choosing the fire resistant and indigenous plant species [69].
These results displayed that climate conditions and stand structure have important effects on the forest fires in Antalya region as the reason of their outbreaks. Similarly, the majority of the fires occurred in the Calabrian pine and cedar forests whose closure was high (71-100%) and where middle sized and thick trees were seen in the studies, which were carried out on the forest fires in Antalya climate zone and stand dynamics and silvicultural techniques. When analyzed the current fire statistics, most of the forest fires are seen to occur in the Calabrian pine and cedar forests which are of high closure and middle sized and thick trees near especially the residential areas, recreation areas and main arterial roads. Also, taking the current ecological conditions into consideration for silvicultural objectives and aims, it is quite important to enable the suitable mesophyte species to take place in the area in order to reduce the effects of fire danger in pure coniferous monoculture Calabrian pine and cedar plantations, where especially important silvicultural interventions are compulsory such as opening necessary maintenance paths [70].