Sensitivity and feature importance of climate factors for predicting fire hotspots using machine learning methods

Authors: EH Nugrahani, S Nurdiati, F Bukhari, MK Najib, DM Sebastian, PAN Fallahi. 

Abstract: Every year, Indonesia experiences a national crisis due to forest fires because the resulting impacts and losses are enormous. Hotspots as indicators of forest fires capable of quickly monitoring large areas are often predicted using various machine learning methods. However, there is still few research that analyzes the sensitivity and feature importance of each predictor that forms a machine learning prediction model. This study evaluates and compares machine learning methods to predict hotspots in Kalimantan based on local and global climate factors in 2001-2020. Using the most accurate machine learning model, each climate factor used as a predictor is analyzed for its sensitivity and feature importance. Four methods used include random forest, gradient boosting, Bayesian regression, and artificial neural networks. Meanwhile, measures of sensitivity and feature importance used are variance, density, and distribution-based sensitivity indices, as well as permutation and Shapley feature importance. Evaluation of the machine learning model concluded that the Bayesian linear regression model outperformed other models with an RMSE of 750 hotspots and an explained variance score of 68.96% on testing data. Meanwhile, tree-based models show signs of overfitting, including gradient boosting and random forest. Based on the results of sensitivity analysis and feature importance of the Bayesian linear regression model, the number of dry days is the most important feature in predicting fire hotspots in Kalimantan.

Keywords: Bayesian regression; Feature importance; Machine learning; Sensitivity analysis; Wildfire


Dipublikasikan pada IAES International Journal of Artificial Intelligence (IJ-AI), vol. 13(2): 2212-2225.

Posting Komentar

0 Komentar