TY - JOUR
T1 - Geographical random forests
T2 - a spatial extension of the random forest algorithm to address spatial heterogeneity in remote sensing and population modelling
AU - Georganos, Stefanos
AU - Grippa, Tais
AU - Niang Gadiaga, Assane
AU - Linard, Catherine
AU - Lennert, Moritz
AU - Vanhuysse, Sabine
AU - Mboga, Nicholus
AU - Wolff, Eléonore
AU - Kalogirou, Stamatis
PY - 2019/1/1
Y1 - 2019/1/1
N2 - Machine learning algorithms such as Random Forest (RF) are being increasingly applied on traditionally geographical topics such as population estimation. Even though RF is a well performing and generalizable algorithm, the vast majority of its implementations is still ‘aspatial’ and may not address spatial heterogenous processes. At the same time, remote sensing (RS) data which are commonly used to model population can be highly spatially heterogeneous. From this scope, we present a novel geographical implementation of RF, named Geographical Random Forest (GRF) as both a predictive and exploratory tool to model population as a function of RS covariates. GRF is a disaggregation of RF into geographical space in the form of local sub-models. From the first empirical results, we conclude that GRF can be more predictive when an appropriate spatial scale is selected to model the data, with reduced residual autocorrelation and lower Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) values. Finally, and of equal importance, GRF can be used as an effective exploratory tool to visualize the relationship between dependent and independent variables, highlighting interesting local variations and allowing for a better understanding of the processes that may be causing the observed spatial heterogeneity.
AB - Machine learning algorithms such as Random Forest (RF) are being increasingly applied on traditionally geographical topics such as population estimation. Even though RF is a well performing and generalizable algorithm, the vast majority of its implementations is still ‘aspatial’ and may not address spatial heterogenous processes. At the same time, remote sensing (RS) data which are commonly used to model population can be highly spatially heterogeneous. From this scope, we present a novel geographical implementation of RF, named Geographical Random Forest (GRF) as both a predictive and exploratory tool to model population as a function of RS covariates. GRF is a disaggregation of RF into geographical space in the form of local sub-models. From the first empirical results, we conclude that GRF can be more predictive when an appropriate spatial scale is selected to model the data, with reduced residual autocorrelation and lower Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) values. Finally, and of equal importance, GRF can be used as an effective exploratory tool to visualize the relationship between dependent and independent variables, highlighting interesting local variations and allowing for a better understanding of the processes that may be causing the observed spatial heterogeneity.
KW - population estimation
KW - Random forest
KW - spatial analysis
UR - http://www.scopus.com/inward/record.url?scp=85067550834&partnerID=8YFLogxK
U2 - 10.1080/10106049.2019.1595177
DO - 10.1080/10106049.2019.1595177
M3 - Article
AN - SCOPUS:85067550834
SN - 1010-6049
JO - Geocarto International
JF - Geocarto International
ER -