Geographical random forests: a spatial extension of the random forest algorithm to address spatial heterogeneity in remote sensing and population modelling

Stefanos Georganos, Tais Grippa, Assane Niang Gadiaga, Catherine Linard, Moritz Lennert, Sabine Vanhuysse, Nicholus Mboga, Eléonore Wolff, Stamatis Kalogirou

Research output: Contribution to journalArticle

Abstract

Machine learning algorithms such as Random Forest (RF) are being increasingly applied on traditionally geographical topics such as population estimation. Even though RF is a well performing and generalizable algorithm, the vast majority of its implementations is still ‘aspatial’ and may not address spatial heterogenous processes. At the same time, remote sensing (RS) data which are commonly used to model population can be highly spatially heterogeneous. From this scope, we present a novel geographical implementation of RF, named Geographical Random Forest (GRF) as both a predictive and exploratory tool to model population as a function of RS covariates. GRF is a disaggregation of RF into geographical space in the form of local sub-models. From the first empirical results, we conclude that GRF can be more predictive when an appropriate spatial scale is selected to model the data, with reduced residual autocorrelation and lower Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) values. Finally, and of equal importance, GRF can be used as an effective exploratory tool to visualize the relationship between dependent and independent variables, highlighting interesting local variations and allowing for a better understanding of the processes that may be causing the observed spatial heterogeneity.

Original languageEnglish
Number of pages17
JournalGeocarto International
DOIs
Publication statusPublished - 1 Jan 2019

Fingerprint

population modeling
remote sensing
population estimation
disaggregation
autocorrelation

Keywords

  • population estimation
  • Random forest
  • spatial analysis

Cite this

Georganos, Stefanos ; Grippa, Tais ; Niang Gadiaga, Assane ; Linard, Catherine ; Lennert, Moritz ; Vanhuysse, Sabine ; Mboga, Nicholus ; Wolff, Eléonore ; Kalogirou, Stamatis. / Geographical random forests : a spatial extension of the random forest algorithm to address spatial heterogeneity in remote sensing and population modelling. In: Geocarto International. 2019.
@article{df744f577a7645789970900e80e5081b,
title = "Geographical random forests: a spatial extension of the random forest algorithm to address spatial heterogeneity in remote sensing and population modelling",
abstract = "Machine learning algorithms such as Random Forest (RF) are being increasingly applied on traditionally geographical topics such as population estimation. Even though RF is a well performing and generalizable algorithm, the vast majority of its implementations is still ‘aspatial’ and may not address spatial heterogenous processes. At the same time, remote sensing (RS) data which are commonly used to model population can be highly spatially heterogeneous. From this scope, we present a novel geographical implementation of RF, named Geographical Random Forest (GRF) as both a predictive and exploratory tool to model population as a function of RS covariates. GRF is a disaggregation of RF into geographical space in the form of local sub-models. From the first empirical results, we conclude that GRF can be more predictive when an appropriate spatial scale is selected to model the data, with reduced residual autocorrelation and lower Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) values. Finally, and of equal importance, GRF can be used as an effective exploratory tool to visualize the relationship between dependent and independent variables, highlighting interesting local variations and allowing for a better understanding of the processes that may be causing the observed spatial heterogeneity.",
keywords = "population estimation, Random forest, spatial analysis",
author = "Stefanos Georganos and Tais Grippa and {Niang Gadiaga}, Assane and Catherine Linard and Moritz Lennert and Sabine Vanhuysse and Nicholus Mboga and El{\'e}onore Wolff and Stamatis Kalogirou",
year = "2019",
month = "1",
day = "1",
doi = "10.1080/10106049.2019.1595177",
language = "English",
journal = "Geocarto International",
issn = "1010-6049",
publisher = "Taylor & Francis",

}

Geographical random forests : a spatial extension of the random forest algorithm to address spatial heterogeneity in remote sensing and population modelling. / Georganos, Stefanos; Grippa, Tais; Niang Gadiaga, Assane; Linard, Catherine; Lennert, Moritz; Vanhuysse, Sabine; Mboga, Nicholus; Wolff, Eléonore; Kalogirou, Stamatis.

In: Geocarto International, 01.01.2019.

Research output: Contribution to journalArticle

TY - JOUR

T1 - Geographical random forests

T2 - a spatial extension of the random forest algorithm to address spatial heterogeneity in remote sensing and population modelling

AU - Georganos, Stefanos

AU - Grippa, Tais

AU - Niang Gadiaga, Assane

AU - Linard, Catherine

AU - Lennert, Moritz

AU - Vanhuysse, Sabine

AU - Mboga, Nicholus

AU - Wolff, Eléonore

AU - Kalogirou, Stamatis

PY - 2019/1/1

Y1 - 2019/1/1

N2 - Machine learning algorithms such as Random Forest (RF) are being increasingly applied on traditionally geographical topics such as population estimation. Even though RF is a well performing and generalizable algorithm, the vast majority of its implementations is still ‘aspatial’ and may not address spatial heterogenous processes. At the same time, remote sensing (RS) data which are commonly used to model population can be highly spatially heterogeneous. From this scope, we present a novel geographical implementation of RF, named Geographical Random Forest (GRF) as both a predictive and exploratory tool to model population as a function of RS covariates. GRF is a disaggregation of RF into geographical space in the form of local sub-models. From the first empirical results, we conclude that GRF can be more predictive when an appropriate spatial scale is selected to model the data, with reduced residual autocorrelation and lower Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) values. Finally, and of equal importance, GRF can be used as an effective exploratory tool to visualize the relationship between dependent and independent variables, highlighting interesting local variations and allowing for a better understanding of the processes that may be causing the observed spatial heterogeneity.

AB - Machine learning algorithms such as Random Forest (RF) are being increasingly applied on traditionally geographical topics such as population estimation. Even though RF is a well performing and generalizable algorithm, the vast majority of its implementations is still ‘aspatial’ and may not address spatial heterogenous processes. At the same time, remote sensing (RS) data which are commonly used to model population can be highly spatially heterogeneous. From this scope, we present a novel geographical implementation of RF, named Geographical Random Forest (GRF) as both a predictive and exploratory tool to model population as a function of RS covariates. GRF is a disaggregation of RF into geographical space in the form of local sub-models. From the first empirical results, we conclude that GRF can be more predictive when an appropriate spatial scale is selected to model the data, with reduced residual autocorrelation and lower Root Mean Squared Error (RMSE) and Mean Absolute Error (MAE) values. Finally, and of equal importance, GRF can be used as an effective exploratory tool to visualize the relationship between dependent and independent variables, highlighting interesting local variations and allowing for a better understanding of the processes that may be causing the observed spatial heterogeneity.

KW - population estimation

KW - Random forest

KW - spatial analysis

UR - http://www.scopus.com/inward/record.url?scp=85067550834&partnerID=8YFLogxK

U2 - 10.1080/10106049.2019.1595177

DO - 10.1080/10106049.2019.1595177

M3 - Article

AN - SCOPUS:85067550834

JO - Geocarto International

JF - Geocarto International

SN - 1010-6049

ER -