Estimation of Poverty Using Random Forest Regression with Multi-Source Data: A Case Study in Bangladesh
Article
Article Title | Estimation of Poverty Using Random Forest Regression with Multi-Source Data: A Case Study in Bangladesh |
---|---|
ERA Journal ID | 201448 |
Article Category | Article |
Authors | Zhao, Xizhi, Yu, Bailang, Liu, Yan, Chen, Zuoqi, Li, Qiaoxuan, Wang, Congxiao and Wu, Jianping |
Journal Title | Remote Sensing |
Journal Citation | 11 (4), pp. 1-18 |
Article Number | 375 |
Number of Pages | 18 |
Year | 02 Feb 2019 |
Publisher | MDPI AG |
Place of Publication | Switzerland |
ISSN | 2072-4292 |
Digital Object Identifier (DOI) | https://doi.org/10.3390/rs11040375 |
Web Address (URL) | https://www.mdpi.com/2072-4292/11/4/375 |
Abstract | Spatially explicit and reliable data on poverty is critical for both policy makers and researchers. However, such data remain scarce particularly in developing countries. Current research is limited in using environmental data from different sources in isolation to estimate poverty despite the fact that poverty is a complex phenomenon which cannot be quantified either theoretically or practically by one single data type. This study proposes a random forest regression (RFR) model to estimate poverty at 10 km × 10 km spatial resolution by combining features extracted from multiple data sources, including the National Polar-orbiting Partnership Visible Infrared Imaging Radiometer Suite (NPP-VIIRS) Day/Night Band (DNB) nighttime light (NTL) data, Google satellite imagery, land cover map, road map and division headquarter location data. The household wealth index (WI) drawn from the Demographic and Health Surveys (DHS) program was used to reflect poverty level. We trained the RFR model using data in Bangladesh and applied the model to both Bangladesh and Nepal to evaluate the model's accuracy. The results show that the R2 between the actual and estimated WI in Bangladesh is 0.70, indicating a good predictive power of our model in WI estimation. The R2 between actual and estimated WI of 0.61 in Nepal also indicates a good generalization ability of the model. Furthermore, a negative correlation is observed between the district average WI and the poverty head count ratio (HCR) in Bangladesh with the Pearson Correlation Coefficient of -0.6. Using Gini importance, we identify that proximity to urban areas is the most important variable to explain poverty which contribute to 37.9% of the explanatory power. Compared to the study that used NTL and Google satellite imagery in isolation to estimate poverty, our method increases the accuracy of estimation. Given that the data we use are globally and publicly available, the methodology reported in this study would also be applicable in other countries or regions to estimate the extent of poverty. |
Keywords | poverty; random forest regression; Bangladesh; nighttime light; Google satellite imagery |
Byline Affiliations | East China Normal University, China |
University of Queensland | |
Library Services |
https://research.usq.edu.au/item/wq53x/estimation-of-poverty-using-random-forest-regression-with-multi-source-data-a-case-study-in-bangladesh
Download files
37
total views19
total downloads2
views this month1
downloads this month