Creating a well-being data layer using machine learning, satellite imagery and ground-truth data

World Resources Institute


Conducting economic surveys requires huge resources; thus, modern means of acquiring this information using publicly available data and open source technologies create the possibilities of replacing current processes. Satellite images can act as a proxy for existing data collection techniques such as surveys and census to predict the economic well-being of a region. The aim of the project is to build on a prototype that was created using Census data and LandSat data for India. In the next iteration, opportunities for Demographic Health Surveys, Open Street Map, Sentinel and nightlight data will be explored. The initial prototype created a model that had an accuracy of almost 70 percent. The aim is to create a model for India that can be adapted and scaled to other countries.

Health Environment Social Services International development Economic Development
check New check Scoping check Scoping QA check Staffing check In progress check Final QA done_all Completed
This project is completed

Project scope

Project goal(s)

Create an algorithm that is based on landscape features (a combination of the NDVI, NDBI, and NDWI of the image, NDVI — Normalized Difference Vegetation Index, NDBI — Normalized Difference Built-up Index, NDWI — Normalized Difference Water Index) and matched with ground truth data (Census, Demographic Health Surveys, Open Street Map and others).

Assess the opportunities and challenges of using different data sources as a proxy for well-being (LandSat, Sentinel, DHS, OSM, and other opportunities such as nightlights etc.)

Present recommendations for scaling beyond India


Planet recently made high resolution Sentinel data available at - Census data is available for 2011 and Demographic Heath Surveys are available for 2015. The aim would be to have a time series, therefore training models on matching satellite and ground truth data is important. There is some Open Street Map data available in certain areas. Creativity with available data sources is encouraged to build the model, therefor alternative data sources are encouraged.

Scope version notes