Identifying economic and financial incentives for forest and landscape restoration in Latin America using Natural Language Processing

World Resources Institute


Forest and landscape restoration is a cross-cutting agenda that traverses sectors such as agriculture, forestry, water and natural resources. While this cross-cutting nature makes restoration an attractive policy measure for carbon sequestration, mitigation, and adaptation, it complicates policy analysis. The sheer volume of text impedes researchers and decision makers from identifying misalignment and monitoring evolving policy and agenda shifts. Analyzing such a large corpus of documents exacerbates policy analysis’ transparency, objectivity, access, and scalability. Our proposal is to standardize and scale policy analysis, alignment, and agenda setting with natural language processing (NLP). A previous proof-of-concept we developed demonstrated the utility of NLP to quickly summarize agenda-specific information from policies. The aim of this project would be to identify financial and economic incentives to support enabling conditions for Nature Based Solutions.

Environment International development
check New check Scoping check Scoping QA check Staffing check In progress check Final QA done_all Completed
Volunteers are working on this project

Project scope

Project goal(s)

The ultimate goal of this project is to create a tool that helps researchers identify relevant policy documents for researchers and policy makers to reduce the time and labor needed to sort through the mass of policy documents online.

The goal for the first phase of this project is to assist the World Resource Institute in setting up data infrastructure and generating initial classification models. This would result in the following steps and deliverables...

-Assisting WRI in retrieving policy documents from government websites, resulting in a larger dataset and documentation on data retrieval procedures that will be used in future phases of this project.

-Designing and building data repository for WRI that will be used in future phases of this project.

-Creating an initial supervised NLP classification model that accurately classifies policies relevant to forest and landscape restoration. The WRI has identified 7+ classes that they would ultimately like to classify. This initial first step will build a model that only classifies policies as relevant or non- relevant. Future phases of this project will predict all classes, however this initial phase only has enough data to predict relevant/non-relevant.

Future Project Goals: • Identify financial and economic incentives in policies • Identify disincentives, misalignment and conflicts in policies (if possible) • Create a heat map which determines the relevance of policies to forest and landscape restoration

Interventions and Actions

This project will help those making/studying environmental policy work more quickly and effectively. Final models will help policy makers identify countries with many or few environmental policies thus helping them direct their effort toward countries with more problematic policies.


The data are Spanish language policy documents of several countries across central and South America. The data are collected and compiled by WRI. WRI has approximately 60 cases of policy relevant to forest and landscape restoration and 100+ not relevant to forest and landscape restoration. Nine other classes are coded by WRI for the next phase of the project. The data are from publicly available sources. The WRI is in the process of expanding the number of cases in the dataset to facilitate the building of more complex models in future phases.

Analysis Needed

This phase of the project: • Identify which policies relate to forest and landscape restoration

Future analysis needed: • Identify which policies identify incentives (in particular financial and economic incentives) for forest and landscape restoration • Identify which polices identify disincentives for forest and landscape restoration • Create a heat map to understand related polices

Validation Methodology

The team will provide guidance on which incentives we seek to understand as a guide. We will manually analyze policies for certain countries. This analysis can be used to compare with the machine learning results.

The quality of the machine learning model will be judged by its ability to accurately classify relevant policies.


The next phase of this project will work with an expanded dataset to create a model that not only correctly classifies relevant models but also classifies models with all 9 categories identified by WRI .

This work will contribute to the policy accelerator which seeks to work with governments to identify tangible policy recommendations. If successful, it will act as a prototype for scaling in other languages.

The aim would be to create a tool that could be used to understand how new policy changes support or conflict for Nature Based Solutions (such as forest and landscape restoration). This would enable governments to easily create supporting policies, environmentalists to advocate for positive change and create greater transparency regarding creating enabling conditions for Nature Based Solutions at national and jurisdictional levels.

Scope version notes