All pdfs

Reason: No license to distribute

Main Corpus on OECD reports on water between 2009 and 2021

dataset
posted on 2022-12-08, 14:23 authored by Farhad MukhtarovFarhad Mukhtarov

This is a corpus of 55 reports in pdf published by the OECD Studies on Water between 2009 and 2021.  This is the basis of the project funded by e-Science center and ODISSEI consotrium awarded to ESSB in EUR in 2022.

The codes produced are availalbe on Github at the following links -- with methodology, samlping and results listed there.

Structured Topc Modelling (STM) --   

GitHub link: https://github.com/disaster-capitalism/topic-modelling

Word frequencies of executive summaries vis a vis main texts --   

GitHub link: https://github.com/disaster-capitalism/topic-modelling

Named Entity Recognition and Semantic Label Mapping --  https://github.com/disaster-capitalism/named-entities 


The data is licensed by OECD and can be obtained from OECD by request. The data description is to be found here -- https://www.oecd-ilibrary.org/environment/oecd-studies-on-water_22245081

Funding

ODISSEI- Markets for Resilience or Disaster Capitalism

History