Main Corpus on OECD reports on water between 2009 and 2021
This is a corpus of 55 reports in pdf published by the OECD Studies on Water between 2009 and 2021. This is the basis of the project funded by e-Science center and ODISSEI consotrium awarded to ESSB in EUR in 2022.
The codes produced are availalbe on Github at the following links -- with methodology, samlping and results listed there.
Structured Topc Modelling (STM) --
GitHub link: https://github.com/disaster-capitalism/topic-modelling
Word frequencies of executive summaries vis a vis main texts --
GitHub link: https://github.com/disaster-capitalism/topic-modelling
Named Entity Recognition and Semantic Label Mapping -- https://github.com/disaster-capitalism/named-entities
The data is licensed by OECD and can be obtained from OECD by request. The data description is to be found here -- https://www.oecd-ilibrary.org/environment/oecd-studies-on-water_22245081