• Graduate Program
    • Why study Business Data Science?
    • Research Master
    • Admissions
    • Course Registration
    • Facilities
  • Research
  • News
  • Events
  • Alumni
  • Magazine
Home | Events Archive | Retrieving ESG Exclusion Level Information from ETF Methodology Books
Research Master Pre-Defense

Retrieving ESG Exclusion Level Information from ETF Methodology Books


  • Series
    Research Master Defense
  • Speaker
    Chao Liang
  • Location
    Tinbergen Institute Amsterdam, room 1.02
    Amsterdam
  • Date and time

    August 31, 2022
    10:00 - 11:00

No area of our planet is invulnerable to the catastrophic eects of natural hazards. In conjunction with the occurrence of environmental degradation, economic disruption and social inequity, the achievement of sustainable development goals is urgently needed. In business, the evaluation of environmental, social and governance indicators supports the achievement of these goals and contributes to a more resilient society.

To ease the incorporation of these factors into the management of investment portfolios, it is essential to eciently disclose relevant exclusion criteria of exchangetraded funds (ETFs). Yet, little research has been conducted on this challenging task, and no well-established solution to automatic exclusion level identification has been developed. Initial research was performed by using the rule-based extraction method for benchmark methodology books. However, a more exible and robust system is desired.

Profiting from recent advancements in artifcial intelligence, especially deep learning natural language processing (NLP), large language models are capable of capturing rich syntactic and semantic information from texts. As a result, we are inspired to explore solutions to the retrieval of critical information in domain-specific documents in this thesis, utilising unconventional NLP methods.

Specifically, based on transfer learning paradigms, we designed three solution strategies. Pre-trained language models were applied as feature extractors as well as backbone networks to fine-tune customised downstream tasks. We implemented document classification, sequence classification, and extractive question answering tasks, respectively. The experimental results show promising accuracy and respond to the proposed research questions. Further research on generative question answering and cross-modal learning is possible. Finally, our study helps combine information from dierent document sources, which, in turn, may help investors concerned with green economic factors eectively manage their portfolios.