Home » HDRs

HDRs 2023

  Florian Boudin, Analysing and indexing scientific texts
Analysing and indexing scientific texts
Author : Florian Boudin
Document :

Keywords : Information retrievalNatural language processingKeyword indexingScientific textsGraph-based methodsEvaluationScientific writing assistance

The work presented in this "Habilitation à Diriger des Recherches" (Accreditation to Supervise Research) focuses on the analysis and indexing of scientific texts and lies at the intersection of two research themes: Natural Language Processing (NLP), which involves the analysis, understanding, and generation of natural language, and Information Retrieval (IR), which studies ways to retrieve information from a collection of documents. We are interested in the question of scholarly document retrieval, which involves searching for documents in the scientific literature (e.g., articles, books, theses) related to a specific subject of study. More specifically, our research aims to enhance the metadata associated with documents to improve their accessibility and dissemination. Our work focuses on the development of automated methods for keyword generation, which are characterized by the unique utilization of graph-based techniques and node ranking algorithms. We delve into the issue of indirectly evaluating automatically generated keywords through application-specific tasks and their utilization in search engines and academic recommendation systems. We present our efforts into constructing linguistic resources, developing software tools, and their dissemination within the scientific community. Finally, we conclude with some prospective insights into keyword indexing and, more broadly, the emerging research at the intersection of NLP and IR themes.

Defense date : 20-06-2023
Jury president : Aurélie Névéol
Jury :
  • Aurélie Névéol
  • Antoine Doucet
  • Jacques Savoy
  • Béatrice Daille
  • Richard Dufour

Copyright : LS2N 2017 - Legal notices -