A Learner-Oriented Annotated Resource of French Multiword Expressions for Text Adaptation in Foreign Language Reading
Download article
Abstract
This article presents a learner-oriented annotated lexical resource of French multiword expressions (MWEs)
designed to support text adaptation in foreign language reading. MWEs, including idioms and collocations, pose
major comprehension challenges for learners because their meaning is often non-compositional or depends on
conventional lexical constraints. To address this issue, the study extends an existing verbal MWE database by
integrating nominal and verbal MWEs annotated according to a linguistically grounded typology distinguishing
idioms, opaque collocations, and transparent collocations. The resource was developed through a multi-step
methodology combining automatic extraction from pedagogical corpora, manual annotation using decision-tree-
based guidelines, and CEFR level assignment based on corpus distribution. The resulting dataset includes
approximately 2,700 expressions enriched with detailed linguistic and learner-relevant metadata. Annotation
campaigns involving native and non-native annotators showed moderate agreement, reflecting the gradient nature
of phraseological opacity. By linking phraseological complexity with learner proficiency, this resource provides a
reproducible framework for modeling MWE difficulty. It offers valuable support for text adaptation, readability
assessment, and the development of NLP-based educational tools, contributing to improved accessibility of French
texts for language learners.
Keywords
multiword expressions; foreign language reading; readability; learner-oriented lexical resource;
decision-tree-based annotation
Authors
- Anna Kalinina, University of Strasbourg, UR 1339/LiLPa & ITI LiRiC
- Thomas François, UC Louvain, CENTAL
- Hélène Vassiliadou, University of Strasbourg, UR 1339/LiLPa & ITI LiRiC
- Amalia Todirascu, University of Strasbourg, UR 1339/LiLPa & ITI LiRiC
Cite as
KALININA, A., FRANÇOIS, T., VASSILIADOU, H. & TODIRASCU, A. (2026), “A Learner-Oriented Annotated Resource of French Multiword Expressions for Text Adaptation in Foreign Language Reading”, Proceedings of the Joint Workshop on Readability and Text Simplification (READIxTSAR) @ LREC 2026, ©ELRA Language Resources Association, Palma de Mallorque, Espagne, 11 mai 2026, p. 181-192