This project addresses the digital divide by focusing on text generation for people with cognitive disabilities. It introduces the ETR-fr dataset, which contains 523 pairs of French texts aligned with the Easy-to-Read (ETR) European standard. ETR texts use simple vocabulary, clear structure, concrete examples, and contextual summaries to meet the needs of this population. Additionally, the project presents a generative model fine-tuned using LoRA on the mBARTHez backbone. The model demonstrates strong performance in generating ETR content compared to full fine-tuning and Prefix-Tuning on BARTHez, Mistral-7B and mBART. The findings highlight the effectiveness of computer-aided text generation for this purpose.
| Model | LoRA | Prefix | Full Fine-Tuning |
|---|---|---|---|
| mBART | Link (104.8 MB) | Link (147.6 MB) | Link (2.1 GB) |
| mBARThez | Link (105.0 MB) | Link (577.4 MB) | Link (1.7 GB) |
| mT5 | Link (78.8 MB) | Link (218.4 MB) | Link (1.8 GB) |
| Mistral-7B | Link (278.9 MB) | Link (2.0 GB) | - |