Skip to content

Latest commit

 

History

History
40 lines (25 loc) · 1.2 KB

File metadata and controls

40 lines (25 loc) · 1.2 KB

NLP CS 2025 Kaggle Challenge

Competition Link

You can access the competition via the following link : Kaggle We are the team named meow.

Report

The report is available in the file NLP_report.pdf following this link

Team Members

  • Rayane Bouaita
  • Erwan David
  • Pierre El Anati
  • Guillaume Faynot
  • Gabriel Trier

Description

Text classification with sparsely represented training data is not a trivial task. We are going to present our solution using large language models (LLMs) to classify texts from almost 390 different languages. After studying the data provided to us, we decided to use different approaches using machine learning models (XLM-Roberta & BERT). Our final model achieved an accuracy of 88.0%, placing our team in the top 10 of the ranking.

Installation

To install the required packages, you can run the following command:

pip install -r requirements.txt

Usage

To train the model, you can run the following command from the root directory:

python models/roberta.py

You can also use the model.ipynb notebook to train the model.