List of resources and tools developed with focus on Portuguese.
-
Updated
Mar 20, 2024
List of resources and tools developed with focus on Portuguese.
End-to-End Python implementation of Muço’s (2025) corruption measurement framework. Combines NLP pipeline (regex extraction, Porter stemming, TF-IDF), PCA-based dimensionality reduction, and fixed-effects OLS to quantify institutional quality from Brazilian audit reports. Includes supervised learning robustness checks and LOO sensitivity analysis.
This repository contains the official Python code and resources for the research paper: "Portuguese Automated Fact-checking: Information Retrieval with Claim extraction".
Article reproducibility LexIris-pt and LexBert-pt: Specialized Sentence Embeddings for Legal Similarity in Brazilian Portuguese.
Portuguese split from MQA
Comparative study of 23 LLMs for Brazilian Portuguese sentiment analysis via in-context learning. Evaluates multilingual vs Portuguese-specialized models across 12 datasets. Code and data included.
CurupiraIA: A Brazilian Portuguese hate speech detection model using BERT fine-tuning, inspired by folklore guardianship principles to protect digital communities from toxic content.
Article reproducibility Classification of the Conciliation Profile in Initial Petitions in the Brazilian Judiciary
Add a description, image, and links to the portuguese-nlp topic page so that developers can more easily learn about it.
To associate your repository with the portuguese-nlp topic, visit your repo's landing page and select "manage topics."