Alignment-Based Approach for automatic modernization of french texts from the 17th to the 18th century
Online demo at https://igm.univ-mlv.fr/~gambette/text-processing/aba/
- With make
make- Without make
Add an extra line with ASR_metrics in the end of the file requirements.txt if you want to use the evaluation metrics
pip install -r requirements.txt- Download PARALLEL17 and put it into the
downloadfolder or run script
python -m aba.download_git 'https://github.com/PhilippeGambette/PARALLEL17.git'- Align PARALLEL17 by words
python -m aba.align_words- Extract dictionaries from PARALLEL17
python -m aba.analyze- Download Morphalou
- Copy
morphalou/4/Morphalou3.1_formatCSV_toutEnUn/Morphalou3.1_CSV.csvtodownloadfolder - Run script
python -m aba.extract_dic_morphalouExtract old french → modern french dictionary from Wikisource.
python -m aba.extract_dic_wikisourceExtract dictionary from multiple .dic files located in resources folder.
python -m aba.extract_dic_resourcespython -m aba.modernize_corpusModernize a text in old French. 1
python -m aba.modernize [-h] text_old_pathModernize a text in old French and evaluate it by comparing it with a reference version stored in a file TEXT_NEW_PATH
python -m aba.modernize_and_evaluate [-h] -n TEXT_NEW_PATH text_old_pathOpens a labeled dictionary and displays an interactive plotly pie chart showing the frequence of modernization rules. A copy of the chart is saved in data/rules_chart.html.
python -m aba.rules_chartSearch 2-columns .tsv files in a given directory for two corresponding strings old and new.
Prints files, rows and lines where both strings appear.
python -m aba.find_strings [-h] [-d DIRECTORY] old newpy.testFootnotes
-
Path arborescence must be written with forward slashes
/. ↩