Our code is developed based on the open-source project MatchZoo.
We use python version 3.7 and the main dependent libs are listed in requirements.txt
conda create -f environment.ymlwhile some other requirements need to be installed handly
ICU Tokenizer
# conda install icu libarary
conda install icu pkg-config
# Or if you wish to use the latest version of the ICU library,
# the conda-forge channel typically contains a more up to date version.
conda install -c conda-forge icu
# mac os
CFLAGS="-std=c++11" PATH="/usr/local/opt/icu4c/bin:$PATH" \
pip install ICU-Tokenizer
# ubuntu
CFLAGS="-std=c++11" pip install ICU-TokenizerEmoji Translation
git clone [email protected]:jhliu17/emoji.git
cd emoji
python setup.py installYou can download the data from this Google Drive storage. To process text and image data, please read the details here.
Training, the experiment config settings are listed in config folder.
# Single gpu or dataparallel
# [ckpt] is optional for continual training
sh scripts/train.sh device_ids config_file [ckpt]
# Or distributed training
sh scripts/train_dist.sh device_ids n_procs config_file [ckpt]Evaluation
sh scripts/eval.sh device_ids config_file ckpt