Skip to content

Latest commit

 

History

History

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 
 
 

README.md

Natural Language Inference

The dataset is Stanford Natural Language Inference (SNLI), which we regard as a three-way classification tasks. We use an encoder-attention-decoder architecture, and stack two additional birnn upon the final sequence representation. Both GloVe word embedding and character embedding is used for word-level representation. Main experimental results are summarized below.

Model #Params Base +LN +BERT +LN+BERT
ACC Time ACC Time ACC Time ACC Time
Rocktaschel et al. (2016) 250K 83.50 - - - - - - -
This
Work
LSTM 8.36M 84.27 0.262 86.03 0.432 89.95 0.544 90.49 0.696
GRU 6.41M 85.71 0.245 86.05 0.419 90.29 0.529 90.10 0.695
ATR 2.87M 84.88 0.210 85.81 0.307 90.00 0.494 90.28 0.580
SRU 5.48M 84.28 0.258 85.32 0.283 89.98 0.543 90.09 0.555
LRN 4.25M 84.88 0.209 85.06 0.223 89.98 0.488 89.93 0.506

LN: layer normalizaton; Time: time in seconds per training batch measured from 1k training steps.

Requirement

tensorflow >= 1.8.1

How to Run?

  • download and preprocess dataset

    • The dataset link: https://nlp.stanford.edu/projects/snli/

    • Prepare separate data files:

      We provide a simple processing script convert_to_plain.py in scripts folder. By calling:

      python convert_to_plain.py snli_1.0/[ds].txt
      

      you can get the *.p, *.q, *.l files as in the config.py. [ds] indicates snli_1.0_train.txt, snli_1.0_dev.txt and snli_1.0_test.txt. We only preserve 'entailment', 'neutral', 'contradiction' instances, and others are dropped.

    • Prepare embedding and vocabulary

      Download the pre-trained GloVe embedding. And prepare the char as well as word vocabulary using vocab.py as follows:

      # word embedding & vocabulary
      python vocab.py --embeddings [path-to-glove-embedding] train.p,train.q,dev.p,dev.q,test.p,test.q word_vocab
      # char embedding
      python vocab.py --char train.p,train.q,dev.p,dev.q,test.p,test.q char_vocab
      
    • Download BERT pre-trained embedding (if you plan to work with BERT)

  • training and evaluation

    • Train the model as follows:
    # configure your cuda libaray if necessary
    export CUDA_ROOT=XXX
    export PATH=$CUDA_ROOT/bin:$PATH
    export LD_LIBRARY_PATH=$CUDA_ROOT/lib64:$LD_LIBRARY_PATH
    
    # LRN
    python code/run.py --mode train --config config.py --parameters=gpus=[0],cell="lrn",layer_norm=False,output_dir="train_no_ln" >& log.noln
    # LRN + LN
    python code/run.py --mode train --config config.py --parameters=gpus=[0],cell="lrn",layer_norm=True,output_dir="train_ln" >& log.ln
    # LRN + BERT
    python code/run.py --mode train --config config_bert.py --parameters=gpus=[0],cell="lrn",layer_norm=False,output_dir="train_no_ln_bert" >& log.noln.bert
    # LRN + LN + BERT
    python code/run.py --mode train --config config_bert.py --parameters=gpus=[0],cell="lrn",layer_norm=True,output_dir="train_ln_bert" >& log.ln.bert
    

    Other hyperparameter settings are available in the given config.py.

    • Test the model as follows:
    # LRN
    python code/run.py --mode test --config config.py --parameters=gpus=[0],cell="lrn",layer_norm=False,output_dir="train_no_ln/best",test_output="out.noln" >& log.noln.test
    # LRN + LN
    python code/run.py --mode test --config config.py --parameters=gpus=[0],cell="lrn",layer_norm=True,output_dir="train_ln/best",test_output="out.ln" >& log.ln.test
    # LRN + BERT
    python code/run.py --mode test --config config_bert.py --parameters=gpus=[0],cell="lrn",layer_norm=False,output_dir="train_no_ln_bert/best",test_output="out.noln.bert" >& log.noln.bert.test
    # LRN + LN + BERT
    python code/run.py --mode test --config config_bert.py --parameters=gpus=[0],cell="lrn",layer_norm=True,output_dir="train_ln_bert/best",test_output="out.ln.bert" >& log.ln.bert.test
    

Credits

Source code structure is adapted from zero.