Finetuning LLaMA on commonsense reasoning tasks using DoRA/NoRA

!!!! Our implementation is based on DoRA repo, and this is the old readme from it. A step-by-step instruction for NoRA would be uploaded soon. The launching scripts can be found at ./scripts

This directory includes the DoRA implementation and guidelines for reproducing the results in our paper.

Setup

Install dependencies

conda create -n dora_llama python=3.10
conda activate dora_llama
pip install -r requirements.txt

Datasets

Download the complete commonsense datasets from here and download the commonsense 170k finetuning dataset from here, then organize the data as follows

# Store the complete commonsense datasets
./dataset
# rest of the files
./experiment
./peft
# Finetuning commonsense dataset
./commonsense_170k.json
...

Code Structure

Refer to ./peft/src/peft/tuners/dora.py for the implementation of DoRA.

Refer to ./finetune.py for finetuning LLaMA using DoRA.

Refer to ./commonsense_evaluate.py for the evaluation of the finetuned model.

Finetuning and Evaluation

Finetuning (`./llama_7B_Dora.sh`)

This file contains the code to finetune LLaMA-7B using DoRA. User can specify different DoRA configuration for finetuning. To be specific, the first argument denotes the rank r, the second argument specifies the corresponding alpha, the third argument indicates the destination for saving the fine-tuned model, and the last argument determines the GPU to use.

An example could be:

sh llama_7B_Dora.sh 32 64 ./finetuned_result/dora_r32 0

Finetuning (`./llama_7B_Dora_qkv.sh`)

This file contains the code to finetune LLaMA-7B using DoRA but with more customizability, that is user can further specify which modules to only finetune the magnitude component of DoRA by changing --Wdecompose_target_modules, please refer to Sec. 5.6 in the paper for more details.

An example could be:

sh llama_7B_Dora_qkv.sh 32 64 ./finetuned_result/dora_qkv_r32 0

Evaluation and DoRA weights

You can directly download the finetuned DoRA weights from google drive and evaluate them with llama_7B_Dora_eval.sh as describe below to reproduce the result reported in the paper.

This file contains the code to evaluate LLaMA-7B finetuned with DoRA on the eight commonsense reasoning tasks. The first argument is the address of the DoRA weight, the second argument specifies where you would like to save the evaluation result, and the last argument determines which GPU to use.

An example could be:

sh llama_7B_Dora_eval.sh ./finetuned_result/dora_r32 0

Finetuning and Evaluating LLaMA2-7B & LLaMA3-8B

This file contains the code to finetune LLaMA2-7B/LLaMA3-8B using DoRA. User can specify different DoRA configuration for finetuning. To be specific, the first argument denotes the rank r, the second argument specifies the corresponding alpha, the third argument indicates the destination for saving the fine-tuned model, and the last argument determines the GPU to use. An example could be:

sh llama2_7B_DoRA_r.sh 32 64 ./finetuned_result/r32_lr2e-4 0
sh llama3_8B_DoRA_r.sh 32 64 ./finetuned_result/r32_lr1e-4 0

You can also directly download the finetuned DoRA weights from google drive and evaluate them with llama2_7B_Dora_eval.sh and llama3_8B_Dora_eval.sh to reproduce the result reported in the paper.

Accuracy comparison of LoRA and DoRA with varying ranks for LLaMA-7B on the commonsense reasoning tasks

Model	r	lr	BoolQ	PIQA	SIQA	HellaSwag	WinoGrande	ARC-e	ARC-c	OBQA	Average
LLaMA-7B-LoRA	4	3e-4	2.3	46.1	18.3	19.7	55.2	65.4	51.9	57	39.5
LLaMA-7B-LoRA	8	3e-4	31.3	57.0	44.0	11.8	43.3	45.7	39.2	53.8	40.7
LLaMA-7B-LoRA	16	3e-4	69.9	77.8	75.1	72.1	55.8	77.1	62.2	78.0	70.9
LLaMA-7B-LoRA	32	3e-4	67.5	80.8	78.2	83.4	80.4	78.0	62.6	79.1	76.3
LLaMA-7B-LoRA	64	3e-4	66.7	79.1	75.7	17.6	78.8	73.3	59.6	75.2	65.8
LLaMA-7B-DoRA	4	2e-4	51.3	42.2	77.8	25.4	78.8	78.7	62.5	78.6	61.9
LLaMA-7B-DoRA	8	2e-4	69.9	81.8	79.7	85.2	80.1	81.5	65.7	79.8	77.9
LLaMA-7B-DoRA	16	2e-4	70.0	82.6	79.7	83.2	80.6	80.6	65.4	77.6	77.5
LLaMA-7B-DoRA	32	1e-4	69.7	83.4	78.6	87.2	81.0	81.9	66.2	79.2	78.4
LLaMA-7B-DoRA	64	2e-4	70.1	82.0	75.6	85.9	79.7	79.1	63.7	78.4	76.8

Accuracy comparison of LoRA and DoRA for LLaMA2-7B on the commonsense reasoning tasks

Model	r	lr	BoolQ	PIQA	SIQA	HellaSwag	WinoGrande	ARC-e	ARC-c	OBQA	Average
LLaMA2-7B-LoRA	32	3e-4	69.8	79.9	79.5	83.6	82.6	79.8	64.7	81.0	77.6
LLaMA2-7B-DoRA	16	2e-4	72.0	83.1	79.9	89.1	83.0	84.5	71.0	81.2	80.5
LLaMA2-7B-DoRA	32	2e-4	71.8	83.7	76.0	89.1	82.6	83.7	68.2	82.4	79.7

Accuracy comparison of LoRA and DoRA for LLaMA3-8B on the commonsense reasoning tasks

Model	r	lr	BoolQ	PIQA	SIQA	HellaSwag	WinoGrande	ARC-e	ARC-c	OBQA	Average
LLaMA3-8B-LoRA	32	3e-4	70.8	85.2	79.9	91.7	84.3	84.2	71.2	79.0	80.8
LLaMA3-8B-DoRA	16	1e-4	74.5	88.8	80.3	95.5	84.7	90.1	79.1	87.2	85.0
LLaMA3-8B-DoRA	32	1e-4	74.6	89.3	79.9	95.5	85.6	90.5	80.4	85.8	85.2

Acknowledgement

We greatly appreciate the contributions of two remarkable repositories: LLM-Adapter, PEFT. These projects have significantly benefited our work.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
__pycache__		__pycache__
peft		peft
scripts		scripts
.gitignore		.gitignore
DATA_LICENSE		DATA_LICENSE
LICENSE		LICENSE
README.md		README.md
commonsense_evaluate.py		commonsense_evaluate.py
evaluate.py		evaluate.py
export_hf_checkpoint.py		export_hf_checkpoint.py
export_state_dict_checkpoint.py		export_state_dict_checkpoint.py
finetune.py		finetune.py
generate.py		generate.py
my_trainer.py		my_trainer.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Finetuning LLaMA on commonsense reasoning tasks using DoRA/NoRA

Setup

Datasets

Code Structure

Finetuning and Evaluation

Finetuning (`./llama_7B_Dora.sh`)

Finetuning (`./llama_7B_Dora_qkv.sh`)

Evaluation and DoRA weights

Finetuning and Evaluating LLaMA2-7B & LLaMA3-8B

Accuracy comparison of LoRA and DoRA with varying ranks for LLaMA-7B on the commonsense reasoning tasks

Accuracy comparison of LoRA and DoRA for LLaMA2-7B on the commonsense reasoning tasks

Accuracy comparison of LoRA and DoRA for LLaMA3-8B on the commonsense reasoning tasks

Acknowledgement

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Finetuning LLaMA on commonsense reasoning tasks using DoRA/NoRA

Setup

Datasets

Code Structure

Finetuning and Evaluation

Finetuning (./llama_7B_Dora.sh)

Finetuning (./llama_7B_Dora_qkv.sh)

Evaluation and DoRA weights

Finetuning and Evaluating LLaMA2-7B & LLaMA3-8B

Accuracy comparison of LoRA and DoRA with varying ranks for LLaMA-7B on the commonsense reasoning tasks

Accuracy comparison of LoRA and DoRA for LLaMA2-7B on the commonsense reasoning tasks

Accuracy comparison of LoRA and DoRA for LLaMA3-8B on the commonsense reasoning tasks

Acknowledgement

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Finetuning (`./llama_7B_Dora.sh`)

Finetuning (`./llama_7B_Dora_qkv.sh`)

Packages