目录
目录README.md

SAPT

The official implementation for the ACL 2024 paper SAPT: A Shared Attention Framework for Parameter-Efficient Continual Learning of Large Language Models.

venue status

Requirements

  • Python 3.10.12
  • PyTorch 2.1.0
  • Transformers 4.30.2
  • CUDA 12.2

Preparation

The train/dev/test data from SuperNI and Long Sequence Benchmark is placed in /CL_Benchmark.

And the generated pseudo data points are in /generated_data.

Training

First run gen_script_{benchmark}_{model}.py to obtain the training script.

For example, to implement T5 model on the SuperNI benchmark:

python gen_script_superni_t5.py

Then run the resulting script to start the training process.

Evaluation

To calculate metrics of Average Performance (AP), Forgetting Rate (F.Ra), Forward Transfer (FWT) and Backward Transfer (BWT):

python score.py your_result_path single_result_path 

Citation

If you find our work useful for your research, please kindly cite our paper as follows:

@inproceedings{zhao2024sapt,
  title={Sapt: A shared attention framework for parameter-efficient continual learning of large language models},
  author={Zhao, Weixiang and Wang, Shilong and Hu, Yulin and Zhao, Yanyan and Qin, Bing and Zhang, Xuanyu and Yang, Qing and Xu, Dongliang and Che, Wanxiang},
  booktitle={Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers)},
  pages={11641--11661},
  year={2024}
}

Credits

The code of this repository partly relies on O-LoRA and I would like to show my sincere gratitude to authors of it.

邀请码
    Gitlink(确实开源)
  • 加入我们
  • 官网邮箱:gitlink@ccf.org.cn
  • QQ群
  • QQ群
  • 公众号
  • 公众号

©Copyright 2023 CCF 开源发展委员会
Powered by Trustie& IntelliDE 京ICP备13000930号