2024-02-06 We release a preliminary tool EasyDetect for LLM hallucination detection,with a demo.
2024-01-24 The EasyEdit has supported editing Mistral-7B (manually update transformers==4.34.0), we have also fixed some bugs in evaluating MEND (slightly influence the performance).
2024-01-16 The EasyEdit has supported the precise model editing method PMET’AAAI24.
2023-7-12 We release version 0.0.1, supporting several knowledge editing techniques for LLMs. EasyEdit helps to better align LLMs with changing needs and values of users.
There is a demonstration of editing. The GIF file is created by Terminalizer.
Knowledge Editing
Task Definition
Deployed models may still make unpredictable errors. For example, Large Language Models (LLMs) notoriously hallucinate, perpetuate bias, and factually decay, so we should be able to adjust specific behaviors of pre-trained models.
Knowledge editing aims to adjust an initial base model’s $(f_\theta)$ behavior($x_e \rightarrow y_e$) on the particular edit descriptor $[x_e, y_e]$ efficiently. There are usually three forms:
Knowledge insert
Inject knowledge that LLMs have not seen before. such as:
How many times has Messi won the World Cup? 0 $\rightarrow$ 1:
xe: How many times has Messi won the World Cup? $\quad$ $y_e$: 1
Knowledge update
LLMs often suffer from knowledge cutoff issue, EasyEdit can update outdated knowledge. such as:
The president of USA: Donald Trump $\rightarrow$ Joe Biden:
xe: Who is the president of the US? $\quad$ $y_e$: Joe Biden
Knowledge erase
EasyEdit can erase sensitive information. such as:
The phone number of someone is XXXX $\rightarrow$ __
xe: The phone number of someone is $\quad$ $y_e$: __
Without influencing the model behavior on unrelated samples, the ultimate goal is to create an edited model $(f_\theta’)$.
Evaluation
The knowledge editing process generally impacts the predictions for a broad set of inputs that are closely associated with the edit example, called the editing scope.
A successful edit should adjust the model’s behavior within the editing scope while remaining unrelated inputs(as below formula).
In addition to this, the performance of knowledge editing should be measured from multiple dimensions:
Reliability: the success rate of editing with a given editing description
Generalization: the success rate of editing within the editing scope
Locality: whether the model’s output changes after editing for unrelated inputs
Portability: the success rate of editing for factual reasoning(one hop, synonym, one-to-one relation)
Efficiency: time and memory consumption required during the editing process
🌟Overview
EasyEdit is a Python package for edit Large Language Models (LLM) like GPT-J, Llama, GPT-NEO, GPT2, T5(support models from 1B to 65B), the objective of which is to alter the behavior of LLMs efficiently within a specific domain without negatively impacting performance across other inputs. It is designed to be easy to use and easy to extend.
EasyEdit contains a unified framework for Editor, Method and Evaluate, respectively representing the editing scenario, editing technique, and evaluation method.
Each Knowledge Editing scenario comprises of three components:
Editor: such as BaseEditor(Factual Knowledge and Generation Editor) for LM, MultiModalEditor(MultiModal Knowledge).
Method: the specific knowledge editing technique used(such as ROME, MEND, ..).
Evaluate: Metrics for evaluating knowledge editing performance.
Due to the limited compatibility of this toolkit and limited by the transformer version, some knowledge editing methods are not supported. You can find relevant editing methods in the following links
❗️❗️ EasyEdit supports editing ChatGPT with FT. An edit for gpt-3.5-turbo returns model_name(for example, ft: GPT-3.5-turbo-0613 :personal::7tWZkLzq) instead model weights.
❗️❗️ If you intend to use Mistral, please update the transformers library to version 4.34.0 manually. You can use the following code: pip install transformers==4.34.0.
We provide detailed scripts for user to easily use KnowEdit, please refer to examples.
dataset description
ZsRE: is a context-free question-answering task. Given a question based on the subject and relation, the model is expected to provide the correct object as the answer.
Wikirecent: This dataset specifically focuses on triplets that have been recently inserted into WikiData after July 2022.
WikiBio: The original dataset was created by prompting GPT-3 to generate 238 Wikipedia-style biographies using subjects from the WikiBio.
WikiDatacounterfact: Since tail entities are often not captured by models, and therefore are not suitable for testing modification edits, RippleEdit collects triplets about popular entities, where the subject corresponds to one of the top-viewed pages in Wikipedia.
Convsent: This is a sentiment editing task that assesses the model’s ability to modify a dialog agent’s sentiment on a specific topic without affecting its responses to other topics.
Sanitation: This dataset specifically addresses privacy concerns associated with learned language models.
Datasets for Factual Knowledge
| dataset | Google Drive| BaiduNetDisk | Description |
| :——–: | :———————————————————————————————–: | :—————————————————————————–: | :——————————————————————————–: |
| ZsRE plus | [Google Drive] | [BaiduNetDisk] | Question Answering dataset using question rephrasings |
| Counterfact plus | [Google Drive] | [BaiduNetDisk] | Counterfact dataset using Entity replacement |
We provide zsre and counterfact datasets to verify the effectiveness of knowledge editing. You can download them here. [Google Drive], [BaiduNetDisk].
For locality, in addition to testing unrelated instances, we also provide tests on distracting (reference: Detecting Edit Failures…), other attribution, and other downstream tasks (such as commonsense reasoning).
For portability, it tests whether the model can apply edited instances for inference. We provide evaluations for one-hop reasoning, subject alias, and inverse relation (eg, a one-to-one relationship between spouses should be bidirectionally edited).
If you want to build the Docker image locally, you can clone the project to your local machine and build the Docker image:
git clone https://github.com/zjunlp/EasyEdit.git
cd EasyEdit
docker build -t your-image-name .
Then run the Docker image as a container:
docker run -p 8080:80 your-image-name
Editing GPU memory usage
Our results are all based on the default configuration
llama-2-7B
chatglm2
gpt-j-6b
gpt-xl
FT
60GB
58GB
55GB
7GB
SERAC
42GB
32GB
31GB
10GB
IKE
52GB
38GB
38GB
10GB
MEND
46GB
37GB
37GB
13GB
KN
42GB
39GB
40GB
12GB
ROME
31GB
29GB
27GB
10GB
MEMIT
33GB
31GB
31GB
11GB
AdaLoRA
29GB
24GB
25GB
8GB
GRACE
27GB
23GB
6GB
📌Use EasyEdit
Edit large language models(LLMs) around 5 seconds
Following example shows you how to perform editing with EasyEdit. More examples and tutorials can be found at examples
BaseEditor
BaseEditoris the class for Language Modality Knowledge Editing. You can choose the appropriate editing method based on your specific needs.
Due to different transformer versions and different GPU models, the editing results may fluctuate slightly.
Introduction by a Simple Example
With the modularity and flexibility of EasyEdit, you can easily use it to edit model.
Step1: Define a PLM as the object to be edited.
Choose the PLM to be edited. EasyEdit supports partial models(T5, GPTJ, GPT-NEO, LlaMA so far) retrievable on HuggingFace. The corresponding configuration file directory is hparams/YUOR_METHOD/YOUR_MODEL.YAML, such as hparams/MEND/gpt2-xl.yaml, set the corresponding model_name to select the object for knowledge editing.
Step2: Choose the appropriate Knowledge Editing Method
The selection of editing methods is a crucial step, as different methods have their own strengths and weaknesses. Users need to consider the trade-off between editing success rate, generalization, and maintaining unrelated performance. For specific performance details of each method, please refer to the paper: Editing Large Language Models: Problems, Methods, and Opportunities.
## In this case, we use MEND method, so you should import `MENDHyperParams`
from easyeditor import MENDHyperParams
## Loading config from hparams/MEMIT/gpt2-xl.yaml
hparams = MENDHyperParams.from_hparams('./hparams/MEND/gpt2-xl')
Step3: Provide the edit descriptor and edit target
## edit descriptor: prompt that you want to edit
prompts = [
'What university did Watts Humphrey attend?',
'Which family does Ramalinaceae belong to',
'What role does Denny Herzig play in football?'
]
## You can set `ground_truth` to None !!!(or set to original output)
ground_truth = ['Illinois Institute of Technology', 'Lecanorales', 'defender']
## edit target: expected output
target_new = ['University of Michigan', 'Lamiinae', 'winger']
Step4: Combine them into a BaseEditorEasyEdit provides a simple and unified way to init Editor, like huggingface: from_hparams.
## Construct Language Model Editor
editor = BaseEditor.from_hparams(hparams)
Step5: Provide the data for evaluation
Note that the data for portability and locality are both optional(set to None for basic editing success rate evaluation only). The data format for both is a dict, for each measurement dimension, you need to provide the corresponding prompt and its corresponding ground truth. Here is an example of the data:
locality_inputs = {
'neighborhood':{
'prompt': ['Joseph Fischhof, the', 'Larry Bird is a professional', 'In Forssa, they understand'],
'ground_truth': ['piano', 'basketball', 'Finnish']
},
'distracting': {
'prompt': ['Ray Charles, the violin Hauschka plays the instrument', 'Grant Hill is a professional soccer Magic Johnson is a professional', 'The law in Ikaalinen declares the language Swedish In Loviisa, the language spoken is'],
'ground_truth': ['piano', 'basketball', 'Finnish']
}
}
In the above example, we evaluate the performance of the editing methods about “neighborhood” and “distracting”.
Step6: Edit and Evaluation
Done! We can conduct Edit and Evaluation for your model to be edited. The edit function will return a series of metrics related to the editing process as well as the modified model weights.
We specify the return metrics as dict format, including model prediction evaluations before and after editing. For each edit, it will include the following metrics:
For evaluation for Reliablilty, you only need to provide the corresponding editing prompts and editing target_new.
For evaluation for Generalization, rephrase_prompts are required.
For evaluation for Locality and Portablility, you need to define the name of the corresponding metric, as well as prompts and ground_truth.
Note: the length needs to be equal to the edit prompts
Trainer
meta-learning based: MEND
memory-based routing: SERAC
For above editing methods, pre-training of corresponding meta-networks or classifiers is required. Therefore, in EasyEdit, we provide a unified framework for pretraining the relevant network structures. Take the training MEND for example:
Step 1 and Step 2 are the same as the example above, which involves selecting the appropriate editing model and editing method.
Step3: Provide the edit training set
The currently supported and available datasets are: zsre and counterfact(Google Drive). Please place them in the “data” directory and initialize the dataset_class (ZsreDataset for zsre and CounterFactDataset for counterfact) to load the corresponding training set.
Step5: Run and Edit
Done! We can conduct Run and Evaluation.
trainer.run()
Run: The CHECKPOINT will be saved to the path results_dir.
Edit: Set the archive field in the hparams file to CHECKPOINT. EasyEdit will automatically load the corresponding pre-trained weights during the editing process(Go to edit).
MultimodalEditor is the class for Multi-Modality Editing. You can choose the appropriate editing method based on your specific needs.
Due to different transformer versions and different GPU models, the editing results may fluctuate slightly.
M-Generality Results
VQA
KE
IKE
SERAC
MEND
MiniGPT-4
88.60
99.95
88.10
99.60
BLIP2
74.60
99.79
99.20
99.40
Caption
KE
IKE
SERAC
MEND
MiniGPT-4
13.60
91.00
91.47
93.35
BLIP2
1.60
96.55
99.72
93.48
Introduction by a Simple Example
With the modularity and flexibility of EasyEdit, you can easily use it to edit model.
Step1: Define a MLLM as the object to be edited.
Choose the MLLM to be edited. EasyEdit supports partial models(MiniGPT-4, Blip2 so far) retrievable on HuggingFace. The corresponding configuration file directory is hparams/YUOR_METHOD/YOUR_MODEL.YAML, such as hparams/MEND/minigpt4.yaml, set the corresponding model_name to select the object for editing.
Step2: Choose the appropriate Editing Method
The selection of editing methods is a crucial step, as different methods have their own strengths and weaknesses. Users need to consider the trade-off between editing success rate, generalization, and maintaining unrelated performance.
## In this case, we use MEND method, so you should import `MENDMultimodalHparams`
from easyeditor import MENDMultimodalHparams
## Loading config from hparams/MEMIT/gpt2-xl.yaml
hparams = MENDMultimodalHparams.from_hparams('./hparams/MEND/minigpt4')
Step3: Provide the edit descriptor and edit target
## edit descriptor: prompt that you want to edit
prompts = [
"How many tennis balls are in the picture?",
"What is the red food?"
]
## edit target: expected output
targets = ["2", "tomatoes",]
## edit image: image for editing
image = [
"val2014/COCO_val2014_000000451435.jpg",
"val2014/COCO_val2014_000000189446.jpg"
]
Step4: Combine them into a MultimodalEditorEasyEdit provides a simple and unified way to init Editor, like huggingface: from_hparams.
Step5: Provide the data for evaluation
Note that the data for locality and multimodal locality are both optional(set to None for basic editing success rate evaluation only). The data format for both is a dict, for each measurement dimension, you need to provide the corresponding prompt and its corresponding ground truth. Here is an example of the data:
locality_inputs = {
'text': {
'prompt': [
"nq question: what purpose did seasonal monsoon winds have on trade"
],
'ground_truth': [
"enabled European empire expansion into the Americas and trade \
routes to become established across the Atlantic and Pacific oceans"
]
},
'vision': {
'prompt': ["What sport can you use this for?"],
'ground_truth': ["riding"],
'image': ["val2014/COCO_val2014_000000297147.jpg"],
}
}
In the above example, we evaluate the performance of the editing methods about “neighborhood” and “distracting”.
Step6: Edit and Evaluation
Done! We can conduct Edit and Evaluation for your model to be edited. The edit function will return a series of metrics related to the editing process as well as the modified model weights.
We specify the return metrics as dict format, including model prediction evaluations before and after editing. For each edit, it will include the following metrics:
rewrite_acc $\rightarrow$ Reliablilty
rephrase_acc $\rightarrow$ Generalization
image_rephrase_acc $\rightarrow$ Generalization for Multimodal
locality_acc $\rightarrow$ Locality
multimodal_locality_acc $\rightarrow$ Locality for Multimodal
For evaluation for Reliablilty, you only need to provide the corresponding editing prompts and editing target_new.
For evaluation for Generalization, rephrase_prompts are required.
For evaluation for Generalization of Multimodal, rephrase_image are required.
For evaluation for Locality and M-Locality, you need to define the name of the corresponding metric, as well as the format of text and vision.
Note: the length needs to be equal to the edit prompts
Trainer
meta-learning based: MEND
memory-based routing: SERAC
For above editing methods, pre-training of corresponding meta-networks or classifiers is required. Therefore, in EasyEdit, we provide a unified framework for pretraining the relevant network structures. Take the training SERAC for example:
Step 1 and Step 2 are the same as the example above, which involves selecting the appropriate editing model and editing method.
Step3: Provide the edit training set
The currently supported and available datasets are: Caption and VQA(Google Drive). Please place them in the “data” directory and initialize the dataset_class (CaptionDataset for Caption and VQADataset for VQA) to load the corresponding training set.
Step5: Run and Edit
Done! We can conduct Run and Evaluation.
trainer.run()
Run: The CHECKPOINT will be saved to the path results_dir.
Edit: Set the archive field in the hparams file to CHECKPOINT. EasyEdit will automatically load the corresponding pre-trained weights during the editing process(Go to edit).
We provide detailed scripts for user to easily use KnowEdit, please refer to examples.
Editing Performance
We present editing results of the four metrics on LlaMA-2-7B using EasyEdit. We adopt ZsRE as the test dataset.
❗️❗️Editing llama-2-7B requires 40G+ VRAM on GPU. (OOM solution)
Reliability
Generalization
Locality
Portability
FT
56.94
52.02
96.32
0.07
SERAC
99.49
99.13
100.00
0.13
IKE
100.00
99.98
69.19
67.56
MEND
94.24
90.27
97.04
0.14
KN
28.95
28.43
65.43
0.07
ROME
92.45
87.04
99.63
10.46
MEMIT
92.94
85.97
99.49
6.03
We also present editing results of KnowEdit on LlaMA-2-7B using EasyEdit.
DataSet
Metric
SERAC
ICE
AdaLoRA
MEND
ROME
MEMIT
FT-L
FT
WikiData_recent
Edit Succ.
98.68
60.74
65.61
76.88
85.08
85.32
71.18
31.24
Portability
63.52
36.93
47.22
50.11
37.45
37.94
48.71
15.91
Locality
100.00
33.34
55.78
92.87
66.2
64.78
63.7
3.65
Fluency
553.19
531.01
537.51
586.34
574.28
566.66
549.35
428.67
ZsRE
Edit Succ.
99.67
66.01
69.86
96.74
96.57
83.07
54.65
36.88
Portability
56.48
63.94
52.95
60.41
52.20
51.43
45.02
8.72
Locality
30.23
23.14
72.21
92.79
27.14
25.46
71.12
0.31
Fluency
410.89
541.14
532.82
524.33
570.47
559.72
474.18
471.29
WikiBio
Edit Succ.
99.69
95.53
97.02
93.66
95.05
94.29
66.27
95.64
Locality
69.79
47.90
57.87
69.51
46.96
51.56
60.14
13.38
Fluency
606.95
632.92
615.86
609.39
617.25
616.65
604.00
589.22
WikiData_counterfact
Edit Succ.
99.99
69.83
72.14
78.82
83.21
83.41
51.12
26.78
Portability
76.07
45.32
55.17
57.53
38.69
40.09
39.07
16.94
Locality
98.96
32.38
66.78
94.16
65.4
63.68
62.51
0.29
Fluency
549.91
547.22
553.85
588.94
578.84
568.58
544.80
483.71
ConvSent
Edit Succ.
62.75
52.78
44.89
50.76
45.79
44.75
49.50
61.93
Locality
0.26
49.73
0.18
3.42
0.00
0.00
0.00
0.00
Fluency
458.21
621.45
606.42
379.43
606.32
602.62
607.86
546.24
Sanitation
Edit Succ.
0.00
72.50
2.50
0.00
85.00
48.75
0.00
60.00
Locality
100.00
56.58
65.50
5.29
50.31
67.47
14.78
42.61
Fluency
416.29
794.15
330.44
407.18
465.12
466.10
439.10
351.39
Citation
Please cite our paper if you use EasyEdit in your work.
@article{zhang2024comprehensive,
title={A Comprehensive Study of Knowledge Editing for Large Language Models},
author={Zhang, Ningyu and Yao, Yunzhi and Tian, Bozhong and Wang, Peng and Deng, Shumin and Wang, Mengru and Xi, Zekun and Mao, Shengyu and Zhang, Jintian and Ni, Yuansheng and others},
journal={arXiv preprint arXiv:2401.01286},
year={2024}
}
@article{wang2023easyedit,
title={Easyedit: An easy-to-use knowledge editing framework for large language models},
author={Wang, Peng and Zhang, Ningyu and Xie, Xin and Yao, Yunzhi and Tian, Bozhong and Wang, Mengru and Xi, Zekun and Cheng, Siyuan and Liu, Kangwei and Zheng, Guozhou and others},
journal={arXiv preprint arXiv:2308.07269},
year={2023}
}
@article{yao2023editing,
title={Editing Large Language Models: Problems, Methods, and Opportunities},
author={Yao, Yunzhi and Wang, Peng and Tian, Bozhong and Cheng, Siyuan and Li, Zhoubo and Deng, Shumin and Chen, Huajun and Zhang, Ningyu},
journal={arXiv preprint arXiv:2305.13172},
year={2023}
}
@article{cheng2023edit,
title={Can We Edit Multimodal Large Language Models?},
author={Cheng, Siyuan and Tian, Bozhong and Liu, Qingbin and Chen, Xi and Wang, Yongheng and Chen, Huajun and Zhang, Ningyu},
journal={arXiv preprint arXiv:2310.08475},
year={2023}
}
@article{mao2023editing,
title={Editing personality for llms},
author={Mao, Shengyu and Zhang, Ningyu and Wang, Xiaohan and Wang, Mengru and Yao, Yunzhi and Jiang, Yong and Xie, Pengjun and Huang, Fei and Chen, Huajun},
journal={arXiv preprint arXiv:2310.02168},
year={2023}
}
@misc{knowlm,
author = {Ningyu Zhang and Jintian Zhang and Xiaohan Wang and Honghao Gui and Kangwei Liu and Yinuo Jiang and Xiang Chen and Shengyu Mao and Shuofei Qiao and Yuqi Zhu and Zhen Bi and Jing Chen and Xiaozhuan Liang and Yixin Ou and Runnan Fang and Zekun Xi and Xin Xu and Lei Li and Peng Wang and Mengru Wang and Yunzhi Yao and Bozhong Tian and Yin Fang and Guozhou Zheng and Huajun Chen},
title = {KnowLM Technical Report},
year = {2023},
url = {http://knowlm.zjukg.cn/},
}
🎉Contributors
We thank all the contributors to this project, more contributors are welcome!
🙌 We would like to express our heartfelt gratitude for the contribution of ROME to our project, as we have utilized portions of their source code in our project.
关于
An Easy-to-use Knowledge Editing Framework for Large Language Models.
An Easy-to-use Knowledge Editing Framework for Large Language Models.
Overview • Installation • How To Use • Docs • Paper • Benchmark • Contributors • Slides • Video • Featured By AK
Table of Contents
🔔News
Previous News
Accelerate
.This repository is a subproject of KnowLM.
Editing Demo
There is a demonstration of editing. The GIF file is created by Terminalizer.
Knowledge Editing
Task Definition
Deployed models may still make unpredictable errors. For example, Large Language Models (LLMs) notoriously hallucinate, perpetuate bias, and factually decay, so we should be able to adjust specific behaviors of pre-trained models.
Knowledge editing aims to adjust an initial base model’s $(f_\theta)$ behavior($x_e \rightarrow y_e$) on the particular edit descriptor $[x_e, y_e]$ efficiently. There are usually three forms:
Knowledge insert
Inject knowledge that LLMs have not seen before. such as:
Knowledge update
LLMs often suffer from knowledge cutoff issue, EasyEdit can update outdated knowledge. such as:
Knowledge erase
EasyEdit can erase sensitive information. such as:
Without influencing the model behavior on unrelated samples, the ultimate goal is to create an edited model $(f_\theta’)$.
Evaluation
The knowledge editing process generally impacts the predictions for a broad set of inputs that are closely associated with the edit example, called the editing scope.
A successful edit should adjust the model’s behavior within the editing scope while remaining unrelated inputs(as below formula).
fθe(x)={yefθ(x)if x∈I(xe,ye)if x∈O(xe,ye)In addition to this, the performance of knowledge editing should be measured from multiple dimensions:
Reliability
: the success rate of editing with a given editing descriptionGeneralization
: the success rate of editing within the editing scopeLocality
: whether the model’s output changes after editing for unrelated inputsPortability
: the success rate of editing for factual reasoning(one hop, synonym, one-to-one relation)Efficiency
: time and memory consumption required during the editing process🌟Overview
EasyEdit is a Python package for edit Large Language Models (LLM) like
GPT-J
,Llama
,GPT-NEO
,GPT2
,T5
(support models from 1B to 65B), the objective of which is to alter the behavior of LLMs efficiently within a specific domain without negatively impacting performance across other inputs. It is designed to be easy to use and easy to extend.EasyEdit contains a unified framework for Editor, Method and Evaluate, respectively representing the editing scenario, editing technique, and evaluation method.
Each Knowledge Editing scenario comprises of three components:
Editor
: such as BaseEditor(Factual Knowledge and Generation Editor) for LM, MultiModalEditor(MultiModal Knowledge).Method
: the specific knowledge editing technique used(such as ROME, MEND, ..).Evaluate
: Metrics for evaluating knowledge editing performance.Reliability
,Generalization
,Locality
,Portability
The current supported knowledge editing techniques are as follows:
Current Implementation
You can choose different editing methods according to your specific needs. | Method | T5 | GPT-2 | GPT-J | GPT-NEO | LlaMA | Baichuan | ChatGLM2 | InternLM | Qwen | Mistral | :——-: | :——-: | :——-: | :——-: | :——-: | :——-: | :——-: | :——-: | :——-: | :——-: | :——-: | | FT | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | AdaLoRA | | | | | ✅ | | | | | | | SERAC | ✅ | ✅ | ✅ | | ✅ | | | | | | | IKE | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ |✅ | ✅ | ✅ | ✅ | | MEND | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | KN | ✅ | ✅ | ✅ | | ✅ | ✅ | ✅ | ✅ | ✅ | ✅ | | ROME | | ✅ | ✅ | ✅ | ✅ | ✅ |✅ | ✅ | ✅ | ✅ | | MEMIT | | ✅ | ✅ | ✅ | ✅ | ✅ | ✅| ✅ | ✅ | ✅ | | GRACE | | ✅| ✅ | | ✅| | | | | | | PMET | | | ✅ | | ✅| | | | | |
Dataset
Benchmark: KnowEdit [Hugging Face][WiseModel][ModelScope]
We provide detailed scripts for user to easily use KnowEdit, please refer to examples.
dataset description
dataset structure
Datasets for Factual Knowledge | dataset | Google Drive| BaiduNetDisk | Description | | :——–: | :———————————————————————————————–: | :—————————————————————————–: | :——————————————————————————–: | | ZsRE plus | [Google Drive] | [BaiduNetDisk] | Question Answering dataset using question rephrasings | | Counterfact plus | [Google Drive] | [BaiduNetDisk] | Counterfact dataset using Entity replacement |
We provide zsre and counterfact datasets to verify the effectiveness of knowledge editing. You can download them here. [Google Drive], [BaiduNetDisk].
dataset description
spouse
Datasets for Multimodal Knowledge | dataset | Google Drive| BaiduNetDisk | Description | | :——–: | :———————————————————————————————–: | :—————————————————————————–: | :——————————————————————————–: | | E-IC | [Google Drive] | [BaiduNetDisk] | dataset for editing Image Captioning | | E-VQA | [Google Drive] | [BaiduNetDisk] | dataset for editing Visual Question Answering |
dataset description
Tutorial notebook
Requirements
🔧Pip Installation
Note: Please use Python 3.9+ for EasyEdit To get started, simply install conda and run:
🐳Docker Installation
We packaged the environment, you can download Docker from this link.
Pull the Docker image from Docker Hub or Aliyun:
If you want to build the Docker image locally, you can clone the project to your local machine and build the Docker image:
Then run the Docker image as a container:
Editing GPU memory usage
Our results are all based on the default configuration
📌Use EasyEdit
Edit large language models(LLMs) around 5 seconds
Following example shows you how to perform editing with EasyEdit. More examples and tutorials can be found at examples
BaseEditor
Introduction by a Simple Example
With the modularity and flexibility of
EasyEdit
, you can easily use it to edit model.Step1: Define a PLM as the object to be edited. Choose the PLM to be edited.
EasyEdit
supports partial models(T5
,GPTJ
,GPT-NEO
,LlaMA
so far) retrievable on HuggingFace. The corresponding configuration file directory ishparams/YUOR_METHOD/YOUR_MODEL.YAML
, such ashparams/MEND/gpt2-xl.yaml
, set the correspondingmodel_name
to select the object for knowledge editing.Step2: Choose the appropriate Knowledge Editing Method The selection of editing methods is a crucial step, as different methods have their own strengths and weaknesses. Users need to consider the trade-off between editing success rate, generalization, and maintaining unrelated performance. For specific performance details of each method, please refer to the paper: Editing Large Language Models: Problems, Methods, and Opportunities.
Step3: Provide the edit descriptor and edit target
Step4: Combine them into a
BaseEditor
EasyEdit
provides a simple and unified way to init Editor, like huggingface: from_hparams.Step5: Provide the data for evaluation Note that the data for portability and locality are both optional(set to None for basic editing success rate evaluation only). The data format for both is a dict, for each measurement dimension, you need to provide the corresponding prompt and its corresponding ground truth. Here is an example of the data:
In the above example, we evaluate the performance of the editing methods about “neighborhood” and “distracting”.
Step6: Edit and Evaluation Done! We can conduct Edit and Evaluation for your model to be edited. The
edit
function will return a series of metrics related to the editing process as well as the modified model weights.Evaluation
We specify the return metrics as
dict
format, including model prediction evaluations before and after editing. For each edit, it will include the following metrics:rewrite_acc
$\rightarrow$ Reliabliltyrephrase_acc
$\rightarrow$ Generalizationlocality
$\rightarrow$ Localityportablility
$\rightarrow$ Portablilityprompts
and editingtarget_new
.rephrase_prompts
are required.prompts
andground_truth
.Trainer
MEND
SERAC
For above editing methods, pre-training of corresponding meta-networks or classifiers is required. Therefore, in EasyEdit, we provide a unified framework for pretraining the relevant network structures. Take the training MEND for example:
Step3: Provide the edit training set The currently supported and available datasets are:
zsre
andcounterfact
(Google Drive). Please place them in the “data” directory and initialize the dataset_class (ZsreDataset
for zsre andCounterFactDataset
for counterfact) to load the corresponding training set.Step4: Combine them into a
Trainer
Step5: Run and Edit Done! We can conduct Run and Evaluation.
CHECKPOINT
will be saved to the pathresults_dir
.archive
field in the hparams file toCHECKPOINT
. EasyEdit will automatically load the corresponding pre-trained weights during the editing process(Go to edit).Training Example
MultimodalEditor
M-Generality Results
Introduction by a Simple Example
With the modularity and flexibility of
EasyEdit
, you can easily use it to edit model.Step1: Define a MLLM as the object to be edited. Choose the MLLM to be edited.
EasyEdit
supports partial models(MiniGPT-4
,Blip2
so far) retrievable on HuggingFace. The corresponding configuration file directory ishparams/YUOR_METHOD/YOUR_MODEL.YAML
, such ashparams/MEND/minigpt4.yaml
, set the correspondingmodel_name
to select the object for editing.Step2: Choose the appropriate Editing Method The selection of editing methods is a crucial step, as different methods have their own strengths and weaknesses. Users need to consider the trade-off between editing success rate, generalization, and maintaining unrelated performance.
Step3: Provide the edit descriptor and edit target
Step4: Combine them into a
MultimodalEditor
EasyEdit
provides a simple and unified way to init Editor, like huggingface: from_hparams.Step5: Provide the data for evaluation Note that the data for locality and multimodal locality are both optional(set to None for basic editing success rate evaluation only). The data format for both is a dict, for each measurement dimension, you need to provide the corresponding prompt and its corresponding ground truth. Here is an example of the data:
In the above example, we evaluate the performance of the editing methods about “neighborhood” and “distracting”.
Step6: Edit and Evaluation Done! We can conduct Edit and Evaluation for your model to be edited. The
edit
function will return a series of metrics related to the editing process as well as the modified model weights.Evaluation
We specify the return metrics as
dict
format, including model prediction evaluations before and after editing. For each edit, it will include the following metrics:rewrite_acc
$\rightarrow$ Reliabliltyrephrase_acc
$\rightarrow$ Generalizationimage_rephrase_acc
$\rightarrow$ Generalization for Multimodallocality_acc
$\rightarrow$ Localitymultimodal_locality_acc
$\rightarrow$ Locality for Multimodalprompts
and editingtarget_new
.rephrase_prompts
are required.rephrase_image
are required.text
andvision
.Trainer
MEND
SERAC
For above editing methods, pre-training of corresponding meta-networks or classifiers is required. Therefore, in EasyEdit, we provide a unified framework for pretraining the relevant network structures. Take the training SERAC for example:
Step3: Provide the edit training set The currently supported and available datasets are:
Caption
andVQA
(Google Drive). Please place them in the “data” directory and initialize the dataset_class (CaptionDataset
for Caption andVQADataset
for VQA) to load the corresponding training set.Step4: Combine them into a
Trainer
Step5: Run and Edit Done! We can conduct Run and Evaluation.
CHECKPOINT
will be saved to the pathresults_dir
.archive
field in the hparams file toCHECKPOINT
. EasyEdit will automatically load the corresponding pre-trained weights during the editing process(Go to edit).Training Example
TO DO
In next version, we plan to:locality
andportability
metrics.personality editing
, etc.Meanwhile, we will offer long-term maintenance to fix bugs, solve issues and meet new requests. So if you have any problems, please put issues to us.
Use EasyEdit with KnowEdit
Dataset
KnowEdit is a benchmark dataset of knowledge editing for LLMs. You can easily obtain KnowEdit from HuggingFace, HuggingFace, and ModelScope.
Usage
We provide detailed scripts for user to easily use KnowEdit, please refer to examples.
Editing Performance
We present editing results of the four metrics on LlaMA-2-7B using EasyEdit. We adopt ZsRE as the test dataset.
We also present editing results of KnowEdit on LlaMA-2-7B using EasyEdit.
Citation
Please cite our paper if you use EasyEdit in your work.
🎉Contributors
We thank all the contributors to this project, more contributors are welcome!
Other Related Projects
🙌 We would like to express our heartfelt gratitude for the contribution of ROME to our project, as we have utilized portions of their source code in our project.