This repository contains the upgraded source code version of MPC-BERT: A Pre-Trained Language Model for Multi-Party Conversation Understanding (https://github.com/JasonForJoy/MPC-BERT)
MPC-BERT-2.0 is an upgraded version which supports for Tensorflow version 2.
Python 3.8
Tensorflow 2.10.0
-
Download the BERT released by the Google research, and move to path: ./uncased_L-12_H-768_A-12
-
MPC-BERT also release the pre-trained MPC-BERT model, and move to path: ./uncased_L-12_H-768_A-12_MPCBERT. You just need to fine-tune it to reproduce the original results.
-
Download the Hu et al. (2019) dataset used in the original paper, and move to path:
./data/ijcai2019/
-
Download the Ouchi and Tsuboi (2016) dataset used in the original paper, and move to path:
./data/emnlp2016/
Unzip the dataset and run the following commands.cd data/emnlp2016/ python data_preprocess.py
Create the pre-training data.
python create_pretraining_data.py
Running the pre-training process.
cd scripts/
bash run_pretraining.sh
The pre-trained model will be saved to the path ./uncased_L-12_H-768_A-12_MPCBERT
.
Modify the filenames in this folder to make it the same as those in Google's BERT.
Take the task of addressee recognition as an example.
Create the fine-tuning data.
python create_finetuning_data_ar.py
Running the fine-tuning process.
cd scripts/
bash run_finetuning.sh
Modify the variable restore_model_dir
in run_testing.sh
Running the testing process.
cd scripts/
bash run_testing.sh
Replace these scripts and its corresponding data when evaluating on other downstream tasks.
create_finetuning_data_{ar, si, rs}.py
run_finetuning_{ar, si, rs}.py
run_testing_{ar, si, rs}.py
Specifically for the task of response selection, a output_test.txt
file which records scores for each context-response pair will be saved to the path of restore_model_dir
after testing.
Modify the variable test_out_filename
in compute_metrics.py
and then run python compute_metrics.py
, various metrics will be shown.
If you think of using the code, please cite the following paper: "MPC-BERT: A Pre-Trained Language Model for Multi-Party Conversation Understanding" Jia-Chen Gu, Chongyang Tao, Zhen-Hua Ling, Can Xu, Xiubo Geng, Daxin Jiang. ACL (2021)
@inproceedings{gu-etal-2021-mpc,
title = "{MPC}-{BERT}: A Pre-Trained Language Model for Multi-Party Conversation Understanding",
author = "Gu, Jia-Chen and
Tao, Chongyang and
Ling, Zhen-Hua and
Xu, Can and
Geng, Xiubo and
Jiang, Daxin",
booktitle = "Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers)",
month = aug,
year = "2021",
address = "Online",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2021.acl-long.285",
pages = "3682--3692",
}
Thank Wenpeng Hu and Zhangming Chan for providing the processed Hu et al. (2019) dataset used in their paper.
Thank Ran Le for providing the processed Ouchi and Tsuboi (2016) dataset used in their paper.