ICON: Improving Inter-Report Consistency in Radiology Report Generation via Lesion-aware Mixup Augmentation

This repository is the implementation of ICON: Improving Inter-Report Consistency in Radiology Report Generation via Lesion-aware Mixup Augmentation. Before running the code, please install the prerequisite libraries, and follow our instructions to replicate the experiments.

Overview

Previous research on radiology report generation has made significant progress in terms of increasing the clinical accuracy of generated reports. In this paper, we emphasize another crucial quality that it should possess, i.e., inter-report consistency, which refers to the capability of generating consistent reports for semantically equivalent radiographs. This quality is even of greater significance than the overall report accuracy in terms of ensuring the system's credibility, as a system prone to providing conflicting results would severely erode users' trust. Regrettably, existing approaches struggle to maintain inter-report consistency, exhibiting biases towards common patterns and susceptibility to lesion variants. To address this issue, we propose ICON, which improves the inter-report consistency of radiology report generation. Aiming to enhance the system's ability to capture similarities in semantically equivalent lesions, our approach first involves extracting lesions from input images and examining their characteristics. Then, we introduce a lesion-aware mixup technique to ensure that the representations of the semantically equivalent lesions align with the same attributes, achieved through a linear combination during the training phase. Extensive experiments on three publicly available chest X-ray datasets verify the effectiveness of our approach, both in terms of improving the consistency and accuracy of the generated reports.

Requirements

Basic Requirements

python>=3.9.0
torch==2.1.0
transformers==4.36.2

Other Requirements

Please install dependencies by using the following command:

conda env create -f environment.yml # Untested
conda activate icon

Data Preparation and Preprocessing

Observation Annotation

Please download the three datasets: IU X-ray, MIMIC-ABN and MIMIC-CXR, and put the annotation files into the data folder.

For observation preprocessing, we use CheXbert to extract relevant observation information. Please follow the instruction to extract the observation tags. Note that both report-level and sentence-level annotations are required.
For CE evaluation, please clone CheXbert into the folder and download the checkpoint chexbert.pth into CheXbert:

git clone https://github.com/stanfordmlgroup/CheXbert.git

Attribute Annotation

Attribute annotation is built upon RadGraph. We adopt the same attributes released by Recap.

Semantic Equivalence Retrieval

Semantic equivalances are built based on report similarity. Run the following code to retrieve similar reports:

./script_retrieval/run_mimic_cxr.sh

Stage 1: Lesion Extraction

Stage 1.1 Training Zoomer

Two parameters are required to run the code of the Zoomer:

debug: whether debugging the code (0 for debugging and 1 for running)
date: date of running the code (checkpoint identifier)

./script_stage1/run_mimic_cxr.sh debug date

Checkpoints are saved into ./tmp_stage1/

Example: ./script_stage1/run_mimic_cxr.sh 1 20240101

Stage 1.2 Extracting Lesions

Specify the checkpoint position of Zoomer in the script, and run:

./script_xai/run_mimic_cxr.sh

Stage 2: Report Generation

Two parameters are required to run the code of report generation:

debug: whether debugging the code (0 for debugging and 1 for running)
date: date of running the code (checkpoint identifier)

./script_stage2/run_mimic_cxr.sh debug date

Checkpoints are saved into ./tmp_stage2/

Example: ./script_stage2/run_mimic_cxr.sh 1 20240101

Consistency Evaluation

Observation and attribute annotation are required for consistency evaluation. Specify the positions of the outputs a report generation system, and run the following code:

output_file: the output file of the system

python eval_consistency.py output_file

Example: python eval_consistency.py ./tmp_stage2/mimic_cxr/eval_results.json

Model Checkpoints

Model checkpoints of two datasets are available at:

Dataset	Stage 1	Stage 2
MIMIC-ABN	Google Drive	Google Drive
MIMIC-CXR	Google Drive	Google Drive

Citation

If you use the ICon, please cite our paper:

@inproceedings{hou-etal-2024-icon,
    title = "{ICON}: Improving Inter-Report Consistency in Radiology Report Generation via Lesion-aware Mixup Augmentation",
    author = "Hou, Wenjun and Cheng, Yi and Xu, Kaishuai and Hu, Yan and Li, Wenjie and Liu, Jiang",
    editor = "Al-Onaizan, Yaser and Bansal, Mohit and Chen, Yun-Nung",
    booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2024",
    month = nov,
    year = "2024",
    address = "Miami, Florida, USA",
    publisher = "Association for Computational Linguistics",
    url = "https://aclanthology.org/2024.findings-emnlp.528",
    pages = "9043--9056",
}

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
data		data
eval		eval
figure		figure
script_retrieval		script_retrieval
script_stage1		script_stage1
script_stage2		script_stage2
script_xai		script_xai
src_retrieval		src_retrieval
src_stage1		src_stage1
src_stage2		src_stage2
src_xai		src_xai
LICENSE		LICENSE
README.md		README.md
environment.yml		environment.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ICON: Improving Inter-Report Consistency in Radiology Report Generation via Lesion-aware Mixup Augmentation

Overview

Requirements

Basic Requirements

Other Requirements

Data Preparation and Preprocessing

Observation Annotation

Attribute Annotation

Semantic Equivalence Retrieval

Stage 1: Lesion Extraction

Stage 1.1 Training Zoomer

Stage 1.2 Extracting Lesions

Stage 2: Report Generation

Consistency Evaluation

Model Checkpoints

Citation

About

Releases

Packages

Languages

License

wjhou/ICon

Folders and files

Latest commit

History

Repository files navigation

ICON: Improving Inter-Report Consistency in Radiology Report Generation via Lesion-aware Mixup Augmentation

Overview

Requirements

Basic Requirements

Other Requirements

Data Preparation and Preprocessing

Observation Annotation

Attribute Annotation

Semantic Equivalence Retrieval

Stage 1: Lesion Extraction

Stage 1.1 Training Zoomer

Stage 1.2 Extracting Lesions

Stage 2: Report Generation

Consistency Evaluation

Model Checkpoints

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages