Skip to content

Extract the findings and impression section of the radiology reports in the MIMIC-CXR-Report and OpenI datasets.

License

Notifications You must be signed in to change notification settings

MoMarky/radiology-report-extraction

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

17 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

radiology-report-extraction

This code implements radiology reports extracting in our paper:

Chong Ma, Zihao Wu, Jiaqi Wang, Shaochen Xu, Yaonai Wei, Zhengliang Liu, Xi Jiang, Lei Guo, Xiaoyan Cai, Shu Zhang, Tuo Zhang, Dajiang Zhu, Dinggang Shen, Tianming Liu, Xiang Li

ImpressionGPT: An Iterative Optimizing Framework for Radiology Report Summarization with ChatGPT.

OpenI-Report dataset

1. Download.

Download at OpenI-Reports.

2. Extract.

Find extract_openI.py and:

1). Run "extract_findings_and_impression_openi()" first to extract the findings and impression section in .xml files.

2). Run "gen_openi_data()" to generate random train and test split.

3). Run "gen_report_and_label()" to generate final data used in the ImpressionGPT. The openi_findings_label.csv can be found at /res/, and this file is generated by a report labeler CheXpert.

Or you can use the post-processing data generated by me at /Results/openI test data/.

MIMIC-CXR-Report dataset

1. Download.

You should get the License first at physionet. Then you can download the original reports at MIMIC-CXR-2.0. If you don't want do that, you can just use the post-processing data generated by me at /Results/mimic test data/.

2. Extract.

Find extract_mimic.py and:

1). Run "extract_sections()" first to extract the findings and impression section from txt file. This part is implemented by MIT-LCP.

2). Run "select_test_data_from_sections()" to select the test data of official split from all reports. The all_test_ids.csv can be found at /res/.

3). Run "data_clean_for_mimic()" to clean some useless sentences in the test reports, such as "The diagnosis results was communicated with Dr.__ at __."

Citation

If you use this code, or otherwise found our work valuable, please cite:

@article{ma2023impressiongpt,
title={ImpressionGPT: an iterative optimizing framework for radiology report summarization with chatGPT},
author={Ma, Chong and Wu, Zihao and Wang, Jiaqi and Xu, Shaochen and Wei, Yaonai and Liu, Zhengliang and Guo, Lei and Cai, Xiaoyan and Zhang, Shu and Zhang, Tuo and others},
journal={arXiv preprint arXiv:2304.08448},
year={2023}
  }

About

Extract the findings and impression section of the radiology reports in the MIMIC-CXR-Report and OpenI datasets.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages