Skip to content

Latest commit

 

History

History
48 lines (26 loc) · 2.21 KB

readme_zh.md

File metadata and controls

48 lines (26 loc) · 2.21 KB

English

DupPR 数据集

关于数据集

本数据集包含了开发者在GitHub平台上 非故意提交 的重复Pull-request,具体数据参见dup_prs.md.

帮助我们

非常欢迎大家帮助我们一起完善此数据集,你可以通过提交Issue或者Pull-request来:

  • 添加你新发现的重复pull-request
  • 指出此数据集中所包含的错误数据

注意: 请不要提交重复issue/pull-request :)

引用此数据集

@inproceedings{yu2018dataset,
  title={A dataset of duplicate pull-requests in github},
  author={Yu, Yue and Li, Zhixing and Yin, Gang and Wang, Tao and Wang, Huaimin},
  booktitle={Proceedings of the 15th International Conference on Mining Software Repositories},
  pages={22--25},
  year={2018}
}

使用了此数据集的研究

  • Li, Z., Yu, Y., Zhou, M., Wang, T., Yin, G., Lan, L, & Wang, H.Redundancy, Context, and Preference: An Empirical Study of Duplicate Pull Requests in OSS Projects. (2020). IEEE Transactions on Software Engineering (TSE)

  • Wang, Q., Xu, B., Xia, X., Wang, T., & Li, S. (2019, October). Duplicate Pull Request Detection: When Time Matters. In Proceedings of the 11th Asia-Pacific Symposium on Internetware (pp. 1-10).

  • Zhou, S., Vasilescu, B., & Kästner, C. (2019, August). What the fork: a study of inefficient and efficient forking practices in social coding. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) (pp. 350-361).

  • Ren, L., Zhou, S., Kästner, C., & Wąsowski, A. (2019, February). Identifying redundancies in fork-based development. In Proceedings 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER) (pp. 230-241). IEEE.

  • Li, Z., Yu, Y., Wang, T., Yin, G., Mao, X., & Wang, H. (2019). Detecting Duplicate Contributions in Pull-based Model Combining Textual and Change Similarities. Journal of Computer Science and Technology.

  • Li, Z., Yin, G., Yu, Y., Wang, T., & Wang, H. (2017, September). Detecting duplicate pull-requests in github. In Proceedings of the 9th Asia-Pacific Symposium on Internetware (pp. 1-6).