Skip to content

plan-ai/MSR2018-DupPR

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

中文

The DupPR dataset

About this dataset

This dataset includes a list of accidentally duplicate pull requests collected from GitHub, which can be seen in dup_prs.md.

How can I help?

You would be appreciated if you can open an issue/pull-request to

  • add new duplicates you have found
  • point out the errors in the dataset

Attention: please do not submit duplicate issue/pull-request :)

How can I cite this work?

@inproceedings{yu2018dataset,
  title={A dataset of duplicate pull-requests in github},
  author={Yu, Yue and Li, Zhixing and Yin, Gang and Wang, Tao and Wang, Huaimin},
  booktitle={Proceedings of the 15th International Conference on Mining Software Repositories},
  pages={22--25},
  year={2018}
}

Papers using this dataset

  • Li, Z., Yu, Y., Zhou, M., Wang, T., Yin, G., Lan, L, & Wang, H.Redundancy, Context, and Preference: An Empirical Study of Duplicate Pull Requests in OSS Projects. (2020). IEEE Transactions on Software Engineering (TSE)

  • Wang, Q., Xu, B., Xia, X., Wang, T., & Li, S. (2019, October). Duplicate Pull Request Detection: When Time Matters. In Proceedings of the 11th Asia-Pacific Symposium on Internetware (pp. 1-10).

  • Zhou, S., Vasilescu, B., & Kästner, C. (2019, August). What the fork: a study of inefficient and efficient forking practices in social coding. In Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE) (pp. 350-361).

  • Ren, L., Zhou, S., Kästner, C., & Wąsowski, A. (2019, February). Identifying redundancies in fork-based development. In Proceedings 2019 IEEE 26th International Conference on Software Analysis, Evolution and Reengineering (SANER) (pp. 230-241). IEEE.

  • Li, Z., Yu, Y., Wang, T., Yin, G., Mao, X., & Wang, H. (2019). Detecting Duplicate Contributions in Pull-based Model Combining Textual and Change Similarities. Journal of Computer Science and Technology.

  • Li, Z., Yin, G., Yu, Y., Wang, T., & Wang, H. (2017, September). Detecting duplicate pull-requests in github. In Proceedings of the 9th Asia-Pacific Symposium on Internetware (pp. 1-6).

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published