Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dupe Guru not detecting duplicate files that have same MD5 hash #1232

Open
matttrv opened this issue Jun 6, 2024 · 4 comments
Open

Dupe Guru not detecting duplicate files that have same MD5 hash #1232

matttrv opened this issue Jun 6, 2024 · 4 comments
Labels
bug Bug reports.

Comments

@matttrv
Copy link

matttrv commented Jun 6, 2024

Describe the bug
Using DupeGuru for years coming to depend on its reliability. Recently, ran it as a double check on some known duplicate files and was surprised when only 80-90% of the dupes were found.

I checked the hashes (MD5, as well as SHA-1) and they were definitely dupes.

I made some sample files to do a controlled test and experienced the same problem -- dupes are not identified. I even loosened the criteria to search just on filename and no dupes found.

To Reproduce
Steps to reproduce the behavior:

  1. Unzip the attached files they will extract into \banger\ and \hanger\ with a text file and a .url file in both folders. I used .txt and .url files because they can be viewed in plain text so you know they are not malicious. (i'll attach the raw files as well)

  2. Add the folders to DupeGuru as scan targets. I scanned with both as "Normal" just to be the most inclusive about what could be identified as dupes.
    image

  3. Check if dupes were found. For me, none were found.
    image

  4. Run a certutil (with MD5) against the files and see that the are indeed dupes.
    image

Expected behavior
Should have been ID'd as dupes. Tried on two different Windows 11 machines, same result, NOT identified as dupes.
Painful because if the dupes are not comprehensively identified, I'm not sure I can continue to use DupeGuru.
I guess it also makes me worry that things identified as dupes are not in fact dupes.

Desktop (please complete the following information):

  • OS: Windows 11
  • Version: DupeGuru 4.3.1 (tested back to 4.04, same issue)

image

dupegurutest.zip

toby.txt
Alliance Stage Company.url.txt

EDIT: Attached the raw files as:

  • toby.txt
  • Alliance.Stage.Company.url.txt (dot TXT added because GitHub will not allow a dot URL to be uploaded)
@matttrv matttrv added the bug Bug reports. label Jun 6, 2024
@matttrv matttrv changed the title Dupe Guru not detecting duplicate files that have same MD5 hahs Dupe Guru not detecting duplicate files that have same MD5 hash Jun 6, 2024
@efanibi25
Copy link

efanibi25 commented Jun 9, 2024

Did you remove the size limit?

image

@AndroYD84
Copy link

Screenshot 2024-06-21 093128
I'm pretty confident you forgot to change this default option.

@efanibi25
Copy link

Screenshot 2024-06-21 093128 I'm pretty confident you forgot to change this default option.

Yes I was able to find the dupe files with that setting on, and with the files given

@Dobatymo
Copy link
Contributor

Dobatymo commented Dec 6, 2024

@efanibi25 this can be closed then right?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Bug reports.
Projects
None yet
Development

No branches or pull requests

4 participants