Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: improve text search normalization #149

Merged
merged 1 commit into from
Jan 15, 2025
Merged

Conversation

camshaft
Copy link
Member

@camshaft camshaft commented Jan 9, 2025

Description of changes:

This change fixes a bug in the text::find module that doesn't correctly map characters to the original positions. The new version is more accurate. It also includes a fuzz test that has some checks around the normalized string having the specified properties.

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

@camshaft camshaft force-pushed the camshaft/text-find-fixes branch from c5b5fad to b9beea2 Compare January 15, 2025 00:12
@camshaft camshaft force-pushed the camshaft/text-find-fixes branch from b9beea2 to c7539d3 Compare January 15, 2025 00:31
@camshaft camshaft marked this pull request as ready for review January 15, 2025 00:31
@camshaft camshaft requested a review from a team as a code owner January 15, 2025 00:31
@@ -32,58 +52,214 @@ fn fast_find(needle: &str, haystack: &str) -> Option<Range<usize>> {

/// TODO we should probaly deprecate this - it's better to enforce strict matching
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

still believe this?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yep. that's why i've added the ability to warn on it 😄. still need to wire up the warnings though.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh I read the TODO as suggesting not allowing fuzzy finds at all, so it wouldn't be a warning, it would just not match

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's what I want to get to. But removing the fuzzy matching right now would be a breaking change so I'd like to give some buffer for codebases to fix the matches and then we can deprecate some time in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah I get it, so the warning would be something like "this is a fuzzy match, which is deprecated and will no longer be supported"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah that's exactly what I had in mind

@camshaft camshaft merged commit 97e0bed into main Jan 15, 2025
13 checks passed
@camshaft camshaft deleted the camshaft/text-find-fixes branch January 15, 2025 00:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants