Skip to content

String matching algorithms #162

Closed Answered by maxbachmann
SergeyMalashenko asked this question in Q&A
Discussion options

You must be logged in to vote

By default extractOne is using fuzz.WRatio, which combines multiple ratios (I will add a explanation of this in the documentation for 1.0.0):

documentation fuzz.WRatio

Here is the documentation from FuzzyWuzzy on fuzz.WRatio:

#. Run full_process from utils on both strings
#. Short circuit if this makes either string empty
#. Take the ratio of the two processed strings (fuzz.ratio)
#. Run checks to compare the length of the strings
* If one of the strings is more than 1.5 times as long as the other
use partial_ratio comparisons - scale partial results by 0.9
(this makes sure only full results can return 100)
* If one of the strings is over 8 times as long as the other
instead scale by 0.6

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by maxbachmann
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
question Further information is requested
2 participants
Converted from issue

This discussion was converted from issue #71 on November 21, 2021 09:19.