Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[de] improve germanSpeller #9821

Open
wants to merge 4 commits into
base: master
Choose a base branch
from
Open

[de] improve germanSpeller #9821

wants to merge 4 commits into from

Conversation

AgnesKleinhans
Copy link
Contributor

@AgnesKleinhans AgnesKleinhans commented Dec 4, 2023

@p-goulart Could you please review my PR? Thank you :-)

@Luisa-LT @St-ac-y

Copy link
Collaborator

@p-goulart p-goulart left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trying to understand the logic here was quite taxing, but let me see if I understand:

  1. if the German tokeniser refuses to split word into multiple parts, try again with a non-strict tokeniser (that was already there);
  2. iterate over all non-Fugen-S words;
  3. if word starts with one of them, then...
  4. set part2 to whatever's left once you remove it;
  5. clean leading hyphens;
  6. and make sure the word lengths match (where does the 3 come from?)
  7. perform your regular misspelling checks.

Right?

Comment on lines +2220 to +2230
String part2 = " ";
for (String w : wordsWithoutInfixS) { // wordsWithHyphen
if (word.startsWith(w)) {
part2 = word.substring(w.length());
if (part2.startsWith("-")) {
part2 = part2.substring(1);
} else if (word.length() > w.length() && w.length() > 3) {
part2 = uppercaseFirstChar(part2.substring(0));
}
return (!isMisspelled(part2) || ignorePotentiallyMisspelledWord(part2)) && isNoun(part2);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This stretch looks over-indented 🙃

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants