Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix phonetics for 誼 #531

Merged
merged 1 commit into from
Oct 18, 2024
Merged

Fix phonetics for 誼 #531

merged 1 commit into from
Oct 18, 2024

Conversation

xatier
Copy link
Contributor

@xatier xatier commented Sep 26, 2024

The concise dictionary prefers ㄧˋ, we provide both for fault tolerance.

https://dict.concised.moe.edu.tw/search.jsp?md=1&word=%E8%AA%BC#searchL

@lukhnos
Copy link
Contributor

lukhnos commented Oct 16, 2024

This is a problematic entry. I suspect no one reads 交誼 (and its derivative terms such as 交誼廳) as ㄐㄧㄠ ㄧˋ because of our natural tendency to disambiguate very common terms (in this case 交易 which occurs much commonly). I feel that this one borderlines on following the dictionary for the sake of following the dictionary.

@xatier
Copy link
Contributor Author

xatier commented Oct 16, 2024

No worries, feel free to close this one if you don't find it suitable for the dictionary.

Side note, I wonder if we had the capability to support dual/multiple frequency for these case, we could adjust the frequency for 交誼 ㄐㄧㄠ ㄧˋ to be significantly lower than 交易 ㄐㄧㄠ ㄧˋ. This could give more flexibility to handle these cases. Similar idea to my proposal in #498 for user dictionary.

@xatier
Copy link
Contributor Author

xatier commented Oct 16, 2024

My planned approach is to keep sharing these PRs when I get some chance, but would like the team to decide whether 小麥 wants to accept the changes or not. We merge only the PRs that the team feels valuable for users. Would that work for you?

@lukhnos
Copy link
Contributor

lukhnos commented Oct 17, 2024

My planned approach is to keep sharing these PRs when I get some chance, but would like the team to decide whether 小麥 wants to accept the changes or not. We merge only the PRs that the team feels valuable for users. Would that work for you?

It works, and it's much appreciated. I want to reiterate that we're very grateful that we receive PRs like the ones you have contributed.

You pointed out the inconsistency in McBopomofo's own data. I agree with you. I believe we are all aware of the low-hanging fruits in any attempt to fix something. Many recent PRs in this category had addressed those without any controversy (and that's considering the fact my fellow reviewers have a high bar for those), but we now have some cases where we are reluctant about, and I'm afraid that we have to make a subjective decision on those.

I wish I could be more academic in giving my own rationales but I don't have the resource to make those arguments (such as using corpus and audio sources to show one pronunciation is more prevalent in one variant of Mandarin than the other—that would be the job for MOE's dictionary editors!). There is also the fluid nature of spoken language. 亞 is a good example. While it's very telling which variant you're speaking when you say 亞洲 (this is especially so in formal speech such as news announcments), I'd venture to say the reading for 波希米亞 (third vs fourth tone) is likely equally split among Taiwanese Mandarin speakers. So 波希米亞 ㄅㄛ ㄒㄧ ㄇㄧˇ ㄧㄚˋ seems a good addition, but 泛亞 ㄈㄢˋ ㄧㄚˋ is unlikely (it's also not how the now defunct telecom company used to call itself, for example).

All this was to say that there are times when an effort to keep things consistent runs counter to a maintainer's intuition about the language, and unfortunately the rationales will be incomplete or subjective. If I'm doing the review, though, I'll do my best to make a case-by-case argument why I need to make such a subjective call. How does that sound to you?

Side note, I wonder if we had the capability to support dual/multiple frequency for these case, we could adjust the frequency for 交誼 ㄐㄧㄠ ㄧˋ to be significantly lower than 交易 ㄐㄧㄠ ㄧˋ. This could give more flexibility to handle these cases. Similar idea to my proposal in #498 for user dictionary.

In the case of this PR, I checked the built data and 交易 would still have a higher score, but I agree with you in general that having something as suggested in #498 is helpful.

@xatier
Copy link
Contributor Author

xatier commented Oct 17, 2024

@lukhnos I see we share the same view regarding how we'd like to update the dictionary files here. I'm also relying on my experience and intuition as a native speaker when crafting these PRs. I have no issues if we need to have some subjective calls with certain controversial terms, that's totally fine!

Just kindly let me know and I can help update the PRs. Let's work together to improve the 小麥 dictionary.

@xatier
Copy link
Contributor Author

xatier commented Oct 17, 2024

I have removed the 交誼 ㄐㄧㄠ ㄧˋ terms, kindly let me know if you seek for other changes. Thanks again for reviewing this and sharing your opinions!

Copy link
Contributor

@lukhnos lukhnos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have removed the 交誼 ㄐㄧㄠ ㄧˋ terms, kindly let me know if you seek for other changes. Thanks again for reviewing this and sharing your opinions!

Much appreciated, and thanks for the feedback to my comment. I suggested removing 友誼 but keeping others so as to maintain the PR's purpose as an attempt to follow the official dictionary as much as possible while also maintaining that the changes are subject to our discretion.

Source/Data/BPMFMappings.txt Outdated Show resolved Hide resolved
Copy link
Contributor

@lukhnos lukhnos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@lukhnos lukhnos merged commit 1a47569 into openvanilla:master Oct 18, 2024
1 check passed
@xatier xatier deleted the yi branch October 21, 2024 22:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants