Skip to content

Commit

Permalink
Tests for Marathi diphthong grapheme -> long vowel phoneme rules.
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 706689539
  • Loading branch information
isingoo authored and copybara-github committed Dec 16, 2024
1 parent dee13ef commit e674ca3
Show file tree
Hide file tree
Showing 2 changed files with 14 additions and 0 deletions.
4 changes: 4 additions & 0 deletions nisaba/scripts/natural_translit/brahmic/g2p.py
Original file line number Diff line number Diff line change
Expand Up @@ -39,6 +39,10 @@ def iso_to_txn() -> pyn.Fst:

# Vowels

# TODO: Convert this constant to a function where duration and diphthong
# context are passed as arguments, and remove the recovery rule.
# The current rule rewrites all /a/ to /ə/ including the diphthongs like /ai/,
# and recovers long a with /ə:/ -> /a:/ but not the diphthongs.
A_TO_EC = (
rw.rewrite(ph.A, ph.EC) @
rw.rewrite(ph.EC + ph.DURH, ph.A + ph.DURH)
Expand Down
10 changes: 10 additions & 0 deletions nisaba/scripts/natural_translit/g2p/testdata/mr_iso_ipa.textproto
Original file line number Diff line number Diff line change
Expand Up @@ -39,3 +39,13 @@ rewrite {
input: "siddʰēgavhāṇa"
output: "sid̪d̪ʰeːɡəʋʰaːɳ"
}
rewrite {
rule: "ISO_TO_IPA"
input: "aisā"
output: "ɛːsaː"
}
rewrite {
rule: "ISO_TO_IPA"
input: "kannauja"
output: "kən̪n̪ɔːd͡ʒ"
}

0 comments on commit e674ca3

Please sign in to comment.