-
Notifications
You must be signed in to change notification settings - Fork 122
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Alignment outputs are not as expected #32
Comments
This looks quite strange. I’ll take a look and get back to you.
Can you let me know what version or revision you are working with?
Thanks!
Joe
…Sent from my iPhone
On 7 Oct 2017, at 04:32, Xavier Anguera ***@***.***> wrote:
Hi,
I am using phonetisaurus to align a a grapheme input to its phonetic transcription.
For this I use the phonetisaurus-align tool with alignment models trained on CMUDict.
I a few cases I see that the output does not match with the input, see for example:
input to the aligner:
OVERAWE OW1 V ER0 AA2
Output from the aligner:
O}OW1 V}V E}_ R}_ A}_ E}_
I had to go around it by computing how many phonemes and graphemes I had in the input and output and do something else if it does not match, but I was wondering if it would not be possible/advisable that phonetisaurus could raise an error/warning in these cases. Currently it exists normally, without any sign that an issue occurred.
Thanks!
―
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
Can you also share the version of the cmudict that you are using, or a link to the revision in their corresponding repo? I cannot find the example word you shared in any recent revision I have handy. I tried to reproduce similar behavior with the latest version of the aligner in master, and the latest version of the cmudict: $ wget https://raw.githubusercontent.com/cmusphinx/cmudict/master/cmudict.dict
$ cat cmudict.dict | perl -pe 's/\([0-9]+\)//;
s/\s+/ /g; s/^\s+//;
s/\s+$//; @_ = split (/\s+/);
$w = shift (@_);
$_ = $w."\t".join (" ", @_)."\n";' > cmudict.formatted.dict
$ phonetisaurus-train --lexicon cmudict.formatted.dict --seq2_del I wrote the following script which I think performs the comparison you described: #!/usr/bin/env python
import re, sys, os
from collections import defaultdict
def ProcessAligned (corpusfile, lexicon) :
with open (corpusfile, "r") as ifp :
for line in ifp :
graphs = []; phones = []
tokens = re.split (ur"\s+", line.decode ("utf8").strip ())
for token in tokens :
g,p = re.split (ur"\}", token)
graphs.extend (re.split (ur"\|", g))
phones.extend (re.split (ur"\|", p))
word = u"".join ([g for g in graphs if not g == u"_"])
pron = u" ".join ([p for p in phones if not p == u"_"])
prons = lexicon [word]
if not pron in prons :
entry = u"{0}\t{1}".format (word, pron)
print entry.encode ("utf8")
return
def LoadLexicon (lexiconfile) :
lexicon = defaultdict (list)
with open (lexiconfile, "r") as ifp :
for entry in ifp :
word, pron = re.split (ur"\t", entry.decode ("utf8").strip ())
lexicon [word].append (pron)
return lexicon
if __name__ == "__main__" :
import argparse
lexicon = LoadLexicon (sys.argv [1])
ProcessAligned (sys.argv [2], lexicon) when I run it against the reference lexicon and resulting aligned corpus:
all pronunciations from the original are found. This again makes me think that it may be an issue related to spaces in the read in lexicon. Lemme know! |
I am using git revision 195f31-dirty
The word OVERAWE is not in CMUDict, I computed its transcription using
Phonetisaurus' G2P model trained on CMUDict, and then I tried to align
graphemes to phonemes, unsuccessfully, as you can see.
thanks!
On Fri, Oct 6, 2017 at 11:02 PM, Josef Novak <notifications@github.com>
wrote:
… This looks quite strange. I’ll take a look and get back to you.
Can you let me know what version or revision you are working with?
Thanks!
Joe
Sent from my iPhone
> On 7 Oct 2017, at 04:32, Xavier Anguera ***@***.***>
wrote:
>
> Hi,
> I am using phonetisaurus to align a a grapheme input to its phonetic
transcription.
> For this I use the phonetisaurus-align tool with alignment models
trained on CMUDict.
> I a few cases I see that the output does not match with the input, see
for example:
> input to the aligner:
> OVERAWE OW1 V ER0 AA2
> Output from the aligner:
> O}OW1 V}V E}_ R}_ A}_ E}_
>
> I had to go around it by computing how many phonemes and graphemes I had
in the input and output and do something else if it does not match, but I
was wondering if it would not be possible/advisable that phonetisaurus
could raise an error/warning in these cases. Currently it exists normally,
without any sign that an issue occurred.
>
> Thanks!
>
> ―
> You are receiving this because you are subscribed to this thread.
> Reply to this email directly, view it on GitHub, or mute the thread.
>
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#32 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJE6_D-JakCEdXTgX1BQrfEzXNulpgkoks5spqNwgaJpZM4Pw76s>
.
|
Ah OK, I did not quite understand at first. Can you use the python bindings or the script interface directly? This will actually provide back the original alignment from the decoding step, and will also retain the arc weights from the joint sequence LM, including backoff epsilon arcs: The python bindings/script interface provide back the following result in my case: $ ./script/phoneticize.py --model /tmp/experiment/train/model.fst --word overawe
0.00 OW1 V ER0 AA1
-------
o:OW1:5.37
v:V:0.84
e|r:ER0:0.06
<eps>:<eps>:1.85
<eps>:<eps>:0.49
a:AA1:5.51
<eps>:<eps>:0.29
w:_:4.21
<eps>:<eps>:0.29
e:_:2.63
<eps>:<eps>:1.06 |
Hi,
I am not interested in getting transcription for some of the words, as some
are entered manually by the user, although I do ned to have alignments for
all.
Why would the python wrapper behave differently from the executable?
In any case, I wrote a simple script to detect when there are alignment
issues and right now I am discarting them, so that they do not break my
pipeline.
Thanks
X.
…On Sun, Oct 8, 2017 at 3:03 AM, Josef Novak ***@***.***> wrote:
Ah OK, I did not quite understand at first. Can you use the python
bindings or the script interface directly? This will actually provide back
the original alignment from the decoding step, and will also retain the arc
weights from the joint sequence LM, including backoff epsilon arcs:
- https://github.com/AdolfVonKleist/Phonetisaurus/
blob/master/src/include/PhonetisaurusScript.h#L119
<https://github.com/AdolfVonKleist/Phonetisaurus/blob/master/src/include/PhonetisaurusScript.h#L119>
The python bindings/script interface provide back the following result in
my case:
$ ./script/phoneticize.py --model /tmp/experiment/train/model.fst --word overawe
0.00 OW1 V ER0 AA1
-------
o:OW1:5.37
v:V:0.84
e|r:ER0:0.06<eps>:<eps>:1.85<eps>:<eps>:0.49
a:AA1:5.51<eps>:<eps>:0.29
w:_:4.21<eps>:<eps>:0.29
e:_:2.63<eps>:<eps>:1.06
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#32 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AJE6_D3_AIC-5l11NQXPLwogYTnUYztpks5sqC11gaJpZM4Pw76s>
.
|
Hi,
I am using phonetisaurus to align a a grapheme input to its phonetic transcription.
For this I use the phonetisaurus-align tool with alignment models trained on CMUDict.
I a few cases I see that the output does not match with the input, see for example:
input to the aligner:
OVERAWE OW1 V ER0 AA2
Output from the aligner:
O}OW1 V}V E}_ R}_ A}_ E}_
I had to go around it by computing how many phonemes and graphemes I had in the input and output and do something else if it does not match, but I was wondering if it would not be possible/advisable that phonetisaurus could raise an error/warning in these cases. Currently it exists normally, without any sign that an issue occurred.
Thanks!
The text was updated successfully, but these errors were encountered: