fix: don't pass quotes to espeak

Previously, the text was wrapped in an additional set of quotes that was passed
to Espeak. This could result in different phonemization in certain edges and
caused the insertion of an initial separator "_" that had to be removed.
Compare:
$ espeak-ng -q -b 1 -v en-us --ipa=1 '"A"'
_ˈɐ
$ espeak-ng -q -b 1 -v en-us --ipa=1 'A'
ˈeɪ

Fixes #2619
This commit is contained in:
Enno Hermann 2023-11-22 15:14:40 +01:00
parent 29dede20d3
commit 52981e3c53
1 changed files with 4 additions and 8 deletions

View File

@ -185,20 +185,16 @@ class ESpeak(BasePhonemizer):
if tie:
args.append("--tie=%s" % tie)
args.append('"' + text + '"')
args.append(text)
# compute phonemes
phonemes = ""
for line in _espeak_exe(self._ESPEAK_LIB, args, sync=True):
logging.debug("line: %s", repr(line))
ph_decoded = line.decode("utf8").strip()
# espeak need to skip first two characters of the retuned text:
# version 1.48.03: "_ p_ɹ_ˈaɪ_ɚ t_ə n_oʊ_v_ˈɛ_m_b_ɚ t_w_ˈɛ_n_t_i t_ˈuː\n"
# espeak:
# version 1.48.15: " p_ɹ_ˈaɪ_ɚ t_ə n_oʊ_v_ˈɛ_m_b_ɚ t_w_ˈɛ_n_t_i t_ˈuː\n"
# espeak-ng need to skip the first character of the retuned text:
# "_p_ɹ_ˈaɪ_ɚ t_ə n_oʊ_v_ˈɛ_m_b_ɚ t_w_ˈɛ_n_t_i t_ˈuː\n"
# dealing with the conditions descrived above
ph_decoded = ph_decoded[:1].replace("_", "") + ph_decoded[1:]
# espeak-ng:
# "p_ɹ_ˈaɪ_ɚ t_ə n_oʊ_v_ˈɛ_m_b_ɚ t_w_ˈɛ_n_t_i t_ˈuː\n"
# espeak-ng backend can add language flags that need to be removed:
# "sɛʁtˈɛ̃ mˈo kɔm (en)fˈʊtbɔːl(fr) ʒenˈɛʁ de- flˈaɡ də- lˈɑ̃ɡ."