Text tokenizer paragraph support #397
Unanswered
JacobDaneh
asked this question in
Q&A
Replies: 1 comment
-
Unfortunately the native ICU You could try implementing your own version of a text tokenizer to split by paragraph. Also, if you don't use a text tokenizer at all (or an empty one that doesn't split the input), you will get something that is similar to paragraph tokens, because the HTML content iterator is returning block elements by default. Let us know if you find a satisfying solution. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi,
I have a customized TTS engine that performs better with paragraphs than sentences.
In swift-toolkit, PublicationSpeechSynthesizer class we can set paragraph for unit parameter of text tokenizer, but unfortunately kotlin-toolkit does not support paragraph. Just word and sentence.
How could I add paragraph as text tokenizer unit?
Thank you for your help
Beta Was this translation helpful? Give feedback.
All reactions