Optispeech Text Processor

This project implements a multi-language text processor for Bookbot Optispeech TTS, it can handle English, Swahili, and Indonesian inputs. It uses the TextProcessor class to process text and generate input IDs for various language models.

Installation

To install the package, run: pip install git+https://github.com/bookbot-hive/bookbot-tts-text-processor.git

To install with a specific version, run: pip install git+https://github.com/bookbot-hive/bookbot-tts-text-processor.git@<version>

after running the above command, make sure that gruut[sw] is installed if it's not, install it manually by running: pip install -f 'https://synesthesiam.github.io/prebuilt-apps/' gruut[sw]

Building the Package

If you want to build the package from source, follow these steps:

Clone the repository:

git clone https://github.com/bookbot-hive/bookbot-tts-text-processor.git
cd text_processor

Install the package:
```
pip install .
```
Or, if you want to install it in editable mode for development:
```
pip install -e .
```

Usage

Here's a basic example of how to use the TextProcessor:

from text_processor import TextProcessor

model_dirs = {
   "en": "bookbot/roberta-base-emphasis-onnx-quantized",
   "sw": "",
   "id": ""
}
db_paths = {
   "en": "/home/s44504/3b01c699-3670-469b-801f-13880b9cac56/Emphasizer/data/words_emphasis_lookup_mixed.json",
   "sw": "",
   "id": ""
}

cosmos_config = {
   "url": os.getenv("COSMOS_DB_URL"),
   "key": os.getenv("COSMOS_DB_KEY"),
   "database_name": "Bookbot"
}

# English
model = TextProcessor(model_dirs["en"], db_paths["en"], language="en", use_cosmos=False, cosmos_config=cosmos_config, emphasize_text='Claude')
result = model.get_input_ids("Hello <wave> world <listen> how are you? <headLean>", phonemes=False, return_phonemes=True, push_oov_to_cosmos=True, add_blank_token=True)
print(f"Result: {result}")

# English Phoneme input
phoneme = "hɛlˈoʊ mˈaɪ nˈeɪm ˈɪz"
result = model.get_input_ids(phoneme, phonemes=True, return_phonemes=True, push_oov_to_cosmos=False, add_blank_token=True)
print(f"Result: {result}")

# Swahili Word input
model = TextProcessor(model_dirs["sw"], db_paths["sw"], language="sw", use_cosmos=False, cosmos_config=cosmos_config)
result = model.get_input_ids("Jana nilitembelea mji wa [Nairobi]. Niliona majengo [marefu] na magari mengi.", phonemes=False, return_phonemes=True, push_oov_to_cosmos=False, add_blank_token=True)
print(f"Result: {result}")

# Indonesian Word input
model = TextProcessor(model_dirs["id"], db_paths["id"], language="id", use_cosmos=False, cosmos_config=cosmos_config)
result = model.get_input_ids("Halo nama saya Budi. Siapa [nama] kamu?", phonemes=False, return_phonemes=True, push_oov_to_cosmos=False, add_blank_token=True)
print(f"Result: {result}")

Output:

{'phonemes': 'hɛlˈoʊ! mˈaɪ nˈeɪm ˈɪz "lˈæd"ɪdə....!', 'input_ids': [23, 47, 27, 59, 4, 3, 28, 55, 3, 29, 57, 28, 3, 67, 38, 3, 5, 27, 61, 19, 5, 49, 19, 45, 12, 12, 12, 12, 4, 3]}
{'phonemes': 'hɛlˈoʊ mˈaɪ nˈeɪm ˈɪz', 'input_ids': [23, 47, 27, 59, 3, 28, 55, 3, 29, 57, 28, 3, 67, 38, 3]}
{'phonemes': 'ʄɑnɑ nilitɛᵐɓɛlɛɑ mʄi wɑ "nɑiɾɔɓi". niliɔnɑ mɑʄɛᵑgɔ "mɑɾɛfu" nɑ mɑɠɑɾi mɛᵑgi.', 'input_ids': [44, 35, 23, 35, 3, 23, 18, 21, 18, 26, 39, 46, 39, 21, 39, 35, 3, 22, 44, 18, 3, 30, 35, 3, 5, 23, 35, 18, 42, 37, 36, 18, 5, 12, 3, 23, 18, 21, 18, 37, 23, 35, 3, 22, 35, 44, 39, 47, 37, 3, 5, 22, 35, 42, 39, 16, 28, 5, 3, 23, 35, 3, 22, 35, 40, 35, 42, 18, 3, 22, 39, 47, 18, 12, 3]}
{'phonemes': 'halo nama saja budi. siapa "nama" kamu?', 'input_ids': [23, 16, 27, 30, 3, 29, 16, 28, 16, 3, 33, 16, 38, 16, 3, 17, 35, 19, 24, 12, 3, 33, 24, 16, 31, 16, 3, 5, 29, 16, 28, 16, 5, 3, 26, 16, 28, 35, 15, 3]}

Be aware that turning on use_cosmos=True will use cosmos DB as your lookup table, if your turn it off it will only use the local json file.

Parameters

TextProcessor Initialization

emphasis_model_path: Path to the emphasis model directory, the model will handle phoneme emphasis.
db_path: Path to the database file for word emphasis lookup.
language: The language to use (default is "en" for English).
use_cosmos: Boolean flag to use emphasisIPA from Azure Cosmos DB (default is False).
cosmos_config: Configuration dictionary for Azure Cosmos DB connection.
animation_tags_path: Path to the animation tags CSV file.
emphasize_text: The LLM you want to use to add emphasis to the text, can be 'Claude' or 'GPT' (default is None).

get_input_ids Method

text: The input text to process.
phonemes: Boolean flag indicating if the input is phonemes (default is False).
return_phonemes: Boolean flag to return phonemes (default is True).
push_oov_to_cosmos: Boolean flag to push out-of-vocabulary (OOV) emphasis to Cosmos DB (default is False), to enable this you must set use_cosmos=True during initialization.
add_blank_token: Boolean flag to add blank tokens at the end of the input_ids (default is True).
normalize: Boolean flag to normalize the input text (default is False).

Supported Languages

English (en)
Swahili (sw)
Indonesian (id)

Azure Cosmos DB Integration

This project can optionally integrate with Azure Cosmos DB. To use this feature, set use_cosmos=True and provide the necessary configuration in cosmos_config.

Environment Variables

The following environment variables are used:

COSMOS_DB_URL: The URL for your Azure Cosmos DB instance.
COSMOS_DB_KEY: The access key for your Azure Cosmos DB instance.

TO DO's

Add Swahili and Indonesian emphasis models to the project
Add Swahili and Indonesian word to phoneme emphasis lookup.
Support body language tags handling for Swahili and Indonesian

License

This project is licensed under the MIT License.

Name		Name	Last commit message	Last commit date
Latest commit History 81 Commits
test		test
text_processor		text_processor
.gitignore		.gitignore
MANIFEST.in		MANIFEST.in
README.md		README.md
animation_data.csv		animation_data.csv
data.csv		data.csv
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Optispeech Text Processor

Installation

Building the Package

Usage

Parameters

TextProcessor Initialization

get_input_ids Method

Supported Languages

Azure Cosmos DB Integration

Environment Variables

TO DO's

License

About

Releases 32

Packages

Languages

bookbot-hive/bookbot-tts-text-processor

Folders and files

Latest commit

History

Repository files navigation

Optispeech Text Processor

Installation

Building the Package

Usage

Parameters

TextProcessor Initialization

get_input_ids Method

Supported Languages

Azure Cosmos DB Integration

Environment Variables

TO DO's

License

About

Resources

Stars

Watchers

Forks

Releases 32

Packages 0

Languages

Packages