Watson STT invocation

Related blog postWATSON SPEECH TO TEXT LANGUAGE MODEL CUSTOMIZATION.

This project contains a bash script automation example for the IBM Cloud Watson Speech to Text service.

The automation contains two flows:

Basic usage for extract the text from an audio saved in FLAC format using a base language model.
Customization of an existing language model for a domain in this example for drums ;-)

Note: If you record your own voice for example in a M4A format here is a possibiltiy to convert M4A to FLAC format for free with Converio.

Prerequsites

IBM Cloud CLI installed
A Watson Text to Speech service with an Plus plan is created.
Install the cURL command line on the local computer

Just execute following steps to run the example.

Step 1: Clone the project

git clone https://github.com/thomassuedbroecker/watson-stt-invocation.git
cd watson-stt-invocation

Step 2: Configure the `.env` file

cp ./code/.env-template ./code/.env

Step 3: Set the correct values in the `.env` file

Create an IBM Cloud APIKEY

ROOTFOLDER="YOUR_PATH"
RESOURCE_GROUP="default"
REGION="us-south"
APIKEY="YOUR_IBMCLOUD_APIKEY"
S2T_SERVICE_INSTANCE_NAME="YOUR_S2T_SERVICE_NAME"

Step 4: Invoke the bash automation

sh code/use-speech-to-text.sh

Example output

#*******************
# Customization flow
#*******************
#------------------
# Create and train a Custom Language Model
#------------------
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   160  100    61  100    99    170    277 --:--:-- --:--:-- --:--:--   458

customization_id: {"customization_id": "7868e363-4afa-4d64-96fd-c506774eebca"}
{"customizations": [{
   "owner": "d3443a47-877c-496d-95b9-f62bce50bb38",
   "base_model_name": "en-US_BroadbandModel",
   "customization_id": "7868e363-4afa-4d64-96fd-c506774eebca",
   "dialect": "en-US",
   "versions": ["en-US_BroadbandModel.v2020-01-16"],
   "created": "2022-11-18T13:32:44.945Z",
   "name": "MyDrums-1",
   "description": "MyDrums-demo",
   "progress": 0,
   "language": "en-US",
   "updated": "2022-11-18T13:32:44.945Z",
   "status": "pending"
}]}
{}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   104  100   104    0     0    319      0 --:--:-- --:--:-- --:--:--   330
Response: {
   "out_of_vocabulary_words": 1,
   "total_words": 43,
   "name": "drums1",
   "status": "analyzed"
}
Status: %-15s ( %d )
 analyzed 10
{"corpora": [{
   "out_of_vocabulary_words": 1,
   "total_words": 43,
   "name": "drums1",
   "status": "analyzed"
}]}

Train ...
{}
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   449  100   449    0     0   1537      0 --:--:-- --:--:-- --:--:--  1586
Response: {
   "owner": "d3443a47-XXX-XXXX-95b9-f62bce50bb38",
   "base_model_name": "en-US_BroadbandModel",
   "customization_id": "7868e363-XXX-XXXX-96fd-c506774eebca",
   "dialect": "en-US",
   "versions": ["en-US_BroadbandModel.v2020-01-16"],
   "created": "2022-11-XXX-XXXX",
   "name": "MyDrums-1",
   "description": "MyDrums-demo",
   "progress": 0,
   "language": "en-US",
   "updated": "2022-11-XXX-XXXX",
   "status": "training"
}
Status (training)
Status: %-15s ( %d )
 training 10
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   452  100   452    0     0   1293      0 --:--:-- --:--:-- --:--:--  1333
Response: {
   "owner": "d3443a47-XXX-XXXX-95b9-f62bce50bb38",
   "base_model_name": "en-US_BroadbandModel",
   "customization_id": "7868e363-XXX-XXXX-96fd-c506774eebca",
   "dialect": "en-US",
   "versions": ["en-US_BroadbandModel.v2020-01-16"],
   "created": "2022-11-XXX-XXXX",
   "name": "MyDrums-1",
   "description": "MyDrums-demo",
   "progress": 100,
   "language": "en-US",
   "updated": "2022-11-XXX-XXXX",
   "status": "available"
}
Status (available)
Status: %-15s ( %d )
 available 20
{"words": [{
   "display_as": "paradiddles",
   "sounds_like": ["paradiddles"],
   "count": 1,
   "source": ["drums1"],
   "word": "paradiddles"
}]}
{
   "owner": "d3443a47-XXX-XXXX-95b9-f62bce50bb38",
   "base_model_name": "en-US_BroadbandModel",
   "customization_id": "7868e363-XXX-XXXX-96fd-c506774eebca",
   "dialect": "en-US",
   "versions": ["en-US_BroadbandModel.v2020-01-16"],
   "created": "2022-11-XXX-XXXX",
   "name": "MyDrums-1",
   "description": "MyDrums-demo",
   "progress": 100,
   "language": "en-US",
   "updated": "2022-11-XXX-XXXX",
   "status": "available"
}
#------------------
# Verify a trained model by using an audio
#------------------
customization_id: 7868e363-XXX-XXXX-96fd-c506774eebca
basic_model: en-US_BroadbandModel

Test audio ...
 {
   "result_index": 0,
   "results": [
      {
         "final": true,
         "alternatives": [
            {
               "transcript": "it's great to play the drums The hi hat is something very special ",
               "confidence": 0.98
            }
         ]
      },
      {
         "final": true,
         "alternatives": [
            {
               "transcript": "it forms the basis for many rhythms syncopations are sometimes distributed with paradiddles and they are creating a fantastic rhythm together with the snare and the bass drum and a splash ",
               "confidence": 0.94
            }
         ]
      }
   ]
}
#*******************
# Basic flow
#*******************
{
   "result_index": 0,
   "results": [
      {
         "final": true,
         "alternatives": [
            {
               "transcript": "hi this is my test for Watson ",
               "confidence": 0.94
            }
         ]
      },
      {
         "final": true,
         "alternatives": [
            {
               "transcript": "speech to text ",
               "confidence": 0.99
            }
         ]
      },
      {
         "final": true,
         "alternatives": [
            {
               "transcript": "check it out ",
               "confidence": 0.99
            }
         ]
      }
   ]
} 
...

Additional information

List of used API calls:

Name		Name	Last commit message	Last commit date
Latest commit History 14 Commits
code		code
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Watson STT invocation

Prerequsites

Step 1: Clone the project

Step 2: Configure the `.env` file

Step 3: Set the correct values in the `.env` file

Step 4: Invoke the bash automation

Additional information

About

Releases

Packages

Languages

License

thomassuedbroecker/watson-stt-invocation

Folders and files

Latest commit

History

Repository files navigation

Watson STT invocation

Prerequsites

Step 1: Clone the project

Step 2: Configure the .env file

Step 3: Set the correct values in the .env file

Step 4: Invoke the bash automation

Additional information

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Step 2: Configure the `.env` file

Step 3: Set the correct values in the `.env` file

Packages