NAO/Zora Architecture for Google Assistant
This repository contains the source of a Choregraphe project that allows robots by Softbank Robotics to behave like a Google Assistant, refereed as GA from now, it responds to normal voice command/questions (eg. "Who is Obama?", "What time is it?", "Turn off the light", etc...) like a normal Google Home would do.
The script provides visual feedback to the user, eyes change colors according to the robot states: red indicates an error, blue indicates that it's listening, white indicate when it's idle. This was tested on a real robot, named Zora, that is based on the Nao robot by Softbank.
The robot should have ALSA installed, note that in Zora it's already installed.
More than the default behavior of a GA device, the robot, with this project can run custom Choregraphe projects, by saying something like execute object recognition
. This projects have 4 sub-projects: Action Commands, Object Detection, Sentiment Analysis and Bingo.
All the following points are mandatory to use this project
- Go in the folder where you want to clone the project
- Clone the project
git clone https://github.com/conema/ZAGA.git
- Enter into the cloned folder
- Automatic procedure
- Manual procedure
- Go in the
Dockerfile
folder docker build --tag="build1:ZAGA" .
- Wait for the end of the build (it can take some time)
- Create/open a project in the Actions Console
- Register a device model
- Download the credentials
- Use
docker run -P -it build1:ZAGA
to start the container - Move the json with credentials into the container
- Use the
google-oauthlib-tool
to generate credentials:
google-oauthlib-tool --scope https://www.googleapis.com/auth/assistant-sdk-prototype --save --headless --client-secrets credentials.json
- To start do the box follow:
Note: Remember that with docker you should use the published port that you see with docker port <containerId>
. You can get the id with docker ps
.
[1] GA Server (By conema)
GA-Server is a simple script that works as a server, it receives audio chunks from a client and it forwards them to Google Assistant. This script can be used (with a client) when you want to integrate GA into a device that is not powerful enough or in a device where the SDK couldn't be installed.
Note: NodeJS >= 11 is needed, you can install it by following the guide in the NodeJs website
Steps from 1 to 5 are needed only if you don't have a registered project or the Google Assistant SDK credentials
- Create/open a project in the Actions Console
- Register a device model
- Download
credentials.json
- Install the
google-oauthlib-tool
in a Python 3 virtual environment:
python3 -m venv env && \
env/bin/python -m pip install --upgrade pip setuptools && \
env/bin/pip install --upgrade "google-auth-oauthlib[tool]"
- Use the
google-oauthlib-tool
to generate credentials:
env/bin/google-oauthlib-tool --scope https://www.googleapis.com/auth/assistant-sdk-prototype --save --headless --client-secrets /path/to/credentials.json
git clone https://github.com/conema/GA-Server.git
cd GA-Server/
- Run
npm install
- Open
index.js
and edit the host, port, input/output sample rate if needed (Default settings are: accept connections from all IPv4 addresses on the local machine on port 4000, audio with 16000Hz sample rate) - Register Custom Device Actions
- Download the gactions-cli and move the executable in
../CDA/
- Update custom actions to GA and set the testing mode. Change <project_id> with the ID of the Google Assistant model (the one created in point 2)
cd ../CDA/ && \ ./gactions update --action_package actions.json --project <project_id> && \ ./gactions test --action_package actions.json --project <project_id>
- Download the gactions-cli and move the executable in
- Go into
GA-Server
folder - Run
node index.js
If all it's working, it should appear a message with "TCP server listen on address: x.w.y.z:p". That means that the server is ready to receive audio chunks from a client. - Open the Choregraphe project, right click on the GA box, click Set parameter and set as IP the IP of the computer and as port 4000.
- Start the behavior, and after been said
Hey Zora
, wait for the beep and for eyes becoming blue, and sayWhat time is it?
[2] Humanoid robot action commands with an ontology (By fspiga13)
This engine allows the NAO/Zora robot to execute natural language commands spoken by the user. To provide the robot with knowledge, we have defined an action robot ontology. The ontology is fed to a NLP engine that performs a machine reading of the input text (in natural language) given by a user and tries to identify action commands for the robot to execute.
A video showing the proposed system and the knowledge that is possible to extract from the human interaction with Zora is available here.
More information about this project can be found here.
The project should works with Python 2.7 and Python 3, but it's tested only with Python 3. Java 8 is also needed.
Note: this project use the python env variable for starting scripts, so the variable python should be present.
- Install required modules
pip3 install requests &&\
pip3 install hdfs &&\
pip3 install bs4 &&\
pip3 install rdflib &&\
pip3 install nltk &&\
pip3 install graphviz &&\
pip3 install stanfordcorenlp &&\
pip3 install networkx &&\
pip3 install matplotlib
- Download Stanford CoreNLP and move the unzipped folder into the
textToRdf
folder
- Run the CoreNLP server
cd ZoraAC/textToRdf/stanford-corenlp-full-2018-10-05/ && \
java -mx6g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000
Note: Using Stanford locally requires about 4GB of free RAM
- Test the operation of the RDF creation tool, it should return an RDF. Change
<sentence>
with the an input sequence, like: Zora, raise your right arm. Note that this can be very slow.
cd ../src/ && \
python xproject.py -t "<sentence>"
- Start ZoraNlpReasoner.jar and use test Actions without NAO/Zora
cd ../../ && \
java -jar ZoraNlpReasoner.jar
- Write commands, like "Hey Zora, raise your right arm" or "Hey Zora, move your head to the left"
- Run the CoreNLP server
cd ZoraAC/textToRdf/stanford-corenlp-full-2018-10-05/ && \
java -mx6g -cp "*" edu.stanford.nlp.pipeline.StanfordCoreNLPServer -port 9000 -timeout 15000
Note: Using Stanford locally requires about 4GB of free RAM
- Start ZoraNlpReasoner as server
cd ../../ && \
java -jar ZoraNlpReasoner.jar use zora
- Open the Choregraphe project, right click on the AC box, click Set parameter and set as URL the url of the precedent server (something like
http://<IP>:5003
, where IP is the internet address of the computer where ZoraNlpReasoner.jar is running) - Start the behavior, and after been said
Hey Zora
, wait for the beep and for eyes becoming blue, and sayexecute ontology
- Say or write the command in the dialog console, like "Hey Zora, raise your right arm" or "Hey Zora, move your head to the left"
[3] Object recognition (By fabseulo)
This application is a server that performs object detection with NAO/Zora, using models loaded in the TensorFlow ModelServer. The robot takes a picture and sends it to the server that answers with the recognized object. The robot asks the user if its bet is right, if so, it makes happy gestures, otherwise it makes sad ones. The box stops when the user say "stop".
Note: the script found in this repository is a modified version of fabseulo's one, the original version will not work with this.
Python 3 is required to run this script. Tested in Ubuntu.
- Install required modules
pip3 install tensorflow && \
pip3 install grpcio && \
pip3 install numpy && \
pip3 install scipy && \
pip3 install object-detection && \
pip3 install hdfs && \
pip3 install tensorflow-serving-api && \
pip3 install flask && \
pip3 install flask_restful && \
pip3 install imageio
-
Install TFX via APT (More installation alternative here)
- Add TensorFlow Serving distribution URI as a package source
echo "deb [arch=amd64] http://storage.googleapis.com/tensorflow-serving-apt stable tensorflow-model-server tensorflow-model-server-universal" | tee /etc/apt/sources.list.d/tensorflow-serving.list && \ curl https://storage.googleapis.com/tensorflow-serving-apt/tensorflow-serving.release.pub.gpg | apt-key add
- Install and update TensorFlow ModelServer
apt-get update && apt-get install tensorflow-model-server
-
Compile protobuf files
- Install protobuf compiler (more builds here)
wget https://github.com/google/protobuf/releases/download/v3.3.0/protoc-3.3.0-linux-x86_64.zip && \ unzip protoc-3.3.0-linux-x86_64.zip
- Go in the folder where you want to download the protobuf files and clone them
git clone https://github.com/tensorflow/models.git
- Enter into the folder where the files are been cloned and compile them. Change <PROTOC_FOLDER> with the absolute path of where you unzipped
protoc-3.3.0-linux-x86_64.zip
cd models/research/ && \ <PROTOC_FOLDER>/bin/protoc object_detection/protos/*.proto --python_out=.
- Add protos files to PYTHONPATH. Change <PROTOS_COMPILED> with the absolute path of the folder where you put the TensorFlow models
echo "export PYTHONPATH=$PYTHONPATH:<PROTOS_COMPILED>/research:<PROTOS_COMPILED>/research/slim" >> ~/.bashrc && \ source ~/.bashrc
-
Download the models
cd ../../ZoraOD/ mkdir -p Models/coco_model/1/ mkdir -p Models/people_model/1/ mkdir -p Models/pets_model/1/ wget https://github.com/hri-unica/Zora-Object-Detection/raw/master/Models/coco_model/1/saved_model.pb -O Models/coco_model/1/saved_model.pb wget https://github.com/hri-unica/Zora-Object-Detection/raw/master/Models/people_model/1/saved_model.pb -O Models/people_model/1/saved_model.pb wget https://github.com/hri-unica/Zora-Object-Detection/raw/master/Models/pets_model/1/saved_model.pb -O Models/pets_model/1/saved_model.pb
- Go to
ZoraOD/
- Edit
model_server.config
, in rows 4, 9, 14 you'll need change<PATH_TO_PROJECT>
with the absolute path of the folder where you downloaded/cloned this project - Start TFX, remember to change
PATH_TO_PROJECT
tensorflow_model_server --port=4001 --model_config_file='<PATH_TO_PROJECT>/ZoraOD/model_server.config'
- Run the test.py to check that all is working.
python3 test.py --image_path=Images_test/harry_meghan.jpg
If the execution is without errors, the script should return a predicted string and the image with bounding boxes should been created in the Images_bbx
folder.
If Hadoop is running, the application saves a log file and the predicted images into HDFS.
- Go into the
ZoraOD/
folder - Edit
model_server.config
, in rows 4, 9 and 14, you'll need change<PATH_TO_PROJECT>
with the absolute path of the folder where you downloaded/cloned this project - Start TFX, remember to change
PATH_TO_PROJECT
tensorflow_model_server --port=4001 --model_config_file='<PATH_TO_PROJECT>/ZoraOD/model_server.config'
- Start the image receiver server located in the Object detection folder
python3 image_receiver.py
- Open the Choregraphe project, right click on the OD box, click Set parameter and set as URL the url of the preceded server (something like
http://<IP>:4002
, where IP is the internet address of the computer where image_receiver.py is running) - Start the behavior, and after been said
Hey Zora
, wait for the beep and for eyes becoming blue, and sayexecute object detection
- Say or write in the dialog box the text and follow the NAO/Zora commands.
NAO/Zora can automatically understand the polarity of what the user says. Based on the sentiment, the robot will make a neutral, positive or negative animation.
- Apache Maven needs to be installed.
- Download glove.6B and place all the text files into
ZoraSA/WordVectors/glove.6B/
- Download SentiWordNet_3.0
cd ZoraSA/ &&\
wget -O SentiWordNet_3.0.txt https://github.com/aesuli/SentiWordNet/raw/master/data/SentiWordNet_3.0.0.txt
- Download RNN models
cd BUPPolarityDetection &&\
wget https://github.com/hri-unica/Nao-Zora-Polarity/raw/master/BUPPolarityDetection/en-rnn.zip &&\
wget https://github.com/hri-unica/Nao-Zora-Polarity/raw/master/BUPPolarityDetection/it-rnn.zip &&\
wget https://github.com/hri-unica/Nao-Zora-Polarity/raw/master/BUPPolarityDetection/subj-rnn.zip
- Download opennlp files
cd opennlp &&\
wget http://opennlp.sourceforge.net/models-1.5/en-ner-date.bin &&\
wget http://opennlp.sourceforge.net/models-1.5/en-ner-location.bin &&\
wget http://opennlp.sourceforge.net/models-1.5/en-ner-organization.bin &&\
wget http://opennlp.sourceforge.net/models-1.5/en-parser-chunking.bin &&\
wget http://opennlp.sourceforge.net/models-1.5/en-pos-maxent.bin &&\
wget http://opennlp.sourceforge.net/models-1.5/en-pos-perceptron.bin &&\
wget http://opennlp.sourceforge.net/models-1.5/en-sent.bin &&\
wget http://opennlp.sourceforge.net/models-1.5/en-token.bin &&\
wget https://github.com/aciapetti/opennlp-italian-models/raw/master/models/it/it-pos-maxent.bin &&\
wget https://github.com/aciapetti/opennlp-italian-models/raw/master/models/it/it-pos_perceptron.bin &&\
wget https://github.com/aciapetti/opennlp-italian-models/blob/master/models/it/it-sent.bin &&\
wget https://github.com/aciapetti/opennlp-italian-models/blob/master/models/it/it-token.bin
- Export Maven opts
echo 'export MAVEN_OPTS="-Xmx2G -Dorg.bytedeco.javacpp.maxbytes=10G -Dorg.bytedeco.javacpp.maxphysicalbytes=10G"' >> ~/.bashrc &&\
source ~/.bashrc
- Go into the
ZoraSA
folder - Start the polarity detection service.
<PATH_TO_MAVEN>
should be changed with the directory to the local installed Maven.
cd BUPPolarityDetection/ && \
<PATH_TO_MAVEN>/bin/mvn jetty:run -Djetty.http.port=8080
Note: if you're using the docker image, <PATH_TO_MAVEN> is /root/apache-maven-3.6.1/
- Open the Choregraphe project, right click on the SA box, click Set parameter and set as URL the url of the preceded server (something like
http://<IP>:8080/sa/service
, where IP is the internet address of the computer where Jetty is running) - Start the behavior, and after been said
Hey Zora
, wait for the beep and for eyes becoming blue, and sayexecute sentiment analysis
- Say or write in the dialog box the text
Zora/Nao interprets and responds to statements made by users in ordinary natural language, using seq2seq.
- Install modules
apt-get install graphviz &&\
pip3 install flask_jsonpify &&\
pip3 install keras &&\
pip3 install theano &&\
pip3 install pydot
- Download nltk (from bash)
python << END && \
import nltk && \
nltk.download('punkt') && \
END
-
Download models
- Go to
ZoraCB
folder - Download
wget https://github.com/hri-unica/Nao-Zora-conversational-agent/raw/master/my_model_weights.zip.001 && \ wget https://github.com/hri-unica/Nao-Zora-conversational-agent/raw/master/my_model_weights.zip.002 && \ wget https://github.com/hri-unica/Nao-Zora-conversational-agent/raw/master/my_model_weights20.zip.001 && \ wget https://github.com/hri-unica/Nao-Zora-conversational-agent/raw/master/my_model_weights20.zip.002 && \ wget https://github.com/hri-unica/Nao-Zora-conversational-agent/raw/master/my_model_weights_bot.zip.001 && \ wget https://github.com/hri-unica/Nao-Zora-conversational-agent/raw/master/my_model_weights_bot.zip.002 && \ wget https://github.com/hri-unica/Nao-Zora-conversational-agent/raw/master/my_model_weights_bot.zip.003
- Merge and extract
cat my_model_weights.zip* > my_model_weights.zip && \ cat my_model_weights_bot.zip* > my_model_weights_bot.zip && \ cat my_model_weights20.zip* > my_model_weights20.zip && \ unzip my_model_weights.zip && \ unzip my_model_weights_bot.zip && \ unzip my_model_weights20.zip
- Go to
- Go into the
ZoraCB
folder - Start the webapp
python3 webapp.py
- Open the Choregraphe project, right click on the Chatbot box, click Set parameter and set as URL the url of the preceded server (something like
http://<IP>:4003/chatbot
, where IP is the internet address of the computer where webapp.py is running) - Start the behavior, and after been said
Hey Zora
, wait for the beep and for eyes becoming blue, and sayexecute chatbot
- Say or write in the dialog box the text
Bingo is a self-contained package, which works out-of-the-box without any configuration, that plays Bingo with the user. It behaves as follows:
- NAO/Zora explains the rules to play with her;
- She starts to say random bingo numbers until she hears "bingo", "line", "stop" or "repeat":
- If bingo or line are said, NAO/Zora asks to dictate the numbers and after 5 (for line) or 15 (for bingo) are said, she stop the user and she repeat, for confirmation, all the numbers that the user said. If the user confirms the numbers, NAO/Zora checks if these numbers are in the extracted ones: if so, the user win, otherwise NAO/Zora starts saying the numbers again. Otherwise, if the user didn't confirm the numbers, NAO/Zora asks to dictate them again;
- If stop is said, the game stops;
- If repeat is said, NAO/Zora repeats the last number that she said and the game continues.
- Start the behavior, and after been said
hey Zora
, wait for the beep and for eyes becoming blue, and sayexecute bingo
- Say or write in the dialog box the text