GitHub - saagar-parikh/soundsense: Audio-Visual Robot Learning

Directory structure:

models
    baselines
        cnnlstm
            model.py
            ...
        mulsa
            model.py
            ...
test
    test_*.py
model_inference.py

Keyboard teleop

w - move up s - move down a - move left d - move right

i - extend arm j - retract arm

l - roll right j - roll left

m - close gripper n - open gripper

Instructions

You will need 3 terminals: Terminal 1 - roslaunch stretch_core stretch_driver.launch (background terminal)

Split terminal into two panels: terminal 2 - grab_audio.py

terminal 3 - teleop_collector.py (control on this)

After collecting one data point, ensure that the audio recorded is correct (play music near the camera and see if it is correct). Once everything looks okay, you can start collecting all the datapoints.

Audio README

roslaunch audio_capture capture.launch format:="wave"

Audio is of format - S16LE (signed 16 bit (int16) little endian)

Before running inference, run pavucontrol and ensure that the recording device is the stereo osmo action 3 camera.
Also ensure that the input device in settings is osmo action 3 camera. Output is internal headphones.

Name		Name	Last commit message	Last commit date
Latest commit History 155 Commits
analysis		analysis
bash		bash
collecting_datum		collecting_datum
config		config
deprecated		deprecated
models		models
test		test
.gitignore		.gitignore
Models.md		Models.md
README.md		README.md
docker_commands.md		docker_commands.md
dockerfile		dockerfile
robot_node.py		robot_node.py
robot_node_dagger.py		robot_node_dagger.py
robot_node_separate.py		robot_node_separate.py
robot_node_test_with_gt.py		robot_node_test_with_gt.py
robot_node_vis.py		robot_node_vis.py
temp.png		temp.png
test_collect_dagger.py		test_collect_dagger.py
test_model.py		test_model.py
test_model_gt.py		test_model_gt.py
test_model_ros.py		test_model_ros.py
visualize_gt_inference.py		visualize_gt_inference.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Keyboard teleop

Instructions

Audio README

About

Languages

saagar-parikh/soundsense

Folders and files

Latest commit

History

Repository files navigation

Keyboard teleop

Instructions

Audio README

About

Resources

Stars

Watchers

Forks

Languages