An implementation of SSI
SSI is an implementation of
"Semi-Supervised Policy Initialization for Playing Games with Language Hints"
Tsu-Jui Fu and William Yang Wang
in North American Chapter of the Association for Computational Linguistics (NAACL) 2021 (Short)
First, the hint module H generates possible hints l for random states s. With s, the policy module P rollouts and step actions a. Then, the reward module R updates P based on the relevance between a and l. With different s, P has the opportunity to learn from various possible hints, and finally serves as a better-initialized policy.
This code is implemented under Python2, PyTorch, and Tensorflow.
Following libraries are also required:
- Semi-Supervised Initialization (SSI)
python rl/ssi.py --lang_coeff=1.0 --lang_enc=onehot --model_dir=./learn_model
- Task Training
wget http://www.cs.utexas.edu/~pgoyal/ijcai19/train_lang_data.pkl -O ./data/train_lang_data.pkl
wget http://www.cs.utexas.edu/~pgoyal/ijcai19/test_lang_data.pkl -O ./data/test_lang_data.pkl
python rl/main.py --expt_id=ID_EXPT --descr_id=ID_DESCR --lang_coeff=1.0 --lang_enc=onehot --model_dir=./learn_model
@inproceedings{fu2021ssi,
author = {Tsu-Jui Fu and William Yang Wang},
title = {{Semi-Supervised Policy Initialization for Playing Games with Language Hints}},
booktitle = {North American Chapter of the Association for Computational Linguistics (NAACL)},
year = {2021}
}