This repository is a modularized re-write of Stanford CS224N assignment 5. Specifically the Character-Level LSTM Decoder for Neural Machine Translation.
I wanted to re-write it as a way of understanding the different components of the model and then eventually the tricks used to train the model - I have somewhat annotated a few notebooks for my own reference.
Docker support coming soon. Meanwhile:
- Clone repository
- Install the requirements using
pipenv install
if running the train and test tasks. If browsing notebooks, use:
pipenv install --dev
- Download and place the Assignment 5 data from Stanford CS224N in
nmt/datasets/data/
. - Run tasks as
pipenv run sh tasks/<task-name>.sh
Possible tasks:
train_local.sh
: training using small sample (equivalent to train-local-q2 from the assignment)test_local.sh
: testing using small sample (equivalent of test-local-q2 from the assignement) - should produce BLEU score of ~99.27train.sh
: training using all data on GPU.test.sh
: testing using all data - should produce BLEU score of ~29.40
- Python 3.6 (if using Pipfile; 3.6+ if using requirements.txt)
- Pipenv
- Stanford CS224N
- Assignment 5 Handout
- Achieving Open Vocabulary Neural Machine Translation with Hybrid Word-Character Models - Minh-Thang Luong and Christopher Manning.
- Character-Aware Neural Language Models - Kim et al. (2016)
- Attention? Attention! - Lilian Weng
- D2l.ai - Aston Zhang, Zachary C. Lipton, Mu Li, Alexander J. Smola