a tensorflow implementation of the paper Convolutional Neural Networks for Sentence Classification
The code is highly based on :
- Yoon Kim theano implementation
- Denny Britz tensorflow implemetation
- abhaikollara tensorflow implementation
This model is slighlty different from the previous ones. There is possibility to use several pretrained vectors, load a few different datasets and choose between using the same filter during convolutions for all word embedding channels or a seperate one. Finally, provides cross validation splits for the datasets used (although, cross validation is not explicitely implemented). At the moment when assesing on a dev/test set all words that didn't exist in training set are considered unknown, even when using pretrained vectors...
- Python 3
- Tensorflow 1.0
- numpy
- change arguments in
conf.py
according to needs - replace/change code in
main.py
according to preference - run code using
$ python main.py
Here are several hyperparameters that need to be fixed prior to running or else the program is going to crash. Currently there is no possibility to download automatically the supported datasets or/and word vectors. So, one must download and point to them
There is a possibility to check variables/embeddings in tesorboard too (although not so well implemented) just run
$ tensorboard --logdir ./runs/
if your current directory is the code folder.
- Stanford Sentiment Treebank both binary and finegrained.
- Movie Review Data (MR)
- Large Movie Review Dataset (IMDB)