Exploring Neural Text Simplification

Abstract

We present the first attempt at using sequence to sequence neural networks to model text simplification (TS). Unlike the previously proposed automated methods, our neural text simplification (NTS) systems are able to simultaneously perform lexical simplification and content reduction. An extensive human evaluation of the output has shown that NTS systems achieve good grammaticality and meaning preservation of output sentences and higher level of simplification than the state-of-the-art automated TS systems. We train our models on the Wikipedia corpus containing good and good partial alignments.

	@InProceedings{neural-text-simplification,
	  author    = {Sergiu Nisioi and Sanja Štajner and Simone Paolo Ponzetto and Liviu P. Dinu},
	  title     = {Exploring Neural Text Simplification Models},
	  booktitle = {{ACL} {(2)}},
	  publisher = {The Association for Computational Linguistics},
	  year      = {2017}
	}

Simplify Text | Generate Predictions (no GPUs needed)

OpenNMT dependencies
1. Install Torch
2. Install additional packages:
```
luarocks install tds
```
Checkout this repository including the submodules:

   git clone --recursive https://github.com/senisioi/NeuralTextSimplification.git

Download the pre-trained released models NTS and NTS-w2v (NOTE: when using the released pre-trained models, due to recent changes in third party software, the output of our systems might not be identical to the one reported in the paper.)

   python src/download_models.py ./models

Run translate.sh from the scripts dir:

   cd src/scripts
   ./translate.sh

Check the predictions in the results directory:

   cd ../../results_NTS

Run automatic evaluation metrics

Install the python requirements (only nltk is needed)

   pip install -r src/requirements.txt

Run the evaluate script

   python src/evaluate.py ./data/test.en ./data/references/references.tsv ./predictions/

The Content of this Repository

./src

download_models.py a script to download the pre-trained models. The models are released to be usable on machines with or without GPUs. They can't be used to continue the training session. In case the download script fails, you may use the direct links for NTS and NTS-w2v
train_word2vec.py a script that creates a word2vec model from a local corpus, using gensim
SARI.py a copy of the SARI implementation
evaluate.py evaluates BLEU and SARI scores given a source file, a directory of predictions and a reference file in tsv format
./scripts - contains some of our scripts that we used to preprocess the data, output translations, and create the concatenated embeddings
./patch - the patch with some changes that need to be applied, in case you may want to use the latest checkout of OpenNMT. Alternatively, you may use our forked code which comes directly as a submodule.

./configs

Contains the OpenNMT config file. To train, please update the config file with the appropriate data on your local system and run

	th train -config $PATH_TO_THIS_DIR/configs/NTS.cfg

./predictions

Contains predictions from previous systems (Wubben et al., 2012), (Glavas and Stajner, 2015), and (Xu et al., 2016), and the generated predictions of the NTS models reported in the paper:

NTS_default_b5_h1 - the default model, beam size 5, hypothesis 1
NTS_BLEU_b12_h1 - the BLEU best ranked model, beam size 12, hypothesis 1
NTS_SARI_b5_h2 - the SARI best ranked model, beam size 12, hypothesis 1
NTS-w2v_default_b5_h1 - the default model, beam size 5, hypothesis 1
NTS-w2v_BLEU_b12_h1 - the BLEU best ranked model, beam size 12, hypothesis 1
NTS-w2v_SARI_b12_h2 - the SARI best ranked model, beam size 12, hypothesis 2

./data

Contains the training, testing, and reference sentences used to train and evaluate our models.

Name		Name	Last commit message	Last commit date
Latest commit History 56 Commits
OpenNMT @ 6947f46		OpenNMT @ 6947f46
configs		configs
data		data
models		models
predictions		predictions
src		src
.gitmodules		.gitmodules
Dockerfile		Dockerfile
README.md		README.md
acl17.pdf		acl17.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Exploring Neural Text Simplification

Abstract

Simplify Text | Generate Predictions (no GPUs needed)

The Content of this Repository

./src

./configs

./predictions

./data

About

Releases

Packages

Contributors 2

Languages

senisioi/NeuralTextSimplification

Folders and files

Latest commit

History

Repository files navigation

Exploring Neural Text Simplification

Abstract

Simplify Text | Generate Predictions (no GPUs needed)

The Content of this Repository

./src

./configs

./predictions

./data

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages