By Jean-Philippe Corbeil and Hadi Abdi Ghadivel
This script was programmed for data augmentation of NLP copora for the paraphrase identification task. It can easily be adapted for any other NLP task. We use Google API. Thus, you need to provide your own Google API token in your own .env file and activate Google Translate API.
Figure 1. Data augmentation in NLP with backtranslation procedure.
No filter is applied in this part of the code. We leave it to further processing steps.
Simply use the requirements.txt file (better in virtual environment):
pip install -r requirements.txt
Last update: June 8th, 2020.