A Prompt Perturbation Toolkit for Prompt Robustness Analysis
- Installation
- Character Editing
- Word Manipulation
- Sentence Paraphrasing
- Parallel Processing
- Structure of the Code
- Citation
- Acknowledgement
pip install promptcraft
Character-level Prompt Perturbation
CharacterPerturb
class for manipulating character in a sentence
from promptcraft import character
sentence = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May."
level = 0.25 # Percentage of characters that will be edited
character_tool = character.CharacterPerturb(sentence=sentence, level=level)
Randomly replace level
percentage characters from the sentence
char_replace = character_tool.character_replacement()
Randomly delete level
percentage characters from the sentence
char_delete = character_tool.character_deletion()
Randomly insert level
percentage characters to the sentence
char_insert = character_tool.character_insertion()
Randomly swap level
percentage characters in the sentence
NOTE: including self-swapping
char_swap = character_tool.character_swap()
Randomly substitute level
percentage characters in the sentence
with a randomly chosen character which is near the original character in the Keyboard (USA Full-size Layout)
NOTE:
(1) We applied keyboard_distance=1
, i.e., the nearest character, number, or samples.
(2) If it is a character, we randomly chose lowercase or uppercase.
char_keyboard = character_tool.keyboard_typos()
Randomly substitute level
percentage characters in the sentence with a common OCR map error
char_ocr = character_tool.optical_character_recognition()
Word-level Prompt Perturbation
WordPerturb
class for manipulating words in a sentence
NOTE: the number of words in a sentence is only the valid words without considering spaces, special symbols, and punctuations
from promptcraft import word
sentence = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May."
level = 0.25 # Percentage of words that will be manipulated
word_tool = word.WordPerturb(sentence=sentence, level=level)
Randomly choose
Replace each of these words with one of its synonyms chosen at random.
Problem 1: Without any synonyms
Problem 2: Fewer positions than needed positions
word_synonym = word_tool.synonym_replacement()
Find a random synonym of a random word in the sentence that is not a stop word.
Insert that synonym into a random position in the sentence.
Do this
word_insert = word_tool.word_insertion()
Randomly choose two words in the sentence and swap their positions.
Do this
word_swap = word_tool.word_swap()
Each word in the sentence can be randomly removed with probability
word_delete = word_tool.word_deletion()
Randomly insert punctuation in the sentence with probability
word_punctuation = word_tool.insert_punctuation()
Randomly split a word to two tokens randomly
word_split = word_tool.word_split()
Sentence-level Prompt Perturbation
SentencePerturb
class for directly manipulating a sentence
from promptcraft import sentence
sen = "Natalia sold clips to 48 of her friends in April, and then she sold half as many clips in May."
sentence_tool = sentence.SentencePerturb(sentence=sen)
Back translate the sentence (English
back_trans_hf = sentence_tool.back_translation_hugging_face()
Back translate the sentence (English
back_trans_google = sentence_tool.back_translation_google()
Paraphrasing the sentence via Parrot Paraphraser
considering
(1) Adequency: Is the meaning preserved adequately?
(2) Fluency: Is the paraphrase fluent English?
(3) Diversity: (Lexical / Phrasal / Syntactical): How much has the paraphrase changed the original sentence?
sen_paraphrase = sentence_tool.paraphrase()
Transform the sentence style to Formal
sen_formal = sentence_tool.formal()
Transform the sentence style to Casual
sen_casual = sentence_tool.casual()
Transform the sentence style to Passive
sen_passive = sentence_tool.passive()
Transform the sentence style to Active
sen_active = sentence_tool.active()
Since all the methods are executed on the CPU,
they can be performed in parallel using the multiprocessing
package.
At the root of the project, you will see:
.
βββ LICENSE
βββ README.md
βββ promptcraft
βΒ Β βββ __init__.py
βΒ Β βββ character.py
βΒ Β βββ parrot.py
βΒ Β βββ sentence.py
βΒ Β βββ styleformer.py
βΒ Β βββ word.py
βββ setup.cfg
βββ setup.py
If you find our toolkit useful, please consider citing our repo and toolkit in your publications. We provide a BibTeX entry below.
@misc{JiaPromptCraft23,
author = {Jia, Shuyue},
title = {{PromptCraft}: A Prompt Perturbation Toolkit},
year = {2023},
publisher = {GitHub},
journal = {GitHub Repository},
howpublished = {\url{https://github.com/SuperBruceJia/promptcraft}},
}
@misc{JiaAwesomeLLM23,
author = {Jia, Shuyue},
title = {Awesome {LLM} Self-Consistency},
year = {2023},
publisher = {GitHub},
journal = {GitHub Repository},
howpublished = {\url{https://github.com/SuperBruceJia/Awesome-LLM-Self-Consistency}},
}
@misc{JiaAwesomeSTS23,
author = {Jia, Shuyue},
title = {Awesome Semantic Textual Similarity},
year = {2023},
publisher = {GitHub},
journal = {GitHub Repository},
howpublished = {\url{https://github.com/SuperBruceJia/Awesome-Semantic-Textual-Similarity}},
}
This work was finished during my 2023 fall semester research rotation at the Department of Electrical and Computer Engineering, Boston University.