New Phonetic attacks

Use Many to many aligner to align letters and phones of cmudict. Calculate statistics from aligned cmudict: To what letters are phones most likely to correspond to? Weight this statistic via a word frequency. To attack a word, transcribe it to it's phonetic representation via cmudict and then use the statistics to transform the phonetic representation back to letters. For higher attack levels, choose statistically less likely letters. Cherry picked example: Heavy rain raises threat of Christmas Day flooding => Heavie rayn razes threit of Cristmise Day flooding. List of not cherry picked examples.

Substring Levenshtein Distance

Inspired by http://ginstrom.com/scribbles/2007/12/01/fuzzy-substring-matching-with-levenshtein-distance-in-python/. The Idea is to perform fuzzy substring matching to get back source word indicies between which the target word fits best. Multiple of those can then be used to puzzle together long source strings without word segmentations.

len source word := n, len target word := m
Initialize distance matrix D of size (m+1)x(n+1)
Fill top row with zeros instead of with increasing values
Fill first column with increasing values based on insertion cost.
Initialize startmatrix S of size (m+1)x(n+1) that keeps track of from which startpoints each point in the distance matrix can be reached in shortest distance
Initialize the entrys of the top row of S with sets, that only contain their respective column index. Every other entry will be initalized with the empty set.
Iterate n and m.
- For each entry, calculate the cost of getting to that entry by inserting, deleting, swapping, substituting or if possible by doing nothing (If source and target word match up at that point).
- Enter the lowest cost into the distance Matrix
- Update S, by updating the set for the current entry with the sets from the entrys that led to this entry with lowest cost
Check the bottom row of D for a lowest cost C. For all bottom row entrys of D equal to C: Store combinations of respective column index together with the set entrys of S at that location in 2-tuples into a list L
Return C,L

Dictionarys

Full-word dict: Bert's default Word piece dictionary, but only the tokens that are real english words on their own.
Morpheme dict: Morphemes from Bert's Word piece dictionary. They are marked for BERT with '##' in the beginning.

Sentence cleanup pipeline

"Soft" tokenization (Tokenize, but don't split up /,$, or sth.)
For each source word S in the Sentence:
- Create combination dictionary C
- Create single-word distance Matrix D
- For each target word T from Full-word dict + Morpheme dict:
  - Calculate Substring Levenshtein distance to obtain distance D and combinations M.
  - Calculate Anagram matching distance A, by evaluating letter frequencies by comparing character frequencies in source and target word.
  - If T in Full-word dict, store min(A,D) in D, while accounting for fillup costs for substring matches.
  - For each U in M: Store T,D in C[U]
- Puzzle together S from combinations in C, so that the start and end indices match up the full source word S. Assume lowest distance from the target words of each combinations key in C. Increase costs for using many word parts. Keep N lowest cost hypothesis in list H.
- Keep the entries of C at the selected combinations for each hypothesis as dictionaries for later use. (The lowest distance word part will not necessarily be the final result after using BERT)
- Insert D into H.
Use the cross product of all word hypothesis to form sentence hypothesis. Only keep the N sentence Hypothesis with the lowest distances in a list S.
Use softmax to convert sentence Hypothesis distances to probabilities. Use softmax to convert word part distances to probabilities for each word part in each word in each sentence hypothesis.
Return S, consisting of sentence hypothesis with probabilities associated, that in each consist of words out of word parts. Each word part has an own dictionary as well as a probability vector associated with it.
For each Hypothesis H in S:
- Select most uncertain word part U in H based on difference of highest 2 probabilities.
- Perform weighted average of the word embeddings top N words (based on probability) for each other word part in H.
- Mask U and put it, together with the avg. word embeddings into bert, to obtain a probability distribution P over words for the masked word part U.
- Only select the words in P that are in the dictionary associated with U and then use softmax to call it a likelihood L.
- Obtain Posterior P from prior probabilitys for U and L.
- Repeat until each word has been covered once (or even for more iterations).
Select the Maximum-a-posteriori (MAP) for each word-part in each sentence hypothesis
Call the probabilities associated with each sentence hypothesis prior P
Use OpenAI's GPT on each Sentence to calculate a sentence likelihood L.
Calculate Sentence Posterior P for the hypothesis from L and P.
Select Maximum-a-posteriori to get a final cleaned up sentence.
Puzzle segmentations back together using self implemented detokenization.

Cleanup example:

The sentence A sma|ll w#hiteca:t witg,l,o,win,g e[h[e[sstajdinginderneath ac*h*a*ir. has been attacked with 3 adversarial attacks. The attack parameters where [('keyboard-typo',0.2),('intrude',0.5),('segmentation',0.6)]. After tokenizing the sentence, it looks like this: ['a', 'sma|ll', 'w#hiteca:t', 'witg,l,o,win,g', 'e[h[e[sstajdinginderneath', 'ac*h*a*ir', '.'] When we now calculate the substring levenshtein distances for each word, we can form the following prior sentence hypothesis:
Hypothesis 1, Probability: 0.5591098924966913, Prior:a small white cat wit glowing eyes standing underneath chair. Hypothesis 2, Probability: 0.4300367473288713, Prior:a small white cat wit glowing eyes standing underneath a chair.
Note, that here, only the most probable words according to the prior are shown. Actually, each word index has a whole dictionary with probabilities associated with it. We see, that we have 2 hypothesis, one that the a in front of chair does not belong there and the actual word is only chair (it could be a typo or something) and the other one is that a is it's own word that belongs in front of chair. Let's now put our hypothesis through the BERT posterior and see how that changes our sentences. Hypothesis 1, Posterior: a small white cat with glowing eyes standing underneath chair. Hypothesis 2, Posterior: a small white cat with glowing eyes standing underneath a chair.
In both cases, BERT fixed the wit-with error, as wit would not make any sense gramatically.
To figure our which of the 2 Hypothesis we should finally choose, let's put the sentences into GPT to calculate sentence Likelihoods.
Hypothesis 1, Prior prob: 0.5591098924966913, Sentence: a small white cat with glowing eyes standing underneath chair. GPT Likelihood: 0.31197451, Posterior prob: 0.3708833644581104 Hypothesis 2, Prior prob: 0.4300367473288713, Sentence: a small white cat with glowing eyes standing underneath a chair. GPT Likelihood: 0.68802549, Posterior prob: 0.6291166355418897 GPT decided, that the sentence ending with a chair is much more likely than without the a. After combining likelihood with prior to Posterior, we only have to choose the MAP now, to get our final cleaned output sentence: a small white cat with glowing eyes standing underneath a chair..

Visual similarity

To asses the similarity of the first 30000 Unicode glyphs with english letters and numbers, we create a 30000x36 similarity matrix. Each glyph is drawn via PIL in 20pt using a fitting font from the Noto font collection. The bitmap is then cropped to contain only the glyph. Then the image is resized and padded on the right and bottom to be of size (30px, 30px). When comparing the bitmap of a unicode glyph image and a letter/number glyph, multiple versions of the letter/number bitmap are created. For letters, the lowercase as well as the uppercase versions of each letter are taken. The bitmap get's downsized to 5 different sizes ((30px, 30px) to (15px, 15px)), rotated and flipped in all 8 unique ways and then padded to (30px, 30px) again, such that the glyph is either placed at the top-left or the bottom left. The percentage of matching black pixels between bitmaps is converted into a similarity score and the highest similarity of all versions is chosen as a final result.

Robustness and Generalization

Our program performs character level adversarial shielding. Despite the fact that our BERT and GTP part can fix a lot of mistakes in the prior, the prior has to be at least somewhat close to the truth for our method to work. The basic version of our prior (bp), that only uses a levenshtein distance for the priors and does not alter the costs of the replacements is already pretty general. It performs well against keyboard-typos, natural-typos, truncation and segmentations. Against intruders, disemvoweling, phonetic attacks and few visual attacks, a levenshtein distance also at least gives somewhat good results. In fact, all attacks, that work on a local character level are somewhat covered by a levenshtein distance, which can be enough to get good results after using context via BERT. Only attacks, that are non local and for example shuffle the whole word (like full-swap) are not accounted for by the edit distance. The prior can be specifically extended to cover more attacks and be easily integrated into the pipeline. To demonstrate that fact, we created an improved version of the prior, called the full prior (fp). For that, we made the following changes:

Reduce edit distance replacement costs for visually similar characters
Reduce deletion cost for characters that appear often in the source word
Reduce insertion cost for vowls in words that contain no vowls
Also calculate an anagram distance. How close is source to be an anagram of target? Use minimum of the 2 distances as final distance.

Evaluation

For plots, go here /evaluation/plots

ours bp = our method using the basic prior (see Robustness and Generalization)
ours fp = our method using the full prior
For the "only prior" versions, we took the maximum for each word of the most likely hypothesis according to the prior to form a sentence.

Abbrevation map

visual	phonetic	full-swap	inner-swap	disemvowel	truncate	keyboard-typo	natural-typo	intrude	segmentation	rand
vi	ph	fs	is	dv	tr	kt	nt	in	sg	rd

mover

method	vi:0.3	dv:0.3	in:0.3	tr:0.3	sg:0.3	ph:0.3	kt:0.2,nt:0.2,tr:0.2	fs:0.3	is:0.3	nt:0.3	kt:0.3	sg:0.5,kt:0.3	rd:0.3	vi:0.3,in:0.3	rd:0.3,rd:0.3
no cleaning	0.226	0.317	0.151	0.435	0.499	0.265	0.19	0.291	0.307	0.414	0.345	0.154	0.321	0.012	0.162
ours fp (only priors)	0.687	0.657	0.874	0.499	0.78	0.419	0.307	0.689	0.843	0.513	0.562	0.538	0.651	0.68	0.451
ours fp (full pipeline)	0.83	0.795	0.862	0.767	0.809	0.588	0.577	0.822	0.831	0.767	0.808	0.687	0.784	0.816	0.643
ours bp (full pipeline)	0.696	0.574	0.845	0.778	0.82	0.57	0.588	0.4	0.539	0.765	0.832	0.701	0.677	0.628	0.501
ours bp (only priors)	0.415	0.317	0.875	0.524	0.797	0.428	0.341	0.292	0.379	0.527	0.638	0.572	0.512	0.402	0.315
pyspellchecker	0.388	0.336	0.588	0.606	0.459	0.397	0.355	0.311	0.441	0.505	0.596	0.23	0.441	0.173	0.27
danishpruthi	0.257	0.38	0.446	0.387	0.4	0.303	0.239	0.277	0.52	0.423	0.416	0.159	0.376	0.14	0.233

sts-b

method	vi:0.3	dv:0.3	in:0.3	tr:0.3	sg:0.3	ph:0.3	kt:0.2,nt:0.2,tr:0.2	fs:0.3	is:0.3	nt:0.3	kt:0.3	sg:0.5,kt:0.3	rd:0.3	vi:0.3,in:0.3	rd:0.3,rd:0.3
no cleaning	0.402	0.473	0.664	0.827	0.867	0.588	0.581	0.402	0.574	0.762	0.593	0.573	0.611	0.33	0.322
ours fp (only priors)	0.775	0.788	0.888	0.78	0.886	0.556	0.52	0.744	0.872	0.758	0.759	0.698	0.714	0.715	0.53
ours fp (full pipeline)	0.83	0.776	0.859	0.793	0.834	0.654	0.587	0.802	0.818	0.773	0.817	0.63	0.779	0.799	0.654
ours bp (full pipeline)	0.652	0.545	0.847	0.815	0.863	0.642	0.534	0.428	0.605	0.778	0.841	0.671	0.638	0.571	0.461
ours bp (only priors)	0.58	0.483	0.902	0.764	0.893	0.613	0.525	0.397	0.546	0.777	0.783	0.735	0.56	0.621	0.425
pyspellchecker	0.59	0.537	0.833	0.87	0.834	0.633	0.645	0.374	0.627	0.815	0.816	0.661	0.598	0.459	0.475
danishpruthi	0.427	0.62	0.712	0.677	0.748	0.517	0.543	0.379	0.744	0.665	0.566	0.436	0.552	0.355	0.384

bleu

method	vi:0.3	dv:0.3	in:0.3	tr:0.3	sg:0.3	ph:0.3	kt:0.2,nt:0.2,tr:0.2	fs:0.3	is:0.3	nt:0.3	kt:0.3	sg:0.5,kt:0.3	rd:0.3	vi:0.3,in:0.3	rd:0.3,rd:0.3
no cleaning	0.025	0.149	0.087	0.153	0.284	0.098	0.016	0.155	0.154	0.141	0.135	0.008	0.137	0.002	0.016
ours fp (only priors)	0.41	0.392	0.567	0.198	0.482	0.173	0.044	0.434	0.568	0.183	0.249	0.238	0.38	0.384	0.179
ours fp (full pipeline)	0.547	0.515	0.568	0.467	0.516	0.339	0.293	0.54	0.55	0.453	0.504	0.398	0.507	0.531	0.365
ours bp (full pipeline)	0.452	0.32	0.541	0.47	0.524	0.333	0.29	0.201	0.305	0.448	0.526	0.407	0.4	0.371	0.259
ours bp (only priors)	0.155	0.089	0.566	0.217	0.487	0.187	0.06	0.098	0.151	0.194	0.347	0.27	0.229	0.149	0.082
pyspellchecker	0.151	0.098	0.402	0.296	0.123	0.134	0.066	0.086	0.186	0.152	0.259	0.021	0.17	0.074	0.052
danishpruthi	0.05	0.148	0.294	0.076	0.124	0.091	0.02	0.081	0.306	0.107	0.136	0.008	0.147	0.033	0.041

rouge-1

method	vi:0.3	dv:0.3	in:0.3	tr:0.3	sg:0.3	ph:0.3	kt:0.2,nt:0.2,tr:0.2	fs:0.3	is:0.3	nt:0.3	kt:0.3	sg:0.5,kt:0.3	rd:0.3	vi:0.3,in:0.3	rd:0.3,rd:0.3
no cleaning	0.254	0.603	0.393	0.721	0.704	0.514	0.404	0.6	0.623	0.644	0.601	0.322	0.566	0.144	0.357
ours fp (only priors)	0.849	0.829	0.944	0.745	0.917	0.648	0.538	0.846	0.928	0.72	0.769	0.761	0.813	0.835	0.675
ours fp (full pipeline)	0.917	0.9	0.938	0.885	0.925	0.748	0.733	0.902	0.913	0.872	0.902	0.843	0.882	0.91	0.781
ours bp (full pipeline)	0.834	0.751	0.923	0.892	0.93	0.722	0.752	0.572	0.698	0.872	0.92	0.847	0.806	0.778	0.662
ours bp (only priors)	0.639	0.587	0.944	0.762	0.927	0.656	0.576	0.549	0.636	0.733	0.823	0.791	0.719	0.655	0.549
pyspellchecker	0.572	0.62	0.791	0.87	0.76	0.666	0.627	0.612	0.732	0.757	0.836	0.499	0.701	0.421	0.52
danishpruthi	0.389	0.629	0.659	0.664	0.627	0.536	0.45	0.556	0.76	0.646	0.644	0.328	0.611	0.334	0.441

editdistance

method	vi:0.3	dv:0.3	in:0.3	tr:0.3	sg:0.3	ph:0.3	kt:0.2,nt:0.2,tr:0.2	fs:0.3	is:0.3	nt:0.3	kt:0.3	sg:0.5,kt:0.3	rd:0.3	vi:0.3,in:0.3	rd:0.3,rd:0.3
no cleaning	12.555	7.615	5.923	3.645	2.667	9.5	8.305	14.35	10.775	4.99	3.833	6.615	7.718	18.505	13.182
ours fp (only priors)	2.408	3.635	1.11	5.593	2.005	8.873	9.695	4.178	2.158	5.407	4.072	5.0	3.945	2.902	7.768
ours fp (full pipeline)	2.268	2.995	1.597	3.14	2.22	7.55	7.19	3.228	2.768	3.322	2.36	4.47	3.485	2.467	6.657
ours bp (full pipeline)	5.04	9.27	1.9	2.913	2.02	8.777	6.982	14.78	9.918	3.38	1.962	4.29	6.485	6.862	11.49
ours bp (only priors)	7.28	11.685	1.167	4.942	1.762	8.685	8.39	14.675	10.303	4.99	2.7	4.322	7.095	8.38	11.995
pyspellchecker	8.543	9.533	3.147	3.34	4.52	8.29	7.235	14.418	9.265	4.91	2.803	7.072	7.253	13.73	12.043
danishpruthi	12.547	9.488	4.92	7.25	6.192	11.265	11.412	16.122	7.045	7.595	5.897	10.258	8.915	15.755	13.922

Runtime and Memory requirements

Runtime evaluated with an Intel i7 processor (4 cores, 8 threads) and a Nvida 1070 ti graphics card.
Using some efficient batching and multiprocessing we got the time down to clean a document of 400 sentences down to around 70 min for the Levenshtein distances, ca. 49 minutes for the context-bert cleaning and one minute for the GPT hypothesis choice. Summed up, we thus take ca. 2 hours to clean a 400 sentence document (18 seconds per sentence). Note however, that just cleaning a single sentence may take significantly longer than 18 seconds, as we can't make use of efficient batching and multiprocessing in that case.
The memory requirements depend on if we load GTP and BERT at the same time into memory or separately. For same time, we take around 6GB memory, otherwise around 4GB.

Picked cherrys

The with (visual, 1.0) attacked sentence Å ğŕơưₚ ȭḟ ᶣoɍßěᵴ ɡɻàʑïƞɢ ĩߒ â ᶂȉɇᶪɗ. get's cleaned perfectly to a group of horses grazing in a field. although the maximum of the prior looks like this: a group dr korea grazing in a lila.

Comparison to other work

We compared to the following other work https://github.com/danishpruthi/Adversarial-Misspellings: This implementation is only targeted at defending swaps, additions, drops and keyboard typos, but even for this kind of errors it only barely works. For example, their demonstration example is nicset atcing I have ever witsesed => nicest acting i have ever witnessed, but already for something slightly different, generated by our natural-typo adversarial attack (strength: 0.2) nicest acting i have evere withnessed it fails and produces nicest acting i have every witnessed. For complicated stuff like A sma|ll w#hiteca:t witg,l,o,win,g e[h[e[sstajdinginderneath ac*h*a*ir. for which to be fair it was built for, it does not work at all: a small w#hiteca:t witg,l,o,win,g e[h[e[sstajdinginderneath ac*h*a*ir..

Human Similarty clues (How humans handle adversarial attackes)

First of all, you have to distinguish between two basic ways of dealing with words.

Word recognition: Instantaneous automatic cognitive process to recognize a word, for very familiar words (often function words). Cannot be used for adversarial attacked words.
Word identification: needs a specific strategy to recognize the word, because it is an unknown or rarely seen word. The literature discusses this case almost only in the context of children learning to read. Nevertheless, a brief summary of the basic strategies:
- context clues (implemented with Bert)
- word order and grammer (implemented with Bert → could perhaps use grammer better)
- word parts → Splitting compound words into individual parts
- morphemic analysis → Break word into morphemes to analyze meaning

Typoglycemia

Typoglycemia is a neologism that describes the effect that a text is still perfectly readable, even though the letters in the words have been swapped, as long as the first and last letter remain the same. According to an Internet meme, this is said to have been scientifically confirmed by the University of Cambridge. However, this is not the case. Nevertheless, Matt Davis from the University of Cambridge has written an interesting article about this. (https://www.mrc-cbu.cam.ac.uk/people/matt.davis/cmabridge/). Important points from the article for our paper:

text reading speed decrase by 11% when internal letters are reorderd, so the order of letters is important
transposition of adjacent letters are easy to read (we model this with your levenstein distance)
transpostion are easy to read if you do not change the sound of a word
Context is very important this is given by function words that do not change e.g. the/and
Shape of a word plays a big role in recognizing words, hence the idea that only the first and last letter would play a role. Nevertheless these do not determine alone the shape of a word.

The meme: Aoccdrnig to a rscheearch at Cmabrigde Uinervtisy, it deosn't mttaer in waht oredr the ltteers in a wrod are, the olny iprmoetnt tihng is taht the frist and lsat ltteer be at the rghit pclae. The rset can be a toatl mses and you can sitll raed it wouthit porbelm. Tihs is bcuseae the huamn mnid deos not raed ervey lteter by istlef, but the wrod as a wlohe.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pre_paper_writeup.md

pre_paper_writeup.md

New Phonetic attacks

Substring Levenshtein Distance

Dictionarys

Sentence cleanup pipeline

Cleanup example:

Visual similarity

Robustness and Generalization

Evaluation

Abbrevation map

mover

sts-b

bleu

rouge-1

editdistance

Runtime and Memory requirements

Picked cherrys

Comparison to other work

Human Similarty clues (How humans handle adversarial attackes)

Typoglycemia

Files

pre_paper_writeup.md

Latest commit

History

pre_paper_writeup.md

File metadata and controls

New Phonetic attacks

Substring Levenshtein Distance

Dictionarys

Sentence cleanup pipeline

Cleanup example:

Visual similarity

Robustness and Generalization

Evaluation

Abbrevation map

mover

sts-b

bleu

rouge-1

editdistance

Runtime and Memory requirements

Picked cherrys

Comparison to other work

Human Similarty clues (How humans handle adversarial attackes)

Typoglycemia