Update README.md

fani-lab · Jul 18, 2024 · 66c85b9 · 66c85b9
1 parent b806147
commit 66c85b9
Showing 1 changed file with 1 addition and 1 deletion.
diff --git a/README.md b/README.md
@@ -130,7 +130,7 @@ For additional details, please refer to this [document](./misc/Backtranslation.p
 To evaluate the quality of the refined queries, metrics such as bleu, rouge, and semsim are employed. The bleu score measures the similarity between the backtranslated and original query by analyzing n-grams, while the rouge score considers the overlap of n-grams to capture essential content. Due to their simplicity and effectiveness, these metrics are widely utilized in machine translation tasks. Despite their usefulness, both scores may not accurately capture the overall meaning or fluency of the translated text due to their heavy reliance on n-grams. To address topic drift and evaluate the similarity between the original and refined queries, we additionally employ [declutr](https://aclanthology.org/2021.acl-long.72/) for query embeddings, computing cosine similarity. Declutr, a self-learning technique requiring no labeled data, minimizes the performance gap between unsupervised and supervised pretraining for universal sentence encoders during the extension of transformer-based language model training. The semsim metric, relying on cosine similarity of embeddings, proves highly effective in capturing the subtle semantic nuances of language, establishing itself as a dependable measure of the quality of backtranslated queries.
 
 The below images demonstrate the average token count for the original queries in English and their backtranslated versions across various languages, along with the average pairwise semantic similarities measured using 'rouge' and 'declutr'. It's evident that all languages were able to introduce new terms into the backtranslated queries while maintaining semantic coherence.
-![image](misc/similarity.jpg)
+![image](misc/similarity.png)
 
 ## Example
 These samples are taken from an ANTIQUE dataset that has been refined using a backtranslation refiner with the German language.