Skip to content

Latest commit

 

History

History
74 lines (62 loc) · 6.57 KB

Tools.md

File metadata and controls

74 lines (62 loc) · 6.57 KB

Title: Linguistic diversity
Authors: Daniel Nettle
Type: publication
Brief description: The authors modelled language diversity in terms of richness (the number of different languages in a given geographical area), phylogenetic diversity (the number of different lineages in the phylogenetic tree of languages) or structural diversity (variation among structures within languages). This is about inter-linguistic diversity.
Provided by: Agata Savary

Title: The index of linguistic diversity: A new quantitative measure of trends in the status of the world’s languages
Authors: David Harmon, Jonathan Loh
Type: publication
Brief description: The authors propose indices to follow the number of world’s active languages, the distribution of mother-tongue speakers among them and the rate of language extinction. The measures are based on a database of time-series data on language demographics. This is about inter-linguistic diversity.
Provided by: Agata Savary

Title: Measuring diversity in multilingual communication
Authors: Michele Gazzola, Torsten Templin, Lisa J. McEnteeAtalianis
Type: publication
Brief description: Socio-linguistic diversity is measured in terms of the probability of using more than one common language in multilingual communication, as well as the degree of diversity of language policies. This is about inter-linguistic diversity.
Provided by: Agata Savary

Title: Diversity in Spectral Learning for Natural Language Parsing
Authors: Shashi Narayan and Shay B. Cohen
Type: publication
Brief description: The authors study the need for diversity in training data and its impact of the performance of NLP tools in parsing.
Provided by: Agata Savary

Title: HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
Authors: Zhilin Yang, Peng Qi, Saizheng Zhang, Yoshua Bengio, William Cohen, Ruslan Salakhutdinov, Christopher D. Manning
Type: publication
Brief description: The authors present a dataset dedicated to training question answering systems, tuned for diversity.
Provided by: Agata Savary

Title: A Diversity-Promoting Objective Function for Neural Conversation Models
Authors: Jiwei Li, Michel Galley, Chris Brockett, Jianfeng Gao, and Bill Dolan
Type: publication
Brief description: The authors stress the problem of the lack of diversity in the outputs of neural conversational systems and propose an diversity-driven objective function which addresses this problem. The diversity is measured in terms of the number of distinct unigrams and bigrams in generated text.
Provided by: Agata Savary

Title: Trading Off Diversity and Quality in Natural Language Generation
Authors: Hugh Zhang, Daniel Duckworth, Daphne Ippolito, Arvind Neelakantan
Type: publication
Brief description: The authors address NL generation and study the tradeoff between generation quality and diversity. In experiments they use Shannon’s entropy (measures which relate to richness and balance, respectively).
Provided by: Agata Savary

Title: SemEval-2016 Task 1: Semantic Textual Similarity, Monolingual and Cross-Lingual Evaluation
Authors: Eneko Agirre, Carmen Banea, Daniel Cer, Mona Diab, Aitor Gonzalez-Agirre, Rada Mihalcea, German Rigau, and Janyce Wiebe
Type: publication
Brief description: The authors describe a SemEval shared task on estimating the degree of semantic equivalence between two snippets of text. To select datasets in which the range of topics would be diverse enough, they use the measure of word embedding similarity, i.e. the average cosine distance between utterance embeddings.
Provided by: Agata Savary

Title: Texygen: A Benchmarking Platform for Text Generation Models
Authors: Yaoming Zhu, Sidi Lu, Lei Zheng, Jiaxian Guo, Weinan Zhang, Jun Wang, and Yong Yu
Type: publication
Brief description: The authors describe Texygen, a benchmarking platform to support open-domain text generation. It covers a set of metrics that evaluate the diversity, the quality and the consistency of the generated texts. To measure diversity they use SelfBLEU, i.e. the BLEU measure applied to generated utterances rather than to the reference.
Provided by: Agata Savary

Title: Semantic Diversity for Natural Language Understanding Evaluation in Dialog Systems
Authors: Enrico Palumbo, Andrea Mezzalira, Cristina Marco, Alessandro Manzotti, and Daniele Amberti
Type: publication
Brief description: The authors address Natural Language Understanding in dialog systems and address the problem of creating a test set of utterances that covers a diversity of possible customer requests. The diversity measures are: SelfBLUE (as in Zhu et al. 2018), Jaccard (average word overlap across test utterances) and Word Embedding Diversity (average cosine distance between embeddings of vectors in the test set).
Provided by: Agata Savary

Title: TextBox: A unified, modularized, and extensible framework for text generation.
Authors: Junyi Li, Tianyi Tang, Gaole He, Jinhao Jiang, Xiaoxuan Hu, Puzhao Xie, Zhipeng Chen, Zhuohao Yu, Wayne Xin Zhao, and Ji-Rong Wen
Type: publication
Brief description: The authors introduce a text generation toolbox. It contains an evaluation module with various evaluation measures implemented, including some diversity measures.
Provided by: Agata Savary

Title: . Proceedings of the Workshop on Computational Linguistics for Linguistic Complexity (CL4LC)
Authors: Dominique Brunato, Felice Dell’Orletta, Giulia Venturi, Thomas François, Philippe Blache
Type: publication
Brief description: Workshop on linguistic complexity, a notion somewhat similar to diversity but addressed with different objectives (e.g.language learning or text simplification).
Provided by: Agata Savary