diff --git a/data/xml/2020.emnlp.xml b/data/xml/2020.emnlp.xml
index 367a25d63f..bc887f33ae 100644
--- a/data/xml/2020.emnlp.xml
+++ b/data/xml/2020.emnlp.xml
@@ -3817,11 +3817,13 @@
RyanCotterell
3138–3153
The question of how to probe contextual word representations in a way that is principled and useful has seen significant recent attention. In our contribution to this discussion, we argue, first, for a probe metric that reflects the trade-off between probe complexity and performance: the Pareto hypervolume. To measure complexity, we present a number of parametric and non-parametric metrics. Our experiments with such metrics show that probe’s performance curves often fail to align with widely accepted rankings between language representations (with, e.g., non-contextual representations outperforming contextual ones). These results lead us to argue, second, that common simplistic probe tasks such as POS labeling and dependency arc labeling, are inadequate to evaluate the properties encoded in contextual word representations. We propose full dependency parsing as an example probe task, and demonstrate it with the Pareto hypervolume. In support of our arguments, the results of this illustrative experiment conform closer to accepted rankings among contextual word representations.
- 2020.emnlp-main.254
+ 2020.emnlp-main.254
10.18653/v1/2020.emnlp-main.254
pimentel-etal-2020-pareto
rycolab/pareto-probing
+
+ Updated appendix.
Interpretation of NLP models through input marginalization
diff --git a/data/xml/2022.acl.xml b/data/xml/2022.acl.xml
index 76fe144b0c..119e902fe8 100644
--- a/data/xml/2022.acl.xml
+++ b/data/xml/2022.acl.xml
@@ -7691,11 +7691,11 @@ in the Case of Unambiguous Gender
From Simultaneous to Streaming Machine Translation by Leveraging Streaming History
- JavierIranzo Sanchez
+ JavierIranzo-Sánchez
JorgeCivera
- AlfonsJuan-Císcar
+ AlfonsJuan
6972-6985
- Simultaneous machine translation has recently gained traction thanks to significant quality improvements and the advent of streaming applications. Simultaneous translation systems need to find a trade-off between translation quality and response time, and with this purpose multiple latency measures have been proposed. However, latency evaluations for simultaneous translation are estimated at the sentence level, not taking into account the sequential nature of a streaming scenario. Indeed, these sentence-level latency measures are not well suited for continuous stream translation, resulting in figures that are not coherent with the simultaneous translation policy of the system being assessed. This work proposes a stream-level adaptation of the current latency measures based on a re-segmentation approach applied to the output translation, that is successfully evaluated on streaming conditions for a reference IWSLT task
+ Simultaneous Machine Translation is the task of incrementally translating an input sentence before it is fully available. Currently, simultaneous translation is carried out by translating each sentence independently of the previously translated text. More generally, Streaming MT can be understood as an extension of Simultaneous MT to the incremental translation of a continuous input text stream. In this work, a state-of-the-art simultaneous sentence-level MT system is extended to the streaming setup by leveraging the streaming history. Extensive empirical results are reported on IWSLT Translation Tasks, showing that leveraging the streaming history leads to significant quality gains. In particular, the proposed system proves to compare favorably to the best performing systems.
2022.acl-long.480
2022.acl-long.480.software.zip
iranzo-sanchez-etal-2022-simultaneous
diff --git a/data/xml/2022.emnlp.xml b/data/xml/2022.emnlp.xml
index 4faab06072..a301634db9 100644
--- a/data/xml/2022.emnlp.xml
+++ b/data/xml/2022.emnlp.xml
@@ -9701,7 +9701,7 @@
KOLD: Korean Offensive Language Dataset
- YounghoonJeongKAIST (Korea Advanced Institute of Science and Technology)
+ YounghunJeongKAIST (Korea Advanced Institute of Science and Technology)
JuhyunOhIndependent Researcher
JongwonLeeSamsung Research
JaimeenAhnIndependent Researcher
@@ -10260,13 +10260,13 @@
10.18653/v1/2022.emnlp-main.787
- Attentional Probe: Estimating a Module’s Functional Potential
+ The Architectural Bottleneck Principle
TiagoPimentelUniversity of Cambridge
JosefValvodaUniversity of Cambridge
NiklasStoehrETH Zurich
RyanCotterellETH Zürich
11459-11472
-
+ In this paper, we seek to measure how much information a component in a neural network could extract from the representations fed into it. Our work stands in contrast to prior probing work, most of which investigates how much information a model's representations contain. This shift in perspective leads us to propose a new principle for probing, the architectural bottleneck principle: In order to estimate how much information a given component could extract, a probe should look exactly like the component. Relying on this principle, we estimate how much syntactic information is available to transformers through our attentional probe, a probe that exactly resembles a transformer's self-attention head. Experimentally, we find that, in three models (BERT, ALBERT, and RoBERTa), a sentence's syntax tree is mostly extractable by our probe, suggesting these models have access to syntactic information while composing their contextual representations. Whether this information is actually used by these models, however, remains an open question..
2022.emnlp-main.788
pimentel-etal-2022-attentional
10.18653/v1/2022.emnlp-main.788
diff --git a/data/xml/2022.inlg.xml b/data/xml/2022.inlg.xml
index 61792352a7..3c7239ed3a 100644
--- a/data/xml/2022.inlg.xml
+++ b/data/xml/2022.inlg.xml
@@ -578,7 +578,7 @@
DialogSum Challenge: Results of the Dialogue Summarization Shared Task
YulongChen
NaihaoDeng
- YangLiu
+ YangLiu
YueZhang
94-103
We report the results of DialogSum Challenge, the shared task on summarizing real-life sce- nario dialogues at INLG 2022. Four teams participate in this shared task and three submit their system reports, exploring different meth- ods to improve the performance of dialogue summarization. Although there is a great im- provement over the baseline models regarding automatic evaluation metrics, such as ROUGE scores, we find that there is a salient gap be- tween model generated outputs and human an- notated summaries by human evaluation from multiple aspects. These findings demonstrate the difficulty of dialogue summarization and suggest that more fine-grained evaluatuion met- rics are in need.
diff --git a/data/xml/2023.acl.xml b/data/xml/2023.acl.xml
index 5498b6da30..e69acb4d56 100644
--- a/data/xml/2023.acl.xml
+++ b/data/xml/2023.acl.xml
@@ -3657,7 +3657,7 @@
PengchengHeMicrosoft
BaolinPengTencent AI Lab
SongWangMicrosoft Azure AI
- YangLiuMicrosoft
+ YangLiuMicrosoft
RuochenXuMicrosoft
HanyHassanMicrosoft
YuShiMicrosoft
@@ -6827,9 +6827,11 @@
NanyunPengUniversity of California, Los Angeles
9235-9254
Automatic melody-to-lyric generation is a task in which song lyrics are generated to go with a given melody. It is of significant practical interest and more challenging than unconstrained lyric generation as the music imposes additional constraints onto the lyrics. The training data is limited as most songs are copyrighted, resulting in models that underfit the complicated cross-modal relationship between melody and lyrics. In this work, we propose a method for generating high-quality lyrics without training on any aligned melody-lyric data. Specifically, we design a hierarchical lyric generation framework that first generates a song outline and second the complete lyrics. The framework enables disentanglement of training (based purely on text) from inference (melody-guided text generation) to circumvent the shortage of parallel data. We leverage the segmentation and rhythm alignment between melody and lyrics to compile the given melody into decoding constraints as guidance during inference. The two-step hierarchical design also enables content control via the lyric outline, a much-desired feature for democratizing collaborative song creation. Experimental results show that our model can generate high-quality lyrics that are more on-topic, singable, intelligible, and coherent than strong baselines, for example SongMASS, a SOTA model trained on a parallel dataset, with a 24% relative overall quality improvement based on human ratings. Our code is available at https://github.com/amazon-science/unsupervised-melody-to-lyrics-generation.
- 2023.acl-long.513
+ 2023.acl-long.513
tian-etal-2023-unsupervised
10.18653/v1/2023.acl-long.513
+
+ Added description of authors contributions.
Causality-aware Concept Extraction based on Knowledge-guided Prompting
@@ -6917,9 +6919,11 @@
YueZhangWestlake University
9332-9351
Most existing cross-lingual summarization (CLS) work constructs CLS corpora by simply and directly translating pre-annotated summaries from one language to another, which can contain errors from both summarization and translation processes. To address this issue, we propose ConvSumX, a cross-lingual conversation summarization benchmark, through a new annotation schema that explicitly considers source input context. ConvSumX consists of 2 sub-tasks under different real-world scenarios, with each covering 3 language directions. We conduct thorough analysis on ConvSumX and 3 widely-used manually annotated CLS corpora and empirically find that ConvSumX is more faithful towards input text. Additionally, based on the same intuition, we propose a 2-Step method, which takes both conversation and summary as input to simulate human annotation process. Experimental results show that 2-Step method surpasses strong baselines on ConvSumX under both automatic and human evaluation. Analysis shows that both source input text and summary are crucial for modeling cross-lingual summaries.
- 2023.acl-long.519
+ 2023.acl-long.519
chen-etal-2023-revisiting
10.18653/v1/2023.acl-long.519
+
+ Correct acknowledgement.
Learning Dynamic Contextualised Word Embeddings via Template-based Temporal Adaptation
@@ -9612,7 +9616,7 @@
UniSumm and SummZoo: Unified Model and Diverse Benchmark for Few-Shot Summarization
YulongChenZhejiang University, Westlake University
- YangLiuMicrosoft
+ YangLiuMicrosoft
RuochenXuMicrosoft
ZiyiYangMicrosoft Research
ChenguangZhuMicrosoft Cognitive Services Research Group
diff --git a/data/xml/2023.arabicnlp.xml b/data/xml/2023.arabicnlp.xml
index 697f5303bc..5cc80a5947 100644
--- a/data/xml/2023.arabicnlp.xml
+++ b/data/xml/2023.arabicnlp.xml
@@ -363,7 +363,7 @@
In-Context Meta-Learning vs. Semantic Score-Based Similarity: A Comparative Study in Arabic Short Answer Grading
MennaFateen
- TsunenoriMina
+ TsunenoriMine
350-358
Delegating short answer grading to automated systems enhances efficiency, giving teachers more time for vital human-centered aspects of education. Studies in automatic short answer grading (ASAG) approach the problem from instance-based or reference-based perspectives. Recent studies have favored instance-based methods, but they demand substantial data for training, which is often scarce in classroom settings. This study compares both approaches using an Arabic ASAG dataset. We employ in-context meta-learning for instance-based and semantic score-based similarity for reference-based grading. Results show both methods outperform a baseline and occasionally even surpass human raters when grading unseen answers. Notably, the semantic score-based similarity approach excels in zero-shot settings, outperforming in-context meta-learning. Our work contributes insights to Arabic ASAG and introduces a prompt category classification model, leveraging GPT3.5 to augment Arabic data for improved performance.
2023.arabicnlp-1.28
@@ -662,7 +662,7 @@
Itri Amigos at ArAIEval Shared Task: Transformer vs. Compression-Based Models for Persuasion Techniques and Disinformation Detection
JehadOumer
NoumanAhmed
- NataliaManrique
+ NataliaFlechas Manrique
543-548
Social media has significantly amplified the dissemination of misinformation. Researchers have employed natural language processing and machine learning techniques to identify and categorize false information on these platforms. While there is a well-established body of research on detecting fake news in English and Latin languages, the study of Arabic fake news detection remains limited. This paper describes the methods used to tackle the challenges of the ArAIEval shared Task 2023. We conducted experiments with both monolingual Arabic and multi-lingual pre-trained Language Models (LM). We found that the monolingual Arabic models outperformed in all four subtasks. Additionally, we explored a novel lossless compression method, which, while not surpassing pretrained LM performance, presents an intriguing avenue for future experimentation to achieve comparable results in a more efficient and rapid manner.
2023.arabicnlp-1.53
@@ -682,7 +682,7 @@
UL & UM6P at ArAIEval Shared Task: Transformer-based model for Persuasion Techniques and Disinformation detection in Arabic
SalimaLamsiyah
- AbdelkaderMahdaouy
+ AbdelkaderEl Mahdaouy
HamzaAlami
IsmailBerrada
ChristophSchommer
@@ -1050,7 +1050,7 @@
UM6P & UL at WojoodNER shared task: Improving Multi-Task Learning for Flat and Nested Arabic Named Entity Recognition
- AbdelkaderMahdaouy
+ AbdelkaderEl Mahdaouy
SalimaLamsiyah
HamzaAlami
ChristophSchommer
diff --git a/data/xml/2023.banglalp.xml b/data/xml/2023.banglalp.xml
index 2b1c2a9471..962d2cb579 100644
--- a/data/xml/2023.banglalp.xml
+++ b/data/xml/2023.banglalp.xml
@@ -76,8 +76,8 @@
SourabrataMukherjee
AkankshaBansal
PrithaMajumdar
- AtulOjhaUniversity of Galway, Ireland, Insight SFI Research Centre for Data Analytics, DSI, University of Galway, Ireland and Panlingua Languague Processing LLP, India
- OndrejDusekCharles University, Prague
+ Atul Kr.OjhaUniversity of Galway, Ireland, Insight SFI Research Centre for Data Analytics, DSI, University of Galway, Ireland and Panlingua Languague Processing LLP, India
+ OndřejDušekCharles University, Prague
34-47
Text style transfer (TST) involves modifying the linguistic style of a given text while retaining its core content. This paper addresses the challenging task of text style transfer in the Bangla language, which is low-resourced in this area. We present a novel Bangla dataset that facilitates text sentiment transfer, a subtask of TST, enabling the transformation of positive sentiment sentences to negative and vice versa. To establish a high-quality base for further research, we refined and corrected an existing English dataset of 1,000 sentences for sentiment transfer based on Yelp reviews, and we introduce a new human-translated Bangla dataset that parallels its English counterpart. Furthermore, we offer multiple benchmark models that serve as a validation of the dataset and baseline for further research.
2023.banglalp-1.5
@@ -356,8 +356,8 @@
UFAL-ULD at BLP-2023 Task 1: Violence Detection in Bangla Text
SourabrataMukherjee
- AtulOjhaUniversity of Galway, Ireland, Insight SFI Research Centre for Data Analytics, DSI, University of Galway, Ireland and Panlingua Languague Processing LLP, India
- OndrejDusekCharles University, Prague
+ Atul Kr.OjhaUniversity of Galway, Ireland, Insight SFI Research Centre for Data Analytics, DSI, University of Galway, Ireland and Panlingua Languague Processing LLP, India
+ OndřejDušekCharles University, Prague
220-224
In this paper, we present UFAL-ULD team’s system, desinged as a part of the BLP Shared Task 1: Violence Inciting Text Detection (VITD). This task aims to classify text, with a particular challenge of identifying incitement to violence into Direct, Indirect or Non-violence levels. We experimented with several pre-trained sequence classification models, including XLM-RoBERTa, BanglaBERT, Bangla BERT Base, and Multilingual BERT. Our best-performing model was based on the XLM-RoBERTa-base architecture, which outperformed the baseline models. Our system was ranked 20th among the 27 teams that participated in the task.
2023.banglalp-1.27
@@ -571,8 +571,8 @@
UFAL-ULD at BLP-2023 Task 2 Sentiment Classification in Bangla Text
SourabrataMukherjee
- AtulOjhaUniversity of Galway, Ireland, Insight SFI Research Centre for Data Analytics, DSI, University of Galway, Ireland and Panlingua Languague Processing LLP, India
- OndrejDusekCharles University, Prague
+ Atul Kr.OjhaUniversity of Galway, Ireland, Insight SFI Research Centre for Data Analytics, DSI, University of Galway, Ireland and Panlingua Languague Processing LLP, India
+ OndřejDušekCharles University, Prague
336-339
In this paper, we present the UFAL-ULD team’s system for the BLP Shared Task 2: Sentiment Analysis of Bangla Social Media Posts. The Task 2 involves classifying text into Positive, Negative, or Neutral sentiments. As a part of this task, we conducted a series of experiments with several pre-trained sequence classification models – XLM-RoBERTa, BanglaBERT, Bangla BERT Base and Multilingual BERT. Among these, our best-performing model was based on the XLM-RoBERTa-base architecture, which outperforms baseline models. Our system was ranked 19th among the 30 teams that participated in the task.
2023.banglalp-1.45
diff --git a/data/xml/2023.bigpicture.xml b/data/xml/2023.bigpicture.xml
index d72d7d1659..5425efd52c 100644
--- a/data/xml/2023.bigpicture.xml
+++ b/data/xml/2023.bigpicture.xml
@@ -9,7 +9,7 @@
SebastianRuder
NoahA. Smith
Association for Computational Linguistics
- Singapore, Singapore
+ Singapore
December
2023
2023.bigpicture-1
diff --git a/data/xml/2023.ccl.xml b/data/xml/2023.ccl.xml
index b4b578d3ff..6a35a6ac1c 100644
--- a/data/xml/2023.ccl.xml
+++ b/data/xml/2023.ccl.xml
@@ -368,7 +368,8 @@
基于动态常识推理与多维语义特征的幽默识别(Humor Recognition based on Dynamically Commonsense Reasoning and Multi-Dimensional Semantic Features)
TuerxunTunike
- DongyuLin, Hongfei anf Zhang
+ HongfeiLin
+ DongyuZhang
LiangYang
ChangrongMin
吐尔逊吐妮可
diff --git a/data/xml/2023.conll.xml b/data/xml/2023.conll.xml
index 192fea3a51..63e223bfb4 100644
--- a/data/xml/2023.conll.xml
+++ b/data/xml/2023.conll.xml
@@ -298,7 +298,7 @@
TomKouwenhoven
Wernerde Valk
MarcoSpruit
- PetervanderPutten
+ Petervan der Putten
389–402
To what degree should we ascribe cognitive capacities to Large Language Models (LLMs), such as the ability to reason about intentions and beliefs known as Theory of Mind (ToM)? Here we add to this emerging debate by (i) testing 11 base- and instruction-tuned LLMs on capabilities relevant to ToM beyond the dominant false-belief paradigm, including non-literal language usage and recursive intentionality; (ii) using newly rewritten versions of standardized tests to gauge LLMs’ robustness; (iii) prompting and scoring for open besides closed questions; and (iv) benchmarking LLM performance against that of children aged 7-10 on the same tasks. We find that instruction-tuned LLMs from the GPT family outperform other models, and often also children. Base-LLMs are mostly unable to solve ToM tasks, even with specialized prompting. We suggest that the interlinked evolution and development of language and ToM may help explain what instruction-tuning adds: rewarding cooperative communication that takes into account interlocutor and context. We conclude by arguing for a nuanced perspective on ToM in LLMs.
2023.conll-1.25
diff --git a/data/xml/2023.eamt.xml b/data/xml/2023.eamt.xml
index f872fef12c..e9f324d5d1 100644
--- a/data/xml/2023.eamt.xml
+++ b/data/xml/2023.eamt.xml
@@ -637,7 +637,7 @@
MaCoCu: Massive collection and curation of monolingual and bilingual data: focus on under-resourced languages
- Marta BaNón
+ MartaBañón
MălinaChichirău
MiquelEsplà-Gomis
MikelForcada
diff --git a/data/xml/2023.emnlp.xml b/data/xml/2023.emnlp.xml
index 24406ea2db..74a3d8724b 100644
--- a/data/xml/2023.emnlp.xml
+++ b/data/xml/2023.emnlp.xml
@@ -958,9 +958,10 @@
XipengQiu
1144-1156
Widely applied large language models (LLMs) can generate human-like content, raising concerns about the abuse of LLMs. Therefore, it is important to build strong AI-generated text (AIGT) detectors. Current works only consider document-level AIGT detection, therefore, in this paper, we first introduce a sentence-level detection challenge by synthesizing a dataset that contains documents that are polished with LLMs, that is, the documents contain sentences written by humans and sentences modified by LLMs. Then we propose Sequence X (Check) GPT, a novel method that utilizes log probability lists from white-box LLMs as features for sentence-level AIGT detection. These features are composed like waves in speech processing and cannot be studied by LLMs. Therefore, we build SeqXGPT based on convolution and self-attention networks. We test it in both sentence and document-level detection challenges. Experimental results show that previous methods struggle in solving sentence-level AIGT detection, while our method not only significantly surpasses baseline methods in both sentence and document-level detection challenges but also exhibits strong generalization capabilities.
- 2023.emnlp-main.73
+ 2023.emnlp-main.73
wang-etal-2023-seqxgpt
- 10.18653/v1/2023.emnlp-main.73
+
+ Fixed footnote on page 1.
QTSumm: Query-Focused Summarization over Tabular Data
@@ -1675,7 +1676,7 @@
VivesDebate-Speech: A Corpus of Spoken Argumentation to Leverage Audio Features for Argument Mining
RamonRuiz-Dolz
- JavierSanchez
+ JavierIranzo-Sánchez
2071-2077
In this paper, we describe VivesDebate-Speech, a corpus of spoken argumentation created to leverage audio features for argument mining tasks. The creation of this corpus represents an important contribution to the intersection of speech processing and argument mining communities, and one of the most complete publicly available resources in this topic. Moreover, we have performed a set of first-of-their-kind experiments which show an improvement when integrating audio features into the argument mining pipeline. The provided results can be used as a baseline for future research.
2023.emnlp-main.128
@@ -1908,7 +1909,7 @@
The Shifted and The Overlooked: A Task-oriented Investigation of User-GPT Interactions
SiruOuyang
ShuohangWang
- YangLiu
+ YangLiu
MingZhong
YizhuJiao
DanIter
@@ -2007,7 +2008,7 @@
G-Eval: NLG Evaluation using Gpt-4 with Better Human Alignment
- YangLiu
+ YangLiu
DanIter
YichongXu
ShuohangWang
@@ -2186,7 +2187,7 @@
Indicative Summarization of Long Discussions
ShahbazSyed
DominikSchwabe
- KhalidKhatib
+ KhalidAl-Khatib
MartinPotthast
2752-2788
Online forums encourage the exchange and discussion of different stances on many topics. Not only do they provide an opportunity to present one’s own arguments, but may also gather a broad cross-section of others’ arguments. However, the resulting long discussions are difficult to overview. This paper presents a novel unsupervised approach using large language models (LLMs) to generating indicative summaries for long discussions that basically serve as tables of contents. Our approach first clusters argument sentences, generates cluster labels as abstractive summaries, and classifies the generated cluster labels into argumentation frames resulting in a two-level summary. Based on an extensively optimized prompt engineering approach, we evaluate 19 LLMs for generative cluster labeling and frame classification. To evaluate the usefulness of our indicative summaries, we conduct a purpose-driven user study via a new visual interface called **Discussion Explorer**: It shows that our proposed indicative summaries serve as a convenient navigation tool to explore long discussions.
@@ -2203,9 +2204,11 @@
HarksooKim
2789-2799
Most research on multimodal open-domain dialogue agents has focused on pretraining and multi-task learning using additional rich datasets beyond a given target dataset. However, methods for exploiting these additional datasets can be quite limited in real-world settings, creating a need for more efficient methods for constructing agents based solely on the target dataset. To address these issues, we present a new learning strategy called vision-language warm-up tasks for multimodal dialogue models (VLAW-MDM). This strategy does not require the use of large pretraining or multi-task datasets but rather relies solely on learning from target data. Moreover, our proposed approach automatically generate captions for images and incorporate them into the model’s input to improve the contextualization of visual information. Using this novel approach, we empirically demonstrate that our learning strategy is effective for limited data and relatively small models. The result show that our method achieved comparable and in some cases superior performance compared to existing state-of-the-art models on various evaluation metrics.
- 2023.emnlp-main.167
+ 2023.emnlp-main.167
lee-etal-2023-framework
- 10.18653/v1/2023.emnlp-main.167
+
+ Fixed the sponsor in the Acknowledgments section.
+ Fixed the sponsor in the Acknowledgments section.
Once is Enough: A Light-Weight Cross-Attention for Fast Sentence Pair Modeling
@@ -3113,7 +3116,7 @@
COVID-19 Vaccine Misinformation in Middle Income Countries
JonginKim
- ByeoBak
+ Byeo RheeBak
AdityaAgrawal
JiaxiWu
VeronikaWirtz
@@ -5075,7 +5078,7 @@
BasahaCorpus: An Expanded Linguistic Resource for Readability Assessment in Central Philippine Languages
- JosephImperial
+ Joseph MarvinImperial
EkaterinaKochmar
6302-6309
Current research on automatic readability assessment (ARA) has focused on improving the performance of models in high-resource languages such as English. In this work, we introduce and release BasahaCorpus as part of an initiative aimed at expanding available corpora and baseline models for readability assessment in lower resource languages in the Philippines. We compiled a corpus of short fictional narratives written in Hiligaynon, Minasbate, Karay-a, and Rinconada—languages belonging to the Central Philippine family tree subgroup—to train ARA models using surface-level, syllable-pattern, and n-gram overlap features. We also propose a new hierarchical cross-lingual modeling approach that takes advantage of a language’s placement in the family tree to increase the amount of available training data. Our study yields encouraging results that support previous work showcasing the efficacy of cross-lingual models in low-resource settings, as well as similarities in highly informative linguistic features for mutually intelligible languages.
@@ -5612,9 +5615,10 @@
YuanjunLaili
6932-6953
Large Language Models (LLMs) have achieved remarkable success in many formal language oriented tasks, such as structural data-to-text and semantic parsing. However current benchmarks mostly follow the data distribution of the pre-training data of LLMs. Therefore, a natural question rises that do LLMs really understand the structured semantics of formal languages. In this paper, we investigate this problem on a special case, converse binary relation. We introduce a new benchmark ConvRe focusing on converse relations, which contains 17 relations and 1240 triples extracted from popular knowledge graph completion datasets. Our ConvRE features two tasks, Re2Text and Text2Re, which are formulated as multi-choice question answering to evaluate LLMs’ ability to determine the matching between relations and associated text. For the evaluation protocol, apart from different prompting methods, we further introduce variants to the test text and few-shot example text. We conduct experiments on three popular LLM families and have observed various scaling trends. The results suggest that LLMs often resort to shortcut learning and still face challenges on our proposed benchmark.
- 2023.emnlp-main.429
+ 2023.emnlp-main.429
qi-etal-2023-investigation
- 10.18653/v1/2023.emnlp-main.429
+
+ This revision corrects the copying error in Table 7.
Towards Low-Resource Automatic Program Repair with Meta-Learning and Pretrained Language Models
@@ -5675,9 +5679,11 @@
RyanCotterell
7011-7034
This work investigates the computational expressivity of language models (LMs) based on recurrent neural networks (RNNs). Siegelmann and Sontag (1992) famously showed that RNNs with rational weights and hidden states and unbounded computation time are Turing complete. However, LMs define weightings over strings in addition to just (unweighted) language membership and the analysis of the computational power of RNN LMs (RLMs) should reflect this. We extend the Turing completeness result to the probabilistic case, showing how a rationally weighted RLM with unbounded computation time can simulate any deterministic probabilistic Turing machine (PTM) with rationally weighted transitions. Since, in practice, RLMs work in real-time, processing a symbol at every time step, we treat the above result as an upper bound on the expressivity of RLMs. We also provide a lower bound by showing that under the restriction to real-time computation, such models can simulate deterministic real-time rational PTMs.
- 2023.emnlp-main.434
+ 2023.emnlp-main.434
nowak-etal-2023-representational
10.18653/v1/2023.emnlp-main.434
+
+ Required that the weighting functions in definition are non-negative.
A Mechanistic Interpretation of Arithmetic Reasoning in Language Models using Causal Mediation Analysis
@@ -6125,7 +6131,7 @@
WiCE: Real-World Entailment for Claims in Wikipedia
RyoKamoi
TanyaGoyal
- JuanRodriguez
+ JuanDiego Rodriguez
GregDurrett
7561-7583
Textual entailment models are increasingly applied in settings like fact-checking, presupposition verification in question answering, or summary evaluation. However, these represent a significant domain shift from existing entailment datasets, and models underperform as a result. We propose WiCE, a new fine-grained textual entailment dataset built on natural claim and evidence pairs extracted from Wikipedia. In addition to standard claim-level entailment, WiCE provides entailment judgments over sub-sentence units of the claim, and a minimal subset of evidence sentences that support each subclaim. To support this, we propose an automatic claim decomposition strategy using GPT-3.5 which we show is also effective at improving entailment models’ performance on multiple datasets at test time. Finally, we show that real claims in our dataset involve challenging verification and retrieval problems that existing models fail to address.
@@ -7912,14 +7918,14 @@
10.18653/v1/2023.emnlp-main.607
- A Video Is Worth 4096 Tokens: Verbalize Story Videos To Understand Them In Zero Shot
+ A Video Is Worth 4096 Tokens: Verbalize Videos To Understand Them In Zero Shot
AanishaBhattacharyya
- YamanSingla
+ Yaman KSingla
BalajiKrishnamurthy
- RajivShah
+ Rajiv RatnShah
ChangyouChen
9822-9839
- Multimedia content, such as advertisements and story videos, exhibit a rich blend of creativity and multiple modalities. They incorporate elements like text, visuals, audio, and storytelling techniques, employing devices like emotions, symbolism, and slogans to convey meaning. There is a dearth of large annotated training datasets in the multimedia domain hindering the development of supervised learn-ing models with satisfactory performance for real-world applications. On the other hand, the rise of large language models (LLMs) has witnessed remarkable zero-shot performance in various natural language processing (NLP) tasks, such as emotion classification, question-answering, and topic classification. To leverage such advanced techniques to bridge this performance gap in multimedia understanding, we propose verbalizing long videos to generate their descriptions in natural language, followed by performing video-understanding tasks on the generated story as opposed to the original video. Through extensive experiments on fifteen video-understanding tasks, we demonstrate that our method, despite being zero-shot, achieves significantly better results than supervised baselines for video understanding. Furthermore, to alleviate a lack of story understanding benchmarks, we publicly release the first dataset on a crucial task in computational social science on persuasion strategy identification.
+ Multimedia content, such as advertisements and story videos, exhibit a rich blend of creativity and multiple modalities. They incorporate elements like text, visuals, audio, and storytelling techniques, employing devices like emotions, symbolism, and slogans to convey meaning. There is a dearth of large annotated training datasets in the multimedia domain hindering the development of supervised learning models with satisfactory performance for real-world applications. On the other hand, the rise of large language models (LLMs) has witnessed remarkable zero-shot performance in various natural language processing (NLP) tasks, such as emotion classification, question answering, and topic classification. To leverage such advanced techniques to bridge this performance gap in multimedia understanding, we propose verbalizing long videos to generate their descriptions in natural language, followed by performing video-understanding tasks on the generated story as opposed to the original video. Through extensive experiments on fifteen video-understanding tasks, we demonstrate that our method, despite being zero-shot, achieves significantly better results than supervised baselines for video understanding. Furthermore, to alleviate a lack of story understanding benchmarks, we publicly release the first dataset on a crucial task in computational social science on persuasion strategy identification.
2023.emnlp-main.608
bhattacharyya-etal-2023-video
10.18653/v1/2023.emnlp-main.608
@@ -8251,9 +8257,10 @@
YuzhongQu
10241-10259
While question answering over knowledge bases (KBQA) has shown progress in addressing factoid questions, KBQA with numerical reasoning remains relatively unexplored. In this paper, we focus on the complex numerical reasoning in KBQA, and propose a new task, NR-KBQA, which necessitates the ability to perform both multi-hop reasoning and numerical reasoning. We also design a logic form in Python format called PyQL to represent the reasoning process of numerical reasoning questions. To facilitate the development of NR-KBQA, we present a large NR-KBQA dataset called MarkQA, which is automatically constructed by a small set of seeds. Each question in MarkQA is annotated with its corresponding SPARQL query, alongside the step-by-step reasoning path in the QDMR format and PyQL program. Experimental results of some state-of-the-art QA methods performed on the MarkQA dataset show that complex numerical reasoning in KBQA faces great challenges.
- 2023.emnlp-main.633
+ 2023.emnlp-main.633
huang-etal-2023-markqa
- 10.18653/v1/2023.emnlp-main.633
+
+ Various fixes.
Comparing Biases and the Impact of Multilingual Training across Multiple Languages
@@ -8895,7 +8902,7 @@
Argument-based Detection and Classification of Fallacies in Political Debates
PierpaoloGoffredo
- MarianaEspinoza
+ MarianaChaves
SerenaVillata
ElenaCabrio
11101-11112
@@ -9299,7 +9306,7 @@
Generating Summaries with Controllable Readability Levels
- LeonardoRibeiro
+ Leonardo F. R.Ribeiro
MohitBansal
MarkusDreyer
11669-11687
@@ -9427,7 +9434,7 @@
Back Transcription as a Method for Evaluating Robustness of Natural Language Understanding Models to Speech Recognition Errors
MarekKubis
PawełSkórzewski
- MarcinSowańnski
+ MarcinSowański
TomaszZietkiewicz
11824-11835
In a spoken dialogue system, an NLU model is preceded by a speech recognition system that can deteriorate the performance of natural language understanding. This paper proposes a method for investigating the impact of speech recognition errors on the performance of natural language understanding models. The proposed method combines the back transcription procedure with a fine-grained technique for categorizing the errors that affect the performance of NLU models. The method relies on the usage of synthesized speech for NLU evaluation. We show that the use of synthesized speech in place of audio recording does not change the outcomes of the presented technique in a significant way.
@@ -10017,9 +10024,10 @@
PijiLi
12522-12537
Open-domain multi-turn dialogue generation encounters the significant challenge of lacking various types of knowledge from diverse sources. Existing models typically focus on identifying specific types of dialogue knowledge and utilize corresponding datasets for training. However, this approach often leads to limited generalization capabilities and increased computational resource requirements. Recently, large language models (LLMs) have shown impressive performance on natural language processing tasks. To harness the knowledge storage of LLMs, we propose a framework named KnowEE that explores multi-source multi-type knowledge from LLMs by leveraging diverse datasets and then exploits the obtained knowledge for response generation. Our framework comprises two phases: First, we leverage five external datasets encompassing various types of knowledge to extract the most relevant samples to the dialogue context which are served as prompts to generate corresponding type of knowledge; Second, we inject the acquired knowledge into the ongoing dialogue context in fine-grained and coarse-grained manners, which is then fed into LLMs to generate the final dialogue response. Both automatic and manual evaluation results validate the effectiveness of our framework in exploring and exploiting multi-source multi-type knowledge to generate coherent, informative, and fluent responses.
- 2023.emnlp-main.771
+ 2023.emnlp-main.771
ni-etal-2023-multi
- 10.18653/v1/2023.emnlp-main.771
+
+ Typo fix.
Focus Your Attention (with Adaptive IIR Filters)
@@ -10154,7 +10162,7 @@
Explaining Interactions Between Text Spans
- SagnikChoudhury
+ SagnikRay Choudhury
PepaAtanasova
IsabelleAugenstein
12709-12730
@@ -10548,9 +10556,10 @@
XipengQiu
13153-13187
Large language models (LLMs) can be used to serve as agents to simulate human behaviors, given the powerful ability to understand human instructions and provide high-quality generated texts. Such ability stimulates us to wonder whether LLMs can simulate a person in a higher form than simple human behaviors. Therefore, we aim to train an agent with the profile, experience, and emotional states of a specific person instead of using limited prompts to instruct ChatGPT API. In this work, we introduce Character-LLM that teach LLMs to act as specific people such as Beethoven, Queen Cleopatra, Julius Caesar, etc. Our method focuses on editing profiles as experiences of a certain character and training models to be personal simulacra with these experiences. To assess the effectiveness of our approach, we build a test playground that interviews trained agents and evaluates whether the agents memorize their characters and experiences. Experimental results show interesting observations that help build future simulacra of humankind.
- 2023.emnlp-main.814
+ 2023.emnlp-main.814
shao-etal-2023-character
- 10.18653/v1/2023.emnlp-main.814
+
+ This revision corrects the footnote about the author on page 1.
Natural Language Decompositions of Implicit Content Enable Better Text Representations
@@ -11687,7 +11696,7 @@
Bridging Continuous and Discrete Spaces: Interpretable Sentence Representation Learning via Compositional Operations
- JamesHuang
+ James Y.Huang
WenlinYao
KaiqiangSong
HongmingZhang
@@ -11764,7 +11773,7 @@
UniChart: A Universal Vision-language Pretrained Model for Chart Comprehension and Reasoning
AhmedMasry
ParsaKavehzadeh
- DoLong
+ Xuan LongDo
EnamulHoque
ShafiqJoty
14662-14684
@@ -11954,7 +11963,7 @@
RidwanMahbub
IfradKhan
SamihaAnuva
- MdShahriar
+ Md ShihabShahriar
Md Tahmid RahmanLaskar
SabbirAhmed
14878-14886
@@ -12404,11 +12413,11 @@
Automatic Transcription of Handwritten Old Occitan Language
- EstebanArias
+ EstebanGarces Arias
VallariPai
MatthiasSchöffel
ChristianHeumann
- MatthiasAenmacher
+ MatthiasAßenmacher
15416-15439
While existing neural network-based approaches have shown promising results in Handwritten Text Recognition (HTR) for high-resource languages and standardized/machine-written text, their application to low-resource languages often presents challenges, resulting in reduced effectiveness. In this paper, we propose an innovative HTR approach that leverages the Transformer architecture for recognizing handwritten Old Occitan language. Given the limited availability of data, which comprises only word pairs of graphical variants and lemmas, we develop and rely on elaborate data augmentation techniques for both text and image data. Our model combines a custom-trained Swin image encoder with a BERT text decoder, which we pre-train using a large-scale augmented synthetic data set and fine-tune on the small human-labeled data set. Experimental results reveal that our approach surpasses the performance of current state-of-the-art models for Old Occitan HTR, including open-source Transformer-based models such as a fine-tuned TrOCR and commercial applications like Google Cloud Vision. To nurture further research and development, we make our models, data sets, and code publicly available.
2023.emnlp-main.953
@@ -13039,7 +13048,7 @@
ZhengZhang
ZhengNing
TobyLi
- JonathanKummerfeld
+ Jonathan K.Kummerfeld
TianyiZhang
16149-16166
Relational databases play an important role in business, science, and more. However, many users cannot fully unleash the analytical power of relational databases, because they are not familiar with database languages such as SQL. Many techniques have been proposed to automatically generate SQL from natural language, but they suffer from two issues: (1) they still make many mistakes, particularly for complex queries, and (2) they do not provide a flexible way for non-expert users to validate and refine incorrect queries. To address these issues, we introduce a new interaction mechanism that allows users to directly edit a step-by-step explanation of a query to fix errors. Our experiments on multiple datasets, as well as a user study with 24 participants, demonstrate that our approach can achieve better performance than multiple SOTA approaches. Our code and datasets are available at https://github.com/magic-YuanTian/STEPS.
@@ -13923,7 +13932,7 @@
SukannyaPurkayasthaTU Darmstadt
LeonEngländerTechnical University of Darmstadt
TimoImhofTechnical University of Darmstadt
- IvanVuliUniversity of Cambridge
+ IvanVulićUniversity of Cambridge
SebastianRuderGoogle
IrynaGurevychUKP Lab, Technische Universität Darmstadt
JonasPfeifferGoogle
diff --git a/data/xml/2023.findings.xml b/data/xml/2023.findings.xml
index 362f463572..8075b8b4bf 100644
--- a/data/xml/2023.findings.xml
+++ b/data/xml/2023.findings.xml
@@ -10372,7 +10372,7 @@
Progressive Translation: Improving Domain Robustness of Neural Machine Translation with Intermediate Sequences
ChaojunWangThe Chinese University of Hong Kong
- YangLiuMicrosoft
+ YangLiuMicrosoft
WaiLamThe Chinese University of Hong Kong
9425-9439
Previous studies show that intermediate supervision signals benefit various Natural Language Processing tasks. However, it is not clear whether there exist intermediate signals that benefit Neural Machine Translation (NMT). Borrowing techniques from Statistical Machine Translation, we propose intermediate signals which are intermediate sequences from the “source-like” structure to the “target-like” structure. Such intermediate sequences introduce an inductive bias that reflects a domain-agnostic principle of translation, which reduces spurious correlations that are harmful to out-of-domain generalisation. Furthermore, we introduce a full-permutation multi-task learning to alleviate the spurious causal relations from intermediate sequences to the target, which results from exposure bias. The Minimum Bayes Risk decoding algorithm is used to pick the best candidate translation from all permutations to further improve the performance. Experiments show that the introduced intermediate signals can effectively improve the domain robustness of NMT and reduces the amount of hallucinations on out-of-domain translation. Further analysis shows that our methods are especially promising in low-resource scenarios.
@@ -15379,7 +15379,7 @@
ReidPryzant
RuochenXu
ShuohangWang
- YangLiu
+ YangLiu
YichongXu
ChenguangZhu
1150-1162
@@ -17340,7 +17340,7 @@
10.18653/v1/2023.findings-emnlp.230
- A Word Sense Distribution-based approach for Semantic Change Prediction
+ Can Word Sense Distribution Detect Semantic Changes of Words?
XiaohangTang
YiZhou
TaichiAida
@@ -17987,9 +17987,10 @@
LluísPadró
4234-4240
In this paper, we investigate the impact of objects on gender bias in image captioning systems. Our results show that only gender-specific objects have a strong gender bias (e.g., women-lipstick). In addition, we propose a visual semantic-based gender score that measures the degree of bias and can be used as a plug-in for any image captioning system. Our experiments demonstrate the utility of the gender score, since we observe that our score can measure the bias relation between a caption and its related gender; therefore, our score can be used as an additional metric to the existing Object Gender Co-Occ approach.
- 2023.findings-emnlp.279
+ 2023.findings-emnlp.279
sabir-padro-2023-women
- 10.18653/v1/2023.findings-emnlp.279
+
+ Fixed figure 1.
FREDSum: A Dialogue Summarization Corpus for French Political Debates
@@ -18416,7 +18417,7 @@
HailinChen
WeishiWang
FangkaiJiao
- DoLong
+ Xuan LongDo
ChengweiQin
BoshengDing
XiaobaoGuo
@@ -18562,7 +18563,7 @@
Controllable Chest X-Ray Report Generation from Longitudinal Representations
- FrancescoSerra
+ FrancescoDalla Serra
ChaoyangWang
FaniDeligianni
JeffDalton
@@ -19300,7 +19301,7 @@
Representativeness as a Forgotten Lesson for Multilingual and Code-switched Data Collection and Preparation
A. SezaDoğruöz
SunayanaSitaram
- ZhengYong
+ Zheng XinYong
5751-5767
Multilingualism is widespread around the world and code-switching (CSW) is a common practice among different language pairs/tuples across locations and regions. However, there is still not much progress in building successful CSW systems, despite the recent advances in Massive Multilingual Language Models (MMLMs). We investigate the reasons behind this setback through a critical study about the existing CSW data sets (68) across language pairs in terms of the collection and preparation (e.g. transcription and annotation) stages. This in-depth analysis reveals that a) most CSW data involves English ignoring other language pairs/tuples b) there are flaws in terms of representativeness in data collection and preparation stages due to ignoring the location based, socio-demographic and register variation in CSW. In addition, lack of clarity on the data selection and filtering stages shadow the representativeness of CSW data sets. We conclude by providing a short check-list to improve the representativeness for forthcoming studies involving CSW data collection and preparation.
2023.findings-emnlp.382
@@ -19683,10 +19684,10 @@
Responsible AI Considerations in Text Summarization Research: A Review of Current Practices
- YuLiu
+ Yu LuLiu
MengCao
- SuBlodgett
- JackieCheung
+ Su LinBlodgett
+ Jackie Chi KitCheung
AlexandraOlteanu
AdamTrischler
6246-6261
@@ -20278,9 +20279,10 @@
WenjunKe
6877-6892
Relation extraction (RE) consistently involves a certain degree of labeled or unlabeled data even if under zero-shot setting. Recent studies have shown that large language models (LLMs) transfer well to new tasks out-of-the-box simply given a natural language prompt, which provides the possibility of extracting relations from text without any data and parameter tuning. This work focuses on the study of exploring LLMs, such as ChatGPT, as zero-shot relation extractors. On the one hand, we analyze the drawbacks of existing RE prompts and attempt to incorporate recent prompt techniques such as chain-of-thought (CoT) to improve zero-shot RE. We propose the summarize-and-ask (SumAsk) prompting, a simple prompt recursively using LLMs to transform RE inputs to the effective question answering (QA) format. On the other hand, we conduct comprehensive experiments on various benchmarks and settings to investigate the capabilities of LLMs on zero-shot RE. Specifically, we have the following findings: (i) SumAsk consistently and significantly improves LLMs performance on different model sizes, benchmarks and settings; (ii) Zero-shot prompting with ChatGPT achieves competitive or superior results compared with zero-shot and fully supervised methods; (iii) LLMs deliver promising performance in extracting overlapping relations; (iv) The performance varies greatly regarding different relations. Different from small language models, LLMs are effective in handling challenge none-of-the-above (NoTA) relation.
- 2023.findings-emnlp.459
+ 2023.findings-emnlp.459
li-etal-2023-revisiting-large
- 10.18653/v1/2023.findings-emnlp.459
+
+ Updated Acknowledgement.
Multi-Stage Pre-training Enhanced by ChatGPT for Multi-Scenario Multi-Domain Dialogue Summarization
@@ -20403,7 +20405,7 @@
Visually Grounded Continual Language Learning with Selective Specialization
KyraAhrens
LennartBengtson
- JaeLee
+ JaeHee Lee
StefanWermter
7037-7054
A desirable trait of an artificial agent acting in the visual world is to continually learn a sequence of language-informed tasks while striking a balance between sufficiently specializing in each task and building a generalized knowledge for transfer. Selective specialization, i.e., a careful selection of model components to specialize in each task, is a strategy to provide control over this trade-off. However, the design of selection strategies requires insights on the role of each model component in learning rather specialized or generalizable representations, which poses a gap in current research. Thus, our aim with this work is to provide an extensive analysis of selection strategies for visually grounded continual language learning. Due to the lack of suitable benchmarks for this purpose, we introduce two novel diagnostic datasets that provide enough control and flexibility for a thorough model analysis. We assess various heuristics for module specialization strategies as well as quantifiable measures for two different types of model architectures. Finally, we design conceptually simple approaches based on our analysis that outperform common continual learning baselines. Our results demonstrate the need for further efforts towards better aligning continual learning algorithms with the learning behaviors of individual model parts.
@@ -22808,7 +22810,7 @@
YichongXu
DanIter
QingkaiZeng
- YangLiu
+ YangLiu
ChenguangZhu
MengJiang
9850-9867
@@ -23440,9 +23442,11 @@
YulanHe
10559-10571
Adjusting for latent covariates is crucial for estimating causal effects from observational textual data. Most existing methods only account for confounding covariates that affect both treatment and outcome, potentially leading to biased causal effects. This bias arises from insufficient consideration of non-confounding covariates, which are relevant only to either the treatment or the outcome. In this work, we aim to mitigate the bias by unveiling interactions between different variables to disentangle the non-confounding covariates when estimating causal effects from text. The disentangling process ensures covariates only contribute to their respective objectives, enabling independence between variables. Additionally, we impose a constraint to balance representations from the treated group and control group to alleviate selection bias. We conduct experiments on two different treatment factors under various scenarios, and the proposed model significantly outperforms recent strong baselines. Furthermore, our thorough analysis on earnings call transcripts demonstrates that our model can effectively disentangle the variables, and further investigations into real-world scenarios provide guidance for investors to make informed decisions.
- 2023.findings-emnlp.709
+ 2023.findings-emnlp.709
zhou-he-2023-causal
10.18653/v1/2023.findings-emnlp.709
+
+ Corrects the tick mark typo in Table 2.
Large Language Model Is Not a Good Few-shot Information Extractor, but a Good Reranker for Hard Samples!
@@ -24694,8 +24698,8 @@
Uniform Complexity for Text Generation
- JosephImperial
- HarishMadabushi
+ Joseph MarvinImperial
+ Harish TayyarMadabushi
12025-12046
Large language models (LLMs) have shown promising results in a wide array of generative NLP tasks, such as summarization and machine translation. In the context of narrative generation, however, existing models still do not capture factors that contribute to producing consistent text. For instance, it is logical that a piece of text or a story should be uniformly readable throughout and that this form of complexity should be controllable. As such, if the complexity of an input text prompt is rated first-grade reading level in the Flesch Reading Ease test, then the generated text continuing the plot should also be within this range of complexity. With this in mind, we introduce Uniform Complexity for Text Generation (UCTG), a new benchmark test which raises the challenge of making generative models observe uniform linguistic properties with respect to prompts. We experiment with over 150+ linguistically and cognitively motivated features for evaluating text complexity in humans and generative models. From our results, we find that models such as GPT-2 struggle to preserve the complexity of input prompts used in its generations, even if finetuned with professionally written texts.
2023.findings-emnlp.805
@@ -26269,7 +26273,7 @@
YichongXu
RuochenXu
DanIter
- YangLiu
+ YangLiu
ShuohangWang
ChenguangZhu
MichaelZeng
@@ -26574,9 +26578,10 @@
AnnemarieFriedrich
14229-14241
Generative language models have recently shown remarkable success in generating answers to questions in a given textual context. However, these answers may suffer from hallucination, wrongly cite evidence, and spread misleading information. In this work, we address this problem by employing ChatGPT, a state-of-the-art generative model, as a machine-reading system. We ask it to retrieve answers to lexically varied and open-ended questions from trustworthy instructive texts. We introduce WHERE (WikiHow Evidence REtrieval), a new high-quality evaluation benchmark of a set of WikiHow articles exhaustively annotated with evidence sentences to questions that comes with a special challenge: All questions are about the article’s topic, but not all can be answered using the provided context. We interestingly find that when using a regular question-answering prompt, ChatGPT neglects to detect the unanswerable cases. When provided with a few examples, it learns to better judge whether a text provides answer evidence or not. Alongside this important finding, our dataset defines a new benchmark for evidence retrieval in question answering, which we argue is one of the necessary next steps for making large language models more trustworthy.
- 2023.findings-emnlp.949
+ 2023.findings-emnlp.949
henning-etal-2023-answer
- 10.18653/v1/2023.findings-emnlp.949
+
+ This revision provides a corrected version of Figure 1.
PaRaDe: Passage Ranking using Demonstrations with LLMs
@@ -27005,7 +27010,7 @@
LMGQS: A Large-scale Dataset for Query-focused Summarization
RuochenXu
SongWang
- YangLiu
+ YangLiu
ShuohangWang
YichongXu
DanIter
@@ -27126,7 +27131,7 @@
Qualitative Code Suggestion: A Human-Centric Approach to Qualitative Coding
- CesarePiano
+ CesareSpinoso-Di Piano
SamiraRahimi
JackieCheung
14887-14909
diff --git a/data/xml/2023.genbench.xml b/data/xml/2023.genbench.xml
index 0bb9afcf85..4657ce4b6b 100644
--- a/data/xml/2023.genbench.xml
+++ b/data/xml/2023.genbench.xml
@@ -65,7 +65,7 @@
Evaluating Neural Language Models as Cognitive Models of Language Acquisition
- HectorJavier Vazquez MartinezUniversity of Pennsylvania
+ HéctorVázquez MartínezUniversity of Pennsylvania
AnnikaLea HeuserUniversity of Pennsylvania
CharlesYangUniversity of Pennsylvania
JordanKodnerStony Brook University
diff --git a/data/xml/2023.inlg.xml b/data/xml/2023.inlg.xml
index 047d41f619..d5bdf65766 100644
--- a/data/xml/2023.inlg.xml
+++ b/data/xml/2023.inlg.xml
@@ -274,9 +274,11 @@
AlbertGatt
293–312
Current captioning datasets focus on object-centric captions, describing the visible objects in the image, often ending up stating the obvious (for humans), e.g. “people eating food in a park”. Although these datasets are useful to evaluate the ability of Vision & Language models to recognize and describe visual content, they do not support controlled experiments involving model testing or fine-tuning, with more high-level captions, which humans find easy and natural to produce. For example, people often describe images based on the type of scene they depict (“people at a holiday resort”) and the actions they perform (“people having a picnic”). Such concepts are based on personal experience and contribute to forming common sense assumptions. We present the High-Level Dataset, a dataset extending 14997 images from the COCO dataset, aligned with a new set of 134,973 human-annotated (high-level) captions collected along three axes: scenes, actions and rationales. We further extend this dataset with confidence scores collected from an independent set of readers, as well as a set of narrative captions generated synthetically, by combining each of the three axes. We describe this dataset and analyse it extensively. We also present baseline results for the High-Level Captioning task.
- 2023.inlg-main.21
+ 2023.inlg-main.21
cafagna-etal-2023-hl
10.18653/v1/2023.inlg-main.21
+
+ Updated Acknowledgments.
Validating Predictive Models Of Evaluative Language For Controllable Data2Text Generation
diff --git a/data/xml/2023.insights.xml b/data/xml/2023.insights.xml
index 5934a7bdf1..7227af97b4 100644
--- a/data/xml/2023.insights.xml
+++ b/data/xml/2023.insights.xml
@@ -25,12 +25,53 @@
AnyaBelzADAPT Research Centre, Dublin City University
CraigThomsonUniversity of Aberdeen
EhudReiterUniversity of Aberdeen
+ GavinAbercrombieHeriot-Watt University
+ Jose M.Alonso-MoralUniversidade de Santiago de Compostela
+ MohammadArvanUniversity of Illinois Chicago
+ AnouckBraggaarTilburg University
+ MarkCieliebakZurich University of Applied Sciences
+ ElizabethClarkGoogle Research
+ Keesvan DeemterUtrecht University
+ TanviDinkarHeriot-Watt University
+ OndřejDušekCharles University Prague
+ SteffenEgerBielefeld University
+ QixiangFangUtrecht University
+ MingqiGaoPeking University
+ AlbertGattUtrecht University
+ DimitraGkatziaEdinburgh Napier University
+ JavierGonzález-CorbelleUniversidade de Santiago de Compostela
+ DirkHovyBocconi University
+ ManuelaHürlimannZurich University of Applied Sciences
+ TakumiItoTohoku University
+ John D.KelleherTechnological University Dublin
+ FilipKlubickaTechnological University Dublin
+ EmielKrahmerTilburg University
+ HuiyuanLaiGroningen University
+ Chrisvan der LeeTilburg University
+ YiruLiGroningen University
+ SaadMahamoodtrivago
+ MargotMieskesUniversity of Applied Sciences Darmstadt
+ Emielvan MiltenburgTilburg University
+ PabloMosteiroUtrecht University
+ MalvinaNissimGroningen University
+ NataliePardeUniversity of Illinois Chicago
+ OndřejPlátekCharles University Prague
+ VerenaRieserHeriot-Watt University
+ JieRuanPeking University
+ JoelTetreaultDataminr
+ AntonioToralGroningen University
+ XiaojunWanPeking University
+ LeoWannerUniversitat Pompeu Fabra
+ LewisWatsonEdinburgh Napier University
+ DiyiYangGeorgia Tech
1-10
We report our efforts in identifying a set of previous human evaluations in NLP that would be suitable for a coordinated study examining what makes human evaluations in NLP more/less reproducible. We present our results and findings, which include that just 13% of papers had (i) sufficiently low barriers to reproduction, and (ii) enough obtainable information, to be considered for reproduction, and that all but one of the experiments we selected for reproduction was discovered to have flaws that made the meaningfulness of conducting a reproduction questionable. As a result, we had to change our coordinated study design from a reproduce approach to a standardise-then-reproduce-twice approach. Our overall (negative) finding that the great majority of human evaluations in NLP is not repeatable and/or not reproducible and/or too flawed to justify reproduction, paints a dire picture, but presents an opportunity for a rethink about how to design and report human evaluations in NLP.
- 2023.insights-1.1
+ 2023.insights-1.1
belz-etal-2023-missing
10.18653/v1/2023.insights-1.1
+
+ Authors change.
ERATE: Efficient Retrieval Augmented Text Embeddings
diff --git a/data/xml/2023.iwcs.xml b/data/xml/2023.iwcs.xml
index cbe88ea29c..6e912e842a 100644
--- a/data/xml/2023.iwcs.xml
+++ b/data/xml/2023.iwcs.xml
@@ -288,7 +288,7 @@
opitz-etal-2023-smaragd
- AMR4NLI: Interpretable and robust NLI measures from semantic graph
+ AMR4NLI: Interpretable and robust NLI measures from semantic graphs
JuriOpitz
ShiraWein
JuliusSteen
diff --git a/data/xml/2023.jeptalnrecital.xml b/data/xml/2023.jeptalnrecital.xml
index 7cd53adf60..56e12c1a35 100644
--- a/data/xml/2023.jeptalnrecital.xml
+++ b/data/xml/2023.jeptalnrecital.xml
@@ -364,9 +364,11 @@
MathieuConstant
23–36
Au début du XXIe siècle, le français faisait encore partie des langues peu dotées. Grâce aux efforts de la communauté française du traitement automatique des langues (TAL), de nombreuses ressources librement disponibles ont été produites, dont des lexiques du français. À travers cet article, nous nous intéressons à leur devenir dans la communauté par le prisme des actes de la conférence TALN sur une période de 20 ans.
- 2023.jeptalnrecital-short.3
+ 2023.jeptalnrecital-short.3
fra
choi-etal-2023-des
+
+ This version corrects a typo in the English abstract (ill-formed translation from the original abstract in French).
Attention sur les spans pour l’analyse syntaxique en constituants
diff --git a/data/xml/2023.ldk.xml b/data/xml/2023.ldk.xml
index a9b523cc09..0ba1d971b3 100644
--- a/data/xml/2023.ldk.xml
+++ b/data/xml/2023.ldk.xml
@@ -451,6 +451,7 @@
PurificaçãoSilvano
DimitarTrajanov
Ciprian-OctavianTruica
+ Elena-SimonaApostol
ChristianChiarcos
AnnaBaczkowska
434-439
diff --git a/data/xml/2023.newsum.xml b/data/xml/2023.newsum.xml
index e1daa0fb01..11fe736c75 100644
--- a/data/xml/2023.newsum.xml
+++ b/data/xml/2023.newsum.xml
@@ -9,7 +9,7 @@
FeiLiu
GiuseppeCarenini
Association for Computational Linguistics
- Hybrid
+ Singapore
December
2023
2023.newsum-1
diff --git a/data/xml/2023.nllp.xml b/data/xml/2023.nllp.xml
index b9b6fd092d..55d86ace46 100644
--- a/data/xml/2023.nllp.xml
+++ b/data/xml/2023.nllp.xml
@@ -136,9 +136,10 @@
LakshminarayananSubramanianNew York University
85-98
This paper formulates a new task of extracting privacy parameters from a privacy policy, through the lens of Contextual Integrity (CI), an established social theory framework for reasoning about privacy norms. Through extensive experiments, we further show that incorporating CI-based domain-specific knowledge into a BERT-based SRL model results in the highest precision and recall, achieving an F1 score of 84%. With our work, we would like to motivate new research in building NLP applications for the privacy domain.
- 2023.nllp-1.10
+ 2023.nllp-1.10
shvartzshanider-etal-2023-beyond
- 10.18653/v1/2023.nllp-1.10
+
+ Typo correction in the CI labels in Section 5.4.
Towards Mitigating Perceived Unfairness in Contracts from a Non-Legal Stakeholder’s Perspective
diff --git a/data/xml/2023.nlposs.xml b/data/xml/2023.nlposs.xml
index f47452aba8..04a8bf681c 100644
--- a/data/xml/2023.nlposs.xml
+++ b/data/xml/2023.nlposs.xml
@@ -8,8 +8,8 @@
GeetickaChauhan
JeremyGwinnup
ElijahRippeth
- Empirical Methods in Natural Language Processing
- Singapore, Singapore
+ Association for Computational Linguistics
+ Singapore
December
2023
2023.nlposs-1
diff --git a/data/xml/2023.semeval.xml b/data/xml/2023.semeval.xml
index cb52317cd2..cecd51d7ef 100644
--- a/data/xml/2023.semeval.xml
+++ b/data/xml/2023.semeval.xml
@@ -2065,7 +2065,7 @@
MDC at SemEval-2023 Task 7: Fine-tuning Transformers for Textual Entailment Prediction and Evidence Retrieval in Clinical Trials
RobertBevanMedicines Discovery Catapult
- OisnTurbittMedicines Discovery Catapult
+ OisínTurbittMedicines Discovery Catapult
MouhamadAboshokorMedicines Discovery Catapult
1287-1292
We present our entry to the Multi-evidence Natural Language Inference for Clinical Trial Datatask at SemEval 2023. We submitted entries forboth the evidence retrieval and textual entailment sub-tasks. For the evidence retrieval task,we fine-tuned the PubMedBERT transformermodel to extract relevant evidence from clinicaltrial data given a hypothesis concerning either asingle clinical trial or pair of clinical trials. Ourbest performing model achieved an F1 scoreof 0.804. For the textual entailment task, inwhich systems had to predict whether a hypothesis about either a single clinical trial or pair ofclinical trials is true or false, we fine-tuned theBioLinkBERT transformer model. We passedour evidence retrieval model’s output into ourtextual entailment model and submitted its output for the evaluation. Our best performingmodel achieved an F1 score of 0.695.
diff --git a/data/xml/2023.sigdial.xml b/data/xml/2023.sigdial.xml
index cb81fe1c3e..ad0c335b1b 100644
--- a/data/xml/2023.sigdial.xml
+++ b/data/xml/2023.sigdial.xml
@@ -333,6 +333,7 @@
Syndicom: Improving Conversational Commonsense with Error-Injection and Natural Language Feedback
ChristopherRichardson
+ AnirudhSundar
LarryHeck
297–308
Commonsense reasoning is a critical aspect of human communication. Despite recent advances in conversational AI driven by large language models, commonsense reasoning remains a challenging task. In this work, we introduce Syndicom - a method for improving commonsense in dialogue response generation. Syndicom consists of two components. The first component is a dataset composed of commonsense dialogues created from a knowledge graph and synthesized into natural language. This dataset includes both valid and invalid responses to dialogue contexts, along with natural language feedback (NLF) for the invalid responses. The second contribution is a two-step procedure: training a model to predict natural language feedback (NLF) for invalid responses, and then training a response generation model conditioned on the predicted NLF, the invalid response, and the dialogue. Syndicom is scalable and does not require reinforcement learning. Empirical results on three tasks are evaluated using a broad range of metrics. Syndicom achieves a relative improvement of 53% over ChatGPT on ROUGE-1, and human evaluators prefer Syndicom over ChatGPT 57% of the time. We will publicly release the code and the full dataset.
diff --git a/data/xml/2023.wsc.xml b/data/xml/2023.wsc.xml
index 4be1765d36..8706fec7c8 100644
--- a/data/xml/2023.wsc.xml
+++ b/data/xml/2023.wsc.xml
@@ -67,7 +67,7 @@
Creation of a Digital Rig Vedic Index (Anukramani) for Computational Linguistic Tasks
- A V S D SMahesh
+ V.S.D.S.MaheshAkavarapu
ArnabBhattacharya
89–96
2023.wsc-csdh.6
diff --git a/data/xml/D19.xml b/data/xml/D19.xml
index 87b110c611..3e4f6997bf 100644
--- a/data/xml/D19.xml
+++ b/data/xml/D19.xml
@@ -5348,7 +5348,7 @@
Text Summarization with Pretrained Encoders
- YangLiu
+ YangLiu
MirellaLapata
3730–3740
Bidirectional Encoder Representations from Transformers (BERT) represents the latest incarnation of pretrained language models which have recently advanced a wide range of natural language processing tasks. In this paper, we showcase how BERT can be usefully applied in text summarization and propose a general framework for both extractive and abstractive models. We introduce a novel document-level encoder based on BERT which is able to express the semantics of a document and obtain representations for its sentences. Our extractive model is built on top of this encoder by stacking several inter-sentence Transformer layers. For abstractive summarization, we propose a new fine-tuning schedule which adopts different optimizers for the encoder and the decoder as a means of alleviating the mismatch between the two (the former is pretrained while the latter is not). We also demonstrate that a two-staged fine-tuning approach can further boost the quality of the generated summaries. Experiments on three datasets show that our model achieves state-of-the-art results across the board in both extractive and abstractive settings.
diff --git a/data/xml/P17.xml b/data/xml/P17.xml
index f01ac0e69c..0162f899e2 100644
--- a/data/xml/P17.xml
+++ b/data/xml/P17.xml
@@ -2210,7 +2210,7 @@ two word-vectors results in a vector that is only a small angle away from the ve
Cora
- Universal Dependencies Parsing for Colloquial Singaporean English
+ Universal Dependencies Parsing for Colloquial Singaporean English
HongminWang
YueZhang
GuangYong LeonardChan
diff --git a/python/acl_anthology/__init__.py b/python/acl_anthology/__init__.py
index 382bf090d8..5d1c5dd731 100644
--- a/python/acl_anthology/__init__.py
+++ b/python/acl_anthology/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/anthology.py b/python/acl_anthology/anthology.py
index ebacdc3602..0761e017c7 100644
--- a/python/acl_anthology/anthology.py
+++ b/python/acl_anthology/anthology.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/collections/__init__.py b/python/acl_anthology/collections/__init__.py
index df003b3bc1..a8d8b30f21 100644
--- a/python/acl_anthology/collections/__init__.py
+++ b/python/acl_anthology/collections/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/collections/collection.py b/python/acl_anthology/collections/collection.py
index d99dda23b5..daa7c378ca 100644
--- a/python/acl_anthology/collections/collection.py
+++ b/python/acl_anthology/collections/collection.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/collections/event.py b/python/acl_anthology/collections/event.py
index 750bc8f747..b6c7ae1fc5 100644
--- a/python/acl_anthology/collections/event.py
+++ b/python/acl_anthology/collections/event.py
@@ -1,5 +1,5 @@
# Copyright 2022 Matt Post
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/collections/eventindex.py b/python/acl_anthology/collections/eventindex.py
index ca701f9a88..43140dfd28 100644
--- a/python/acl_anthology/collections/eventindex.py
+++ b/python/acl_anthology/collections/eventindex.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/collections/index.py b/python/acl_anthology/collections/index.py
index 1bf257523e..85904f2c97 100644
--- a/python/acl_anthology/collections/index.py
+++ b/python/acl_anthology/collections/index.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/collections/paper.py b/python/acl_anthology/collections/paper.py
index 4ac25483b0..acbfb3448b 100644
--- a/python/acl_anthology/collections/paper.py
+++ b/python/acl_anthology/collections/paper.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/collections/types.py b/python/acl_anthology/collections/types.py
index ef26af6639..e7ff137832 100644
--- a/python/acl_anthology/collections/types.py
+++ b/python/acl_anthology/collections/types.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/collections/volume.py b/python/acl_anthology/collections/volume.py
index c2d65c0928..43735fb646 100644
--- a/python/acl_anthology/collections/volume.py
+++ b/python/acl_anthology/collections/volume.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/config.py b/python/acl_anthology/config.py
index 4d9f0cbb0c..5d371e05e8 100644
--- a/python/acl_anthology/config.py
+++ b/python/acl_anthology/config.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/constants.py b/python/acl_anthology/constants.py
index 42d0800088..67fe017a16 100644
--- a/python/acl_anthology/constants.py
+++ b/python/acl_anthology/constants.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/containers.py b/python/acl_anthology/containers.py
index 58d1b3e92a..98a67decdd 100644
--- a/python/acl_anthology/containers.py
+++ b/python/acl_anthology/containers.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/exceptions.py b/python/acl_anthology/exceptions.py
index 0d16951c66..74830b36ef 100644
--- a/python/acl_anthology/exceptions.py
+++ b/python/acl_anthology/exceptions.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/files.py b/python/acl_anthology/files.py
index 0347356f9b..f8fded4fcc 100644
--- a/python/acl_anthology/files.py
+++ b/python/acl_anthology/files.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/people/__init__.py b/python/acl_anthology/people/__init__.py
index ba7e72f628..49063fa6de 100644
--- a/python/acl_anthology/people/__init__.py
+++ b/python/acl_anthology/people/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/people/index.py b/python/acl_anthology/people/index.py
index fb14469bc3..41b25cb993 100644
--- a/python/acl_anthology/people/index.py
+++ b/python/acl_anthology/people/index.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/people/name.py b/python/acl_anthology/people/name.py
index 4876dd0326..116dd17240 100644
--- a/python/acl_anthology/people/name.py
+++ b/python/acl_anthology/people/name.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/people/person.py b/python/acl_anthology/people/person.py
index 1f123985eb..075086c476 100644
--- a/python/acl_anthology/people/person.py
+++ b/python/acl_anthology/people/person.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/sigs.py b/python/acl_anthology/sigs.py
index 3475ca812b..9a835d2d43 100644
--- a/python/acl_anthology/sigs.py
+++ b/python/acl_anthology/sigs.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/text/__init__.py b/python/acl_anthology/text/__init__.py
index 6aa2046421..9622479a9d 100644
--- a/python/acl_anthology/text/__init__.py
+++ b/python/acl_anthology/text/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/text/markuptext.py b/python/acl_anthology/text/markuptext.py
index 2d81f67fd4..facc0e1c2e 100644
--- a/python/acl_anthology/text/markuptext.py
+++ b/python/acl_anthology/text/markuptext.py
@@ -1,4 +1,4 @@
-# Copyright 2019-2023 Marcel Bollmann
+# Copyright 2019-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/text/texmath.py b/python/acl_anthology/text/texmath.py
index 2dee8fbec1..4af8bb5cb9 100644
--- a/python/acl_anthology/text/texmath.py
+++ b/python/acl_anthology/text/texmath.py
@@ -1,4 +1,4 @@
-# Copyright 2019-2023 Marcel Bollmann
+# Copyright 2019-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/utils/__init__.py b/python/acl_anthology/utils/__init__.py
index 5d4173b990..fcf9e8520d 100644
--- a/python/acl_anthology/utils/__init__.py
+++ b/python/acl_anthology/utils/__init__.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/utils/git.py b/python/acl_anthology/utils/git.py
index f8891ee2c1..f203c75eca 100644
--- a/python/acl_anthology/utils/git.py
+++ b/python/acl_anthology/utils/git.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/utils/ids.py b/python/acl_anthology/utils/ids.py
index 6e4434fe04..b70519e9a5 100644
--- a/python/acl_anthology/utils/ids.py
+++ b/python/acl_anthology/utils/ids.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/utils/latex.py b/python/acl_anthology/utils/latex.py
index b1821357d7..a3c492f4de 100644
--- a/python/acl_anthology/utils/latex.py
+++ b/python/acl_anthology/utils/latex.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/utils/logging.py b/python/acl_anthology/utils/logging.py
index e6119ce213..ae1fabaabe 100644
--- a/python/acl_anthology/utils/logging.py
+++ b/python/acl_anthology/utils/logging.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/utils/text.py b/python/acl_anthology/utils/text.py
index d030a44a11..c4abda31ac 100644
--- a/python/acl_anthology/utils/text.py
+++ b/python/acl_anthology/utils/text.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/utils/xml.py b/python/acl_anthology/utils/xml.py
index 87942cb7da..f52f512afd 100644
--- a/python/acl_anthology/utils/xml.py
+++ b/python/acl_anthology/utils/xml.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/acl_anthology/venues.py b/python/acl_anthology/venues.py
index 16b05e822a..6043b4f465 100644
--- a/python/acl_anthology/venues.py
+++ b/python/acl_anthology/venues.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/benchmarks/bench_attrs.py b/python/benchmarks/bench_attrs.py
index dd2c4121a4..194224afce 100644
--- a/python/benchmarks/bench_attrs.py
+++ b/python/benchmarks/bench_attrs.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/benchmarks/bench_sanitycheck.py b/python/benchmarks/bench_sanitycheck.py
index 1659bc51f6..a22d91fc9e 100644
--- a/python/benchmarks/bench_sanitycheck.py
+++ b/python/benchmarks/bench_sanitycheck.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/benchmarks/bench_utils.py b/python/benchmarks/bench_utils.py
index 1953223505..e7a69861fe 100644
--- a/python/benchmarks/bench_utils.py
+++ b/python/benchmarks/bench_utils.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/benchmarks/bench_xml_markup.py b/python/benchmarks/bench_xml_markup.py
index a3b4873667..47185ff074 100644
--- a/python/benchmarks/bench_xml_markup.py
+++ b/python/benchmarks/bench_xml_markup.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/benchmarks/bench_xml_names.py b/python/benchmarks/bench_xml_names.py
index 401f94fbe1..d729eba94c 100644
--- a/python/benchmarks/bench_xml_names.py
+++ b/python/benchmarks/bench_xml_names.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/benchmarks/bench_xml_parsing.py b/python/benchmarks/bench_xml_parsing.py
index 219a425bfb..c1c9b3a697 100644
--- a/python/benchmarks/bench_xml_parsing.py
+++ b/python/benchmarks/bench_xml_parsing.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/anthology_integration_test.py b/python/tests/anthology_integration_test.py
index 89d1371ceb..225ade4dc7 100644
--- a/python/tests/anthology_integration_test.py
+++ b/python/tests/anthology_integration_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/anthology_test.py b/python/tests/anthology_test.py
index c22ec60450..fa913fb168 100644
--- a/python/tests/anthology_test.py
+++ b/python/tests/anthology_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/collections/collection_test.py b/python/tests/collections/collection_test.py
index 22246e61f9..f4509648da 100644
--- a/python/tests/collections/collection_test.py
+++ b/python/tests/collections/collection_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/collections/collectionindex_test.py b/python/tests/collections/collectionindex_test.py
index 06532cc62f..c7305df5d5 100644
--- a/python/tests/collections/collectionindex_test.py
+++ b/python/tests/collections/collectionindex_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/collections/event_test.py b/python/tests/collections/event_test.py
index 9ee4bf1980..d311857e2a 100644
--- a/python/tests/collections/event_test.py
+++ b/python/tests/collections/event_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/collections/eventindex_test.py b/python/tests/collections/eventindex_test.py
index 4433a7b3b3..7bb87cb841 100644
--- a/python/tests/collections/eventindex_test.py
+++ b/python/tests/collections/eventindex_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/collections/paper_test.py b/python/tests/collections/paper_test.py
index 77feaf087b..f21f2a032e 100644
--- a/python/tests/collections/paper_test.py
+++ b/python/tests/collections/paper_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/collections/volume_test.py b/python/tests/collections/volume_test.py
index 23914554e2..d111b94eb3 100644
--- a/python/tests/collections/volume_test.py
+++ b/python/tests/collections/volume_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/conftest.py b/python/tests/conftest.py
index d7c342bd6d..c09ab19516 100644
--- a/python/tests/conftest.py
+++ b/python/tests/conftest.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/containers_test.py b/python/tests/containers_test.py
index 342ee346ed..d0bdf30dae 100644
--- a/python/tests/containers_test.py
+++ b/python/tests/containers_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/files_test.py b/python/tests/files_test.py
index f815332f9e..b08fbab9be 100644
--- a/python/tests/files_test.py
+++ b/python/tests/files_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/people/name_test.py b/python/tests/people/name_test.py
index a33dfbed78..1bb1d4c9d9 100644
--- a/python/tests/people/name_test.py
+++ b/python/tests/people/name_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/people/person_test.py b/python/tests/people/person_test.py
index 36559610ff..f99ba3b885 100644
--- a/python/tests/people/person_test.py
+++ b/python/tests/people/person_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/people/personindex_test.py b/python/tests/people/personindex_test.py
index 2a9a58db4b..a653189dd3 100644
--- a/python/tests/people/personindex_test.py
+++ b/python/tests/people/personindex_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/sigs_test.py b/python/tests/sigs_test.py
index 0882203b22..aef1817289 100644
--- a/python/tests/sigs_test.py
+++ b/python/tests/sigs_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/text/markuptext_test.py b/python/tests/text/markuptext_test.py
index 78a4a2f400..b53f63dccd 100644
--- a/python/tests/text/markuptext_test.py
+++ b/python/tests/text/markuptext_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/text/texmath_test.py b/python/tests/text/texmath_test.py
index c5bc9779fb..2ad78493f6 100644
--- a/python/tests/text/texmath_test.py
+++ b/python/tests/text/texmath_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/utils/ids_test.py b/python/tests/utils/ids_test.py
index c87e962c27..0b77937e62 100644
--- a/python/tests/utils/ids_test.py
+++ b/python/tests/utils/ids_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/utils/latex_test.py b/python/tests/utils/latex_test.py
index 29a436271f..9f9174d67b 100644
--- a/python/tests/utils/latex_test.py
+++ b/python/tests/utils/latex_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/utils/logging_test.py b/python/tests/utils/logging_test.py
index bf5c773b20..f9939c85ce 100644
--- a/python/tests/utils/logging_test.py
+++ b/python/tests/utils/logging_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/utils/text_test.py b/python/tests/utils/text_test.py
index 32c7b7e5a1..754b0d6dbf 100644
--- a/python/tests/utils/text_test.py
+++ b/python/tests/utils/text_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/utils/xml_test.py b/python/tests/utils/xml_test.py
index 88acbd0c6e..f19d0d619b 100644
--- a/python/tests/utils/xml_test.py
+++ b/python/tests/utils/xml_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
diff --git a/python/tests/venues_test.py b/python/tests/venues_test.py
index a2e2266118..d918ba593e 100644
--- a/python/tests/venues_test.py
+++ b/python/tests/venues_test.py
@@ -1,4 +1,4 @@
-# Copyright 2023 Marcel Bollmann
+# Copyright 2023-2024 Marcel Bollmann
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.