Name variant: Alham Fikri Aji (#3393)

acl-org · Jun 12, 2024 · 179e756 · 179e756
1 parent 2f49b30
commit 179e756
Show file tree

Hide file tree

Showing 4 changed files with 7 additions and 7 deletions.
diff --git a/data/xml/2023.calcs.xml b/data/xml/2023.calcs.xml
@@ -79,7 +79,7 @@
       <author><first>Long</first><last>Phan</last></author>
       <author><first>Rowena</first><last>Garcia</last></author>
       <author><first>Thamar</first><last>Solorio</last></author>
-      <author><first>Alham</first><last>Aji</last></author>
+      <author><first>Alham Fikri</first><last>Aji</last></author>
       <pages>43-63</pages>
       <abstract>The differences in decision making between behavioural models of voice interfaces are hard to capture using existing measures for the absolute performance of such models. For instance, two models may have a similar task success rate, but very different ways of getting there. In this paper, we propose a general methodology to compute the similarity of two dialogue behaviour models and investigate different ways of computing scores on both the semantic and the textual level. Complementing absolute measures of performance, we test our scores on three different tasks and show the practical usability of the measures.</abstract>
       <url hash="85582681">2023.calcs-1.5</url>

diff --git a/data/xml/2023.emnlp.xml b/data/xml/2023.emnlp.xml
@@ -603,7 +603,7 @@
       <title><fixed-case>LLM</fixed-case>-powered Data Augmentation for Enhanced Cross-lingual Performance</title>
       <author><first>Chenxi</first><last>Whitehouse</last></author>
       <author><first>Monojit</first><last>Choudhury</last></author>
-      <author><first>Alham</first><last>Aji</last></author>
+      <author><first>Alham Fikri</first><last>Aji</last></author>
       <pages>671-686</pages>
       <abstract>This paper explores the potential of leveraging Large Language Models (LLMs) for data augmentation in multilingual commonsense reasoning datasets where the available training data is extremely limited. To achieve this, we utilise several LLMs, namely Dolly-v2, StableVicuna, ChatGPT, and GPT-4, to augment three datasets: XCOPA, XWinograd, and XStoryCloze. Subsequently, we evaluate the effectiveness of fine-tuning smaller multilingual models, mBERT and XLMR, using the synthesised data. We compare the performance of training with data generated in English and target languages, as well as translated English-generated data, revealing the overall advantages of incorporating data generated by LLMs, e.g. a notable 13.4 accuracy score improvement for the best case. Furthermore, we conduct a human evaluation by asking native speakers to assess the naturalness and logical coherence of the generated examples across different languages. The results of the evaluation indicate that LLMs such as ChatGPT and GPT-4 excel at producing natural and coherent text in most languages, however, they struggle to generate meaningful text in certain languages like Tamil. We also observe that ChatGPT falls short in generating plausible alternatives compared to the original dataset, whereas examples from GPT-4 exhibit competitive logical consistency.</abstract>
       <url hash="71f6b198">2023.emnlp-main.44</url>
@@ -10794,7 +10794,7 @@
       <author><first>Samuel</first><last>Cahyawijaya</last></author>
       <author><first>Jan Christian Blaise</first><last>Cruz</last></author>
       <author><first>Genta</first><last>Winata</last></author>
-      <author><first>Alham</first><last>Aji</last></author>
+      <author><first>Alham Fikri</first><last>Aji</last></author>
       <pages>12567-12582</pages>
       <abstract>Multilingual Large Language Models (LLMs) have recently shown great capabilities in a wide range of tasks, exhibiting state-of-the-art performance through zero-shot or few-shot prompting methods. While there have been extensive studies on their abilities in monolingual tasks, the investigation of their potential in the context of code-switching (CSW), the practice of alternating languages within an utterance, remains relatively uncharted. In this paper, we provide a comprehensive empirical analysis of various multilingual LLMs, benchmarking their performance across four tasks: sentiment analysis, machine translation, summarization and word-level language identification. Our results indicate that despite multilingual LLMs exhibiting promising outcomes in certain tasks using zero or few-shot prompting, they still underperform in comparison to fine-tuned models of much smaller scales. We argue that current “multilingualism’ in LLMs does not inherently imply proficiency with code-switching texts, calling for future research to bridge this discrepancy.</abstract>
       <url hash="a6c294da">2023.emnlp-main.774</url>
@@ -12201,7 +12201,7 @@
       <author><first>Fahim</first><last>Faisal</last></author>
       <author><first>Alissa</first><last>Ostapenko</last></author>
       <author><first>Genta</first><last>Winata</last></author>
-      <author><first>Alham</first><last>Aji</last></author>
+      <author><first>Alham Fikri</first><last>Aji</last></author>
       <author><first>Samuel</first><last>Cahyawijaya</last></author>
       <author><first>Yulia</first><last>Tsvetkov</last></author>
       <author><first>Antonios</first><last>Anastasopoulos</last></author>

diff --git a/data/xml/2023.matching.xml b/data/xml/2023.matching.xml
@@ -93,7 +93,7 @@
     <paper id="7">
       <title>Knowledge-Augmented Language Model Prompting for Zero-Shot Knowledge Graph Question Answering</title>
       <author><first>Jinheon</first><last>Baek</last></author>
-      <author><first>Alham</first><last>Aji</last></author>
+      <author><first>Alham Fikri</first><last>Aji</last></author>
       <author><first>Amir</first><last>Saffari</last></author>
       <pages>70-98</pages>
       <abstract>Large Language Models (LLMs) are capable of performing zero-shot closed-book question answering tasks, based on their internal knowledge stored in parameters during pre-training. However, such internalized knowledge might be insufficient and incorrect, which could lead LLMs to generate factually wrong answers. Furthermore, fine-tuning LLMs to update their knowledge is expensive. To this end, we propose to augment the knowledge directly in the input of LLMs. Specifically, we first retrieve the relevant facts to the input question from the knowledge graph based on semantic similarities between the question and its associated facts. After that, we prepend the retrieved facts to the input question in the form of the prompt, which is then forwarded to LLMs to generate the answer. Our framework, Knowledge-Augmented language model PromptING (KAPING), requires no model training, thus completely zero-shot. We validate the performance of our KAPING framework on the knowledge graph question answering task, that aims to answer the user’s question based on facts over a knowledge graph, on which ours outperforms relevant zero-shot baselines by up to 48% in average, across multiple LLMs of various sizes.</abstract>

diff --git a/data/xml/2024.eacl.xml b/data/xml/2024.eacl.xml
@@ -733,7 +733,7 @@
       <author><first>Abdul</first><last>Waheed</last><affiliation>Mohamed bin Zayed University of Artificial Intelligence</affiliation></author>
       <author><first>Chiyu</first><last>Zhang</last><affiliation>University of British Columbia</affiliation></author>
       <author><first>Muhammad</first><last>Abdul-Mageed</last><affiliation>University of British Columbia</affiliation></author>
-      <author><first>Alham</first><last>Aji</last><affiliation>Mohamed bin Zayed University of Artificial Intelligence and Amazon</affiliation></author>
+      <author><first>Alham Fikri</first><last>Aji</last><affiliation>Mohamed bin Zayed University of Artificial Intelligence and Amazon</affiliation></author>
       <pages>944-964</pages>
       <abstract>Large language models (LLMs) with instruction fine-tuning demonstrate superior generative capabilities. However, these models are resource-intensive. To alleviate this issue, we explore distilling knowledge from instruction-tuned LLMs into much smaller ones. While other similar works have been done, they are often conducted on a limited set of (usually still large) models and are not accompanied by proper evaluations. To this end, we carefully develop a large set of 2.58M instructions based on both existing and newly-generated instructions. In addition to being sizable, we design our instructions to cover a broad set of topics to ensure diversity. Extensive analysis of our instruction dataset confirms its diversity, and we generate responses for these instructions using gpt-3.5-turbo. Leveraging these instructions, we fine-tune a diverse herd of models, collectively referred to as LaMini-LM, which includes models from both the encoder-decoder and decoder-only families, with varying sizes. We evaluate the performance of our models using automatic metrics on 15 different natural language processing (NLP) benchmarks, as well as through human assessment. We also assess the model for hallucination and toxicity, and for the former, we introduce a new benchmark dataset for hallucination-inducing QA. The results demonstrate that our proposed LaMini-LM models are comparable to strong baselines while being much smaller in size.</abstract>
       <url hash="30928b55">2024.eacl-long.57</url>
@@ -1070,7 +1070,7 @@
       <author><first>Tarek</first><last>Mahmoud</last></author>
       <author><first>Toru</first><last>Sasaki</last><affiliation>Technische Universität Darmstadt</affiliation></author>
       <author><first>Thomas</first><last>Arnold</last><affiliation>Technische Universität Darmstadt</affiliation></author>
-      <author><first>Alham</first><last>Aji</last><affiliation>Mohamed bin Zayed University of Artificial Intelligence and Amazon</affiliation></author>
+      <author><first>Alham Fikri</first><last>Aji</last><affiliation>Mohamed bin Zayed University of Artificial Intelligence and Amazon</affiliation></author>
       <author><first>Nizar</first><last>Habash</last><affiliation>New York University Abu Dhabi</affiliation></author>
       <author><first>Iryna</first><last>Gurevych</last><affiliation>Mohamed bin Zayed University of Artificial Intelligence and Technical University of Darmstadt</affiliation></author>
       <author><first>Preslav</first><last>Nakov</last></author>