This repository contains the details of the project: Argument Mining with Fine-Tuned Large Language Models. Fine-tuning involves further training of a pre-trained model on a downstream dataset. This helps general-purpose LLL pre-training to be complemented with task specific supervised training.
This repository is organized as follows:
- abstRCT: this directory contains the materiel for experiments on the Abstracts of Randomized Controlled Trials (AbstRCT) dataset.
- cdcp: this directory contains the materiel for experiments on the Cornell eRulemaking Corpus (CDCP) dataset.
- mega: this directory contains the materiel for implementation of a combined dataset consisting of all three datasets.
- pe: this directory contains the materiel for experiments on the Persuasive Essays (PE) dataset.
.
├── abstRCT
├── cdcp
├── mega
└── pe
We experiment with the following models:
-
LLaMA-3-8B-Instruct -- Meta AI
-
LLaMA-3-70B-Instruct -- Meta AI
-
LLaMA-3.1-8B-Instruct -- Meta AI
-
Gemma-2-9B-it -- Google
-
Qwen-2-7B-Instruct -- Qwen
-
Mistral-7B-Instruct -- Mistral AI
-
Phi-3-mini-instruct -- Microsoft
We experiment on the three tasks of an Argument Mining (AM) pipeline:
- Argument Component Classification (ACC): ACC involves classifying an argument component as either Major Claim, Claim or Premise.
- Argument Relation Identification (ARI): ARI involves classifying pairs of argument components as either Related or Non-related.
- Argument Relation Classification (ARC): ARC involves classifying an argument relation as either Support or Attack.
We use the following versions of the packages:
torch==2.4.0
gradio==4.43.0
pydantic==2.9.0
LLaMA-Factory==0.9.0
transformers==4.44.2
bitsandbytes==0.43.1
- For fine-tuning LLMs, we use LLaMA-Factory.
- For model checkpoints, we use Unsloth.
- We also use Hugging Face.
All experiments have been performed on the High Performance Cluster at La Rochelle Université.