Releases: tingofurro/headline_grouping
Releases · tingofurro/headline_grouping
Initial Release of HLGD
We release the HLGD dataset and trained models.
Dataset:
hlgd_original_annotations.json
: A JSON file containing the 10 timelines, each with the original five annotations and the global group (aggregate of the five annotations).hlgd_classification_0.1.zip
a Zip file containing three files:train.json
,dev.json
andtest.json
, each containing a split of the final classification dataset described in the paper. This file is compatible with HuggingFace's dataset library.
Models:
cls_elec_base_hlgd_0.74f1.bin
model corresponds to theElectra Finetune on HLGD + Time
in the paper. An example use of the model is provided in model_classifier.pygpt2med_headline_gen_1.645.bin
model corresponds to the headline generator used for theHeadline Generator Swap
results. An example use of the model is provided in model_generator_swap.py
Analysis:
We release the underlying annotation for the Analysis section (Section 3.6): headline_grouping_typology_negatives.csv
and headline_grouping_typology_positives.csv
.