FUJI: Fuzzy Jaccard Index: A robust comparison of ordered lists

This repository contains the code (and many feature rankings computed on over twenty real-life benchmark data sets) from the paper Fuzzy Jaccard Index: A robust comparison of ordered lists.

This code is distributed under the Creative Commons Attribution license (CC BY 4.0), so the authors would greatly apprecieate if you acknowledge its use by citing the paper above (the corresponding bibtex is shown below).

@misc{fuji,
    title={Fuzzy Jaccard Index: A robust comparison of ordered lists},
    author={Matej Petkovi\{c} and Bla\v{z} \v{S}krlj and Dragi Kocev and Nikola Simidjievski},
    year={2020},
    eprint={2008.02216},
    archivePrefix={arXiv},
    primaryClass={cs.LG}
}

Example

The code is easy to use and implements the FUJI score (fuzzy_jaccard), as well as all the baselines that we compare to (jaccard, hamming, pog, npog, kuncheva, wald, lustgarten, krizek, cwrel, pearson, correlation, fuzzy_gamma).

For example, once we obtain the rankings r and s, e.g.,

r = [1.0, 0.9, 0.3, 0.14, 0.1]
s = [0.8, 0.9, 0.3, 0.14, 0.1]

(where r[i] and s[i] give the importance of the i-th feature), FUJI can be computed as

curve, auc = compute_similarity(r, s, "fuzzy_jaccard")

The list curve is a list, containing the FUJI values at each point, and auc is the area under this curve. For some other examples, see main.py.

Dependencies

The code implements many similarity scores. Some of them need numpy or scipy. For showing the progress, tqdm can be used.

.fimp files

The structure of the files is the following:

<meta data (if available)>
<fimp table>

<fimp table> consists of four columns:

index of the feature in the dataset
name of the feature
rank of the feature (>= 1)
feature relevance score

The values are tab-separated.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

FUJI: Fuzzy Jaccard Index: A robust comparison of ordered lists

Example

Dependencies

.fimp files

Files

README.md

Latest commit

History

README.md

File metadata and controls

FUJI: Fuzzy Jaccard Index: A robust comparison of ordered lists

Example

Dependencies

.fimp files