PR-Rank

Requirements

Please download the following resources:

AOL-IA documents: Follow the instructions on the ir-datasets website
FastText model for filtering the AOL-IA dataset
DMOZ dataset for training the document topic estimator
RankLib for training the Learning-to-Rank model

Getting started

Install the required dependencies as follows:

conda env create -f env.yml
conda activate PR-Rank
pip install ./PR-Rank

Running Experiments

To execute the series of experiments, run the following commands:

# Estimate qrel
python PR-Rank/aolia_qrel/main.py

# Extract features
python -m spacy download en_core_web_sm
python PR-Rank/features_extraction/main.py

# Divide dataset
python PR-Rank/dataset_division/main.py

# Train & evaluate PR-Rank (Parameter regression model)
python PR-Rank/parameter_regression/main.py

To modify experimental settings, edit the following configuration files:

PR-Rank/aolia_qrel/config/config.yaml
PR-Rank/features_extraction/config/config.yaml
PR-Rank/dataset_division/config/config.yaml
PR-Rank/parameter_regression/config/config.yaml

Usage

PR-Rank involves two main experimental stages, each with its own configuration:

Dataset Division
PR-Rank Parameter Regression

Changing Feature Sets

You can independently select feature sets for each experimental stage:

Dataset Division Feature Sets

In the dataset division configuration, modify the feature_sets parameter:

# Use only query features for dataset division
feature_sets:
  - Q

PR-Rank Domain Feature Sets

In the PR-Rank configuration, modify the domain_feature_sets parameter:

# Use all features sets for PR-Rank
domain_feature_sets:
  - Q
  - D
  - Q-D

Available options for both stages are Q (Query), D (Document), Q-D (Query-Document pair), or any combination.

Experiment Naming Convention

Use descriptive names for each experimental stage to organize your runs effectively.

Dataset Division Experiment Name

In PR-Rank/dataset_division/config/config.yaml:

# experiment_name: q
domains_dir_path: PR-Rank/dataset_division/experiment/q/data/domains
ltr_datasets_dir_path: PR-Rank/dataset_division/experiment/q/data/ltr_datasets
...

PR-Rank Experiment Name

In PR-Rank/parameter_regression/config/config.yaml:

# experiment_name: all
domain_features_dir_path: PR-Rank/parameter_regression/experiment/all/data/domain_features
model_parameters_dir_path: PR-Rank/parameter_regression/experiment/all/data/model_parameters
...

Name		Name	Last commit message	Last commit date
Latest commit History 13 Commits
PR-Rank		PR-Rank
.gitignore		.gitignore
README.md		README.md
env.yml		env.yml

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

PR-Rank

Requirements

Getting started

Running Experiments

Usage

Changing Feature Sets

Dataset Division Feature Sets

PR-Rank Domain Feature Sets

Experiment Naming Convention

Dataset Division Experiment Name

PR-Rank Experiment Name

About

Releases

Packages

Languages

kasys-lab/PR-Rank

Folders and files

Latest commit

History

Repository files navigation

PR-Rank

Requirements

Getting started

Running Experiments

Usage

Changing Feature Sets

Dataset Division Feature Sets

PR-Rank Domain Feature Sets

Experiment Naming Convention

Dataset Division Experiment Name

PR-Rank Experiment Name

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages