A context-aware group recommender system based on collaborative filtering, with benchmarking experiments.
It has been tested in the music domain (song recommendation), both by performing experiments (section 2) and in the implementation of a web application (using flask_backend.py together with firebase).
During the development of this recommender system/benchmark, an own dataset has been created and will be published soon. For more information see section 4.
Download and dependencies installation:
git clone https://github.com/adrixo/cf-group-recommender-system
cd cf-group-recommender-system
pip install -r requirements.txt
Execution of a simple recomendation:
cd recommender
python3 recommender_example.py
Execution of tests:
cd recommender/tests/
python3 test_name.py
Execution of automated experiments (may require high computational cost with large datasets):
cd recommender
python3 experiments.py
The following experiment is proposed to test the results of the recommendations with different configurations in order to perform a comparative evaluation:
- For each context in [none, chill, fitness, party]
- For each algorithm in [baseline, svd, coclustering, nmf, knn]
- For each group in [custom groups]
- For each aggregation method in [avg, add, app, avm, lms, maj, mpl, mul]
- For each top-n length from 5 to 100 -> Calculation of precision, recall and nDCG metrics.
This experiment is described in recommender-system/experiments.py
Where the contexts are induced from Spotify songs' features:
- Chill: Song's energy below 30%
- Party: Song's valence over 85%
- Fitness: Song's energy over 90%
After the execution, the recommender-system/results_file.csv is created and can be processed throught recommender-system/process_results.ipynb to create result graphs:
tests have been developed for each of the classes, can be found in recommender-system/recommender/tests/ with the name classname_tests.py
The recommender system is based on aggregated predictions. The predictions are obtained using the collaborative filtering approach and then aggregated through one of the 9 implemented aggregation methods. The context is considered through the pre-filtering strategy. Finally, the evaluation is performed on top-n lists according to the relevant items recommended for the different group members.
The following figure shows the recommendation process:
The recommender consists of five classes, each one related to the steps that involve the recommendation, except for the contextual pre-filtering that is performed during the dataset loading for better performance. The classes are connected as follows:
The "algorithm" class uses the facade design pattern in order to, using the algorithm_base.py interface, different libraries or implementations can be used. In the Algorithms subsection this interface is detailed.
The following example shows the simplified execution of objects:
# 1. Dataset load + contextual prefiltering
ds = Dataset(
dataset_file, sep=sep, cols=columns,
rating_scale=rating_scale, split_dataset=True,
lib=lib, line_format=line_format)
# 2. Model load and model fitting
alg = Algorithm(ds, lib, algorithm)
alg.fit_model()
# 3. Group creation
group = Group(group_name='group name')
group.add_list_of_users(custom_group_members)
# 4. Aggregation of group recommendation prediction
agg = Aggregator(ds, alg, group)
rec = agg.perform_group_recommendation()
agg.print_group_recommendation()
# 5. Group recommendation evaluation
agg.evaluate()
Six algorithms have been implemented using the Surprise library (github project), for more information about it and more details about each algorithm see the Surprise documentation.
1. Surprise: - Baseline - SVD - KNN - CoClustering - NMF - SlopeOne
2. Other libraries/implementations: - To implement *
* Any other library or developer implementation can be developed following the algorithm_base.py interface, fulfilling the following methods:
start_model()
fit()
predict()
get_top_n()
9 aggregation methods have been implemented from Felfering et al., divided into three groups:
1. Consensus based:
- Average
- Additive utilitarian
- Multiplicative
- Average without misery
- fairness
2. Borderline based:
- Least misery
- Majority voting
- Most pleasure
- Most respected person
3. Majority based:
- Approval voting
- Plurality voting
- Copeland rule
- Borda count
contextual pre-filtering is performed during dataset loading using lists with tuples in the following way:
[
{"name": "context_name"
"context": [{
"column": "column_1",
"mode": "value",
"threshold": 0.85,
"direction": "above"
}]
},
{"name":"context_2", "context": ...}
]
Where column is the selected column to filter, mode is 'value' or 'nominal' if the filtered variable is a number or a tag, and direction (above or below) is the threshold direction to filter.
The evaluation metrics are based on the analysis of top-n recommended lists, based on whether or not the ratings of the recommended are relevant (above or below a certain threshold).
The currently implemented functions are:
- Precision
- Recall
- DCG and nDCG
- MAE and RSME
The MAE, RSME metrics are not valid in aggregate prediction, its implementation is limited to the study of its performance . Serendipity, coverage, consensus and fairness implementation could be studied in future works.
The evaluation of top-n lists is strongly influenced by their length (n-value).
In order to simplify the executions and experiments, the file configuration.py is created, in that file the dataset used and its information, the group details, algorithms employed, etc. are set.
Several scripts have been created to automate experiments and data processing: En la carpeta preproc data hay distintos archivos con los cuales generar y procesark los archivos
The dataset has been created from the LastFm 1k users dataset and enriched with spotify song features using the spotify API.
This dataset will be published soon.