recommendations of content tied to a user
- NumPy (1.14.3+)
- SciPy (1.1.0+)
- sh (1.12.14+)
- TensorFlow (1.8.0+)
- Pandas (0.22.0+)
- Scikit-Learn (0.19.1+)
- Matplotlib (2.2.2+)
Additionally, to run the api you need:
There are essentially three steps involved:
- Preprocessing:
preprocess.py
Accepts either a JSON or a CSV file as input and outputs two Scipy sparse matrices: one for training another for testing. - Training:
task.py
Accepts the Scipy sparse matrices and trains the model on them - Prediction:
predict.py
Will output the topn
ratings for a specified user id.
You can run any of these scripts with the -h
or --help
option for more information on supported options.
First of all, you need a dataset. You can use any that catches your fancy. We will be working the 5-core amazon music dataset
Now, we need to take this JSON and transform it into a trainging matrix and a testing matrix.
$ python preprocess.py --data Digital_Music_5.json --format json --col-order reviewerID asin overall --lines True
There should now be two files in your directory: train.npz
and test.npz
. We now train the model.
$ python task.py --train-data train.npz --test-data test.npz
Now, there should be a new directory called model
with two files: row.npy
and col.npy
. To get predictions, run
$ python predict.py --u model\row.npy --v model\col.npy --user-id 12
[ 990 1973 1255 2268 644]
12
here is the row idex of the user in our matrix and [ 990 1973 1255 2268 644]
are the column indices of the recommended music in our matrix. To recover the actual user id and music id, you need to set --save-map
in preprocess.py
and supply the two maps to predict.py
. See help for more details. (Accesible by running python preprocess.py --help
and python task.py --help
)
At present, the API is tightly coupled with the Collaborative Communities project. It is only useful for making recommendations based on a CC user's viewing history. To use the API, you will need to have the event logging module installed.
To get recommendations, make a GET
request to the server with the user id and (optionally) the number of recommendations needed.
eg: http://localhost:3445/rec?user=12&nrecs=3
Before the API is able to generate recommendations, it must be trained. To train the API make a POST request to the server specifying the URI of the logs and optionally specify the parameters for preprocessing and training.
eg: curl -i -X POST -H 'Content-Type: application/json' -d '{"article-view": "http://localhost:8000/logapi/event/article/view/?after=1970-01-01T00:00:00"}' http://localhost:3445/train
To visualise recommendations, make a GET
request to the server. Optionally, you may specify the user id and the percentage of items to display.
eg: http://localhost:3445/visual?user=1&r=3
-
Install redis:
sudo apt−get install redis−server
) -
Create a virtual environment:
virtualenv --system-site-packages -p python3 rec_api
-
Activate the virtual environment:
source ~/rec_api/bin/activate
-
Clone this repo:
git clone https://github.com/fresearchgroup/Community-Recommendation.git
-
Change into the directory:
cd Community-Recommendation
-
Install dependencies:
pip3 install -r requirements.txt
-
Set up Flask:
export FLASK_APP=flask_api.py
and, optionally,export FLASK_ENV=development
-
Set the token for event logs:
export LOG_AUTH_TOKEN=Your_Token_Here
-
Run the server (eg:
flask run --host 0.0.0.0 --port 3445
)
- Install Docker and Docker-Compose
- Clone this repo:
git clone https://github.com/woodsy-sounding-wilful-sapwood/Community-Recommendation.git
- Change into the directory:
cd Community-Recommendation
- Add the event logs token:
echo ”Your Token Here” >> .env
- Build the system:
sudo docker-compose build
- Run the system:
sudo docker-compose up