The objective of this project is to assign a rank to each page of the Wikipedia using Spark. The first part of the project is data preprocessing and then page rank algorithm is used. I recommend to use DataBricks to be able to run the code and use a small sample of the Wikipedia pages.
This project is developed with python and spark.