Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
Signed-off-by: pelinkeskin <86728061+pelinkeskin@users.noreply.github.com>
  • Loading branch information
pelinkeskin authored Nov 23, 2023
1 parent d27ba56 commit 97038ec
Showing 1 changed file with 2 additions and 2 deletions.
4 changes: 2 additions & 2 deletions DatawareHousing_AssociationRule_Mining/README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,2 @@
### Analysing Books with Hadoop & Data Exploration with PySpark
This is a combination of two projects I made for one of my modules during my Masters to gain practical experience in using big data management tools, Hadoop and Spark. To complete Hadoop tasks, I loaded three books to HDFS and wrote map-reduce tasks with Python to analyze books. To complete PySpark tasks, I analyzed the wordle game dataset extracted from Twitter by utilizing a sparks multiprocessing framework with RDDs and Dataframes.
### Designing a Data Warehouse & Association Rule Mining using SQL and Python
I completed these tasks as part of an assignment in one of my modules during my Masters. This folder contains two notebooks; In Data_Warehousing_practice.ipynb I created a database and made a set of OLAP queries to explore the data. Then I defined a fact constellation schema diagram for the data warehouse with facts, dimensions, and measures and addressed steps for specific OLAP operations to perform to answer a particular query. Then use PostgreSQL Python and its libraries and define a set of functions to operate the data warehouse. In Association_Rule_Mining_Practice.ipynb, I cleaned and transformed an online retail dataset for association rule mining, then mined association rules using Apriori and FP-growth algorithms.

0 comments on commit 97038ec

Please sign in to comment.