-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Signed-off-by: pelinkeskin <86728061+pelinkeskin@users.noreply.github.com>
- Loading branch information
1 parent
d27ba56
commit 97038ec
Showing
1 changed file
with
2 additions
and
2 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,2 +1,2 @@ | ||
### Analysing Books with Hadoop & Data Exploration with PySpark | ||
This is a combination of two projects I made for one of my modules during my Masters to gain practical experience in using big data management tools, Hadoop and Spark. To complete Hadoop tasks, I loaded three books to HDFS and wrote map-reduce tasks with Python to analyze books. To complete PySpark tasks, I analyzed the wordle game dataset extracted from Twitter by utilizing a sparks multiprocessing framework with RDDs and Dataframes. | ||
### Designing a Data Warehouse & Association Rule Mining using SQL and Python | ||
I completed these tasks as part of an assignment in one of my modules during my Masters. This folder contains two notebooks; In Data_Warehousing_practice.ipynb I created a database and made a set of OLAP queries to explore the data. Then I defined a fact constellation schema diagram for the data warehouse with facts, dimensions, and measures and addressed steps for specific OLAP operations to perform to answer a particular query. Then use PostgreSQL Python and its libraries and define a set of functions to operate the data warehouse. In Association_Rule_Mining_Practice.ipynb, I cleaned and transformed an online retail dataset for association rule mining, then mined association rules using Apriori and FP-growth algorithms. |