Fancy seeing you here π !
I am Maria Balos, a data scientist and user-centric designer based in Cambridge, UK. You can find me most of the time behind a screen or next to a coffee. Welcome to this small corner of my work!
Github Stats
Right now I am involved in to:
- Completing daily coding problems in Leedcode, please check my Leetcode profile
- Master in Deep Learning and Generative AI by DATAMECUM - October 2024 - April 2025
- Convolutional Neural Networks (CNN): Applying class concepts by creating my first CNN architecture and training a model for emotion detection
- Recurrent Neural Networks (RNN)
- Working on Ryanair Radar Project
- Finishing the last 16/100 projects of the course 100 Days Of Code in Python by Angela Yu on Udemy.
Last achievements:
- 1st of November: I reached 200 solved Leetcode problems: Check it out in my Leetcode profile
- 30th of October: Ryanair time-capsule, reverse-engineering the Ryanair API to collect daily flight prices and train machine learning models to forecast price changes.
- 17th of October: Completed NLP HuggingFace course.
- 28th of May: Completed the first part of the "Practical Deep Learning" course by fast.ai
- 17th of May: Completed the "Advanced Learning Algorithm Course" by Andrew Ng in Coursera
- 9th of May: Winner of the DATAMECUM Datathon 3rd promotion competition.
Last Medium post:
Please check out the next sections to see these skills applied in projects.
-
Exploratory Data Analysis: understanding the data, identifying missing values, approach duplicated values, handling ambiguous values,identifying outliers and anomalies, and correlation detection.
-
Unsupervised Machine Learning
- K-Means: Weever Watermark project
- DBSCAN
- PCA
- Outliers and anomaly detection
- SOM - Self-Organizing Maps
-
Supervised Machine Learning
- Generalised Linear Models
- Support Vector Machines
- K-Nearest Neighbors
- Decision Stumps: Kaggle notebook
- Decision Trees:
- Random Forest: Datamecum Datathon
- XGBoost: Datamecum Datathon
- Ensemble models: Winner of the Datamecum Datathon capstone project competition with an ensemble of the Random Forest and XGBoost predictions, please check out the presentation video.
-
Python libraries for data science
- Data processing: Pandas, NumPy
- ML & stats: Scikit-Learn, Statsmodels
- Data visualisation: Matplotlib, Seaborn, Plotly
- Space Mission Analysis is a data exploration and data visualisation project where I applied most of the data visualisation libraries.
- Mohs Hardness Exploratory Data Analysis: Decision Stump (one layer decision tree) for a Kaggle competition, this placed me in position 598/1632 at the end of the competition. A decision stump presentation has been created to introduce Datamecum students to decision stumps.
- Datamecum Datathon - Capstone project competition between the third promotion students of the Intensive Program in Data Science by DATAMECUM consisting of building a supervised model to predict a binary class. The exploratory data analysis consisted of:
- checking for missing values.
- handling duplicated values and ambiguous data.
- exploring the relation between missing values and the target variable.
- Self Organizing Maps and correlation matrix used for correlation checks.
- Weever Watermark: Applying K-Means and arrays transformation to group the colours of a provided image and use the generated centroids to create a 10-colour palette into a GUI API created with Flask. Please have a look at the Weever Watermark DEMO or at the Machine learning applied to the design industry: K-Means for image palette generation article where I explain the project.
- Datamecum Dataton - Capstone project competition between the third promotion students of the Intensive Program in Data Science by DATAMECUM consisting of building a supervised model to predict a binary class.
- Typing Thunder: a speed-typing GUI app created to measure how fast the user type in one minute. Typing Thunder DEMO
- Morse Code: a command-line program where the dictionary loops, strings and functions in Python are applied. Morse Converter DEMO
- MochaMaps: a website that displays coffee shops and their facilities from a database by using: SQLite, SQLAlchemy, Jinja2, REST API, Flask API and Bootstrap-Flask. MockaMaps DEMO
- LinkedIn Toggler: Using Selenium for Python to automate recurrent LinkedIn tasks. Please have a look at the LinkedIn Toggler repository if you want to know more about this project.
Thank you for visiting my GitHub! Feel free to have a deeper look in my repositories to find more specific projects. Please share any feedback, suggestions, or tips that you believe could help me grow and improve!
I am always happy for a coffee, a chit-chat or a discussion of any possible collaboration. Please drop me an email at mariabalos16@gmail.com or send me a message through my LinkedIn if you fancy any of those.