Predictive analytics is the use of data, statistical algorithms and machine learning techniques to identify the likelihood of future outcomes based on historical data. The goal is to go beyond knowing what has happened to providing a best assessment of what will happen in the future.
The predictive analytics pipeline includes the following steps at a high level.
- Define objective
- Obtain data
- Explore and visualize data
- Preprocess data (clean, scale, etc.)
- Select models/algorithms to use and implement
- Evaluate the performance of models and select best one
- Deploy model
The repository aims to perform predictive analytics on real-world datasets from multiple sources using a combination of supervised and unsupervised models.