-
Notifications
You must be signed in to change notification settings - Fork 7
To Dos
Chris Kennedy edited this page Aug 11, 2017
·
18 revisions
HAL todo list
- Move duplicationCheck into its own function
- Easier to test
- Easier to try different versions
- No real performance impact
- Alternative makeSparseMat implementations
- C++
- Armadillo sparseMat
- Eigen?
- Eventually: OpenMP and OpenACC (GPU) support
- R
- dplyr
- data.table
- C++
- Alternative duplicationCheck implementations
- C++
- Armadillo sparseMat
- Eigen?
- C++
- Early stopping on interactions
- So we can avoid the super high interactions
- Interaction restarting
- Resume adding interactions after initial partial fit to see if it improves performance
- More extensive performance profiling (RAM and CPU)
- Clear written description of all parts of the algorithm
- Solid examples in Cheng Ju et al.’s “On Adaptive Propensity Score Truncation in Causal Inference”
- Alternative lasso implementations
- h2o
- C++
- MLPACK
- Alternative prediction implementations
- R
- dplyr
- C++
- Armadillo / MLPACK
- R
- Larger algorithm re-implementation
- Saving the indicator functions in a list with two vectors: variables used (e.g. x1, x3), and cutoffs (1.5, 10.2)
- Wider R ML framework support
- mlr wrapper
- caret wrapper
- Python implementation based on C++ core (ala xgboost, arborist, etc.)
- then scikit-learn wrapper