Skip to content

MADlib v1.7.1

Compare
Choose a tag to compare
@iyerr3 iyerr3 released this 21 Mar 22:52
· 119 commits to placeholder since this release

Release Date: 2015-March-18

New features:

  • Random Forest Performance Improvement
    • Function forest_train() is 1.5X ~ 4X faster without variable importance,
      and up to 100X faster with variable importance
    • Function forest_predict() is up to 10X faster when type='response'
    • Allow user-specified sample ratio to train with a small subsample
  • Gaussian Naive Bayes: allow continuous variables
  • K-Means: Allow user-specified sample ratio for K-means++ seeding
  • Miscellaneous
    • Array functions: array_square() for element-wise square, madlib.sum()
      for array element-wise aggregation
    • Madpack does not require password when not necessary (MADLIB-357)
    • Platform support of PostgreSQL 9.4 and HAWQ 1.3
    • Allow views and materialized views for training functions
    • Support quantile computation in summary functions for HAWQ and PG 9.4

Bug fixes:

  • Fixed the support of multiple parameter values and NULL in general
    cross-validation (MADLIB-898, MADLIB-896)
  • Fixed infinite loop when detecting recursive view-to-view dependencies for
    upgrading (MADLIB-901)
  • Allow user-specified column names in PCA and multinom_predict()

Known issues:

  • Performance for decision tree with cross-validation is poor on a HAWQ
    multi-node system.