MADlib: Big Data Machine Learning in SQL for Data Scientists

  • Open Source, commercially usable BSD license
  • Supports Postgres, Pivotal Greenplum Database, and Pivotal HAWQ
  • Powerful analytics for Big Data

Read More

Latest News

MADlib v1.7 Release Announcement

MADlib v1.7 is released and available for download.

New features include:

  1. A new GLM (generalized linear models) module that allows various regression and classification methods beyond the basic linear and logistic regression.
  2. Revamped Decision Tree and Random Forest modules that provide better features, easier-to-use interfaces and better performance.
  3. Enhanced PMML support for exporting GLM output and tree/forest models.

For a more detailed list of changes see the MADlib v1.7 Release Notes.

Access the binaries on the MADlib Download Page. As always the MADlib user forum is open for questions.

MADlib v1.6 Release Announcement

MADlib v1.6 is released and available for download.

New features include:

  1. A new unified ‘margins’ function that computes marginal effects for linear, logistic, multilogistic, and cox proportional hazards regression.
  2. A a new helper function to convert categorical variables using dummy encoding to indicator variables which can be used directly in regression methods.
  3. Multi-fold performance for cox proportional hazards and ARIMA.
  4. New functionality to export linear and logistic regression models as a PMML object, using the PyXB python library.

Various bug fixes include:

  1. A check in K-Means to ensure dimensionality of all data points are the same and also equal to the dimensionality of any provided initial centroids.
  2. A check in multinomial regression to quit early and cleanly if model size is greater than the maximum permissible memory.
  3. Error out when grouping columns have same name as one of the output table column names.

For a more detailed list of changes see the MADlib v1.6 Release Notes.

Access the binaries on the MADlib Download Page. As always the MADlib user forum is open for questions.

Older News