MADlib: Big Data Machine Learning in SQL for Data Scientists

  • Open Source, commercially usable BSD license
  • Supports Postgres, Pivotal Greenplum Database, and Pivotal HAWQ
  • Powerful analytics for Big Data

Read More

Latest News

MADlib v1.5 Release Announcement

MADlib v1.5 is released and available for download.

New features include:

  1. Support for the Pivotal Distribution of Hadoop (PHD) via HAWQ.
  2. Updated design and improved usability for Conditional Random Fields (CRFs).
  3. Performance improvements for linear and logistic prediction functions.

Various bug fixes have been made including:

  1. Fixed elastic net prediction to predict using all features instead of just the selected features to avoid an error when no feature is selected as relevant in the trained model.
  2. For corner probability values, p=0 and p=1, in bernoulli and binomial distributions, the quantile values should be 0 and num_of_trials (=1 in the case of bernoulli) respectively, independent of the probability of success.

For a more detailed list of changes see the MADlib v1.5 Release Notes.

Access the binaries on the MADlib Download Page. As always the MADlib user forum is open for questions.

MADlib v1.4 Release Announcement

MADlib v1.4 is released and available for download.

New features include:

  1. Improved interface for Multinomial logistic regression.
  2. Robust variance and clustered variance estimators for Cox Proportional Hazards.
  3. NULL handling for various regression methods.

Deprecated functionality includes:

  1. Old mlogregr() function has been deprecated in favor of new mlogregr_train() function.
  2. Optimizer parameters for robust variance functions have been gathered into a single parameter instead of three separate parameters.  See documentation for details.

For a more detailed list of changes see the MADlib v1.4 Release Notes.

Downloads are available on the MADlib Download Page.

As always the MADlib user forum is open for questions.

MADlib v1.3 Release Announcement

Announcing the availability of MADlib v1.3 including the addition of Stratification support for Cox Proportional Hazards and improvements in NULL handling.

Binary packages are available for CentOS/RedHat and for Mac OS X. On other platforms, MADlib can be built from source. Our Wiki provides detailed instructions for deploying MADlib on PostgreSQL and Greenplum installations. For a list of new features, bug fixes, and known issues, please refer to the Release Notes.

As always, the MADlib forum is open for questions and discussions. Try it out and let us know about your feedback!

Older News