Skip to content
Poohowy edited this page Dec 15, 2017 · 33 revisions

Installing MADlib®

Prerequisites

  1. An installed version of the Pivotal HAWQ, Pivotal Greenplum Database 4.2+ or PostgreSQL (64-bit) 9.2+ with plpython support enabled. Note: plpython may not be enabled in PostgreSQL by default.
  • For the Pivotal HAWQ and Greenplum Database, see your Pivotal account representative (http://pivotal.io)

  • For PostgreSQL, here is one installation method for CentOS:

          # Install 64 bit Postgres 9.2, including plpython
          $ rpm -i http://yum.postgresql.org/9.2/redhat/rhel-6-x86_64/pgdg-centos92-9.2-6.noarch.rpm
          $ sudo yum install postgresql92 postgresql92-server postgresql92-contrib postgresql92-devel postgresql92-plpython postgresql92-plperl
    
          # Initialize the Postgres service
          $ service postgresql-9.2 initdb
          $ service postgresql-9.2 start
    
          # Create a user and database
          $ su postgres
          $ psql template1
              create user mdusername superuser password 'urpassword';
              CREATE DATABASE maddb;
              \q
    
          # Configure your postgresql connection configuration. The following is a very permissive
          # configuration, depending on your security needs you may want to be more conservative.
          # See the postgres documentation for configuring your pg_hba.conf file for details:
          #     http://www.postgresql.org/docs/9.2/static/auth-pg-hba-conf.html
          $ nano /var/lib/pgsql/9.2/data/pg_hba.conf
    
               # TYPE  DATABASE    USER        CIDR-ADDRESS          METHOD
               # "local" is for Unix domain socket connections only
               local   all         all                               trust
               # IPv4 local connections:
               host    all         all         127.0.0.1/32          md5
               # IPv6 local connections:
               host    all         all         ::1/128               ident
               # Allow all remote connections
               host    all         all         0.0.0.0/0             md5
    

Single Node: HAWQ >= 1.2, Greenplum >= 4.2 or PostgreSQL >= 9.2

Note: For PostgreSQL on Mac OS X, Homebrew formulae is available: https://github.com/Homebrew/homebrew-science/blob/master/madlib.rb

  1. Download the MADlib binary installation package that is appropriate for your version of Greenplum or PostgreSQL. (http://madlib.net/download)

    In the following, we will use $MADLIB_PACKAGE as a placeholder for the downloaded file.

  2. Start the MADlib installation.

    NOTE: If you already have MADlib installed on this machine this step will overwrite your previous MADlib binaries. If you would like to install it to a different location read the "Installing Multiple Versions of MADlib" section below.

    • Mac OS X: Double click on the installer package

    • Red Hat / CentOS Linux: Run the following as root:

        yum install $MADLIB_PACKAGE --nogpgcheck
      
  3. Make sure the MADlib in-database registration utility (madpack) will be able to locate your database installation:

    • For Greenplum, run source /path/to/greenplum/greenplum_path.sh
    • For PostgreSQL, make sure that psql is in PATH.
  4. Register MADlib in your database (for example: $DBMS=greenplum, $HOST=localhost:5432, $DATABASE=testdb, $USER=gpadmin):

    /usr/local/madlib/bin/madpack -p $DBMS -c $USER@$HOST/$DATABASE install
    

    For PostgreSQL, use $DBMS=postgres instead.

  5. To test your installation you can run the install check procedure:

    /usr/local/madlib/bin/madpack -p $DBMS -c $USER@$HOST/$DATABASE install-check
    

Multi-Node Cluster: HAWQ >= 1.2 or Greenplum >= 4.2 on Red Hat / CentOS Linux

Follow the single-node installation, except in step 2 run the following steps instead:

  1. Copy the installation package to the segment nodes:

    gpscp -f seg_host_file $MADLIB_PACKAGE "=:$(pwd)"

  2. Run the following as root to perform the installation on all nodes:

    gpssh -f seg_host_file << EOF cd "$(pwd)" yum -y install madlib-X.Y-Linux.rpm --nogpgcheck EOF

Other operating systems

Install from source as described in Building MADlib from Source.

Installing Multiple Versions of MADlib

To install another MADlib package (of any version) on a system with an existing MADlib installation, follow these hints:

  • For full control, install from source and adjust the CMAKE_INSTALL_PREFIX build setting, e.g.

    ./configure -DCMAKE_INSTALL_PREFIX=/other/directory

    See Building MADlib from Source for details.

  • You can install the MADlib RPM into a custom directory using the --relocate parameter. E.g.,

    rpm -i madlib-X.Y-Linux.rpm --relocate /usr/local=/other/directory

  • When registering a MADlib version in your database (step 4 above), call /other/directory/madlib/bin/madpack instead.