Tutorial | Configure a connection between Dataiku and a SQL database #

Get started #

Data projects must begin with connecting to data sources.

Objectives #

In this tutorial, you will:

  • Create a local SQL connection if you do not already have one available.

  • Create a connection to this source in Dataiku.

Prerequisites #

  • You will need admin permission on a Dataiku instance.

Create a SQL database #

To get started, you need to have a SQL database available. If you already have one, please skip to the next section .

Note

For this example, we’ll use PostgreSQL, but you can follow a similar process for any supported SQL database .

Install PostgreSQL #

If it isn’t already available, install PostgreSQL version 9 or above. Take note of the host on which PostgreSQL is installed.

Tip

On macOS, we recommend using the Postgres App .

Configure your PostgreSQL database #

At a minimum you need a user and a database, as explained in the PostgreSQL guide . As a best practice, we recommend using schemas in order to administer multiple projects within a database.

For example, the sample code below creates the user matthieu , with password Password , and grants this user all privileges (can create and delete tables) in the dku_churn schema in the dku database. Similarly, user dku_tshirt_admin has been granted all privileges on the dku_tshirt schema in the dku database.

psql -h localhost
CREATE DATABASE dku;
\c dku
CREATE SCHEMA dku_churn;
CREATE USER matthieu WITH PASSWORD 'Password';
GRANT ALL PRIVILEGES ON SCHEMA dku_churn TO matthieu;
CREATE SCHEMA dku_tshirt;
CREATE USER dku_tshirt_admin WITH PASSWORD 'Password';
GRANT ALL PRIVILEGES ON SCHEMA dku_tshirt TO dku_tshirt_admin;
\q

Configure a SQL connection in Dataiku #

Now that we have credentials for a SQL database, we need to establish a connection between Dataiku and this database.

Important

This walkthrough is for a self-managed Dataiku instance. If using Dataiku Cloud, see the how-to for adding a new connection .

Create a connection #

  1. Log in as the Dataiku Administrator.

  2. From the Applications menu in the top navigation bar, choose Administration .

    The Administration menu is located in the Admin Tools menu.
  3. Navigate to the Connections tab, and select New Connection > PostgreSQL .

    Many different SQL database connections are available in the New Connection menu.
  4. Fill in the information required for the connection. Don’t forget to give a name to your connection!

    Warning

    You cannot change the connection name afterward.

    Create a SQL connection on the new PostgreSQL connection screen.
  5. Test and create your PostgreSQL connection.

Import datasets into Dataiku #

After creating your connection, if you are connecting to a pre-existing SQL database, you can create a Dataiku dataset for every table in the database.

  1. Click Import tables to datasets .

  2. Select the tables you want to import.

  3. Click Import tables .

  4. Select the project in which the datasets associated with these tables should appear.

You can mass import tables after creating the SQL connection.

What’s next? #

Congratulations! You now have a SQL connection to use underneath your Dataiku Flows.

Learn more about how to take advantage of this type of connection in Tutorial | Data transfer with visual recipes .

Tip

You can find this content (and more) by registering for the Dataiku Academy course, Integration with SQL Databases . When ready, challenge yourself to earn a certification !