# How to Create a Custom Recipe[¶](https://knowledge.dataiku.com/latest/kb/customization/plugins/examples/recipe.html#how-to-create-a-custom-recipe "Permalink to this headline")

By writing a custom recipe, you can add a new kind of recipe to Dataiku DSS. The idea is:

* You write the core of the recipe in Python or R code

* You write a JSON descriptor that declares:

+ The kinds of inputs and outputs of the recipe

+ The available configuration parameters

* In the Python or R code of the recipe, you use a specific API to retrieve the inputs, outputs and parameters (i.e., the “instantiation parameters”) of the recipe

To the user, the custom recipe is a visual recipe in which they can enter the declared configuration parameters and run the recipe.

Let’s write a custom recipe that computes pairwise correlations (i.e., correlations between the values in pairs of columns). Such a recipe could be used, for example, to discover that the price of a car has a strong negative correlation with the mileage.

We will start by writing a Python recipe in the Flow of the tutorial project, and then make it “reusable”.

## Prerequisites[¶](https://knowledge.dataiku.com/latest/kb/customization/plugins/examples/recipe.html#prerequisites "Permalink to this headline")

In this tutorial, we’ll use the plugin we created in the introduction and add a custom recipe component to it.

## Create Your Project[¶](https://knowledge.dataiku.com/latest/kb/customization/plugins/examples/recipe.html#create-your-project "Permalink to this headline")

The first step is to create a new Dataiku DSS **Project**.

* From the Dataiku homepage, click **+New Project > DSS Tutorials > Code > Your first plugin (Tutorial)**.

This includes the example dataset *wine\_quality*.

## Create the Base Recipe[¶](https://knowledge.dataiku.com/latest/kb/customization/plugins/examples/recipe.html#create-the-base-recipe "Permalink to this headline")

Create a Python recipe with the *wine\_quality* dataset as an input and a new *wine\_correlation* dataset as the output.

The recipe code should look like the following:

§ # -\*- coding: utf-8 -\*-

§ import dataiku

§ import pandas as pd, numpy as np

§ # Read the input

§ input\_dataset = dataiku.Dataset("wine\_quality")

§ df = input\_dataset.get\_dataframe()

§ column\_names = df.columns

§ # We'll only compute correlations on numerical columns

§ # So extract all pairs of names of numerical columns

§ pairs = []

§ for i in range(0, len(column\_names)):

§ for j in range(i + 1, len(column\_names)):

§ col1 = column\_names[i]

§ col2 = column\_names[j]

§ if df[col1].dtype == "float64" and \

§ df[col2].dtype == "float64":

§ pairs.append((col1, col2))

§ # Compute the correlation for each pair, and write a

§ # row in the output array

§ output = []

§ for pair in pairs:

§ corr = df[[pair[0], pair[1]]].corr().iloc[0][1]

§ output.append({"col0" : pair[0],

§ "col1" : pair[1],

§ "corr" :  corr})

§ # Write the output to the output dataset

§ output\_dataset =  dataiku.Dataset("wine\_correlation")

§ output\_dataset.write\_with\_schema(pd.DataFrame(output))

Run the recipe and see the output: a dataset with 3 columns (col0, col1, corr) and one row per input columns pair.

## Convert It to a Custom Recipe[¶](https://knowledge.dataiku.com/latest/kb/customization/plugins/examples/recipe.html#convert-it-to-a-custom-recipe "Permalink to this headline")

To make this Python recipe a custom recipe:

* Click **Actions**

* Choose **Convert to plugin**

* Select **Existing dev plugin**

* Choose *first-plugin* as the **Plugin id**

* Type *compute-correlation* as the **New plugin recipe id**

* Click **Convert**

* Dataiku generates the custom recipe files and suggests we edit them now in the Plugin Developer. Let’s do that now.

For the rest of the tutorial, we’ll tweak the generated files.

Note

Dataiku stores the generated files under the **Plugin id**. You can always find the plugins you developed by visiting the application menu and selecting **Plugins** > **Development**.

### Edit Definitions in recipe.json[¶](https://knowledge.dataiku.com/latest/kb/customization/plugins/examples/recipe.html#edit-definitions-in-recipe-json "Permalink to this headline")

First, let’s have a look at the *recipe.json* file. The most important things to change are the `inputRoles` and `outputRoles` arrays. Roles allow you to associate one or more datasets to each kind of input and output of the recipe.

Our recipe is a simple one: it has one input role with exactly 1 dataset, and one output role with exactly 1 dataset. Edit your JSON to look like:

§ "inputRoles" : [

§ {

§ "name": "input",

§ "label": "Input dataset",

§ "description": "The dataset containing the raw data from which we'll compute correlations.",

§ "arity": "UNARY",

§ "required": true,

§ "acceptsDataset": true

§ }

§ ],

§ "outputRoles" : [

§ {

§ "name": "main\_output",

§ "label": "Output dataset",

§ "description": "The dataset containing the correlations.",

§ "arity": "UNARY",

§ "required": true,

§ "acceptsDataset": true

§ }

§ ],

We’d like to allow users of this plugin to be able to focus on “strong” correlations (i.e., values that are closest to +1 or -1).

We can specify a threshold parameter that can be set in the recipe dialog by editing the `params` section of *recipe.json*:

§ "params": [

§ {

§ "name": "threshold",

§ "label" : "Threshold for showing a correlation",

§ "type": "DOUBLE",

§ "defaultValue" : 0.5,

§ "description":"Correlations below the threshold will not appear in the output dataset",

§ "mandatory" : true

§ }

§ ],

### Edit Code in recipe.py[¶](https://knowledge.dataiku.com/latest/kb/customization/plugins/examples/recipe.html#edit-code-in-recipe-py "Permalink to this headline")

Now let’s edit *recipe.py*. The default contents include some generic starter code for referencing roles and parameters, the code from your Python recipe, and some comments that explain how to finish creating your custom recipe. In the end, your *recipe.py* should start with code for retrieving datasets and parameters like:

§ # Retrieve array of dataset names from 'input' role, then create datasets

§ input\_names = get\_input\_names\_for\_role('input')

§ input\_datasets = [dataiku.Dataset(name) for name in input\_names]

§ # For outputs, the process is the same:

§ output\_names = get\_output\_names\_for\_role('main\_output')

§ output\_datasets = [dataiku.Dataset(name) for name in output\_names]

§ # Retrieve parameter values from the of map of parameters

§ threshold = get\_recipe\_config()['threshold']

The portion of your original recipe that reads inputs needs to be updated to refer to the datasets created from the input roles, like:

§ # Read the input

§ input\_dataset = input\_datasets[0]

§ df = input\_dataset.get\_dataframe()

§ column\_names = df.columns

The portion of your original recipe that computes the correlations should be updated to include the threshold to filter out the weak correlations:

§ for pair in pairs:

§ corr = df[[pair[0], pair[1]]].corr().iloc[0][1]

§ if np.abs(corr) > threshold:

§ output.append({"col0" : pair[0],

§ "col1" : pair[1],

§ "corr" :  corr})

The portion of your original recipe that writes the output datasets also needs to be updated to refer to the datasets created from the output roles, like:

§ # Write the output to the output dataset

§ output\_dataset =  output\_datasets[0]

§ output\_dataset.write\_with\_schema(pd.DataFrame(output))

Verify that `wine\_quality` and `wine\_correlation` don’t appear anymore in your recipe. In general, the rest of *recipe.py* can be left as-is.

## Use Your Custom Recipe in the Flow[¶](https://knowledge.dataiku.com/latest/kb/customization/plugins/examples/recipe.html#use-your-custom-recipe-in-the-flow "Permalink to this headline")

Note

After editing *recipe.json* for a custom recipe, you must do the following:

* Click **Reload**

* Reload the Dataiku DSS page in your browser

When modifying the *recipe.py* file, you don’t need to reload anything. Simply run the recipe again.

* Go to the Flow

* Click **+ Recipe** and select your plugin recipe. The usual recipe creation tab appears.

* Select the *wine\_quality* input dataset

* Create a new output dataset

* Run the recipe, editing the default threshold value if you desire

* Congratulations, you have created your first custom visual recipe!
