How-to | Reshape data from wide to long format #

You can use the Pivot recipe to reshape data from long to wide format. However, if initially presented with data in a wide format, you can “unpivot” the data from wide to long format using the Prepare recipe processor Fold multiple columns (or Fold multiple columns by pattern ).

Consider a dataset with the following structure:

Dataiku screenshot of a dataset in wide format.

To reshape this dataset, so that the *_total_sum columns are folded into one total_sum column with one row per year :

  1. In a Prepare recipe, click + Add a New Step .

  2. Choose Fold multiple columns by pattern .

  3. For the field Columns to fold pattern , supply a regular expression that matches which columns should be folded.

  4. For the Column for fold name field, provide a name for the new column holding the row labels (in this case year ).

  5. For the Column for fold value field, provide a name for the new column holding the cell values (in this case total_sum ).

  6. Check the box Remove folded columns to delete the folded columns from the schema of the output dataset.

Dataiku screenshot of a dataset in wide format.

Note

You can find another example of this processor being used in the reference documentation.