Concept | Date handling in Dataiku #

Watch the video

Introduction to date parsing challenges #

Working with dates poses a number of data cleaning challenges.

There are many date formats, different time zones, and components like “day of the week” which can be difficult to extract. A human might be able to recognize that “1/5/19”, “2019-01-05”, and “1 May, 2019” are all the same date. However, to a computer, these are just three different strings.

Parsing dates.

Using the Prepare Recipe to solve these challenges #

Strings representing dates need to be parsed so that the computer can recognize the true, unambiguous meaning of the date. Dataiku answers this problem with the Prepare recipe .

When you have a column that appears to be a date, Dataiku is able to recognize it as a date. In the example below, the meaning of the first column is an unparsed date.

You can proceed in two ways to parse it:

  • Open the processor library, filter for Dates , and search for a step to help in whatever situation you may find yourself. Here, we find the Parse date processor.

    Parse date processor.
  • Take advantage of how Dataiku suggests transformation steps based on a column’s meaning. Because Dataiku has identified this column as an unparsed date, it suggests adding the Parse date processor to the script.

    Screenshot of the context menu of a date column.

Both methods achieve the same result.

After you have chosen the correct processor, it is just a few more clicks to select the correct settings, in this case, the format of the date and the timezone.

Once you’ve added a step, a preview of the output is immediately visible. You can see how the format of the date has changed, and the meaning is now a Date.

Now, with the properly parsed date, you’re on your way! Dataiku will suggest new steps, such as Compute time since , Extract date components , and Filter on date .

../../_images/prepare-date-new-steps.png

What’s next? #

In this lesson, you learned how to handle and format dates in Dataiku. Continue getting to know the basics of Dataiku by learning about formulas .