FAQ | Where can I see how many records are in my entire dataset? #

The default sample previewed in the Explore tab of a dataset is the first 10,000 records, but your whole dataset may have many more records than this. To check the full record count using Dataiku built-in methods, there are a few different options (additional details on each are included below):

  1. From the Flow, select the dataset and directly compute dataset metrics from the Info tab on the far right panel, under the Status header.

  2. With the dataset open in the Explore tab, select the Compute row count icon at the top of the dataset.

  3. With the dataset open, visit the Status tab to compute or review dataset metrics.

  4. If record count is part of a recurring quality check (after a scenario run, for example), you can embed this metric into a Dataiku dashboard and set it to automatically update each time the table is rebuilt.

Method 1: From the Flow #

With the dataset selected in the Flow, navigate to the Info tab in the far right panel and click Compute under the Status header.

Computing record counts in the Info window under the Status tab.

Configured metrics will appear in-place inside this menu, and may be refreshed as needed from this point forward.

Window showing the total size and record count of the dataset.

Method 2: Compute row count #

In the Explore dataset view, Dataiku displays the number of sampled rows in the top left. For datasets larger than 10,000 rows, Dataiku shows the total record count as “not computed” by default.

To view the record count, select the Compute row count icon, or the arrow icon, next to the Sample badge.

Computing record counts with the Compute Row Count icon in the top left of the dataset.

Methods 3 and 4: Status tab and metrics #

From the Explore dataset view, navigate to the Status tab and click Compute .

Status tab showing the number of columns and records in a dataset.

The default metrics are column count and record count, but you can add additional dataset metrics in the Edit subtab if desired. Metrics are often used in conjunction with scenarios, but are not strictly dependent on scenarios. For example, tracking the number of records might show you how many new customer records are getting added to the database each day.

Metrics can be published to a Dataiku dashboard, and if you would like them to automatically update each time the dataset is rebuilt (as might be the case in a recurring automation scenario), simply toggle the option for Auto compute after build to Yes .

Updating metrics to automatically compute after build.

Note that metrics probes are automatically historized, which is very useful to track the evolution of a dataset’s status. To review the history of a dataset metric, simply select History instead of Last value in the Display dropdown menu of the main Metrics page.

View the history of record counts in the dataset.

You can find more information about metrics in our documentation here .