You do not yet have any connection for {{dataset.type|datasetTypeToName}}. You need to create a new connection before creating datasets on this connection.
Leave blank to use the Project ID associated with the connection Leave blank to use the Snowflake database associated with the connection Leave blank to use the MS SQL Server database associated with the connection Leave blank to use the MS Fabric Warehouse database associated with the connection Leave blank to use the Databricks catalog associated with the connection
Important note about Hive query datasets

We recommend to only use Hive query datasets if you absolutely need, i.e. if there is no table nor view in your Hive database that fits your needs.
Hive query datasets have the following performance-related implications:

  • They cannot be used as inputs of visual recipes with "Hive" or "Spark" engine. Any visual recipe that takes a Hive query dataset as input will run off the DSS engine, with rows streamed outside of Hive.
  • Any access to the dataset will cause the query to be executed. If your query is a complex query with joins or aggregations, this could make interacting with this dataset very slow.

Note that the "Table" mode also accepts views. We usually recommend creating views in your Hive database.

Important note about SQL query datasets

We recommend to only use SQL query datasets if you absolutely need, i.e. if there is no table nor view in your database that fits your needs.
SQL query datasets have the following performance-related implications:

  • They cannot be used as inputs of visual recipes in "in-database (SQL)" processing. Any visual recipe that takes a SQL query dataset as input will run off the DSS engine, with rows streamed outside of the database.
  • When used as input of a SQL query recipe, they result in nested sub-queries in the query recipe, which might not always work.
  • Any access to the dataset will cause the query to be executed. If your query is a complex query with joins or aggregations, this could make interacting with this dataset very slow.

Note that the "Table" mode also accepts views. We usually recommend creating views in your database.

Date & Time handling
SQL "datetime without timezone" values read as DSS "datetime with tz" are assumed in the the timezone selected below, and expressed at UTC
SQL "date" values read as DSS "datetime with tz" are assumed to be 00:00 of that date in the the timezone selected below, and expressed at UTC
For SQL "date" and SQL "timestamp without time zone", when reading them as DSS "datetime with tz", timezone to assign to this timezone-less data
Teradata uses different time zone names. You need to select the timezone matching the "Assumed Time Zone"
Preview
{{testResult.previewErrorMsg}}
{{testResult.previewErrorMsg}}

Advanced BigQuery settings

It is usually only required to change this for views based on partitioned tables

SQL

SQL Spark integration

Dynamic dataset repeat

Reading this dataset is based on the variables from a secondary parameters dataset. When enabled, the data will be read multiple times, using variable values from rows in the parameters dataset at each iteration. The results of each iteration are then concatenated into a single dataset. This is typically used to read from multiple similar tables