Instance templates and setup actions
Instance template represent common configuration for instances, reusable across several instances. It is required to use an instance template to launch an instance. Instances stay linked to their instance template for their whole lifecycle.
What is configured through the instance templates includes, but is not limited to:
-
Identities able to SSH to the instance
-
Cloud credentials for the managed DSS
-
Installation of additional dependencies and ressources
-
Pre-baked and custom configurations for DSS
To create, edit and delete templates, head to the Instance templates in the left menu of FM. The following document explains each section of the configuration.
SSH keypair
Use this field to select which AWS EC2 keypair will be deployed on the instance. This is useful for admins to connect to the machine with SSH. This field is optional.
This key will be available on the
ec2-user
account (
centos
for DSS instances prior version 12), i.e. you will be able to login as
ec2-user@DSS.HOST.IP
AWS credentials
In most cases, your DSS instances will require AWS credentials in order to operate. These credentials will be used notably to integrate with ECR and EKS. They can also be used (optionally) for S3 connectivity.
The recommended way to offer AWS credentials to DSS instance is the use of an IAM instance profile. You can create a role, and put its instance profile ARN as the “runtime instance profile ARN” field.
Keep “restrict access to metadata server” enabled so that DSS end-users cannot access these credentials.
Atypical options
There may be some cases where you want setup to have additional permissions, notably being able to retrieve secrets from ASM, or perform other tasks that might require permissions useful for startup only (see setup actions ).
If that’s needed, you can add a “Startup instance profile ARN” that will only be available during startup and that will be replaced by the “Runtime instance profile ARN” once startup is complete.
Alternatively to IAM instance profile, you can also use a keypair that will be given to the DSS service account.
The AWS Secret Access Key can be stored in ASM (in which case the Startup instance profile ARN must be able to read it) or encrypted and stored by FM (in which case the Startup instance profile ARN must be able to user the CMK to decrypt it).
Setup actions
Setup actions are configuration steps ran by the agent . As a user, you create a list a setup actions you wish to see executed on the machine.
Setup Kubernetes and Spark-on-Kubernetes
This task takes no parameter and pre-configures DSS so you can use Kubernetes clusters and Spark integration with them. It prepares the base images and enables DSS Spark integration.
Install system packages
This setup action is a convenient way to install additional system packages on the machine should you need them. It takes a list of Almalinux package names as only parameter.
Install a JDBC driver
Instances come pre-configured with drivers for PostgresSQL, MariaDB, Snowflake, AWS Athena and Google BigQuery. If you need another driver, this setup action eases the process. It can download a file by HTTP, HTTPS, from S3 bucket or from an ABS container.
Run Ansible tasks
This setup action allows you to run arbitrary ansible tasks at different point of the startup process.
The Stage parameter specificies at which point of the startup sequence it must be executed. There is three stages:
-
Before DSS install : These tasks will be run before the agent installs (if not already installed) or upgrades (if required) DSS.
-
After DSS install : These tasks will be run once DSS is installed or upgraded, but not yet started.
-
After DSS is started : These tasks will be run once DSS is ready to receive public API calls from the agent.
The Ansible tasks allows you to Write a YAML list of ansible tasks as if they were written in a role. Available tasks are base Ansible tasks and Ansible modules for Dataiku DSS . When using Dataiku modules, it is not required to use the connection and authentication options. It is automatically handled by FM.
Some additional facts are available:
-
dataiku.dss.port
-
dataiku.dss.datadir
-
dataiku.dss.was_installed : Available only for stages After DSS install and After DSS startup
-
dataiku.dss.was_upgraded : Available only for stages After DSS install and After DSS startup
-
dataiku.dss.api_key : Available only for stage After DSS startup
Example:
---
- dss_group:
name: datascienceguys
- dss_user:
login: dsadmin
password: verylongbutinsecurepassword
groups: [datascienceguys]
Ansible is ran with the unix user held by the agent, and can run administrative tasks with become .
Install a code environment with a Visual ML preset
This setup action installs a code environment with the Visual Machine Learning and Visual Time series forecasting preset.
Enable Install GPU-based preset to install the GPU-compatible packages. Otherwise, the CPU packages are installed.
Leaving Allow in-place update enabled means that if there is a newer version of the preset the next time the setup action runs, and it is compatible with the previously installed code environment, said code environment is updated in place. Otherwise, a new code environment is created with the updated preset.