⚠️ Required Fields Missing: {{ validationMessage }}

Target

Enables automatic class weighting to handle imbalanced datasets.

Model output

Alphanumeric and underscores only. No spaces or special characters.

Train / test set for final evaluation

Time ordering

Sampling & Splitting

Approximate proportion of the sample that goes to the train set. The rest goes to the test set.
Using a fixed random seed allows for reproducible splitting.

Metrics

Hyperparameter optimization and model evaluation

Features Handling

Feature Status Encoding / Rescaling Missing values Constant
{{ column.name }}
{{ config.selectedOption1[column.name] || 'No handling' }}

Algorithms

Logistic Regression
Random Forest
XGBoost
LightGBM
Gradient Boosting
Decision Tree

Logistic Regression

A linear model that uses a sigmoid function to estimate the probability of a sample belonging to a certain class. Logistic Regression supports L1 and L2 regularization (controlled by the C parameter) to prevent overfitting.

{{ showFullDesc.logistic_regression ? 'Show less' : 'Show more' }}
C (Regularization)

Penalty parameter C of the error term. A low value of C will generate a smoother decision boundary (higher bias) while a high value aims at correctly classifying all training examples, at the risk of overfitting (high variance). C corresponds to the inverse of a regularization parameter.

Random Forest

A Random Forest is made of many decision trees. Each tree in the forest predicts a record, and each tree "votes" for the final answer of the forest. The forest chooses the class having the most votes.

A decision tree is a simple algorithm which builds a decision tree. Each node of the decision tree includes a condition on one of the input features.

When "growing" (ie, training) the forest:
  • for each tree, a random sample of the training set is used;
  • for each decision point in the tree, a random subset of the input features is considered.
Random Forests generally provide good results, at the expense of "explainability" of the model.
{{ showFullDesc.random_forest_classification ? 'Show less' : 'Show more' }}
Number of trees

Number of trees in the forest.

Maximum depth of tree

Maximum depth of each tree in the forest. Higher values generally increase the quality of the prediction, but can lead to overfitting. High values also increase the training and prediction time.

Minimum samples per leaf

Minimum number of samples required in a single tree node to split this node. Lower values increase the quality of the prediction (by splitting the tree node), but can lead to overfitting and increased training and prediction time.

XGBoost

XGBoost is an advanced gradient tree boosting algorithm. It has support for parallel processing, regularization and early stopping, which makes it a fast, scalable and accurate algorithm.

For more information on gradient tree boosting, see the "Gradient tree boosting" algorithm.

{{ showFullDesc.xgb_classification ? 'Show less' : 'Show more' }}
Number of trees

XGBoost has an early stop mechanism so the exact number of trees will be optimized. High number of actual trees will increase the training and prediction time. Typical values: 100 - 10000.

Maximum depth of tree

Maximum depth of each tree. High values can increase the quality of the prediction, but can lead to overfitting. Typical values: 3 - 10.

Min child weight

Minimum sum of weights (hessian) in a node. High values can prevent overfitting by learning highly specific cases. Smaller values allow leaf nodes to match a small set of rows, which can be relevant for highly imbalanced sets.

Learning rate

Lower values slow down convergence and can make the model more robust. Typical values: 0.01 - 0.3.

LightGBM

LightGBM is a tree-based gradient boosting library designed to be distributed and efficient. This algorithm provides fast training speed, low memory usage, good accuracy and is capable of handling large scale data.

For more information on gradient tree boosting, see the "Gradient tree boosting" algorithm.

{{ showFullDesc.lgbm_classification ? 'Show less' : 'Show more' }}
Number of trees

LightGBM has an early stop mechanism so the exact number of trees will be optimized. High number of actual trees will increase the training and prediction time. Typical values: 100 - 10000.

Maximum depth of tree

Maximum depth of each tree. High values can increase the quality of the prediction, but can lead to overfitting. Typical values: 3 - 10.

Min child weight

Minimum sum of weights (hessian) in a node. High values can prevent overfitting by learning highly specific cases. Smaller values allow leaf nodes to match a small set of rows, which can be relevant for highly imbalanced sets.

Learning rate

Lower values slow down convergence and can make the model more robust. Typical values: 0.01 - 0.3.

Gradient Boosting

Gradient boosting is a technique which produces a prediction model in the form of an ensemble of "weak" prediction models (small decision trees).

The concept is to train a set of decision trees (weak learners) to create a final strong learner. This is an iterative method. After each tree is trained, the data is reweighted: samples that were misclassified gain weight while the correctly classified ones lose weight. This allows future weak learners to focus on the "difficult" examples that the previous weak learners missed.

Gradient Boosted Trees is a generalization of boosting to arbitrary differentiable loss functions. GBT is an accurate and effective off-the-shelf procedure that can be used for both regression and classification problems. Gradient Boosted Trees models are used in a variety of areas including Web search ranking and ecology. The advantages of GBRT are:
  • Natural handling of data of mixed type (= heterogeneous features)
  • Predictive power
  • Robustness to outliers in output space (via robust loss functions)
Due to the iterative nature of boosting, it is not very parallelizable and is less scalable than other algorithms.
{{ showFullDesc.gb_classification ? 'Show less' : 'Show more' }}
Number of trees

The number of boosting stages to perform. Gradient boosting is fairly robust to over-fitting so a large number usually results in better performance.

Maximum depth of tree

Maximum depth of each tree. High values can increase the quality of the prediction, but can lead to overfitting. Typical values: 3 - 10.

Minimum samples per leaf

Minimum number of samples required in a single tree node to split this node. Lower values increase the quality of the prediction (by splitting the tree node), but can lead to overfitting and increased training and prediction time.

Learning rate

Lower values slow down convergence and can make the model more robust. Typical values: 0.01 - 0.3.

Decision Tree

Decision Tree is a simple non-parametric algorithm. It creates a model that predicts the value of the target by learning simple decision rules inferred from the data features.

These rules form a tree, with the leaves of the tree carrying the predicted value. Evaluation simply goes down the tree and evaluates the rule at each split.

{{ showFullDesc.decision_tree_classification ? 'Show less' : 'Show more' }}
Maximum depth of tree

The maximum depth of the tree.

Minimum samples per leaf

Minimum number of samples required to be at a leaf node.

Lasso Regression
Random Forest
XGBoost
LightGBM
Gradient Boosting
Decision Tree

Lasso Regression

Lasso Regression is a linear model that addresses some problems of Ordinary Least Squares by imposing a penalty (or regularization term) to the weights. Lasso regression uses a L1 regularization. This algorithm only selects some of the input features and completely discards others (depending on the value of the regularization term). It thus generally creates simpler output formulas than other linear models.

{{ showFullDesc.lasso_regression ? 'Show less' : 'Show more' }}
Alpha

Regularization strength. Higher values increase regularization, pushing more coefficients to zero and creating sparser models. Lower values allow the model to fit the training data more closely.

Random Forest

A Random Forest is made of many decision trees. Each tree in the forest predicts a record, and each tree "votes" for the final answer of the forest. The forest chooses the class having the most votes.

A decision tree is a simple algorithm which builds a decision tree. Each node of the decision tree includes a condition on one of the input features.

When "growing" (ie, training) the forest:
  • for each tree, a random sample of the training set is used;
  • for each decision point in the tree, a random subset of the input features is considered.
Random Forests generally provide good results, at the expense of "explainability" of the model.
{{ showFullDesc.random_forest_regression ? 'Show less' : 'Show more' }}
Number of trees

Number of trees in the forest.

Maximum depth of tree

Maximum depth of each tree in the forest. Higher values generally increase the quality of the prediction, but can lead to overfitting. High values also increase the training and prediction time.

Minimum samples per leaf

Minimum number of samples required in a single tree node to split this node. Lower values increase the quality of the prediction (by splitting the tree node), but can lead to overfitting and increased training and prediction time.

XGBoost

XGBoost is an advanced gradient tree boosting algorithm. It has support for parallel processing, regularization and early stopping, which makes it a fast, scalable and accurate algorithm. For more information on gradient tree boosting, see the "Gradient tree boosting" algorithm.

{{ showFullDesc.xgb_regression ? 'Show less' : 'Show more' }}
Number of trees

XGBoost has an early stop mechanism so the exact number of trees will be optimized. High number of actual trees will increase the training and prediction time. Typical values: 100 - 10000.

Maximum depth of tree

Maximum depth of each tree. High values can increase the quality of the prediction, but can lead to overfitting. Typical values: 3 - 10.

Min child weight

Minimum sum of weights (hessian) in a node. High values can prevent overfitting by learning highly specific cases. Smaller values allow leaf nodes to match a small set of rows, which can be relevant for highly imbalanced sets.

Learning rate

Lower values slow down convergence and can make the model more robust. Typical values: 0.01 - 0.3.

LightGBM

LightGBM is a tree-based gradient boosting library designed to be distributed and efficient. This algorithm provides fast training speed, low memory usage, good accuracy and is capable of handling large scale data. For more information on gradient tree boosting, see the "Gradient tree boosting" algorithm.

{{ showFullDesc.lgbm_regression ? 'Show less' : 'Show more' }}
Number of trees

LightGBM has an early stop mechanism so the exact number of trees will be optimized. High number of actual trees will increase the training and prediction time. Typical values: 100 - 10000.

Maximum depth of tree

Maximum depth of each tree. High values can increase the quality of the prediction, but can lead to overfitting. Typical values: 3 - 10.

Min child weight

Minimum sum of weights (hessian) in a node. High values can prevent overfitting by learning highly specific cases. Smaller values allow leaf nodes to match a small set of rows, which can be relevant for highly imbalanced sets.

Learning rate

Lower values slow down convergence and can make the model more robust. Typical values: 0.01 - 0.3.

Gradient Boosting

Gradient boosting is a technique which produces a prediction model in the form of an ensemble of "weak" prediction models (small decision trees).

The concept is to train a set of decision trees (weak learners) to create a final strong learner. This is an iterative method. After each tree is trained, the data is reweighted: samples that were misclassified gain weight while the correctly classified ones lose weight. This allows future weak learners to focus on the "difficult" examples that the previous weak learners missed.

Gradient Boosted Trees is a generalization of boosting to arbitrary differentiable loss functions. GBT is an accurate and effective off-the-shelf procedure that can be used for both regression and classification problems. Gradient Boosted Trees models are used in a variety of areas including Web search ranking and ecology. The advantages of GBRT are:
  • Natural handling of data of mixed type (= heterogeneous features)
  • Predictive power
  • Robustness to outliers in output space (via robust loss functions)
Due to the iterative nature of boosting, it is not very parallelizable and is less scalable than other algorithms.
{{ showFullDesc.gb_regression ? 'Show less' : 'Show more' }}
Number of trees

The number of boosting stages to perform. Gradient boosting is fairly robust to over-fitting so a large number usually results in better performance.

Maximum depth of tree

Maximum depth of each tree. High values can increase the quality of the prediction, but can lead to overfitting. Typical values: 3 - 10.

Minimum samples per leaf

Minimum number of samples required in a single tree node to split this node. Lower values increase the quality of the prediction (by splitting the tree node), but can lead to overfitting and increased training and prediction time.

Learning rate

Lower values slow down convergence and can make the model more robust. Typical values: 0.01 - 0.3.

Decision Tree

Decision Tree is a simple non-parametric algorithm. It creates a model that predicts the value of the target by learning simple decision rules inferred from the data features. These rules form a tree, with the leaves of the tree carrying the predicted value. Evaluation simply goes down the tree and evaluates the rule at each split.

{{ showFullDesc.decision_tree_regression ? 'Show less' : 'Show more' }}
Maximum depth of tree

The maximum depth of the tree.

Minimum samples per leaf

Minimum number of samples required to be at a leaf node.

Hyperparameters

Maximum number of hyperparameter combinations to explore.

Runtime environment

Run the ML training job in a Container Runtime or a Snowpark-optimized warehouse.
Must be an existing Snowpark Container Services Compute Pool.
To store ML Job payloads. Must be an existing Snowflake Stage.
Must be a Snowpark-optimized warehouse. Leave empty to use default.
Deploy to Snowflake ML Model Registry (same database/schema as input dataset).