Gradient Boosting works by recursively building a sequence of shallow decision trees. Each trees improves on the previous model by correcting residual errors.
Learning rate (shrinkage), should be between 0 (exclusive) and 1 (inclusive). Lowering the value slows learning and improves generalization, and high values lead to overfitting. Values between 0.01 and 0.5 are generally used.
Maximum Tree depth for each tree in the sequence.
Minimum number of observations to create a leaf.
The family of loss functions used to build the trees. It is recommended to leave AUTO and let H2O do the guessing, unless you are familiar with these options.
Tweedie variance power.
For numerical variables, build a histogram of this many bins, then split at the best point. Lower values lead to poorer splits, but higher values lead to longer computation times.
Group categorical variables into this many bins, then split at the best point. Lower values lead to poorer splits, but higher values lead to longer computation times.
Over or under sample the data to balance classes for training (bias thus introduced will be removed after training). May improve predictions in case of imbalanced classes. Subsampling will also lead to faster training.
Maximum size of the dataset after balancing classes, as a proportion of the original dataset. Values lower than 1 are allowed, thus reducing dataset size.