# Advanced topics[¶](https://doc.dataiku.com/dss/latest/machine-learning/deep-learning/advanced.html#advanced-topics "Permalink to this headline")

* Start with weights from a previously trained model

* How is the model trained?

* Advanced training mode

+ Build sequence

+ Fit model

- Usage of metrics in Callbacks

## Start with weights from a previously trained model[¶](https://doc.dataiku.com/dss/latest/machine-learning/deep-learning/advanced.html#start-with-weights-from-a-previously-trained-model "Permalink to this headline")

You can initialize a model training with weights from another model for transfer learning & fine-tuning.

To do so, Keras provides the load\_model and load\_weights methods to retrieve previously saved models or weights.

DSS provides functions to retrieve the location of models, either from a ML task or a saved model:

§ # in dataiku.doctor.deep\_learning.load\_model

§ get\_keras\_model\_from\_trained\_model(session\_id=None, analysis\_id=None, mltask\_id=None)

§ get\_keras\_model\_location\_from\_trained\_model(session\_id=None, analysis\_id=None, mltask\_id=None)

§ get\_keras\_model\_from\_saved\_model(saved\_model\_id)

§ get\_keras\_model\_location\_from\_saved\_model(saved\_model\_id)

## How is the model trained?[¶](https://doc.dataiku.com/dss/latest/machine-learning/deep-learning/advanced.html#how-is-the-model-trained "Permalink to this headline")

Whether you use Standard or Advanced mode, DSS trains the model using Sequence objects and the fit method.

This preprocesses the data in batches and not all at once to prevent using too much memory, in particular for texts and images, which are memory-intensive.

DSS will preprocess data and produce those sequences: train and validation (what we usually call test is called validation in Keras terminology), depending on the size of each batch, and call fit\_generator. You can customize how the process is done.

## Advanced training mode[¶](https://doc.dataiku.com/dss/latest/machine-learning/deep-learning/advanced.html#advanced-training-mode "Permalink to this headline")

The Advanced mode for training (accessible by clicking on the top right of the analysis) allows you to modify the data, preprocessed by DSS that will be sent to the model, and to customize the parameters of the call to fit\_generator. In particular, the two main use cases of using the Advanced mode are:

* data augmentation

* using custom Callbacks

You need to fill two methods

### Build sequence[¶](https://doc.dataiku.com/dss/latest/machine-learning/deep-learning/advanced.html#build-sequence "Permalink to this headline")

The method build\_sequence should return the sequences that will be used to train the model. To do so, you have access to helpers build\_train\_sequence\_with\_batch\_size and build\_validation\_sequence\_with\_batch\_size, which are functions that return sequences depending on a batch\_size.

Then you can modify at will these sequences before training. In particular, you may want to perform some data augmentation. DSS provides a helper to do so, which looks like:

§ from dataiku.doctor.deep\_learning.sequences import DataAugmentationSequence

§ from tensorflow.keras.preprocessing.image import ImageDataGenerator

§ original\_batch\_size = 8

§ train\_sequence = build\_train\_sequence\_with\_batch\_size(original\_batch\_size)

§ augmentator = ImageDataGenerator(

§ zoom\_range=0.2,

§ shear\_range=0.5,

§ rotation\_range=20,

§ width\_shift\_range=0.2,

§ height\_shift\_range=0.2,

§ horizontal\_flip=True

§ )

§ augmented\_sequence = DataAugmentationSequence(train\_sequence, "image\_name\_preprocessed", augmentator, n\_augmentation=3)

where:

* image\_name\_preprocessed is the name of the input to augment

* n\_augmentation is the number of time the sequence is augmented

ImageDataGenerator is a helper provided by Keras to perform data augmentation on images.

For custom augmentation, you can provide your own instance of a class implementing a random\_transform method with the following signature:

§ def random\_transform(x, seed=None):

§ # returns a numpy array with the same shape as x

When you use data augmentation, you need to be aware that the actual batch size of its augmented sequence will be original\_batch\_size \* n\_augmentation, therefore you may want to provide a smaller original\_batch\_size.

### Fit model[¶](https://doc.dataiku.com/dss/latest/machine-learning/deep-learning/advanced.html#fit-model "Permalink to this headline")

The method fit\_model allows you to define custom Keras callbacks.

As per Keras documentation,

A callback is a set of functions to be applied at given stages of the training procedure. You can use callbacks to get a view on internal states and statistics of the model during training

DSS builds a list of base\_callbacks (to compute metrics, interrupt model if requested in the UI …) that must be added in the call to fit\_generator. Then, you are free to add any custom callback to this list.

#### Usage of metrics in Callbacks[¶](https://doc.dataiku.com/dss/latest/machine-learning/deep-learning/advanced.html#usage-of-metrics-in-callbacks "Permalink to this headline")

Many built-in (or custom) Callbacks from Keras require a metric to monitor. Their behavior will depend on the value of this metric. For example, the Early Stopping callback will stop the model training prior to completing all planned epochs if the tracked metric is no longer improving.

Usually, you define the metrics you want to track in the metrics parameter of the compile function. Then you can retrieve them via the callbacks. DSS also computes its own metrics through a base callback depending on the prediction type:

Regression:

* ‘EVS’

* ‘MAPE’

* ‘MAE’

* ‘MSE’

* ‘RMSE’

* ‘RMSLE’

* ‘R2 Score’

* ‘Custom Score’

Binary Classification

* ‘Accuracy’

* ‘Precision’

* ‘Recall’

* ‘F1 Score’

* ‘Cost Matrix Gain’

* ‘Log Loss’

* ‘Cumulative Lift’

* ‘ROC AUC’

* ‘Custom score’

Multiclass Classification

* ‘Accuracy’

* ‘Precision’

* ‘Recall’

* ‘F1 Score’

* ‘Log Loss’

* ‘ROC AUC’

* ‘Custom score’

As DSS tracks metrics on the ‘Test’ set, you need to prepend ‘Test ‘ to the name of the metric to have the proper name.

Warning

As they are computed in a base callback, if you want to use them, you need to put your custom callback after the list of base\_callbacks provided by DSS, in the list that you will pass to fit\_generator.

For example, in a binary classification problem, if you want to introduce an early stopping callback monitoring ROC AUC, you can add the following callback to its list

§ from tensorflow.keras.callbacks import EarlyStopping

§ early\_stopping\_callback = EarlyStopping(monitor="Test ROC AUC",

§ mode="max",

§ min\_delta=0,

§ patience=2)

DSS also provides a helper to retrieve in the code the name of metric that is used for the optimization of the model, along with the info on whether it is a loss (and lower is better) or a score (greater is better). You can access those variables with

§ from dataiku.doctor.deep\_learning.shared\_variables import get\_variable

§ metric\_to\_monitor = get\_variable("DKU\_MODEL\_METRIC")

§ greater\_is\_better = get\_variable("DKU\_MODEL\_METRIC\_GREATER\_IS\_BETTER")

and the previous early stopping callback becomes

§ from dataiku.doctor.deep\_learning.shared\_variables import get\_variable

§ from tensorflow.keras.callbacks import EarlyStopping

§ metric\_to\_monitor = get\_variable("DKU\_MODEL\_METRIC")

§ greater\_is\_better = get\_variable("DKU\_MODEL\_METRIC\_GREATER\_IS\_BETTER")

§ early\_stopping\_callback = EarlyStopping(monitor=metric\_to\_monitor,

§ mode="max" if greater\_is\_better else "min",

§ min\_delta=0,

§ patience=2)
