Model training
The number of epochs for learning. Higher values lead to better convergence, but take more time.
The number of samples to include in each mini-batch.
Whether one epoch should contain all the training data.
The number of batches used for training one epoch.
Whether the data should be shuffled between epochs (recommended, unless the data is already in random order).
Whether to serialize partially processed data during the first epoch to speed up subsequent epochs.