# DSS 3.0 Relase notes[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#dss-3-0-relase-notes "Permalink to this headline")

* Migration notes

+ How to upgrade

+ External libraries upgrades

+ From scheduled jobs to scenarios

* Version 3.0.5 - June 24th, 2016

+ Spark

+ Datasets

+ Machine learning

+ API node

+ Webapps

+ Recipes

+ Automation

+ Misc

* Version 3.0.4 - June 16th, 2016

+ Plugins

+ Production

+ SQL Notebook

+ Recipes

+ Machine learning

+ Datasets

+ Data preparation

+ Charts

+ Misc

* Version 3.0.3 - May 30th, 2016

+ Recipes

+ Metrics & Scenarios

+ Misc

* Version 3.0.2 - May 25th, 2016

+ Hadoop & Spark

+ Metrics & Checks

+ Automation node & scenarios

+ Machine learning

+ API Node

+ Data preparation

+ Visual recipes

+ Charts

+ Misc

+ Webapps

* Version 3.0.1 - May 11th 2016

+ Installation

+ Connectivity

+ Metrics & Checks

+ Scenarios

+ Machine Learning

+ Data preparation

+ Misc

* Version 3.0.0 - May 1st 2016

+ New features

+ Other notable enhancements

+ Other changes

## Migration notes[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#migration-notes "Permalink to this headline")

Warning

Migration to DSS 3.0 from a previous DSS 2.X instance requires some attention.

To migrate from DSS 1.X, you must first upgrade to 2.0. See DSS 2.0 Relase notes

Automatic migration from Data Science Studio 2.3.X is supported, with the following restrictions and warnings:

* DSS 3.0 features an improved security model. The migration aims at preserving as much as possible the previously defined permissions, but we strongly encourage you to review the permissions of users and groups after migration.

* DSS 3.0 now enforces the “Reader” / “Data Analyst” / “Data Scientist” roles in the DSS licensing model. You might need to adjust the roles for your users after upgrade.

* DSS now includes the XGBoost library in the visual machine learning interface. If you had previously installed older versions of the XGBoost Python library (using pip), the XGBoost algorithm in the visual machine learning interface might not work

* The usual limitations on retraining models and regenerating API node packages apply (see Upgrading a DSS instance for more information)

* After migration, all previously scheduled jobs are disabled, to ease the “2.X and 3.X in parallel” deployment models. You’ll need to go to the scenarios pages in your projects to re-enable your previously scheduled jobs.

Automatic migration from Data Science Studio 2.0.X, 2.1.X and 2.2.X is supported, with the previous restrictions and warnings, and, in addition, the ones outlined in DSS 2.1 Relase notes, DSS 2.2 Relase notes, DSS 2.3 Relase notes

### How to upgrade[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#how-to-upgrade "Permalink to this headline")

It is strongly recommended that you perform a full backup of your Data Science Studio data directory prior to starting the upgrade procedure.

For automatic upgrade information, see Upgrading a DSS instance

### External libraries upgrades[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#external-libraries-upgrades "Permalink to this headline")

Several external libraries bundled with DSS have been bumped to major revisions. Some of these libraries include some *backwards-incompatible* changes. You might need to upgrade your code.

Notable upgrades:

* Pandas 0.16-> 0.17

* Scikit-learn 0.16 -> 0.17

### From scheduled jobs to scenarios[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#from-scheduled-jobs-to-scenarios "Permalink to this headline")

The 3.0 version introduces Scenarios, which replace Scheduled jobs.

Each scheduled job you had in 2.X, enabled or not, is transformed during the migration process into a simple scenario replicating the functionalities of that scheduled job:

* the scenario contains a single build step to build the datasets that the scheduled job was building

* the scenario contains a single time-based trigger with the same setup as the scheduled job, so that the trigger activates exactly with the same frequency and time point as the scheduled job

If the scheduled job was enabled, the time-based trigger of the corresponding scenario is enabled, and conversely. The scenarios themselves are set to inactive, so that after the migration none will run. You need to activate the scenarios (for example from the scenarios’ list), or take the opportunity to rearrange the work that the scheduled jobs were performing into a smaller number of scenarios; a single scenario can indeed launch multiple builds, waiting for a build to finish before launching the next one.

Since a scenario will execute the build corresponding to a scheduled job only when its trigger is active and the scenario itself is active, the quickest route to get the same scheduled builds as before is to activate all scenarios.

## Version 3.0.5 - June 24th, 2016[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#version-3-0-5-june-24th-2016 "Permalink to this headline")

This release fixes a critical bug related to Spark, plus several smaller bug fixes.

### Spark[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#spark "Permalink to this headline")

* Fix MLLib and Data preparation on Spark

### Datasets[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#datasets "Permalink to this headline")

* Fix exception in JSON extractor with some specific cases of nested arrays

### Machine learning[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#machine-learning "Permalink to this headline")

* Fix XGboost regression models when evaluation metrics is MAE, MAPE, EVS or MSE

* Display grid search scores in regression reports

### API node[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#api-node "Permalink to this headline")

* Fix various issues with data enrichment in “mapped” mode

### Webapps[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#webapps "Permalink to this headline")

* Fix loading data from local/static

### Recipes[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#recipes "Permalink to this headline")

* Fix validation of custom expressions in sample recipe

### Automation[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#automation "Permalink to this headline")

* Fix migration of scenarios from DSS 2.3 with partitions

* Better explanations as to why some scenarios are aborted

* Fix layout issues in scenario screens

### Misc[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#misc "Permalink to this headline")

* Fix mass tagging on Hive and Impala notebooks

* Fixs on graph for job preview

## Version 3.0.4 - June 16th, 2016[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#version-3-0-4-june-16th-2016 "Permalink to this headline")

This release brings a lot of bug fixes and minor features for plugins.

### Plugins[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#plugins "Permalink to this headline")

* Add ability to introduce visual separators in settings screen

* Add ability to hide parameters in settings screen

* Add ability to huse custom forms in settings screen

### Production[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#production "Permalink to this headline")

* Add a metric for count of non null values

* Add more metrics in the “data validity” probe

* Expand capabilities for custom SQL aggregations

* Add the ability to have custom checks in plugins

* Use proxy settings for HTTP-based reporters

* Fix and improve settings of the “append to dataset” reporter

### SQL Notebook[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#sql-notebook "Permalink to this headline")

* Make the spinner appear immediately after submitting the query

* Fix error reporting issues

* Fix reloading of results in multi-cells mode

* Add support for variables expansion

### Recipes[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id1 "Permalink to this headline")

* Fix visual recipes running on Hive with multiple Hive DBs

* Fix reloading of split and filtering recipe with custom variables

### Machine learning[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id2 "Permalink to this headline")

* Fix display of preparation step groups in model reports

* Fix simple Shuffle-based cross-validation on regression models

* Fix train-test split based on extract from two datasets with filter on test

* Fix deploying “clustering” recipe on connections other than Filesystem

* Add ability to disable XGBoost early stopping on regression

### Datasets[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id3 "Permalink to this headline")

* Fix renaming of datasets in the UI

* Fix the Twitter dataset

* Fix “Import data” modal in editable dataset

* Fix reloading of schema for Redshift and other DBs

### Data preparation[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#data-preparation "Permalink to this headline")

* Improved display of filters for small numerical values

* Fix mass change meaning action

* Add ability to mass revert to default meaning

* Unselect the steps when unselecting a group

* Fix UI issue on Firefox

### Charts[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#charts "Permalink to this headline")

* Add ability to have “external” legend on more charts

* Fix several small bugs

* Fix scale on charts with 2 Y-axis

### Misc[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id4 "Permalink to this headline")

* Fix issue with R installation on Redhat 6

* Fix missing information in diagnostic tool

* Fix import of projects with SQL notebooks from 2.X

* Fix saving of summary info for web apps

* Add dataset listing and schema fetching in web apps API

## Version 3.0.3 - May 30th, 2016[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#version-3-0-3-may-30th-2016 "Permalink to this headline")

DSS 3.0.3 is a bugfix release. For a summary of new features in DSS 3.0, see below.

### Recipes[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id5 "Permalink to this headline")

* Fix bug leading to unusable join recipe in some specific cases

* Fix performance issue in code recipes with large number of columns

### Metrics & Scenarios[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#metrics-scenarios "Permalink to this headline")

* Fix history charts for points with no value

* Fix possible race condition leading to considering some jobs as failed

### Misc[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id6 "Permalink to this headline")

* Fix various UI issues in read-only mode

* Fix critical login bug

* Fix “Disconnected” overlay on Monitoring page

## Version 3.0.2 - May 25th, 2016[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#version-3-0-2-may-25th-2016 "Permalink to this headline")

DSS 3.0.2 is a bugfix and minor enhancements release. For a summary of new features in DSS 3.0, see below.

### Hadoop & Spark[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#hadoop-spark "Permalink to this headline")

* Preserve the “hive.query.string” Hadoop configuration key in Hive notebook

* Clear error message when trying to use Geometry columns in Spark

* Fix S3 support in Spark

### Metrics & Checks[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#metrics-checks "Permalink to this headline")

* Better performance for partitions list

* Simplify and rework the way metrics are enabled and configured

### Automation node & scenarios[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#automation-node-scenarios "Permalink to this headline")

* Add deletion of bundles

* Remap connections in SQL notebooks

* Fix scenario run URL in mails

### Machine learning[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id7 "Permalink to this headline")

* Fix wrongly computed multiclass metrics

* Much faster multiclass scoring for MLLib

* Fix multiclass AUC when only 2 classes appear in test set

* Fix tooltip issues in the clustering scatter plot

### API Node[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id8 "Permalink to this headline")

* Fix typo in custom HTTP header that could lead to inability to parse the response

* Fix the INSEE enrichment processor

* Fix excessive verbosity

### Data preparation[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id9 "Permalink to this headline")

* Add a new processor to compute distance between geo points

* Fix DateParser in multi-columns mode when some of the columns are empty

* Modifying a step comment now properly unlocks the “Save” button

### Visual recipes[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#visual-recipes "Permalink to this headline")

* Fix split recipe on “exotic” boolean values (Yes, No, 1, 0, …)

### Charts[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id10 "Permalink to this headline")

* Add percentage mode on pie/donut chart

### Misc[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id11 "Permalink to this headline")

* Add new error reporting tools

* Enforce hierarchy of files to prevent possible out-of-datadir reads

* Fix support for nginx >= 1.10

* Fix the ability to remove a group permission on a project

### Webapps[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id12 "Permalink to this headline")

* Automatically enable/disable the Save button

* Warn if leaving with unsaved changes

* Add history and explicit commit mode

## Version 3.0.1 - May 11th 2016[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#version-3-0-1-may-11th-2016 "Permalink to this headline")

DSS 3.0.1 is a bugfix release. For a summary of the major new features in DSS 3.0, see: https://www.dataiku.com/learn/whatsnew

### Installation[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#installation "Permalink to this headline")

* Added support for nginx >= 1.10

### Connectivity[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#connectivity "Permalink to this headline")

* Fixed “Other SQL databases” connections

### Metrics & Checks[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id13 "Permalink to this headline")

* Fixed ordering of partitions table

* Default probes and metrics will now be enabled on migration from 2.X

### Scenarios[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#scenarios "Permalink to this headline")

* Improved description of triggers

### Machine Learning[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id14 "Permalink to this headline")

* Removed unapplicable parameter for MLLib

* Improve explanations about target remapping in Jupyter export

### Data preparation[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id15 "Permalink to this headline")

* Fixed migration on groups

* Multiple ColumnRenamer processors will automatically be merged

### Misc[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id16 "Permalink to this headline")

* Fixed display of Git diffs which could break

* Fixed display of logs on Safari

* Fixed tasks lists on projects

* Added user-customized themes

* “Read-only Analysts” can now fully view visual analysis screens

* Added “project-import” and “project-export” commands to dsscli

## Version 3.0.0 - May 1st 2016[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#version-3-0-0-may-1st-2016 "Permalink to this headline")

DSS 3.0.0 is a major upgrade to DSS with exciting new features.

For a summary of the major new features, see: https://www.dataiku.com/learn/whatsnew

### New features[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#new-features "Permalink to this headline")

#### Automation deployment (“bundles”)[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#automation-deployment-bundles "Permalink to this headline")

Dataiku DSS now comes in three flavors, called node types:

* The Design node (the “classical” DSS), where you mainly design your workflows

* The Automation node, where you run and automate your workflows

* The API node (introduced in DSS 2.2), where you score new records in real-time using a REST API

After designing your data workflow in the design node, you can package it in a consistent artefact, called a “bundle”, which can the be deployed to the automation node.

On the automation node, you can activate, rollback and manage all versions of your bundles.

This new architecture makes it very easy to implement complex deployment use cases, with development, acceptance, preproduction and production environments.

For more information, please see our product page: http://www.dataiku.com/dss/features/deployment/

#### Scenarios[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#release-dss30-scenarios "Permalink to this headline")

DSS has always been about rebuilding entire dataflows at once, thanks to its smart incremental reconstruction engine.

With the introduction of automation scenarios, you can now automate more complex use cases:

* Building a part of the flow before another one (for partitioning purposes for example)

* Automatically retraining models if they have diverged too much.

Scenarios are made up of:

* Triggers, that decide when the scenario runs

* Steps, the building blocks of your scenarios

* Reporters, to notify the outside world.

You’ll find a lot of information in Automation scenarios, metrics, and checks

#### Metrics and checks[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#metrics-and-checks "Permalink to this headline")

You can now track various advanced metrics about datasets, recipes, models and managed folders. For example:

* The size of a dataset

* The average of a column in a dataset

* The number of invalid rows for a given meaning in a column

* All performance metrics of a saved model

* The number of files in a managed folder

In addition to these built-in metrics, you can define custom metrics using Python or SQL. Metrics are historized for deep insights into the evolution of your data flow and can be fully accessed through the DSS APIs.

Then, you can define automatic data checks based on these metrics, that act as automatic sanity tests of your data pipeline. For example, automatically fail a job if the average value of a column has drifted by more than 10% since the previous week.

#### Advanced version control[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#advanced-version-control "Permalink to this headline")

Git-based version control is now integrated much more tightly in DSS.

* View the history of your project, recipes, scenarios, … from the UI

* Write your own commit messages

* Choose between automatic commit at each edit or manual commit (either by component or by project)

In addition, you can now choose between having a global Git repository or a Git repository per project

When viewing the history, you can get the diff of each commit, or compare two commits.

#### Team activity dashboards[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#team-activity-dashboards "Permalink to this headline")

Monitor the activity of each project thanks to our team activity dashboards.

#### Administrator monitoring dashboards[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#administrator-monitoring-dashboards "Permalink to this headline")

We’ve added a lot of monitoring dashboards for administrators, especially for large instances with lots of projects:

* Global usage summary

* Data size per connection

* Tasks running on the Hadoop and Spark clusters and per database

* Tasks running in the background on DSS

* Authorization matrix for an overview of all effective authorizations

### Other notable enhancements[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#other-notable-enhancements "Permalink to this headline")

#### Project import/export[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#project-import-export "Permalink to this headline")

When exporting a project, you can now export all datasets from all connections (except partitioned datasets), saved models and managed folders. When importing the project in another DSS design node, the data is automatically reloaded.

This allows to export complete projects, including data.

When importing projects, you can also *remap* connections, removing the need to define connections with exactly the same name as on the source DSS instance.

#### Maintenance tasks[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#maintenance-tasks "Permalink to this headline")

DSS now performs automatically several maintenance and cleanup tasks in the background.

#### Improved security model[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#improved-security-model "Permalink to this headline")

We’ve added several new permissions for more fine-grained control. The following permissions can now be granted to each group, independently of the admin permissions:

* Create projects and tutorials

* Write “unsafe” code (that might be used to circumvent the permissions system)

* Manage user-defined meanings

In addition, users can now create personal connections without admin intervention.

The administration UI now includes an authorization matrix for an overview of all effective authorizations

#### API[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#api "Permalink to this headline")

* The public API includes new methods to interact with scenarios and metrics

* The public API includes new methods for exporting projects

#### Data preparation[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#id18 "Permalink to this headline")

* It’s now possible to delete columns based on a name pattern

### Other changes[¶](https://doc.dataiku.com/dss/latest/release_notes/3.0.html#other-changes "Permalink to this headline")

* DSS does not automatically grant Analyst access to the “first analysts group” when creating a project. After the creation of a project, only its creator (and the DSS administrators) can access it by default.
