The **Process Mining Solution** is delivered as a plug-and-play **Dataiku Application**, enabling users to quickly spin up configured instances for process analysis. Each instance is a self-contained Dataiku project built from a shared project template (`SOL_PROCESS_MINING`).

We recommend **creating one instance per process or subprocess** you want to mine.

# Access the Application

From the **Dataiku homepage**, under the **Applications** section (not Projects), click on **Process Mining**.

Here you’ll see:

- A list of all previously created instances
- A button to **Start using the application** to create a new one

![App instances.png](jtrT9KqWWom1)

Creating a new instance will:

- Duplicate the source project (`SOL_PROCESS_MINING`)
- Open the instance in a dedicated **configuration interface** (instead of the usual project view)

![Dataiku App.png](JPFZAkfLOz1J)

The configuration interface guides you through three key steps:

- [Configure the Input Dataset](#configure-the-input-dataset-1)
- [Map Columns](#map-columns-1)
- [Build the Instance](#build-the-instance-1)

You can always access the full project (Flow, datasets, dashboards, etc.) by clicking **Switch to project view**.

# Configure the Input Dataset

Upload or connect your **prepared process data** (see [Data Model](article:12) for requirements).

## Option 1: File Upload

- Uses the `filesystem_managed` connection by default
- A drag-and-drop widget allows you to upload CSV files
- A sample file (`workflow.csv`) containing loan application data is included

> ⚠️ If your instance doesn’t support `filesystem_managed`, the source project must be reconfigured to use an authorized connection.

After your file(s) appear, click **Configure**.

## Option 2: Database Connection

If you're working from a SQL backend like PostgreSQL or Snowflake:

- Choose the connection from the list
- Click **Configure**
- Click on `workflow_parsed` to open dataset settings
- Point it to your table containing the prepared process data

For non-SQL backends (e.g. HDFS), manual setup is required:

1. Go to **Project > Datasets**

   - Delete the `workflow` dataset (this also removes the sync recipe)

2. Select all datasets → click **Change connection**

   - Choose your desired connection
   - Confirm and drop existing data
   - Leave "reuse connection settings" **unticked**

3. A warning like _“Dataset available_processes has no predecessor...”_ may appear. This is expected.
4. Go to **Flow > View > Recipes engine**

   - Ensure all recipes use the appropriate engine for your connection

# Map Columns

See [Data Model](article:12) for more information on expected columns.

# Build the Instance

Once column mapping is done, click **Build** to generate the datasets and dashboards that power the solution.

As a next step:

- Open the **Data Quality** dashboard to validate that your data was interpreted correctly.
- Proceed to explore your process in the **Process Mining dashboard**.
