# New Segmentation Designer


This [webapp](web_app:UcGqC6v) allows you to segment and analyze data directly from your Dataiku projects. You can start by selecting a dataset, applying filters to focus on specific groups, and then choosing between two segmentation methods: Machine Learning (KMeans) or Rule-Based (see article [Data Driven Segmentation](article:19)). The web application guides you through setting parameters, running the segmentation, and visualizing the results.

After creating your segments, you can rename them to make the results more meaningful. You can save the segmented data back to Dataiku or export it as a CSV file. The app provides interactive charts and insights to help you understand how the data is grouped, making it easier to draw conclusions and support your decision-making.

## Filtering Tab
![1.png](VLj0zDPdIHIc)
- Dataset Selection: Users begin by selecting a dataset from the Dataiku flow via a dropdown menu. The application retrieves the dataset and validates it, ensuring the presence of an account_id column.
- Filtering Options: Users can filter the dataset by selecting columns and specifying conditions (e.g., ranges, dates, or specific values). Filters are dynamically generated based on the selected columns.
- Apply Filters: Clicking "Apply" will update the data table displayed on the right panel, reflecting the filtered data. If select features is empty the "Apply" button displays the original data. 
## Building Tab
![2.png](2cx9BYrlQ0i0)
- Method Selection: Users choose between two segmentation methods:
 - Machine Learning Clustering (K-Means): Uses unsupervised machine learning to cluster data.
 - Rule-Based Segmentation: Segments data based on predefined rules and weights.
- Parameter Configuration: Depending on the chosen method, users can set parameters:
 - KMeans: Specify the number of segments and select features.
 - Rule-Based: Specify the number of segments, select numerical features, and define weights for each feature to control segmentation.
- Run Segmentation: Once the parameters are set, users click "Run" to apply the segmentation. The app runs the selected method and displays the resulting segments on the right panel. A new column 'cluster' is added in the original dataset. 

## Insights Tab
![3.png](pRuuXgxRCfZw)
![analysis2.png](kUfEfrujdQK5)

- Segmentation Insights: Once segmentation is complete, users can see insights, including a histogram of the number of records per segment and a feature importance chart. For rule-based methods, feature weights are used; for KMeans, feature importance is derived using a Random Forest classifier (see article [Feature Importance in Segmentation](article:20)).
- Feature Distribution: Users can analyze feature distributions across segments, view average values by segment, and explore mixed graphs to better understand the segmentation patterns.
- Remapping Segment Names: Users should rename the segments to make them more meaningful using input boxes. This allows for easier interpretation of the segmentation results. 
- Saving and Exporting Results: Users can save the segmented data to the Dataiku flow as a new dataset and record in a metadata dataset. Alternatively, they can export the segmented data locally as a CSV file.

# Interaction with the Dataiku Flow
## Dataset Integration
The webapp ingests datasets directly from the Dataiku flow using dataiku API. This enables users to access pre-existing datasets and ensures the selected data is always in sync with the flow.
## Metadata and Saving:
The app saves session details, including the method used, parameters, and description, as records in the [metadata_dataset](dataset:metadata_dataset) in Dataiku Flow. The segmented data is saved as a new dataset, and the model (in case of K-Means) is also stored in the [model_sessions_folder](managed_folder:TdqwWquP) and the bounds and weights as a new record in [rule_based_specs_dataset](dataset:rule_based_specs_dataset).






