![flow_zones_macro_view_inputs_preprocessing.png](3xLnVcmhQBJS)

# Flow zone presentation


The [users_analysis_and_selection](flow_zone:vzi5sQD) filters the users kept for being candidates in the collaborative filtering process.
Filtering users allows to reduce the recommendation problem cardinality *([Get more explanations on this](article:34))*

![flow_zone_users_analysis_and_selection.png](tSRxNWj2KckM)

# Main steps

The flow zone is composed of two main branches: 
- A branch computing information at the ***day x user***  level.
- A branch computing information at the ***user***  level.


### Day x user branch
In this branch: 
- Recipe [compute_user_daily_interactions](recipe:compute_user_daily_interactions) computes the total number of interactions users had for each day they interacted with your items.
- Recipe [compute_user_daily_interactions_windows](recipe:compute_user_daily_interactions_windows) computes, for each date a user interacted with your items, the next date she/he then interacted again.
- Recipe [compute_users_since_last_interaction](recipe:compute_users_since_last_interaction) finally computes the days separating each user's daily interactions.

:arrow_forward: This information is then leveraged in the [Project's dashboard](article:21) to better understand your user community behavior and tune the application based on it.


### User branch
In this branch: 

- Recipe [compute_user_interactions](recipe:compute_user_interactions) computes: 
  - The last date each user had a recorded interaction.
  - Their total number of interactions. 
  - Their total number of *distinct* interactions.
  
- Recipe [compute_user_types](recipe:compute_user_types) flags the user to keep or reject from the collaborative filtering pipeline, based on the application settings *([See the corresponding article](article:43))*. This is done in two steps : 
    - 1: Computing the column **user_type**. This column will flag 3 types of users:
        - "collaborative_filtering_users": These are the users that had interacted with more than 1 distinct item and that had a total number of interactions *ABOVE* or equals the threshold set in your application.
        - "low_interactions_user": Are the users that had interacted with more than 1 distinct item and that had a total number of interactions above 1 *BUT* had a total number of interactions *BELOW* the threshold set in your application.
        - "cold_start_user": These are the users that only had 1 interaction in the batch set in the application. Thus we can't use them in the collaborative filtering pipeline.
    - 2: Flagging the users to keep with the boolean column **is_user_to_keep**: users to keep are simply the ones having **user_type** equals to "collaborative_filtering_users". 
    
- Recipe [split_user_types](recipe:split_user_types) splits the users into two datasets: 
    - Dataset [users_kept](dataset:users_kept) contains all users that will be candidates for the collaborative filtering pipeline: these are all the users with  **is_user_to_keep** equals "True".
    - Dataset [users_rejected](dataset:users_rejected) contains all the remaining users.
    
- Recipe [compute_hyperactive_users](recipe:compute_hyperactive_users) computes information about the 'hyperactive users'. It generates:
    - Dataset [hyperactive_users](dataset:hyperactive_users): A dataset containing all users flagged as hyperactive.
    - Dataset [user_hyperactivity_threshold](dataset:user_hyperactivity_threshold): A dataset containing the number of interactions threshold value for considering a user as being hyperactive.
