<div class="alert">
  User setup impacting the <b>Automatic Matches</b> selection:
  <ul>
    <li>Matching Distance Thresholds</li>
    <li>Matching Distance Weights</li>
    <li>Automatic Matching Threshold</li>
  </ul>
</div>

Observations that are not classified as perfect matches remain in the process and are used as input for the fuzzy join recipe. 

 **Match Score**  in calculated as the weighted average distance across all keys, for the observations having distances bellow the user-defined Matching Distance Weights.  
 
 For a scenario with 5 keys, the match score function is:

![Screenshot 2024-12-04 at 11.56.36.png](YjfjxUY7j8Sx)
 
 **Automatic Matches**  refer to those matches whose match scores exceeded the user-set Automatic Matching Threshold,  yet fell short of a perfect 100%.

![Screenshot 2025-11-20 at 10.41.36.png](naXWmHOiuKCP)
The **Fuzzy Join Recipe** is dedicated to joins between two datasets when join keys don’t match exactly. It works by calculating a distance chosen by user and then comparing it to a threshold. The thresholds of each keys are defined in the Project Setup and the type of distance calculation depends on the data type of the keys. 

Distance Metrics by type:
-  **Numeric Values** : Euclidean distance (absolute arithmetic difference in one dimension). For example, the distance between ppp and qqq is ∣p−q∣|p - q|∣p−q∣.
-  **String Values** : Levenshtein distance (number of edits—insertions, deletions, or replacements—needed to transform one string into another). Example: The Levenshtein distance between “kitten” and “sitting” is 3.
-  **Date Values** : Difference in seconds. For example, 3600 seconds = 1 hour.

Observations that exceed the threshold distance for any key are set aside as  **No Match** .

The remaining potential matches are processed to generate a match score, calculated as the weighted average distance across all keys, using the user-defined Matching Distance Weights. Matches whose match scores exceeded the user-set Automatic Matching Threshold are classified as  **Automatic Matches**. 

**Process by reconciliation type: **
-  **1-to-1** : If multiple automatic matches are generated for the same secondary or primary observation, only the one with the highest score is retained.
-  **1-to-many** : If multiple automatic matches are generated for the same secondary observation, only the one with the highest score is retained. If multiple automatic matches are generated for the same primary observation, all are retained. 

In both cases, only rows automatically matched from secondary dataset are excluded from subsequent steps in the flow. 








