# Flow zone presentation
 
The [subway_stations_preprocessing](flow_zone:rhcQz1q) is in charge of processing the [subway_stations dataset](dataset:subway_stations_original).
 
![subway-stations-preprocessing.png](44pPIjXpgSl6)
 
[subway_stations dataset] dataset contains the details of all stations within Paris subway lines. We use it to:
- Project the subway stations into a network graph so that we have different metrics of centrality for each of them. This is done thanks to the plugin recipe [compute_subway_stations_graph_features](recipe:compute_subway_stations_graph_features). 
- Get metadata from Paris train stations (get all the lines that each station belongs to, and know if the station is a line terminus for some of these stations).
- Use the stations' location information to index all this data, so that any geo point can be enriched with the information of the closest subway stations that surround it.
  - Recipe [compute_Nhst9fL7](recipe:compute_Nhst9fL7) is in charge of doing this. It uses the latitude and longitude information to train a **K-d tree** indexing station geospatial data. Once trained, this **K-d tree** and the stations' metadata are saved in the folder [subway_stations_indexing](managed_folder:Nhst9fL7) ([Why do we use a folder here ?](article:42)).
 
This flow zone ends with data ready to enrich the dataset [real_estate_sales_prepared](dataset:real_estate_sales_prepared) with subway station information in [flow zone  properties_geospatial_enrichment](flow_zone:Uam9udZ). [Learn more about this zone](article:28).