## Datasheet for [historic_export](dataset:historic_export)

 ### _Motivations: Describe the motivations for creating the dataset, including funding, any specific tasks the authors had in mind, and who the authors are._ 
 
 This data comes from the marketing department of Dataiku Wireless. It was created by that team's data engineers based on a business request. The business user requested information about all current clients and those that have churned recently. The data was pulled from a CSM tool and the engineers did some basic cleaning before passing it over to the analytics team.
 
 ### _Composition: Describe the composition of the dataset, like what kinds of data are in it, how it was collected, whether labels are associated with the data, and whether the dataset contains sensitive information._

This data contains individual client data including the demographic information provided by a client at the time of signing up for our service. It also has information on how long the client has used our service, how much they use, and how much they pay. These information come from different tables within our datalake and are otherwise not explicitly labelled. Sensitive information includes client's age, martial status, credit score and homeownership status.

 ### _Collection Process: Describe the data collection process, like how the data was collected, where or who it was collected from, who was involved in the collection process, and, if people are involved, if consent was given for the data to be collected. _

Clients provide their information at the time of service initation and this data is stored in our datalake accordingly. The client and the in store salesperson are involved in the collection process, and our sign up form includes an opt-in consent to collecting and storing this personal data. 

### _Processing: Whether the data was process or labelled and how it was done._

The data engineers processed the data by joining client demographic information with client usage data. Client usage data was aggregated to get averages and totals for length of service, minutes used, and bill amounts. Age groups were assigned by the data engineers, but the rationale for the specific labels were assigned is unknown and undocumented. The credit ratings assigned to clients are from an external agency, with no information provided on what the ratings mean or how they were calculated. Clients locale were labelled according to census definitions for rural, urban, town, or suburban based on the five number zipcode in the raw demographic data.

### _Uses: The tasks the dataset is intended to be used for, how it has already been used, and limitations of use._

This dataset is intended for usage by analytics team to look for patterns in active vs churned customers. It has been used to create basic charts for business users on the marketing team. At this time, there are no known limitations on it's use. 

### _Distribution: How the dataset will be distributed and to who, and any restrictions on distribution._

The dataset is available in the analytics schema of the datalake, and is accessible to the marketing analytics team. There are no known restrictions on the dataset at this time. 

### _Maintenance: Who and how the dataset will be maintained, and if and how others will be able to build on it._

This raw dataset was built once for the purpose of static analytics, and the data engineers have not been assigned maintenance duties on it. Other teams may build on it knowing it is not regularly updated. 