# Streamlit: your first webapp[¶](https://developer.dataiku.com/latest/tutorials/webapps/streamlit/basics/index.html#streamlit-your-first-webapp "Permalink to this heading")

In order to follow the steps in this tutorial you will need:

Pre-requisites

* Dataiku >= 11.0

* A Kubernetes cluster properly linked to the Dataiku DSS instance (for more details, see the reference documentation page on Elastic AI Computation

* Access to a Project with relevant permissions to create Code Studios

* A functioning Code Studio template for Streamlit webapps (for more details, see the reference documentation page on Code Studio templates)

Streamlit is a popular web application framework, designed for building rich interactive applications using Python. In this article, you will develop a Streamlit application in Dataiku DSS using the Code Studio feature and then deploy it as a Dataiku DSS webapp.

## Preparing the source data[¶](https://developer.dataiku.com/latest/tutorials/webapps/streamlit/basics/index.html#preparing-the-source-data "Permalink to this heading")

This tutorial is inspired by one of Snowflake’s demos and mostly reuses the same code and data.

Start by downloading the source data following this link and make it available in your DSS Project, for example by uploading the *.csv.gz* file to it. Name the resulting Dataiku dataset `uber\_raw\_data\_sep14`.

The dataset contains information about Uber pickup dates, times and geographical coordinates (latitude and longitude). To better understand this data, you will build a few data visualizations in the rest of the tutorial, but first, you need to set up the webapp’s edition environment.

## Setting up the edition environment[¶](https://developer.dataiku.com/latest/tutorials/webapps/streamlit/basics/index.html#setting-up-the-edition-environment "Permalink to this heading")

In Dataiku DSS, Streamlit webapps are built on top of Code Studios, which are also used to provide advanced development environments. This tutorial assumes that you already have access to a functioning Code Studio template for Streamlit, referred to as `streamlit-template`.

From your Project, create a new Code Studio instance:

* In the “Code” menu, go to “Code Studios”

* Click on “Create your first Code Studio”

* In the “New Code Studio” modal, select `streamlit-template`, name your instance `uber-nyc-dev` and click on “Create”.

The `uber-nyc` Code Studio is now created, but not started yet. To start it, click on *Start Code Studio*. After waiting for the start operation to be completed, you should see a “Hello World” message: this is the initial rendering of your Streamlit webapp! For now, it doesn’t do much, but you will add more functionalities in the next sections.

## Editing the webapp source code[¶](https://developer.dataiku.com/latest/tutorials/webapps/streamlit/basics/index.html#editing-the-webapp-source-code "Permalink to this heading")

Your first task is to surface the Uber pickup data in the webapp. Access the IDE environment in the “VS Code” tab of the Code Studio, and go to “Workspace > code\_studio-versioned/streamlit/app.py”. This will open the source code file of the webapp.

Add the following code:

§ import streamlit as st

§ import dataiku

§ import pandas as pd

§ DATE\_TIME\_COL = "date/time"

§ #############

§ # Functions #

§ #############

§ @st.experimental\_singleton

§ def load\_data(nrows):

§ data = dataiku.Dataset("uber\_raw\_data\_sep14") \

§ .get\_dataframe(limit=nrows)

§ lowercase = lambda x: str(x).lower()

§ data.rename(lowercase, axis='columns', inplace=True)

§ data[DATE\_TIME\_COL] = pd.to\_datetime(data[DATE\_TIME\_COL],

§ format="%m/%d/%Y %H:%M:%S")

§ return data

§ ##############

§ # App layout #

§ ##############

§ data = load\_data(nrows=10000)

§ st.title('Uber pickups in NYC')

§ if st.checkbox('Show raw data'):

§ st.subheader('Raw data')

§ st.write(data)

The structure of the code is split in two:

* The *Functions* part contains all functions that rule the **behavior** of the application

* The *App layout* part lists the different visual components that the application is made of, hence defining its **appearance**.

For this initial step, you created a `load\_data()` function that retrieves the source data and turns in into a pandas DataFrame that you’ll be able to manipulate later for more advanced operations. The layout is fairly simple: it displays the content of that DataFrame as a table if the “Show raw data” box is ticked.

### Breaking down rides by hour of the day[¶](https://developer.dataiku.com/latest/tutorials/webapps/streamlit/basics/index.html#breaking-down-rides-by-hour-of-the-day "Permalink to this heading")

Suppose now that you want to further investigate your data and check if there is a particular time of the day when the number of pickups is higher or lower than usual. To do so, you will create a histogram at the hour level and display it in the application. First, add a few more dependencies to import at the beginning of the file:

§ import altair as alt

§ import numpy as np

§ import pandas as pd

Then, add the histogram computation function to the *Functions* part:

§ @st.experimental\_memo

§ def histdata(df):

§ hist = np.histogram(df[DATE\_TIME\_COL].dt.hour, bins=24, range=(0, 24))[0]

§ return pd.DataFrame({"hour": range(24), "pickups": hist})

Finally, incorporate the histogram visualization in the application by adding this to the *App layout* section:

§ # Histogram

§ chart\_data = histdata(data)

§ st.write(

§ f"""\*\*Breakdown of rides per hour\*\*"""

§ )

§ st.altair\_chart(

§ alt.Chart(chart\_data)

§ .mark\_area(

§ interpolate="step-after",

§ )

§ .encode(

§ x=alt.X("hour:Q", scale=alt.Scale(nice=False)),

§ y=alt.Y("pickups:Q"),

§ tooltip=["hour", "pickups"],

§ )

§ .configure\_mark(opacity=0.2, color="red"),

§ use\_container\_width=True,

§ )

If you check back on the *Streamlit* tab you should now see a nice histogram rendered:

### Drawing a scatter map with pickup locations[¶](https://developer.dataiku.com/latest/tutorials/webapps/streamlit/basics/index.html#drawing-a-scatter-map-with-pickup-locations "Permalink to this heading")

For the final item of your application, you will create a map displaying the pickup locations. To make it more interactive, you will also add a slider to filter the data and keep only a specific hour of the day.

No additional computation is needed here, so you can directly add the following code to the *App layout* part:

§ # Map and slider

§ hour\_to\_filter = st.slider('', 0, 23, 17)

§ filtered\_data = data[data[DATE\_TIME\_COL].dt.hour == hour\_to\_filter]

§ st.subheader(f"Map of all pickups at {hour\_to\_filter}:00")

§ st.map(filtered\_data)

If you go back once more to the *Streamlit* tab you will see the newly-added map:

### Putting it all together[¶](https://developer.dataiku.com/latest/tutorials/webapps/streamlit/basics/index.html#putting-it-all-together "Permalink to this heading")

Your webapp is now fully functional! Here is the complete code for your application:

§ import streamlit as st

§ import dataiku

§ import numpy as np

§ import pandas as pd

§ import altair as alt

§ DATE\_TIME\_COL = "date/time"

§ #############

§ # Functions #

§ #############

§ @st.experimental\_singleton

§ def load\_data(nrows):

§ data = dataiku.Dataset("uber\_raw\_data\_sep14") \

§ .get\_dataframe(limit=nrows)

§ lowercase = lambda x: str(x).lower()

§ data.rename(lowercase, axis='columns', inplace=True)

§ data[DATE\_TIME\_COL] = pd.to\_datetime(data[DATE\_TIME\_COL],

§ format="%m/%d/%Y %H:%M:%S")

§ return data

§ @st.experimental\_memo

§ def histdata(df):

§ hist = np.histogram(df[DATE\_TIME\_COL].dt.hour, bins=24, range=(0, 24))[0]

§ return pd.DataFrame({"hour": range(24), "pickups": hist})

§ ##############

§ # App layout #

§ ##############

§ # Load a sample from the source Dataset

§ data = load\_data(nrows=10000)

§ st.title('Uber pickups in NYC')

§ if st.checkbox('Show raw data'):

§ st.subheader('Raw data')

§ st.write(data)

§ # Histogram

§ chart\_data = histdata(data)

§ st.write(

§ f"""\*\*Breakdown of rides per hour\*\*"""

§ )

§ st.altair\_chart(

§ alt.Chart(chart\_data)

§ .mark\_area(

§ interpolate="step-after",

§ )

§ .encode(

§ x=alt.X("hour:Q", scale=alt.Scale(nice=False)),

§ y=alt.Y("pickups:Q"),

§ tooltip=["hour", "pickups"],

§ )

§ .configure\_mark(opacity=0.2, color="red"),

§ use\_container\_width=True,

§ )

§ # Map and slider

§ hour\_to\_filter = st.slider('', 0, 23, 17)

§ filtered\_data = data[data[DATE\_TIME\_COL].dt.hour == hour\_to\_filter]

§ st.subheader(f"Map of all pickups at {hour\_to\_filter}:00")

§ st.map(filtered\_data)

## Publishing the webapp[¶](https://developer.dataiku.com/latest/tutorials/webapps/streamlit/basics/index.html#publishing-the-webapp "Permalink to this heading")

Up to this point, your application is still living inside your development environment, namely your Code Studio instance. The final step of this tutorial is to make it widely available for other Dataiku DSS users to view.

* In the “Code Studios” list screen, select `uber-nyc-dev`

* In the Action panel on the right, select “Publish” and name your webapp (e.g. `Uber NYC App`) then click on “Create”

Your Streamlit application is now deployed as a DSS webapp, congratulations! You can access it in “Code > Webapps > Uber NYC App”.

Note

Once you have deployed a Code Studio application as a Dataiku DSS webapp, if you change the source code in the Code Studio editor then those changes will be directly reflected in the webapp. That is because the webapp itself constantly points to the latest state of the Code Studio.

## Wrapping up[¶](https://developer.dataiku.com/latest/tutorials/webapps/streamlit/basics/index.html#wrapping-up "Permalink to this heading")

In this tutorial, you saw how to build a simple Streamlit application and deploy it as a DSS webapp, while leveraging the advanced code edition capabilities offered by the Code Studios feature. If you want to experiment with other frameworks like Dash or Bokeh, check out the available tutorials in the Webapp section.
