Give a front-end to your data insights
Pandas is one of the most popular data science libraries for analysing and manipulating data. But what if you want to take the data that you have manipulated with Pandas and present it to other people in a visual manner? With Anvil, you can take a local script or a Jupyter notebook and turn it into an interactive dashboard — all for free and entirely in Python.
In this tutorial, we’re going to build a dashboard app using the Netflix Movies and TV Shows dataset by Shivam Bansal. You can also download the data directly in the link below:
We’ll use Pandas to manipulate our data in a local script or a notebook, Plotly to visualize it, and the Anvil Uplink to connect it to an app and turn it into an interactive dashboard.
These are the steps that we will cover:
- We’ll start by creating our Anvil app.
- Then, we’ll design our dashboard using Anvil components.
- After that, we’ll clean and prepare the data for plotting.
- We will then connect our script or notebook to our app via the Uplink.
- Using Plotly, we will build our plots.
- Finally, we will add some finishing touches to our dashboard.
Let’s get started!
Step 1: Create your Anvil app
Go to the Anvil Editor, click on “Blank App”, and choose “Rally”. We just want an empty app, so we’ll delete the current Form1 and then add a new Blank Panel form:
Now let’s rename our app. Click on the app’s name, on the top left corner of the screen. This takes us to the General Settings page, where we can modify our app’s name.
Step 2: Design your dashboard
We will build our user interface by adding drag-and-drop components from the Toolbox. First, add a ColumnPanel , where we will fit all our other components. Then, let’s add three Plots into the ColumnPanel - we can adjust their sizes by clicking and dragging their edges (use Ctrl+Drag for finer adjustments).
We’ll also drop a Label into our app and place it above the plots. Select your Label and, in the Properties Panel on the right, change the text to “Netflix Content Dashboard”. Set the
font_size to 32 and, if you want, you can customize the font.
Finally, we’ll set the
spacing_above property of all the components to
large (you can find this under the Layout section of the Properties Panel).
Step 3: Get the data into shape
It’s time to clean our data and get it ready to display. Create a file in your favorite development enviroment – open up a new Jupyter notebook, or create a new script in your IDE of choice. All the code included in this tutorial will work with both options.
First, import Pandas at the top. Then, write the
csv_to_df function, which will fetch the data in from our CSV file and convert it into a Pandas DataFrame.
import pandas def csv_to_df(f): return pd.read_csv(f, index_col=0)
Before we can plot our data, we need to clean it and prepare it for plotting. We’ll do all the necessary transformations to our data inside a function that we’ll call
prepare_netflix_data. In it, we’ll first call our
csv_to_df function to fetch our data.
def prepare_netflix_data(): netflix_df = csv_to_df('netflix_titles.csv')
Our figures will only use data from the
date_added columns, so we’ll slice our DataFrame using the
netflix_df = netflix_df.loc[:,['type', 'country', 'date_added']]
There are also some missing values that we’ll have to deal with, so that we don’t include missing data in our plots. Pandas has a few different ways to handle missing values, but in this case we’ll just use
dropna to get rid of them:
netflix_df = netflix_df.dropna(subset=['country'])
country column contains the production country of each movie or TV Show in Netflix. However, some of them have more than one country listed. For simplicity, we’re going to assume the first one mentioned is the most important one, and we’ll ignore the others. We’ll also create a separate DataFrame that only contains the value counts for this column, sorted by country – this will be useful later on when we input the data into our plots.
netflix_df['country'] = [countries for countries in netflix_df['country'].str.split(',')] country_counts = pd.DataFrame( netflix_df['country'].value_counts().rename_axis('countries').reset_index(name='counts')).sort_values(by=['countries'])
date_added variable currently contains only strings, which is not a very easy format to work with when we want to plot something in chronological order. Because of this, we’ll convert it into datetime format using
to_datetime, which will allow us to easily order the data by year later on.
netflix_df['date_added'] = pd.to_datetime(netflix_df['date_added'])
We’ll return our transformed dataframe and the
country_counts variable that we just created. This is how the full function should look by the end:
def prepare_netflix_data(): netflix_df = csv_to_df('netflix') netflix_df = netflix_df.loc[:,['type', 'country', 'date_added']] netflix_df = netflix_df.dropna() netflix_df['country'] = [countries for countries in netflix_df['country'].str.split(',')] country_counts = pd.DataFrame(netflix_df['country'].value_counts().rename_axis('countries').reset_index(name='counts')).sort_values(by=['countries']) netflix_df['date_added'] = pd.to_datetime(netflix_df['date_added']) return netflix_df, country_counts
Finally, we can print the output of
country_counts, to make sure everything looks right.
This is what the output should look like:
Now that we’re happy with how our data looks, we can build our plots but, before that, we’ll have to connect our local script to our Anvil app.
Step 4: Making the connection
Right now, our code is running on our computer or in our Jupyter notebook, but we will want the plots we build to display on our app and run on the web browser. To be able to do this, we will have to connect our local code to our Anvil app and make the functions we’ve written in our local code callable from the app. In this way, the local code acts as the server, while the app runs all the client-side code in the web browser.
To connect our local script to our app, let’s first enable the Anvil Uplink. In the Editor, click the
+ button in the Sidebar Menu, select Uplink and click on
+ Enable server Uplink.
After this, we need to
pip install the Uplink library on our local machine (use
!pip install instead, if working on a Jupyter notebook):
pip install anvil-uplink
Now we’ll add the following lines at the top of our local script (you can copy them from your Uplink window dialog):
import anvil.server anvil.server.connect("<your-uplink-key-here>")
If you’re using a local script, we will also add
anvil.server.wait_forever() at the very bottom of our script, which will keep our script running to allow our app to call functions in it. All the rest of the code we write going forward will go above this line. This is not necessary for Jupyter notebooks.
anvil.server.callablefunctions will only remain accessible while your notebook kernel is active. Many hosted notebook environments, such as Google Colab, will shutdown kernels that have been idle for some time. You may be able to increase this timeout by having a cell continually running -
while True: sleep(1)will do, and this is exactly what
anvil.server.wait_forever()does. Even then, hosted kernels are unlikely to run forever, and for production deployments you should see our Deploy a Google Colab Notebook with Docker guide.
With this, our local script is connected to our app. This means it can now do anything we can do in an Anvil Server Module.
Step 5: Build the plots
We now need to take our processed data and turn it into the plots that we will want to display. Our data is in a
pandas DataFrame right now, which means that we require
pandas to read it. Because of this, we can’t send the data directly to our app because
pandas is not available to the web browser, or client (learn more about why certain packages are not available in the client in our Client vs Server code in Anvil explainer).
However, we can send already built plots instead. Because of this, we’ll first build our figures in our script or notebook, and then we’ll return the plots to our app to display them.
Our dashboard will contain three figures:
- A map showing the number of films per production country
- A content type pie chart
- A line chart of content added through time
Now, we need to create them. We’ll first import
plotly.graph_objects, then write the
create_plots function. Before anything else, we’ll call our
prepare_netflix_data function to fetch our transformed data. After that, we’ll first create and return our map plot, using Plotly’s Scattergeo. Remember to add this code above the
anvil.server.wait_forever() call if working on a local script.
import plotly.graph_objects as go @anvil.server.callable def create_plots(): netflix_df, country_counts = prepare_netflix_data() fig1 = go.Figure( go.Scattergeo( locations=sorted(netflix_df['country'].unique().tolist()), locationmode='country names', text = country_counts['counts'], marker= dict(size= country_counts['counts'], sizemode = 'area'))) return fig1
We made this function available from the client side of our app by decorating it as @anvil.server.callable, which means we can access our function’s output from our Anvil app in order to display our figures. We’ll do this by using
anvil.server.call inside the
__init__ method in our Form1 code to call our function. Then, we’ll asign the output to our plot’s
class Form1(Form1Template): def **init**(self, **properties): # Set Form properties and Data Bindings. self.init_components(**properties) fig1 = anvil.server.call('create_plots') self.plot_1.data = fig1
We can check that everything is working like we want it to by running our local script or notebook and our app. This is how it looks so far:
Let’s now do the same with our other two plots. We’ll add a pie chart and a line chart to our
@anvil.server.callable def create_plots(): netflix_df, country_counts = prepare_netflix_data() fig1 = go.Scattergeo( locations=sorted(netflix_df['country'].unique().tolist()), locationmode='country names', text = country_counts['counts'], marker= dict( size= country_counts['counts'], line_width = 0, sizeref = 2, sizemode = 'area', reversescale = True )) fig2 = go.Pie( labels=netflix_df['type'], values=netflix_df['type'].value_counts(), hole=.4, textposition= 'inside', textinfo='percent+label' ) fig3 = go.Scatter( x=netflix_df['date_added'].dt.year.value_counts().sort_index().index, y=netflix_df['date_added'].dt.year.value_counts().sort_index() ) return fig1, fig2, fig3
And we’ll call them from the client in the same way we did our previous plot:
class Form1(Form1Template): def **init**(self, **properties): # Set Form properties and Data Bindings. self.init_components(**properties) fig1, fig2, fig3 = anvil.server.call('create_plots') self.plot_1.data = fig1 self.plot_2.data = fig2 self.plot_3.data = fig3
With that, we have a functional dashboard:
Step 6: Finishing touches
With everything set up, we can move onto customizing our plots. Plotly supports pre-configured templates that describe plot styling, and Anvil provides pre-built templates to match our Rally and M3 themes. We can apply these through the
For this dashboard, I’m going to choose the
rally plot theme, which will match the colour scheme and styling of our app. I will set it as the default plot theme for the app by adding this line of code to the
class Form1(Form1Template): def **init**(self, **properties): # Set Form properties and Data Bindings. self.init_components(**properties) Plot.templates.default = "rally"
This is how the plots look now:
Finally, we’ll change our plots’ margins and font by using their
layout property. We’ll also add titles to each one:
fig1, fig2, fig3 = anvil.server.call('create_plots') self.plot_1.data = fig1 self.plot_1.layout.margin = dict(t=60, b=30, l=0, r=0) self.plot_1.layout.font.family='Raleway' self.plot_1.layout.title = 'Production countries' self.plot_2.data = fig2 self.plot_2.layout.margin = dict(t=60, b=30, l=10, r=10) self.plot_2.layout.font.family='Raleway' self.plot_2.layout.title = 'Content breakdown by type' self.plot_3.data = fig3 self.plot_3.layout.margin = dict(t=60, b=40, l=50, r=50) self.plot_3.layout.font.family='Raleway' self.plot_3.layout.title = 'Content added over time'
After these changes, our dashboard is ready to be shared! All you need to do now is publish your app to make it available to the world.
Want to turn off your computer?
Now that we’ve finished developing our dashboard, you might want to want to stop relying on your local script. You can upload the data set to Anvil, and run our data processing code in an Anvil Server Module, so our dashboard will still work if we stop our script or turn off our computer.
If you want to learn how to do this, check out this tutorial:
Clone the App
If you want to check out the source code for our Netflix dashboard, click on the links below and clone the app, download the local script and access the Jupyter notebook on Colab:
New to Anvil?
If you’re new here, welcome! Anvil is a platform for building full-stack web apps with nothing but Python. No need to wrestle with JS, HTML, CSS, Python, SQL and all their frameworks – just build it all in Python.
Yes – Python that runs in the browser. Python that runs on the server. Python that builds your UI. A drag-and-drop UI editor. We even have a built-in Python database, in case you don’t have your own.
Why not have a play with the app builder? It’s free! Click here to get started: