Create a PDF with Python

So, you’re doing some data analysis in Python, and you want to generate a PDF report. If you Google around, you’ll find a bunch of jerry-rigged ways of doing it, by generating HTML. Working with HTML and CSS is a pain – wouldn’t it be easier if we could just design our PDFs with a drag-and-drop designer?

We’re going to do just that, using Anvil. Anvil is a platform for building web UIs, but today we’ll just be using it to make PDFs.

In this example, we’re going to take two CSVs, representing sales data from this month and last month, and create a PDF report that looks like this:

We’ll build up to this in a few stages:

Follow along to build the app yourself, or you can open the full example app and script here:

And you can download our sample CSV files here:


Preparing our data

Let’s say we have two CSVs, describing our company’s revenue, by source, for this month and last month. They look like this:

category,revenue
Widgets,1298471
Services,265402.12
Licensing,28000

We can use pandas to load and join our two CSVs. (You’ll need this_month.csv and last_month.csv saved in your working directory):

import pandas as pd

this_month = pd.read_csv("this_month.csv")
last_month = pd.read_csv("last_month.csv")

combined = this_month.join(last_month.set_index("category"),
                           on="category", rsuffix="_last_month")

print(combined)

That will produce a data frame like this:

    category     revenue  revenue_last_month
0    Widgets  1298471.00           982305.00
1   Services   265402.12           203631.25
2  Licensing    28000.00            39000.00

Designing our first PDF

To design our PDF, we first open the Anvil cloud editor, and create a new app, choosing the ‘Material Design’ theme. We’ll want to create a “Form” - that’s a piece of web UI - which we will then turn into a PDF.

For our PDF, we don’t want any headers, or navigation, so we’ll create a new “Blank Panel” form, and call it ReportForm:

We can use the drag-and-drop editor to put a title on our page. We’ll use a Label component, then adjust its properties to display a centred title with large text:

Rendering it from Python

Before we go any further, let’s generate that PDF from Python. We’ll use the Uplink to connect our local code to this Anvil app.

Then we install the Uplink library:

pip install anvil-uplink

And then we paste that connection code into our script, and add code to create a PDF file:

import anvil.server

anvil.server.connect("[YOUR APP'S UPLINK KEY HERE]")

import anvil.pdf
import anvil.media

pdf = anvil.pdf.render_form("ReportForm")

anvil.media.write_to_file(pdf, "report.pdf")

Now we run this code. Anvil will produce a PDF, containing just our title.

You’ve just created a PDF and written it to your local filesystem!

Displaying data on our PDF

Passing data into Anvil

We want to display more than just a title: We want to show our data! The first step is to pass our data into our Form’s code, that runs inside Anvil.

We can’t pass Pandas data frames directly to Anvil, so we turn our data into a list of dicts first:

records = combined.to_dict('records')
print(records)

Here’s what that looks like: It’s a list, with a dictionary for each row of the data frame:

[{'category': 'Widgets', 'revenue_last_month': 982305.0, 'revenue': 1298471.0}, {'category': 'Services', 'revenue_last_month': 203631.25, 'revenue': 265402.12}, {'category': 'Licensing', 'revenue_last_month': 39000.0, 'revenue': 28000.0}]

We can pass this data as an extra argument to render_form(), which will in turn be passed into the __init__ function of our Anvil form.

We edit our script to say:

pdf = anvil.pdf.render_form("ReportForm", records)

Displaying data on our form

Now, we’ll make ReportForm display this data. The first step is to go into the Code for the form, and add the extra argument to our __init__ function:

Edit the definition of the __init__ function to accept our data. It now looks like this:

  def __init__(self, records, **properties):

    # ... rest of code as before ...

Displaying a table

We want to display a table, with each category of revenue and its growth/decline since last month. So we drag a Data Grid onto our page, and give it three columns: Category, Revenue, and Change, displaying the dictionary keys category, revenue and change.

Inside this DataGrid is a RepeatingPanel, and we can display rows in our table by filling out its item property. Edit our ReportForm’s __init__ method as follows:

  def __init__(self, records, **properties):
    # Set Form properties and Data Bindings.
    self.init_components(**properties)

    # Any code you write here will run when the form opens.
    self.repeating_panel_1.items = [
      {'category': r['category'],
       'revenue': f"${r['revenue']:,.2f}",
       'change': f"{100.0*r['revenue']/r['revenue_last_month'] - 100:+.0f}%"
      }
      for r in records
    ]

Now, when you run your script, it will generate a PDF with a table:

Displaying a total

We want to display a “total” row for this table. So we add a new DataRowPanel to our grid, underneath the automatic rows from the RepeatingPanel, and display our totals there.

We can do this entirely in code, by adding this to our __init__ function:

    # Compute total revenue for this month and last month
    total_rev = sum(r['revenue'] for r in records)
    last_month_rev = sum(r['revenue_last_month'] for r in records)
  
    # Display this data as a new row in the 
    total_row = DataRowPanel(bold=True, background="#eee")
    total_row.item = {
      'category': 'Total:',
      'revenue': f"${total_rev:,.2f}",
      'change': f"{100.0*total_rev/last_month_rev - 100:+.0f}%"
    }
    self.data_grid_1.add_component(total_row)

Voila! Our table is looking spiffy. If we run our script, we get this PDF:

Plotting graphs

The final piece is to display a graph! We’ll be summarising our data with two graphs: a pie chart displaying the proportion of revenue from each source, and a bar chart comparing each category’s performance with last month.

First, we go back to Design view on our form, and add two Plot icon Plot components, next to each other:

We then use the popular Plotly API to plot our data on these components. We add the following code to our __init__ function:

    # Build a pie chart breaking down revenue sources
    self.plot_1.layout.title="Revenue Sources"
    self.plot_1.data = go.Pie(labels=[r['category'] for r in records],
                              values=[r['revenue'] for r in records])

    # Build a graph chart with last month and this month's revenue
    # for each category
    self.plot_2.layout.title = "Month-On-Month"
    self.plot_2.data = [go.Bar(x=[r['category'] for r in records],
                               y=[r['revenue_last_month'] for r in records],
                               name="Last month"),
                        go.Bar(x=[r['category'] for r in records],
                               y=[r['revenue'] for r in records],
                               name="This month")]

If we run our script, we have created a complete sales report in PDF form!

That’s all, folks!

You now have a Python script that will generate a beautiful PDF report – and you know how to design more!

Again, you can open the full example app and script here:

And you can download our sample CSV files here: