How to make plots using Pandas
We’re going to look at an example of making plots using Pandas, the hugely popular Python data maniuplation library. Pandas is a standard tool in Python for scalably transforming data, and it has also become a popular way to import and export from CSV and Excel formats.
On top of all that, it also contains a very nice plotting API. This is extremely convenient: you already have your data in a Pandas DataFrame, so why not use the same library to plot it?
This is part of a comparison of many Python plotting libraries. We’re making the same multi-bar plot in each one so we can compare how they work - here’s the full rundown of libraries. The data is UK election results from 1966 to 2020:
Data that plots itself
We’ve seen some impressively simple APIs in this series of articles, but Pandas has to take the crown.
To plot a bar plot with a group for each party and
year on the x-axis, I simply need to do this:
import matplotlib.pyplot as plt from votes import wide as df ax = df.plot.bar(x='year') plt.show()
Four lines - definitely the most terse multi-bar plot we’ve created in this series.
I’m using my data in wide form, meaning there’s one column per political party:
year conservative labour liberal others 0 1966 253 364 12 1 1 1970 330 287 6 7 2 Feb 1974 297 301 14 18 .. ... ... ... ... ... 12 2015 330 232 8 80 13 2017 317 262 12 59 14 2019 365 202 11 72
This means Pandas automatically knows how I want my bars grouped - and if I wanted them grouped differently, Pandas makes it easy to restructure my DataFrame.
As with Seaborn, Pandas’s plotting feature is an abstraction on top of Matplotlib, which is
why you call Matplotlib’s
plt.show() function to actually produce the plot.
Here’s what it looks like:
Looks great – especially considering how easy it was! Let’s style it to look just like the Matplotlib example.
We can easily tweak the styling by accessing the underlying Matplotlib methods.
Firstly, we can colour our bars by passing a Matplotlib colormap into the plotting function:
from matplotlib.colors import ListedColormap cmap = ListedColormap(['#0343df', '#e50000', '#ffff14', '#929591']) ax = df.plot.bar(x='year', colormap=cmap)
And we can set up axis labels and titles using the return value of the plotting function - it’s simply a Matplotlib
ax.set_xlabel(None) ax.set_ylabel('Seats') ax.set_title('UK election results')
Here’s what it looks like now:
That’s pretty much identical to the Matplotlib version shown above, but in 8 lines of code rather than 16! My inner code golfer is very pleased.
You can copy this example as an Anvil app here:
Abstractions must be escapable
As with Seaborn, the ability to drop down and access Matplotlib APIs to do the detailed tweaking was really helpful. This is a great example of giving an abstraction escape hatches, to make it powerful as well as simple. (This is something we take great care to do with Anvil.)
More about plotting in Python
We’re looking at a range of Python plotting libraries and comparing their characteristics by plotting our UK election results multi-bar plot. Check out the overview to see more: