Day 5 of the Anvil Advent Calendar
Build a web app every day until Christmas, with nothing but Python!
Generating Seasonal Cheer with AI
Coming up with messages for your Christmas cards can be a bore. Couldn’t we auto-generate those messages?
One of 2019’s biggest pieces of AI news was GPT-2, a text-generating neural network from OpenAI. It was a massive scientific leap forwards, and yet remarkably easy to have fun with. Here’s the app:
That’s the live app - refresh the page to get another greeting.
How it’s done
The GPT-2 network has already been trained on a huge corpus of text drawn from Reddit, which means it already “understands” lots of rules about language, grammar and punctuation. But it turns out you can “specialise” it by training it just a little bit more with a particular “corpus” of text.
This is made incredibly simple by Max Woolf’s gpt-2-simple package. Here’s what I did:
1. Gather input data
I started out by gathering a list of the schmaltziest, cheesiest Christmas card messages I could find online. I pasted them all into a file called
greetings.txt, separated by blank lines.
GPT-2 doesn’t understand things like beginnings or ends of documents. But we want it to learn things about the length and structure of Christmas card messages, so we have to insert fake “start” and “end” tokens, so the model can learn the “grammar” of these new “words”. We can then use the start token as the seed when generating new text, and truncate the output at the first occurrence of the end token.
Here’s the code I used to do that:
with open('greetings.txt','r') as f: text = f.read() messages = text.split("\n\n") messages = ['<|startoftext|> ' + t + " <|endoftext|>" for t in messages] with open('greetings-massaged.txt','w') as f: f.write("\n".join(messages))
The corpus I could scrounge together is pretty small, so we only need an extremely small number of runs (40 steps) – any more than that, and the model starts over-fitting, and regurgitating messages word-for-word from our input set. (It does a bit of this anyway!)
The following parameters worked pretty well for me. I ran them in a Google Colab notebook:
import gpt_2_simple as gpt2 gpt2.download_gpt2(model_name="124M") sess = gpt2.start_tf_sess() gpt2.finetune(sess, dataset="greetings-massaged.txt", model_name='124M', steps=40, restore_from='fresh', run_name='christmas-cards', print_every=10, sample_every=200, save_every=500 )
4. Loading a saved model
At this point, the model has saved its state in the
checkpoint/christmas-cards/ directory. We can save that directory and just load it later with the following code:
# Run this in a fresh environment where we haven't run `gpt2.finetune()`: sess = gpt2.start_tf_sess() gpt2.load_gpt2(sess, run_name='christmas-cards')
And now, we can invoke this model to generate a list of Christmas card messages!
@anvil.server.callable def get_christmas_card_messages(n_samples=1): sess = gpt2.start_tf_sess() gpt2.load_gpt2(sess, run_name='christmas-cards') text = gpt2.generate(sess, run_name="christmas-cards", prefix="<|startoftext|>", truncate="<|endoftext|>", length=80, return_as_list=True, nsamples=n_samples ) return text
6. Publish as a web app
Anvil makes it easy to expose a Jupyter notebook as a web app. So we can use a drag-and-drop designer to put a front end on our code:
See the source code
Here is the text predictor, as a Google Colab notebook you can just run:
Here is the source code to our Anvil app, which uses this notebook to generate and display seasonal greetings:
Give the Gift of Python
Share this post: