Day 5 of the Anvil Advent Calendar

Build a web app every day until Christmas, with nothing but Python!

Generating Seasonal Cheer with AI

Coming up with messages for your Christmas cards can be a bore. Couldn’t we auto-generate those messages?

One of 2019’s biggest pieces of AI news was GPT-2, a text-generating neural network from OpenAI. It was a massive scientific leap forwards, and yet remarkably easy to have fun with. Here’s the app:

This heartfelt greeting was written by a machine.

That’s the live app - refresh the page to get another greeting.


How it’s done

The GPT-2 network has already been trained on a huge corpus of text drawn from Reddit, which means it already “understands” lots of rules about language, grammar and punctuation. But it turns out you can “specialise” it by training it just a little bit more with a particular “corpus” of text.

This is made incredibly simple by Max Woolf’s gpt-2-simple package. Here’s what I did:

1. Gather input data

I started out by gathering a list of the schmaltziest, cheesiest Christmas card messages I could find online. I pasted them all into a file called greetings.txt, separated by blank lines.

2. Massage

GPT-2 doesn’t understand things like beginnings or ends of documents. But we want it to learn things about the length and structure of Christmas card messages, so we have to insert fake “start” and “end” tokens, so the model can learn the “grammar” of these new “words”. We can then use the start token as the seed when generating new text, and truncate the output at the first occurrence of the end token.

Here’s the code I used to do that:

with open('greetings.txt','r') as f:
  text = f.read()

messages = text.split("\n\n")  
messages = ['<|startoftext|> ' + t + " <|endoftext|>" for t in messages]

with open('greetings-massaged.txt','w') as f:
  f.write("\n".join(messages))

3. Train

The corpus I could scrounge together is pretty small, so we only need an extremely small number of runs (40 steps) – any more than that, and the model starts over-fitting, and regurgitating messages word-for-word from our input set. (It does a bit of this anyway!)

The following parameters worked pretty well for me. I ran them in a Google Colab notebook:

import gpt_2_simple as gpt2

gpt2.download_gpt2(model_name="124M")

sess = gpt2.start_tf_sess()

gpt2.finetune(sess,
              dataset="greetings-massaged.txt",
              model_name='124M',
              steps=40,
              restore_from='fresh',
              run_name='christmas-cards',
              print_every=10,
              sample_every=200,
              save_every=500
              )

4. Loading a saved model

At this point, the model has saved its state in the checkpoint/christmas-cards/ directory. We can save that directory and just load it later with the following code:

# Run this in a fresh environment where we haven't run `gpt2.finetune()`:
sess = gpt2.start_tf_sess()
gpt2.load_gpt2(sess, run_name='christmas-cards')

5. Generate

And now, we can invoke this model to generate a list of Christmas card messages!

@anvil.server.callable
def get_christmas_card_messages(n_samples=1):
    sess = gpt2.start_tf_sess()
    gpt2.load_gpt2(sess, run_name='christmas-cards')
    text = gpt2.generate(sess,
                         run_name="christmas-cards",
                         prefix="<|startoftext|>",
                         truncate="<|endoftext|>",
                         length=80,
                         return_as_list=True,
                         nsamples=n_samples
                         )
    return text

6. Publish as a web app

Anvil makes it easy to expose a Jupyter notebook as a web app. So we can use a drag-and-drop designer to put a front end on our code:


See the source code

Here is the text predictor, as a Google Colab notebook you can just run:

Here is the source code to our Anvil app, which uses this notebook to generate and display seasonal greetings:


Give the Gift of Python

Share this post: