Generating spooky stories

I love Halloween! And I love scary stories! In school, we read a lot of Edgar Allan Poe. (I grew up near Baltimore, and boy did my teachers love a local talent.) I liked everything we read, but the one that stood out to me the most was The Pit and the Pendulum, a dark tale about a prisoner being tortured during the Spanish Inquisition.1

This year for Halloween, I decided to write a spooky story in the style of Poe. But then I had a better idea. Why write a story when I can train a text generator to write one for me? I fine-tuned OpenAI’s GPT-2 model on the complete works of Edgar Allan Poe using the Hugging Face Transformers library.

I’ve dubbed my fine-tuned model GPoeT-2, and I used it to generate an eerie tale in time for Halloween.

Creating my dataset

Because all of Poe’s works are in the public domain, I could download everything he ever published from Project Gutenberg to create a dataset. I split his works into a training and an evaluation set, both of which contain stories and poems in their entirety. The training set is larger and used to acutally train the model, while the evaluation set is used to test how well the model performs on unseen data. You can find my Poe dataset on GitHub.

Fine-tuning GPT-2

Powerful Transformer networks like GPT-2 are trained on a huge amount of data (GPT-2 was trained on text from 8 million webpages), and this is part of what makes them so powerful. Training a large network on this much data takes a long time and a lot of computational power, so we don’t want to retrain these networks from scratch. Instead, we can fine-tune them. This means we train the network on a new dataset or for a new task, but we use the pretrained model as our inital setting. Because the model’s parameters are already set, we don’t need to do as much training to fine-tune the model.

To fine-tune GPT-2 using the Hugging Face Transformers library, you first need to have PyTorch or TensorFlow installed (I use PyTorch). Then, you need to install the Transformers libaray

To fine-tune GPT-2 on my Poe dataset, I used the run_language_modeling.py script from the Transformers GitHub repository and ran the following command in the terminal:

python run_language_modeling.py \
--output_dir='./tuned_model' \
--model_type=gpt2 \
--model_name_or_path=gpt2 \
--do_train \
--train_data_file='./data/poe_train.txt' \
--do_eval \
--eval_data_file='./data/poe_eval.txt' \
--per_device_train_batch_size=2 \
--per_device_eval_batch_size=2 \
--line_by_line \
--evaluate_during_training \
--num_train_epochs=5

line_by_line means that the model treats each line in the dataset as a separate example. I ran this parameter due to GPU memory contraints so that less words would be loaded into the model at a time. For the same reason, I set the batch size to 2. (See the docs for a complete list of possible arguments)

Even though fine-tuning is a lot faster than pre-training from scratch, transformers are hefty networks, so it’s still a computationally expensive process. I fine-tuned GPoeT-2 on a Nvidia GeForce RTX 2060 for 5 epochs with a batch size of 2 and it took about 5 hours.

…And the burning

To test out my model, I used another Hugging Face script, run_generation.py, and ran the following command:

python run_generation.py \
--model_type gpt2 \
--model_name_or_path './tuned_model' \
--length 200 \

I then fed in the first part of The Fall of the House of Usher when prompted for input text. The result was certainly Edgar Allan Poe-like, but it…well…read for yourself:

During the whole of a dull, dark, and soundless day in the autumn of the year, when the clouds hung oppressively low in the heavens, I was again thrown headlong into a fire. My imagination ran wild with visions of the wildest of the parched earth–of the burning and the red-hot Egypt–of the dreary, the ghastly moon–of the chilly and the mournful, and the unhallowed, and the pungent Egypt–of the burning and the fissure. I saw the burning and the burning, and the burning of the gutter; and I knew that I had all along been consumed. I had spent the day in the wild, and dreary, and pestilential, and ill-remembered apartment, with the burning and the burning, and the burning of the conjunctures of the burning, and the burning of the monastic fire, I was summoned into a single, low, and solitary, and convulsive, and motionless stupor; and amid the shrill and loathsome murmur of the burning and the burning, I became, in a full, distinct, and motionlessness of the terrible, and the filmy, and the silent, and the muttering of the human soul. And, as I listened to the mutterings of the burning, and the murmurs of the burning, and the muttering, and the groans of the burning, and the lamentable, and the lamentable, and the damned, and the muttering, and the groans of the burning, and the lamentable

So GPoeT-2 likes to repeat itself…this isn’t too surprising given that neural networks are great at picking up patterns, and Poe frequently used repetition as a literary device (see The Tell Tale Heart, The Raven and The Bells for prime examples). But GPoeT-2’s level of repetition was a bit overkill. With a different prompt, I had even gotten it to repeat the same phrase ad infinitum.

The final ingredient to get GPoeT-2 to be perfect was to penalize it for repetition. After trial and error, I changed the default repetition_penalty argument from 1 to 1.1 and found this to produce the best results.

Turning my idea into an Web app

The next step was to turn my GPoeT-2 generator into a web app with Anvil, so that anyone can generate their own macabre tales.

I used the Anvil Uplink to connect my local run_generation.py script to the Anvil server. Adapting this standalone script to serve a web app was pretty simple.

I added three arguments to the main() function, length, prompt_text and model_name, so that input from the Anvil app could be passed in as arguments to the function. The length and prompt_text arguments allow users of the app to decide how many words the model should generate and to feed it some starting text. model_name allows users to choose whether to use GPT-2 or the fine-tuned GPoeT-2 to generate text. As a final step, I added anvil.server.wait_forver() to the bottom of the local script, so that the connection to Anvil wouldn’t close.

#added to the top of the script to connect to my Anvil app
import anvil.server
anvil.server.connect("<my Uplink key>")
...

#this decorator makes my function callable in Anvil
@anvil.server.callable
def main(length, prompt_text, model_name):

    #if app user chooses GPoeT-2, the argument is changed to the path to the fine-tuned model
    if model_name == 'poe': 
        model_name_or_path = './tuned_model'
    else:
        model_name_or_path = 'gpt2'
...

#add this at the bottom so the connection doesn't close
anvil.server.wait_forever()

You can use the app yourself to have GPoeT-2 generate a Poe-like ghost story:

The Pit and the Generated Pendulum

I decided to have GPoeT-2 rewrite The Pit and the Pendulum. Here’s how I did it: I chose a few sentences from the story and input them separately into GPoeT-2, generating 500 words at a time. I then used this generated text to piece together a story as best I could. Only the first sentence, which is in bold, was written by Poe. I did not write any of the story, but I did rearrange passages and make some sentences more grammatical.

Without futher ado, here’s The Pit and the Pendulum by GPoeT-2. Enjoy and Happy Halloween!

I WAS sick, sick unto death with that long agony; and when they at length unbound me, and I was permitted to sit, I felt that my senses were leaving me. I writhed with rage, and trembled with pain. I raged and shrieked, and rivalled, and wrung, and screamed and swore, and staggered and reeking, and reared again. In a state of anxiety, I called to my bed, and buried myself in an incubus of some substance which had escaped my body; then, my eyes became lost in an ague. When the first faint and vivid thought entered me, I again reeled, and struggled to utter it. With a shudder, I dwelt upon an even more ghastly horror; and, on the margin of my comfortable bed and lofty chair, there flitted a thousand fancies about me. A lamp swung low on the bed as if on purpose. Its glare was bright red-hot, and, with an air of horror still burning, sat upon my brain, as if from some singular cause.

For a moment, in fact, all was sure, I could hear no more; but a dull, husk or quavering cry; and, half of horror, half of terror, I prostrated as I dared not utter. My soul reeked into abysses; and the world grew louder and louder with every species of scream; and, like the ague, it gushed with odors of an alley-carnival fire. In an instant, a bonny, gentle shudder seized my whole frame, and I fled to the wing and fell victim to a nearly hourly ill-directed horror. The utter darkness was total; and with a gale of the most terrible tempest, and the grey clouds, there rushed to life from the mist, and into all my limbs: but my nerves had been chilled, and my vision fainting, and dizzy, and dizzied; and I staggered and fell; and shrieked out loud and high, and low, and convulsively; and the fever called to me to the living Earth. I shuddered, and sank, and there fell headlong into that ghastly coffin.

I gasped for breath, and moved at once, and trembled in its slumber. I struggled and screamed for aid. Still, a voice, or rather a hideous shriek, was heard. I groaned aloud. It said nothing. The monster came and leaped, and swam, and lurched, and leaped, and fell. I stood, with all my strength, on the bed, and my limbs swung and lashed, and the monster lurked. He was obliged to fall back, and, being nearly unconscious of my fall, he could scarcely move his feet. His head lay flat upon the floor as if from a sudden quivering pain; his arms were on one end of the partition, and his knees were beneath me, while I fell prostrate. I knew that the animal had recovered sufficiently from its fright to be able to make its struggle no longer; its head would have been capable of reaching my body, but not the slightest portion of its tail, nor legs. I shuddered as I touched it, and regarded its form. The fiend rose and fled.

I was now attempting to get up. My head was thrown from extreme pain, and much swollen and dizzy; and blackened with the intense thirst; for some time. For an instant, a memory flew within my brain. The horrors of that frightful night, the terrors of those past, were no more. It was only a feeble hope, but still a solid and happy hope; but it was all–it was all–all at once, and this was a mere dream. The world had ceased. The fever and I must hurry to a final termination. But the soul–the spirit stepped upon the threshold–and I gasped and died. My soul remained as an empty shell, a shadow, and no more. And the spectre died away; and I became nothing more.

P.S. Poe was a much better writer than any AI could ever be. All of his stories are in the public domain, so (re-)read them for yourself! Maybe start with the real The Pit and the Pendulum


  1. He should have seen it coming, but nobody expects the Spanish Inquisition. ↩︎