RNN metadata for mimicking individual author style

Teaching a text-generating char-RNN to automatically imitate many different authors by labeling the input text by author; additional experiments include imitating Geocities and retraining GPT-2 on a large Project Gutenberg poetry corpus.
topics: statistics, NN, fiction, shell, R, GPT, tutorial
created: 12 Sep 2015; modified: 21 Mar 2019; status: finished; confidence: likely; importance: 8

Char-RNNs are unsupervised generative models which learn to mimic text sequences. I suggest extending char-RNNs with inline metadata such as genre or author prefixed to each line of input, allowing for better & more efficient metadata, and more controllable sampling of generated output by feeding in desired metadata. An experiment using torch-rnn on a set of ~30 Project Gutenberg e-books (1 per author) to train a large char-RNN shows that a char-RNN can learn to remember metadata such as authors, learn associated prose styles, and often generate text visibly similar to that of a specified author.

I further try & fail to train a char-RNN on Geocities HTML for unclear reasons.

More successfully, I experiment with a recently-developed alternative to char-RNNs, the Transformer NN architecture, by finetuning training OpenAI’s GPT-2-small Transformer model on a much larger (117MB) Project Gutenberg poetry corpus using both unlabeled lines & lines with inline metadata (the source book). The poetry generated is of considerably higher quality than my char-RNNs.

A character-level recurrent neural network (“char-RNN”) trained on corpuses like the Linux source code or Shakespeare can produce amusing textual output mimicking them. Music can also be generated by a char-RNN if it is trained on textual scores or transcriptions, and some effective music has been produced this way (I particularly liked Sturm’s).

A char-RNN is simple: during training, it takes a binary blob (its memory or “hidden state”) and tries to predict a character based on it and a new binary blob; that binary blob gets fed back in to a second copy of the RNN which tries to predict the second character using the second binary blob, and this gets fed into a third copy of the RNN and so on (“unrolling through time”). Whether each character is correct is the training error, which get backpropagated to the previous RNNs; since they are still hanging around in RAM, blame can be assigned appropriately, and eventually gibberish hopefully evolves into a powerful sequence modeler which learns how to compactly encode relevant memories into the hidden state, and what characters can be predicted from the hidden state. This doesn’t require us to have labels or complex loss functions or a big apparatus—the RNN gets trained character by character.

Handling multiple corpuses

A problem with this approach is that a char-RNN has to be trained for each corpus: if you want Shakespearean gibberish, you must train it only on Shakespeare, and if you want Irish music, you must train only on Irish—if you don’t, and you create a corpus which is Shakespeare concatenated with the Bible, you will probably get something halfway between the two, which might be somewhat interesting, but is not a step forward to generating better & more interesting gibberish; or if you have a few hundred songs of Irish music written in ABC format and then you have a few dozen of rock or classical pieces written in MIDI, training an RNN on them all mixed together will simply yield gibberish output because you will get an ‘average syntax’ of ABC & MIDI and an ‘average music’ of Irish & Rock. This is in part because the training is unsupervised in the sense that the char-RNN is only attempting to predict the next character given the previous characters, and it has no reason to give you just Shakespeare or just Bible output; it is bouncing between them

However, it seems like it should be possible to do this. An RNN is a powerful neural network, and we can see in examples using Karpathy’s char-rnn that such RNNs have learned ‘sublanguages’: in the Linux C source code examples, the RNN has learned to switch appropriately between comments, source code, and string literals; in the CSS examples, it’s learned to switch between comments, CSS source code, string literals, URLs, and data-URIs. If the RNN can decide on its own while generating C or CSS to switch from “source code mode” to “comment mode”, then it should be able to also learn to switch between Shakespeare and Bible mode, or even more authors.

If we could get the RNN to do such switching on demand, there are several possible benefits. Human-authored textual output is always more similar than different: a text file of Shakespeare is much more similar to a text file of the Bible than it is to an equivalent length of ASCII generated at random such as $M@Spc&kl?,U.(rUB)x9U0gd6G; a baroque classical music score is more similar to a transcript of an traditional Irish music jam. Since they share such mutual information, a trained RNN to produce Shakespeare and the Bible will be smaller than the sum of2 RNNs for Shakespeare & the Bible separately; this makes it easier to share trained RNNs since you can distribute 1 RNN covering many genres or authors for people to play with, rather than having to train & host a dozen different RNNs. Such an RNN may also generate better output for all cases since less of the corpuses’ information is spent on learning the basics of English shared by both corpuses and more is available for learning the finer details of each kind of writing, which may help in cases like music where large datasets of textual transcriptions of a desired genre may not be available (by training on a large corpus of classical music, a smaller corpus of Irish music may go further than it would’ve on its own). More speculatively, the metadata itself may dynamically improve generation by making it easier for the RNN to not ‘wander’ but, since the RNN is keeping a memory of the metadata in its hidden state, output may be more thematically coherent since the RNN can periodically refer back to the hidden state to remember what it was talking about.

How can we do that? The RNN in the C or CSS examples is able to mode-switch like this because, I think, there are clear transition markers inside the CSS or C which ‘tell’ the RNN that it needs to switch modes now; a comment begins /* ... or a data-URI in CSS begins url('data:image/png;base64,...). In contrast, the most straightforward way of combining music or books and feeding them into a char-RNN is to simply concatenate them; but then the RNN has no syntactic or semantic markers which tell it where ‘Bible’ begins and ‘Shakespeare’ ends. Perhaps we can fix that by providing metadata such as author/genre and turning it into a semi-supervised task, somehow, along the lines of the source code: distinguish the text of one author from another, and then let the RNN learn the distinctions on its own, just like the CSS/C.


There are two approaches for how to encode the metadata into the RNN:

  1. in band: systematically encode the metadata into the corpus itself, such as by a prefixed or suffixed string, and hope that the RNN will be able to learn the relevance of the metadata and use it during training to improve its predictions (which it should, as LSTM/GRU units are supposed to help propagate long-term dependencies like this); then specific genres or authors or styles can be elicited during sampling by providing that metadata as a seed.

    So for example, a Shakespeare corpus might be transformed by prefixing each line with a unique string which doesn’t to appear in the corpus itself, eg “SHAKESPEARE|To be or not to be,|SHAKESPEARE”. Then during sampling, Shakespearean prose will be triggered like th sample.lua rnn.t7 -primetext "SHAKESPEARE|". (Why the pipe character? Because it’s rarely used in prose but isn’t hard to type or work with.) To add in more metadata, one adds in more prefixes; for example, perhaps the specific work might be thought relevant and so the corpus is transformed to “SHAKESPEARE|HAMLET|To be or not to be,|HAMLET|SHAKESPEARE”. Then one can sample with the specific work, author, or both. For musical generation, relevant metadata might be musical genre, author, tempo, instruments, type of work, tags provided by music listeners (“energetic”, “sad”, “for_running” etc), so one could ask for energetic Irish music for two fiddles.

    This has the advantage of being easy to set up (some regexes to add metadata) and easy to extend (take an existing trained RNN and use it on the modified corpus); the disadvantage is that it may not work as the RNN may be unable to jointly learn to recall and use the metadata—it may instead learn to forget the metadata immediately, or spend all its learning capacity on modeling an ‘average’ input because that yields better log-loss error. This in band approach can also easily be extended to cover classification; in classification, the metadata is put at the end of each line, so instead of learning to predict text conditional on metadata & previous text, the RNN is learning to predict metadata conditional on previous text, and classifications can be extracted by low-temperature sampling with the input as the prime text followed by the separator character and seeing what metadata is predicted (eg th sample.lua classification.t7 -temperature 0.1 -primetext "...text...|" ~> "SHAKESPEARE\n").

    As far as I know, no one has done this except perhaps inadvertently or implicitly.

  2. out of band: instead of depending on the RNN to learn the value of the metadata and preserving it in its hidden state, one can change the RNN architecture to inject the metadata at each timestep. So if one has an RNN of 500 neurons, 5 of them will be hardwired at each timestep to the metadata value for the sequence being worked on.

    The downside is that all metadata inputs will require modification of the RNN architecture to map them onto a particular hidden neuron. The advantage is that the metadata value will always be present, there is no need to hope that the RNN will learn to hold onto the metadata, and it only has to learn the associated differences; so it will learn more reliably and faster. Variants of this turn out to have been done before:

    1. Mikolov & Zweig 2012, “Context dependent recurrent neural network language model”: RNN augmented with topic information from LDA, achieving better prediction on the Penn Treebank & WSJ transcription task

    2. Aransa et al 2013/2015, “Improving Continuous Space Language Models using Auxiliary Features”: a feedforward NN given n characters at a time, with the inputs at each sequence including embeddings of the previous lines and, particularly, 5 ‘genres’ (in this case, Egyptian Arabic SMS/chat, modern standard Arabic, Egyptian Arabic forum discussions, Levantine forum discussions, formal MSA from UN translations, Egyptian Arabic telephone calls), hardwired into the input layer; finding that genre particularly helped BLEU scores. (Including metadata like genre to assist training appears to have been used fairly regularly in earlier text topic-modeling work, but not so much neural networks or for increasing realism of generated text.)

    3. Chen et al 2015, “Recurrent Neural Network Language Model Adaptation for multi-Genre Broadcast Speech Recognition”: an RNN augmented with the text input being fed into standard text topic-modeling algorithms like LDA, partially trained on BBC genres (advice/children/comedy/competition/documentary/drama/events/news), and the total outputs from the topic algorithms hardwired into the input layer along with the text; giving moderate improvements on audio->text transcription.

    4. Sennrich et al 2016, “Controlling Politeness in Neural Machine Translation via Side Constraints”: a standard neural machine translation using RNNs in the encoder-decoder framework, here for translating English->German movie subtitles, but the German corpus’s sentences are annotated by politeness metadata describing the pronouns/verb conjugations; they obtain both better BLEU scores on translation as well as the ability to change to change the generated English

    5. This has also been done in Lipton et al 2015 (see also Ficler & Goldberg 2017): they model beer reviews with a character-level RNN which is given metadata (beer types: “American IPA”, “Russian Imperial Stout”, “American Porter”, “Fruit/Vegetable Beer”, and “American Adjunct Lager”) as a hardwired input to the RNN at each timestep, noting that

      It might seem redundant to replicate xaux at each sequence step, but by providing it, we eliminate pressure on the model to memorize it. Instead, all computation can focus on modeling the text and its interaction with the auxiliary input…Such models have successfully produced (short) image captions, but seem impractical for generating full reviews at the character level because signal from xaux must survive for hundreds of sequence steps. We take inspiration from an analogy to human text generation. Consider that given a topic and told to speak at length, a human might be apt to meander and ramble. But given a subject to stare at, it is far easier to remain focused.

      They experienced trouble training their beer char-RNN, and they adopt a strategy of training normally without the hardwired metadata down to a loss of <1.0/character and then training with metadata to a final loss of 0.7-0.8. This is reasonable because at a loss of 1.1 on English text, sampled output has many clear errors, but at <0.9 the output becomes uncanny; it stands to reason that subtle differences of style & vocabulary will only begin to emerge once the RNN has the basics of English down pat (the differences between skilled authors’ Englishes are, unsurprisingly, smaller than the differences between regular English & gibberish).

    Pretraining+metadata works well for Lipton et al 2015, but they don’t compare it to inlined metadata or show that the pretraining is necessary. I am also a little skeptical about the rationale that out of band signaling is useful because it puts less pressure on the hidden state: while it may reduce pressure on the RNN’s LSTMs to memorize the metadata, one is still losing RAM to reinjecting the metadata into the RNN at every timestep. Either way, the metadata must be stored somewhere in RAM and it doesn’t make much difference if it’s 495 effective neurons (with 5 hardwired to metadata) or if it’s 500 effective neurons (of which 5 eventually get trained to hold metadata, yielding 495 effective neurons). Pretraining also won’t work with torch-rnn as the word-embedding it computes is different on each dataset, so it’s currently impossible to train on an unlabeled dataset, change the data to labeled, and resume training.

    1. after my experiments here, DeepMind published a CNN for generating raw audio: “WaveNet: A Generative Model for Raw Audio”, van den Oord et al 2016. They noted similar phenomena: the WaveNet could imitate specific speakers if provided speaker labels along with the raw audio, and specifying metadata like instruments allowed control of generated musical output. Another later Google paper, Johnson et al 2016’s “Google’s Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation”, applies in-band metadata to generalize a RNN translator by specifying the target language in-band and having the RNN learn how to exploit this metadata for better natural language generation and the ability to translate between language pairs with no available corpuses.

Given the attractive simplicity, I am going to try in band metadata.


The easiest kind of data to test with is English prose: I can recognize prose differences easily, and there are countless novels or fictional works which can be converted into labeled prose.

If we just download some complete works off Project Gutenberg (googling ‘Project Gutenberg “complete works of”’), prefix each line with “$AUTHOR|”, concatenate the complete works, and throw them into char-rnn, we should not expect good results: the author metadata will now make up something like 5% of the entire character count (because PG wraps them to short lines) and by training on 5M of exclusively Austen and then 5M of exclusively Churchill, we might run into overfitting problems and due to the lack of proximity of different styles, the RNN might not ‘realize’ that the author metadata isn’t just some easily predicted & then ignored noise but can be used to predict far into the future. We also don’t want the PG headers explaining what PG is, and to make sure the files are all converted to ASCII.

So to deal with these 4 issues I’m going to process the PG collected works thusly:

  1. delete the first 80 lines and last ~300 lines, and filter out any line mentioning “Gutenberg”

  2. convert to ASCII

  3. delete all newlines and then rewrap to make lines which are 10000 bytes—long enough to have a great deal of internal structure and form a good batch to learn from, and thus can be randomly sorted with the others.

    But newlines do carry semantic information—think about dialogues—and does deleting them carry a cost? Perhaps we should map newlines to some rare character like tilde, or use the poetry convention of denoting newlines with forward-slashes?

  4. prefix each long line with the author it was sampled from


As a baseline, a char-RNN with 2x2500 neurons, trained with 50% dropout, batch-size 55, and BPTT length 200, on the PG dataset without any author prefixes or suffixes, converges to a validation loss of ~1.08 after ~20 epoches.

Training with prefixes

Small RNN

For my first try, I grabbed 7 authors, giving a good final dataset of 46M, and fed it into char-rnn, choosing a fairly small 2-layer RNN and using up the rest of my GPU RAM by doing unrolling far more than the default 50 timesteps to encourage it to learn the long-range dependencies of style:

cd ~/src/char-rnn/data/
mkdir ./styles/ ; cd ./styles/

## "The Complete Project Gutenberg Works of Jane Austen" http://www.gutenberg.org/ebooks/31100
wget 'https://www.gutenberg.org/ebooks/31100.txt.utf-8' -O austen.txt
## "The Complete Works of Josh Billings" https://www.gutenberg.org/ebooks/36556
wget 'https://www.gutenberg.org/files/36556/36556-0.txt' -O billings.txt
## "Project Gutenberg Complete Works of Winston Churchill" http://www.gutenberg.org/ebooks/5400
wget 'https://www.gutenberg.org/ebooks/5400.txt.utf-8' -O churchill.txt
## "The Project Gutenberg Complete Works of Gilbert Parker" https://www.gutenberg.org/ebooks/6300
wget 'https://www.gutenberg.org/ebooks/6300.txt.utf-8' -O parker.txt
## "The Complete Works of William Shakespeare" http://www.gutenberg.org/ebooks/100
wget 'https://www.gutenberg.org/ebooks/100.txt.utf-8' -O shakespeare.txt
## "The Entire Project Gutenberg Works of Mark Twain" http://www.gutenberg.org/ebooks/3200
wget 'https://www.gutenberg.org/ebooks/3200.txt.utf-8' -O twain.txt
## "The Complete Works of Artemus Ward" https://www.gutenberg.org/ebooks/6946
wget 'https://www.gutenberg.org/ebooks/6946.txt.utf-8' -O ward.txt
du -ch *.txt; wc --char *.txt
# 4.2M  austen.txt
# 836K  billings.txt
# 9.0M  churchill.txt
# 34M   input.txt
# 12M   parker.txt
# 5.3M  shakespeare.txt
# 15M   twain.txt
# 12K   ward.txt
# 80M   total
#  4373566 austen.txt
#   849872 billings.txt
#  9350541 churchill.txt
# 34883356 input.txt
# 12288956 parker.txt
#  5465099 shakespeare.txt
# 15711658 twain.txt
#     9694 ward.txt
# 82932742 total
for FILE in *.txt; do
  dos2unix $FILE
  AUTHOR=$(echo $FILE | sed -e 's/\.txt//' | tr '[:lower:]' '[:upper:]')
  cat $FILE | tail -n +80 | grep -v -i 'Gutenberg' | iconv -c -tascii | tr '\n' ' ' | \
   fold --spaces --bytes --width=10000 | sed -e "s/^/$AUTHOR\|/" > $FILE.transformed
rm input.txt
cat *.transformed | shuf > input.txt
cd ../../
th train.lua -data_dir data/styles/ -gpuid 0 -rnn_size 747 -num_layers 2 -seq_length 187
# using CUDA on GPU 0...
# loading data files...
# cutting off end of data so that the batches/sequences divide evenly
# reshaping tensor...
# data load done. Number of data batches in train: 4852, val: 256, test: 0
# vocab size: 96
# creating an LSTM with 2 layers
# number of parameters in the model: 7066716
# cloning rnn
# cloning criterion
# 1/242600 (epoch 0.000), train_loss = 4.57489208, grad/param norm = 9.6573e-01, time/batch = 2.03s
# ...
# 15979/242600 (epoch 3.293), train_loss = 1.01393854, grad/param norm = 1.8754e-02, time/batch = 1.40s

This gets us a corpus in which every line specifies its author and then switches authors, while still being long enough to have readable meaning. After about 22 hours of training yielding a validation loss of 1.0402 (with little improvement evident after the first 7 hours), we can try out our best candidate and see if it knows Shakespeare versus Austen:

BEST=`ls cv/*.t7 | sort --field-separator="_" --key=4 --numeric-sort --reverse | tail -1`
th sample.lua $BEST -temperature 0.8 -length 500 -primetext "SHAKESPEARE|"
# SHAKESPEARE|is of no regular complexion.  The action of the plain chatter--"  "Alas, they
# have discovered what was to be afforded since then?"  "We can believe--for the signature of
# the Church."  "So they do, dear lord, do they their home?  Oh, no, to the devil which we
# have not written, the Church is not in the world; but not in this harmless way then to the
# captain of man--therefore while the praise of it was allurious he would not reflect on the
# curious man's hatch deemed that his life should be very con

th sample.lua $BEST -temperature 0.8 -length 500 -primetext "SHAKESPEARE|" -seed 105
# now looked at him a sharp pleasure in passing southward and again in portion of his mother's
# reach of it. Suddenly the thing was said.  "We'll sit down and find out," he inquired, with a
# pity to see Mr. Carvel driving beside the bedroom, which was almost as much as he could bear
# the potion.  "You say you're strong," said Mrs. Holy, indignantly, "you won't have to go
# away, about the doctor. What is it?"  "Why, we are"

th sample.lua $BEST -temperature 0.8 -length 500 -primetext "AUSTEN|"
# AUSTEN|business, and the gout--a constant and foolish figure in which Fellowes' ring is
# nearer to distemper than meek and steady interest and clean iron. The episode for the future
# and the war, and the seedy and effective sun-elogs and the others ventured its remote room,
# whose hair was a suffering man--that the work of the circumstance interested him. It had no
# long served to open the papers to answer up a quiet road, free from the long row of white
# to the lash called No. 14,000 to a sweet conversatio

th sample.lua $BEST -temperature 0.8 -length 500 -primetext "TWAIN|"
# TWAIN|quarrelling with a little book, and so on, considering its sensations as to whether
# it were not possible to eat it.  He thought that the leader of the conference with his own
# death would be recognized as a common expression.  The men that mounted from motive powers,
# how big the calf, commander of the rights of the new economic steamer, the English, a lass
# of manhood, will exhibit no praise or increase out of a sort of meaning in the senses, and
# send them back to such a winter as we can go into t

We can see that while the RNN is producing very English-sounding novelistic prose and produces its usual mix of flawless syntax and hilarious semantics (I particularly like the phrase “Oh, no, to the devil which we have not written, the Church is not in the world”), it has failed to learn the styles I was hoping for. The Austen and Twain samples sound somewhat like themselves, but the Shakespeare samples are totally wrong and sound like a Victorian English novel. And given the lack of improvements on the validation set, it seems unlikely that another 10 epochs will remedy the situation: the RNN should quickly learn how to use the very useful metadata.

Since the style varies so little between the samples, I wonder if mimicking English uses up all the capacity in the RNN? I gave it only 747 neurons, but I could’ve given it much more.

Larger RNN

So to try again:

  • to better preserve the semantics, instead of deleting newlines, replace them with a slash
  • try much shorter lines of 1000 bytes (increasing the relative density of the metadata)
  • back off on the very long backpropagation through time, and instead, devote the GPU RAM to many more neurons.
  • the default setting for the validation set is a bit excessive here and I’d rather use some of that text for training

Errored out of memory early the next day; the validation loss is still pretty meh, but at 1.1705, can’t expect much, and indeed, the style is not impressive when I check several prefixes:

th sample.lua cv/lm_lstm_epoch0.93_1.1705.t7 -temperature 0.8 -length 500 -primetext "SHAKESPEARE|"
# seeding with SHAKESPEARE|
# --------------------------
# SHAKESPEARE|jung's own,/which is on the house again.  There is no endeavour to be dressed in the midst of the/present of
# Belle, who persuades himself to know to have a condition of/the half, but "The garnal she was necessary, but it was high,
# consecrets, and/excursions of the worst and thing and different honor to flew himself.  But/since the building closed the
# mass of inspiration of the children of French wind,/hurried down--but he was in the second farmer of the Cald endless figures,
# Mary/Maeaches, and t

th sample.lua cv/lm_lstm_epoch0.93_1.1705.t7 -temperature 0.8 -length 500 -primetext "AUSTEN|"
# AUSTEN|mill./And now the good deal now be alone, there is no endeavour to be dreaming./In fact, what was the story of his
# state, must be a steady carriages of pointing out/both till he has walked at a long time, and not convinced that he
# remembers/her in this story of a purpose of this captain in stock. There was/no doubt of interest, that Mr. Crewe's
# mother could not be got the/loss of first poor sister, and who looked warm enough by a/great hay below and making a
# leaver and with laid with a murder to

th sample.lua cv/lm_lstm_epoch0.93_1.1705.t7 -temperature 0.8 -length 500 -primetext "TWAIN|"
# TWAIN|nor contributed/she has filled on behind him.  He had been satisfied by little just as to/deliver that the inclination
# of the possession of a thousand expenses in the group of feeling had destroyed/him to descend.  The physical had he darted
# before him that he was worth a
# PARKER|George Pasha, for instance?"//"Then it is not the marvel of laws upon Sam and the Sellers."  She said/he would ask
# himself to, one day standing from the floor, as he/stood for the capital.  He was no good of conversation

Larger author count

Next, I decided to increase diversity of styles: ramping up to 38 authors, including modern SF/F fiction authors (Robert Jordan’s Wheel of Time, Gene Wolfe, R.A. Lafferty, Ryukishi07’s Umineko no naku koro ni, Kafka), poetry ancient and modern (Iliad, Beowulf, Dante, Keats, Coleridge, Poe, Whitman, Gilbert & Sullivan), ancient fiction (the Bible), miscellaneous nonfiction (Aristotle, Machiavelli, Paine) etc. By adding in many more authors from many different genres and time periods, this may force the RNN to realize that it needs to take seriously the metadata prefix.

wget 'https://dl.dropboxusercontent.com/u/182368464/umineko-compress.tar.xz'
untar umineko-compress.tar.xz && rm umineko-compress.tar.xz
mv umineko/umineko.txt  ryukishi07.txt; mv  umineko/wot.txt jordan.txt; rm -rf ./umineko/

cat /home/gwern/doc-misc/fiction/lafferty/*.txt > lafferty.txt
cat /home/gwern/doc-misc/fiction/wolfe/fiction/*.txt > wolfe.txt

wget 'https://www.gutenberg.org/ebooks/10031.txt.utf-8'  -O poe.txt && sleep 5s ## avoid anti-crawl defenses
wget 'https://www.gutenberg.org/ebooks/11.txt.utf-8'     -O carroll.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/1232.txt.utf-8'   -O machiavelli.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/12699.txt.utf-8'  -O aristotle.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/1322.txt.utf-8'   -O whitman.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/16328.txt.utf-8'  -O beowulf.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/1661.txt.utf-8'   -O doyle.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/23684.txt.utf-8'  -O keats.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/2383.txt.utf-8'   -O chaucer.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/2701.txt.utf-8'   -O melville.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/30.txt.utf-8'     -O bible.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/3090.txt.utf-8'   -O maupassant.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/31270.txt.utf-8'  -O paine.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/3253.txt.utf-8'   -O lincoln.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/345.txt.utf-8'    -O stoker.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/3567.txt.utf-8'   -O bonaparte.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/3600.txt.utf-8'   -O montaigne.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/4200.txt.utf-8'   -O pepys.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/4361.txt.utf-8'   -O sherman.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/4367.txt.utf-8'   -O grant.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/6130.txt.utf-8'   -O homer.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/7849.txt.utf-8'   -O kafka.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/808.txt.utf-8'    -O gilbertsullivan.txt && sleep 5s
wget 'https://www.gutenberg.org/ebooks/8800.txt.utf-8'   -O dante.txt && sleep 5s
wget 'https://www.gutenberg.org/files/28289/28289-0.txt' -O eliot.txt && sleep 5s
wget 'https://www.gutenberg.org/files/29090/29090-0.txt' -O coleridge.txt && sleep 5s
wget 'https://www.gutenberg.org/files/5000/5000-8.txt'   -O davinci.txt && sleep 5s

Due to OOM crash, I decreased the neuron count. With a much bigger model, also necessary to have dropout enabled (default of 0 means progress seems to halt around a loss of 3.5 and makes no discernible progress for hours)

Did OK but seemed to have difficulty improving past a loss of 1.14, had issues with exploding error (one exploding error up to a loss of 59 terminated an overnight training run) and then began erroring out every time I tried to resume, so I began a third try, this time experimenting with deeper layers and increasing the data preprocessing steps to catch various control-characters and copyright/boilerplate which snuck in:

This one eventually exploded too, having maxed out at a loss of 1.185.

After deleting even more control characters and constantly restarting after explosions (which had become a regular thing as the validation loss began bouncing around a range of 1.09-1.2, the RNN seeming to have severe trouble doing any better) I did some sampling. The results are curious: the RNN has memorized the prefixes, of course, and at higher temperatures will spontaneously end with a newline and begin with a new prefix; many of the prefixes like “BIBLE|” look nothing like the original source, but the “JORDAN|” prefix performs extremely well in mimicking the Wheel of Time, dropping in many character names and WoT neologisms like “Aiel” or (of course) “Aes Sedai”. This isn’t too surprising since the WoT corpus makes up 20M or a sixth of the input; it’s also not too surprising when WoT terms pop up with other prefixes, but they do so at a far lower rate. So at least to some extent, the RNN has learned to use Jordan versus non-Jordan prefixes to decide whether to drop in WoT vocab. The next largest author in the corpus is Mark Twain, and here too we see something similar: when generating Twain text, we see a lot of words that sound like Twain vocabulary (riverboats, “America”, “the Constitution” etc), and while these sometimes pop up in the smaller prefix samples it’s at a much lower rate. So the RNN is learning that different prefixes indicate different vocabularies, but it’s only doing this well on the largest authors.

Class imbalance fix

Does this reflect that <2M of text from an author is too little to learn from and so the better-learned authors’ material inherently pulls the weaker samples towards them (borrowing strength), that the other authors’ differences are too subtle compared to the distinctly different vocab of Jordan & Twain (so the RNN focuses on the more predictively-valuable differences in neologisms etc), or that the RNN is too small to store the differences between so many authors?

For comparison, a one-layer RNN trained on solely the Robert Jordan corpus (but still formatted with prefixes etc) got down to a loss of 0.9638, and just the Bible, 0.9420 So the penalty for the Bible for learning Jordan as well is , and vice-versa is . Presumably the reason the Bible RNN is hurt 2.7x more is because the Jordan corpus is 4.3x larger and more learning capacity goes to its vocabulary & style since a bias towards Jordan style will pay off more in reduced loss, a classic class-imbalance problem.

Class-imbalance problems can sometimes be fixed by changing the loss function to better match what one wants (such as by penalizing more errors on the smaller class), reducing the too-big class, or increasing the too-small class (by collecting more data or faking that with data augmentation). I tried balancing the corpuses better by limiting how much was taken from the biggest.

Also at this time, torch-rnn was released by Justin Johnson, with claims of much greater memory efficiency & better performance compared to char-rnn, so I tried it out. torch-rnn was capable of training larger RNNs, and I experienced many fewer problems with exploding loss or OOM errors, so I switched to using it. The preprocessing step remains much the same, with the exception of a | head --bytes=1M call added to the pipeline to limit each of the 31 authors to 1MB:

rm *.transformed
for FILE in *.txt; do
  dos2unix $FILE;
  AUTHOR=$(echo $FILE | sed -e 's/\.txt//' | tr '[:lower:]' '[:upper:]')
  cat $FILE | tail -n +80 | head -n -362 | grep -i -v -e 'Gutenberg' -e 'http' -e 'file://' -e 'COPYRIGHT' -e 'ELECTRONIC VERSION' \
    -e 'ISBN' | tr -d '[:cntrl:]' | iconv -c -tascii | sed -e ':a;N;$!ba;s/\n/ /g' -e 's/  */ /g' -e 's/ \/ \/ //g' | \
    fold --spaces --bytes --width=3000 | head --bytes=1M | sed -e "s/^/$AUTHOR\|/" > $FILE.transformed
cat *.transformed | shuf > input.txt

## with limiting:
findhog *.transformed
# 8   coleridge.txt.transformed
# 8   dante.txt.transformed
# 8   davinci.txt.transformed
# 8   eliot.txt.transformed
# 8   gilbertsullivan.txt.transformed
# 8   grant.txt.transformed
# 8   homer.txt.transformed
# 8   kafka.txt.transformed
# 8   pepys.txt.transformed
# 8   sherman.txt.transformed
# 152 carroll.txt.transformed
# 240 keats.txt.transformed
# 244 beowulf.txt.transformed
# 284 machiavelli.txt.transformed
# 356 poe.txt.transformed
# 560 doyle.txt.transformed
# 596 aristotle.txt.transformed
# 692 whitman.txt.transformed
# 832 stoker.txt.transformed
# 1028    bible.txt.transformed
# 1028    bonaparte.txt.transformed
# 1028    chaucer.txt.transformed
# 1028    jordan.txt.transformed
# 1028    lafferty.txt.transformed
# 1028    lincoln.txt.transformed
# 1028    maupassant.txt.transformed
# 1028    melville.txt.transformed
# 1028    montaigne.txt.transformed
# 1028    paine.txt.transformed
# 1028    ryukishi07.txt.transformed
# 1028    wolfe.txt.transformed

cd ../../
python scripts/preprocess.py --input_txt data/multi/input.txt --output_h5 multi.h5 --output_json multi.json --val_frac 0.005 --test_frac 0.005
nice th train.lua -input_h5 multi.h5 -input_json multi.json -batch_size 100 -seq_length 70 -dropout 0.5 -rnn_size 2500 -num_layers 2
# ...
# Epoch 28.52 / 50, i = 65000 / 118100, loss = 0.901009
# val_loss =      1.028011712161

This trained to convergence with a loss of ~1.03 after ~30 epochs taking a week or two, yielding 2016-03-27-metadata.t7 (583MB). This is ~0.05 better than the unlabeled baseline.

Did it succeed in learning to use the metadata and mimicking style?


Yes. Sampling 80K characters of text on CPU and setting the temperature high enough that the RNN will periodically emit a newline and jump to a new mode with the invocation th sample.lua -gpu -1 -checkpoint cv/2016-03-27-metadata.t7 -length 80000 -temperature 0.8 -start_text 'JORDAN|', there are 13 transitions:

  1. Jordan: short but fail. Mentions “London”, “Jacques”, “Nantucket”, etc

  2. Maupassant: success. Poison, murder, city etc

  3. Lafferty: mixed success. Clubs, girls, Chicago, heavy on dialogue, and American names, but also some vocabulary creeping in from other authors such as “Tar Valon” (Jordan)

  4. Chaucer: success. Clearly old-timey with invocations of Jesus. Sample:

    “…throughout this world, and shall thereby be called in trust, as now O first cause of this world we have no danger; That women were with you and the message, As I loved them they that should pray: No more of this so little wickedness.” When she saw him that there was no wight to see, For in h is cursed peace, his Christe’s hand, And cried his daughter many a long time For he took her out of the world so dear. And she was not holy and more jolly, Had wedded her no sooth and blithe sore; The lady is this marriage and her wife. Come to the priest, what woe we have to do, And thanke him to make a dream, and I can Thomas, with that he saide, may I not stand: And the time went him all out of the town, And with the corpse, and settled him like As Jesus Christ, as he was thought, They would have been a full confused grace.

  5. Whitman: short but success?

    WHITMAN|but lusty, closing the walls, Who are the clauses of cavalry with

  6. Chaucer: success

  7. Lincoln: success. Sample:

    LINCOLN|of his constitutional affairs, is better put down by their own things than above the extent of the majority of the people or of the Republicans of the United States which in the extremes may be said to be one of those who will obtain bad negro as ill-demanded and simple means as they have belonged. r. Pitt in the same manner in Parliament I have not seen him in the other uncommon personal expedition to the British court, and that his thirst was the object, or in which he wrote liberty for supporting him in the present day with an extreme resolution of the sovereignty….

  8. Bible: success. Sample:

    BIBLE|with him two cities which I commanded them; he shall not die: for the LORD is among us. And the LORD was come unto his son that sent him to seek the way to Adon. 02:019:019 And it came to pass at the end of three days after the people of Israel, that they had to touch their voice, and give him a south, and be cut before Pharaoh: 04:030:028 And the LORD spake unto oses, saying, 03:022:002 There shall not a man be found out of the house of the LORD. 03:013:028 And the priest shall have one lot and the length of the bullock, and shall put the blood upon the altar, and put the altar of gold to his feet, and set his finger in water, and shall come into the plain. 03:011:027 And the priest shall take the butler and the head of the servant shall sprinkle it out, and the priest shall burn it into a ring, and cover the fat that is upon the altar, and shall pitch it out. 03:001:004 And he shall put the lamps in water, even a trespass offering, and the hanging for the robe of the burnt offering, and put the altar of shittim wood, and burn the altar of burnt offering unto the LORD.

  9. Stoker: success. Victorian English, mention of cemeteries, disemvoweling, Van Helsing.

  10. Lafferty: mixed success. More Chicago and Lafferty-like vocabulary, but what is “Renfield” doing there?

  11. Ryukishi07: success. Sample:

    RYUKISHI07|of something like that. You can stop too long, a little bit more spinning stuff. You could put away the first side of your way out on the study at the end of the ‘Sea From Battler’. “I see, isn’t it?! Ooooooohhhh…” In other words, if the seagulls had been known to have been over there already, the Shannon wouldn’t have accepted a servant. …And when George-aniki suddenly put his head over and spat on his shoulders, Rand said, showing some relationship to her. He was calm and was jealous of his nearly much image or experience. “………………Hahahahaha……….” Natsuhi noticed that tune from the warm block, and it was quite a small part of it… “I’m not gonna be out of the main way. Where’s the witch?!” Natsuhi oba-san said something about forty… The fork of gold wasn’t like whispering every day. “…You’re still unable to make me. Now if you stay back to the back of the world part of my heart, that’s wrong. …………But I really have here a magazine.” “Ah, ………don’t worry about it. I wouldn’t call a lot one.” “That’s right. …If it was a metal bird, I would also stay here. I’m sorry, but it’s a fantastic person who is still living in your speed… If you couldn’t think of it, that’s right. If you want to call me a bed, I’d be swept by your duty and you may be fine.” “…………………” “……W, ………what are you going to do with the culprit? Did you say something like that…?” Natsuhi returned the rose garden. As the announcement had finished looking over his, he heard the overwhelming sound of the falling hair, on the windows, his eyes slicing around the sound of a pair of hold of holes in one hand. …

  12. Doyle: mixed success. There appears to be infiltration from Lincoln.

  13. Montaigne: mixed success. Discusses France, but also Nantucket.

  14. Maupassant: success.

So of the 14 samples, 8 were definitely in the style of the right author, 5 were mixed successes as they mostly resembled their author but not entirely, and only 1 was a clear failure. With 31 authors to choose from, that’s not an accident.

One Walt Whitman pastiche sample I generated while testing struck me as quite poetic; with line breaks inserted where indicated by capitalization:

And shes my brothers to be put upon me, intense and sound,
All are me. Sounds purified, O sound of the streets!
O landscapes! O still the fierce and the scraping of beauty!
The murderous twinkle of the sky and basement,
How the beasts at first began to bite and the waves near the floor.
The walls of lands discover'd passions,
Earth, sword-ships, enders, storms, pools, limailes, shapes of violent,
Rooters, alarms, the light-starring mail, untold arms, patients, portals, the well-managed number, the bravest farms,
The effect of doubts, the bad ways, the deeds of true signs, the curious things, the sound of the world,
It is of figure and anthem, the common battle rais'd,
The beautiful lips of the world that child in them can chase it

For a more systematic look, I generated samples from all included authors:

The Eliot output was perplexingly bad, consisting mostly of numbers, so I looked at the original. It turned out that in this particular corpus, 10 of the text files had failed to download, and instead, Project Gutenberg served up some HTML CAPTCHAs (not cool, guys)! This affected: Coleridge, Dante, Da Vinci, Eliot, Gilbert & Sullivan, Grant, Homer, Kafka, Pepys, & Sherman. (Checking the output, I also noticed that a number of words starting with capital ‘M’ were missing the ‘M’, which I traced to the tr call trying to strip out control characters that did not do what I thought it did.) Excluding the corrupted authors, I’d informally rank the output subjectively as:

  • bad: Aristotle, Beowulf, Bible, Chaucer, Jordan, Keats
  • uncertain: Carroll, Wolfe
  • good: Stoker, Paine, Bonaparte, Lafferty, Melville, Doyle, Ryukishi07, Whitman, Lafferty, Machiavelli, Aristotle, Bible

The RNN is somewhat inconsistent: sometimes it’ll generate spot-on prose and other times fail. In this case, good and bad Bible samples were present, and previous Chaucer was fine but the Chaucer in this sample was bad. (This might be due to the high temperature setting, or the messed-up texts.) But overall, it doesn’t change my conclusion that the RNN has indeed learned to use metadata and successfully mimic different authors.

Training with prefixes+suffixes

The RNN seems to learn the connection of the prefix metadata to the vocabulary & style of the following text only at the very end of training, as samples generated before then tend to have disconnected metadata/text. This might be due to the RNN initially learning to forget the metadata to focus on language modeling, and only after developing an implicit model of the different kinds of text, ‘notice’ the connection between the metadata and kinds of text. (Or, to put it another way, it doesn’t learn to remember the metadata immediately, as the metadata tag is too distant from the relevant text and the metadata is only useful for too-subtle distinctions which it hasn’t learned yet.) What if we tried to force the RNN to memorize the metadata into the hidden state, thereby making it easier to draw on it for predictions? One way of forcing the memorization is to force it to predict the metadata later on; a simple way to do this is to append the metadata as well, so the RNN can improve predictions at the end of a sample (predicting poorly if it has forgotten the original context); so text would look something like SHAKESPEARE|...to be or not to be...|SHAKESPEARE.

I modified the data preprocessing script slightly to append the author as well, but otherwise used the same dataset (including the corrupt authors) and training settings.

My first try at appending resulted in a failure, as it converged to a loss of 1.129 after a week or two of training, much worse than the 1.03 achieved with prefix-only. Sampling text indicated that it had learned to generate random author metadata at the end of each line, and had learned to mimic some different prose styles (eg Biblical prose vs non-Biblical) but it had not learned to memorize the prefix nor even the use of the prefix (!).

A second try with the same settings converged to 1.1227 after 25 epochs, with the same sampling performance.

In a third try, I resumed from that checkpoint but increased the BPTT unrolling seq_length from 50 to 210 to see if that would help it. It converged to 1.114 with suffixes still random. For a fourth try, I reduced dropout from 0.5 to 0.1, which did not make a difference and converged to 1.117 after 8 epoches.

So in this case, training with suffixes did not speed up training, and impeded learning.

While I am not too surprised that suffixes did not speed up training, I am surprised how it barred learning prefixes at all and I don’t know why. This should have been, if anything, an easier task.


I wondered if the same metadata approach could be used to trick the char-RNN into learning classification as well—perhaps if the RNN learns language modeling by trying to predict subsequent characters, it acquires a greater natural language understanding than if it was trained directly on predicting the author?

I fixed the corrupted HTML files and the tr bug, and modified the script to read fold --spaces --bytes --width=3000 (so each line is 3000 characters long) and the author is now placed at the end: sed -e "s/$/\|$AUTHOR/". So the char-RNN is trained to predict each subsequent character, and at the end of 3000 characters, it sees a | and (in theory) will then predict the author. To test the results, one can feed in a short stereotypical piece of text ending in a pipe, and see if it is able to respond by generating the author.

This turned out to be a total failure. After over a week of training, the validation loss had fallen to 1.02, yet when I sampled it, it was unable to classify text, eg:

At best, it sometimes would add random upcased text following the pipe (“|CHAPTER” was common), or random authors (never the right one).

I thought perhaps the penalty for missing the final characters in a line was too small as it represented no more than 0.3% of each line, and so I reduced the line-length down to 500 characters (so the author was now ~2% of each line). This didn’t work either (validation loss of ~1.12, probably due to shorter lines with less context to work with), so I disabled dropout, added batchnorm, and increased the BPTT enough to backpropagate over the entire line.

After another week or two, the validation loss asymptoted at ~1.09, but still no classification performance. (Trained model download.) Here is a sample (adding line-breaks for readability at capitalized words which correspond to linebreaks in the original):

41 Book 40 With patient ones of the seas, the form of the sea which was gained the streets of the moon.
Yet more all contest in the place, See
the stream and constant spirit, that is of a material spirit,
The live of the storm of forms and the first stretch
Of the complexion of the mountains;
The sea fell at the tree, twenty feet wide,
And the taste of a scarlet spot where the captain bears,
She shook the sound the same that was white,
Where the permanent eye of the sea had scarce assembled,
The many such, the beauteous of a subject of such spectacles.
If thou be too sure that thou the second shall not last,
Thou canst not be the exceeding strength of all.
Thou wert as far off as thou goest, the sea Of the bands and the streams of the bloody stars
Of the world are the mountains of the sun,
And so the sun and the sand strike the light,
But each through the sea dead the sun and spire
And the beams of the mountain shed the spirits half so long,
That of the which we throw them all in air.
Think of thy seas, and come thee from that for him,
That thou hast slain in dreams, as they do not see
The horses; but the world beholds me; and behold
The same the dark shadows to the sand,
And stream and slipping of the darkness from the flood.
He that I shall be seen the flying strain,
That pierces with the wind, and the storm of many a thousand rays
Were seen from the act of love to the course.
There was a stream, and all the land and bare
Ereth shall thy spirit be suppos'd
To fall in water, and the wind should go home on all the parts
That stood and meet the world, that with the strong the place
Of thy prayer, or the continual rose,
So that the shape of the brand broke the face,
And to the band of the ring which erewhile
Is turn'd the merchant bride.
I am thine only then such as thou seest,
That the spirits stood in those ancient courses,
And in their spirit to be seen, as in the hard form
Of their laws the people in the land,
That they are between, that thou dost hear a strong shadow,
And then, nor war in all their powers, who purposes hanging to the road,
And to the living sorrow shall make thy days
Behold the strains of the fair streets, and burn,
And the shepherd for the day of the secret tear,
That thou seest so high shall be so many a man.
What can ye see, as sinking on the part
Of this reminiscence of the pursuit?
Behold the martial spirits of men of the rock,
From the flowers of the touch of the land with the sea and the blow
The steamer and the bust of the fair cloud.
The steps behind them still advanc'd, and drew,
As prepared they were alone all now
The sharp stick and all their shapes that winds,
And the trembling streams with silver the showering fires
The same resort; they stood there from the plain,
And shook their arms, sad and strong, and speaks the stars,
Or pointed and his head in the blood,
In light and blue he went, as the contrary came and beat his hands.
The stars, that heard what she approach'd, and drew
The shore, and thus her breast retraced the rushing throng:
"And more with every man the sun
Proclaims the force of future tongues
That this of all the streams are crack'd."
"The thought of me, alas!" said he,
"Now that the thirst of life your country's father sang,
That in the realms of this beast the prince
The victor from the true betray beginnings of the day."

The generated text is semi-interesting, so it’s not that the RNN was broken. It was focused on learning to model the average text.

So it would seem that the classification signal was not strong enough to cause learning of it. The worsened validation score suggests that this approach simply won’t work: the longer the lines, the less incentive there is for classification, but the shorter the lines, the worse it learns to model the regular text.


Can we learn multiple metadata prefixes? Like an author and then a transform of some sort—in music, a useful transform might be time signature or instrument set.

A simple transform we could apply here is upcasing and downcasing every character, so we might have a set of 6 prefixes like Bible+upcase, Bible+downcase, Bible+mix, etc, written as BIBLE|U|, BIBLE|D|, BIBLE|M|, and to help enforce abstraction, also reverse ordering like U|BIBLE|, giving 12 total prefixes (3x2x2). The interesting question here is whether the RNN would be able to factor out the transformations and learn the up/mix/downcase transformation separately from the Bible/Jordan difference in styles. (If it thought that Jordan upcased was a different author, and to be learned differently, from Jordan downcased, then we would have to conclude that it was not seeing two pieces of metadata, Jordan+upcase, but seeing it as one JORDANUPCASE, and a failure of both learning and abstraction.) But if we included each of the 12 prefixes, then we wouldn’t know if it had managed to do this, since it could have learned each of the 12 separately, which might or might not show up as much worse performance. So we should leave out two prefixes: one to test out generalization of casing, and one to test out swapping (dropping 1 from Bible and 1 from Jordan to be fair). At the end, we should get an RNN with a validation loss slightly worse than 0.9763 (the extra transformation & keyword must cost something), and one which will hopefully be able to yield the correct output for the prefixes JORDAN|U| and C|BIBLE|

First version sans dropout got to a loss of 0.7969 (!); contamination or leakage of the validation test set? But since the versions in the validation set could be only different-cased versions, then wouldn’t’ve the RNN’d’t’ve learned the transformation and it’s not really leakage at all? After it hit a limit at 0.79 and started turning in losses of 0.8+ for hours, tried retraining it with some dropout and the loss exploded, not shrinking even after training it all night, so I restarted with a fresh RNN and some dropout, getting a more stable training result.

Unfortunately, it did not work. Using the unobserved pairs showed it had not learned to generalize.


So some lessons here are:

  1. use a sufficiently large RNN; 500 neurons may be adequate to model a single author like the Bible or Shakespeare but is too small to learn many authors despite the savings
  2. train to convergence; the differences between authors is smaller than between the average of authors & random noise, and the metadata will only show its worth at the end when it has reached ~1 loss
  3. keep data relatively balanced, or the RNN will spend all its effort trying to learn patterns & vocabulary of the most common kind of input

Further work:

  • multiple metadata: author/genre/work, perhaps. The RNN might learn to disentangle the various factors, so one could generate samples from BIBLE|RELIGION|RAYMOND_CHANDLER|. Music in ABC notation would be another target as ABC supports genre metadata and there might be useful ABC databases.

  • visualize the RNN hidden state to look for ‘grandmother neurons’; could such neurons be used to create the equivalent of DeepDream or Neural Style and ‘transfer’ the style of, say, Biblical prose to hard-boiled detective stories?

    My belief is that a genre/author-classification+unsupervised-prediction char-RNN may be able to do style transfer. This is because such a char-RNN should learn a clean separation between the metadata (style) and the semantics (content).

    In genre/author classification, the hidden state incrementally builds up an inferred genre/author as it processes the text sequence; in unsupervised prediction, the hidden state incrementally builds up a summary of past semantics+syntax as it tries to predict the next character. The hidden state representing the best current guess for classification will be mostly static because it will quickly reach high confidence as to the genre/author and then the neurons encoding that information must be protected long-term from being modified; in contrast, the semantics+syntax hidden state is changing every time-step and if its distributed encoding overlapped with the genre/author distributed encoding, it would quickly forget its original conclusions about genre/author.

    This opposition should yield a trained char-RNN with a few neurons devoted solely to genre/author and the rest devoted to semantics+syntax encoding.

    Given such a clean split, something analogous to the style transfer CNN should be possible. First, figure out which neurons are which; then feed in texts from different genre/authors and extract the hidden state corresponding to each genre/author, eg Bible vs Wheel of Time. To convert a piece of Wheel of Time prose into Biblical prose or vice versa, feed in a desired piece of text to produce the genre/author and semantics+syntax hidden state vectors; now, hardwire the semantics+syntax vector and do gradient ascent on the input text to gradually turn the original genre/author hidden state into the target genre/author hidden state; once the transformed text yields both the target genre/author hidden state but also the same semantics+syntax hidden state, it has been converted. Hypothetically, to the extent that the char-RNN has learned English semantics and prose styles, this would convert text into different styles while preserving the semantics.

    This might not work with a char-RNN doing character-level prediction if the learned semantics+syntax turns out to be weak enough that a converted piece of text only bears a faint resemblance to the original. (Perhaps the semantics don’t add enough predictive power, or the char-RNN is small enough that it must use all its capacity learning vocabulary etc.) If it doesn’t, some other approaches might be to train a classification char-RNN, providing the style metric, and also a sequence-to-sequence autoencoding RNN to provide a semantics encoding; then set the style target to be the desired style, hardwire the autoencoder, and use them jointly as a loss to do gradient descent on. RNNs can also be combined with CNNs, and this may allow a more direct borrowing of the original style transfer algorithm.


Geocities char-RNN

Geocities (1994-2009) was an Internet service for hosting personal webpages which featured a wide range of idiosyncratic and unusual content. Geocities Forever is a website created by Aanand which features text generated by a small CPU-trained 3x512 char-RNN on a small 50MB sample of the raw HTML from the ArchiveTeam Geocities corpus. The generated HTML is amusing but also shows some weaknesses in generating interleaved English/HTML, which I thought was connected to undertraining on a small corpus—based on my earlier experiments with char-RNN models of CSS and multiple English authors, I know that char-RNNs are capable of switching languages smoothly. During October-November 2016, I attempted to train a larger 2x3000 RNN with a 1GB+ sample using torch-rnn, and ran into issues:

  • the larger corpus had quality issues related to some files being present many times, including 1 file which was present in several thousand copies
  • training repeatedly “bounced” in that after quickly reaching low training & validation losses and generating high-quality text samples, error would skyrocket & text samples plummet in quality (or not be generated at all due to malformed probabilities)

Cleaning and shuffling the corpus reduced the quality issue, and reducing learning rate substantially helped avoid the bouncing problem, but ultimately the goal of high quality text samples was not reached before my laptop died and I was forced to stop GPU training. Training a char-RNN on very large text corpuses is more difficult than I thought, perhaps because the variety of content overloads the RNN model capacity and can create catastrophic forgetting unless trained for a very long time at low learning rates for many epoches.

Having downloaded the torrent, the 7zip-compressed files are laid out according to the original Geocities ‘neighborhood’ structure and must be extracted.

Data extraction

The bulk of the torrent is image files and other media content, while we only want to the HTML, so we extract those, and to keep the content easily read and avoid any possible binary corruption or weird characters, we convert everything to ASCII before writing to disk:

The total HTML content is ~9GB, more than adequate.

A quick inspection shows that the HTML is exceptionally verbose and repetitive due to injected Geocities HTML and copy-paste. What sort of training loss could we expect from the content? We can look at the bits-per-character performance of a compression utility:

LZMA/xz baseline:

xz manages 1.194bpc; in terms of a negative log loss, xz managed a loss of 0.69:

RNNs can model very nonlinear and complicated phenomena, but they also have tiny hidden-state/memories and so suffer in comparison to a compression utility which can store long literals in RAM (xz -9 will use up to 4GB of RAM for context). So if the RNN can reach 0.69, that would be acceptable.

Another way to put it: how many lines are repeated? A comparison of wc --lines and sort --unique | wc --lines shows that a surprisingly low number of lines are unique, suggesting even more repetition in the HTML parts than I expected.

torch-rnn’s preprocess.py script, and its training, store all data in RAM, so using all 9GB turns out to be infeasible. 1GB turns out to use an acceptable average ~34% of my laptop’s 16GB RAM for preprocessing & training.


My initial set of training hyperparameters:

Performance was bad: training loss ~3.5, validation loss after 2 days: 4.61/4.69/4.49 Not good! Is 3 layers too unstable? A minibatch size of 2 too unstable? (Increasing the minibatch requires decreasing RNN size because there’s nothing left to cut.) Not enough BPTT? Let’s try switching to 2 layers, which frees up a ton of memory for the minibatch & BPTT:

Trains within 1000 batches to ~0.6 training loss, often with training loss below the xz bound, but validation loss explodes! there’s also odd training loss behavior: it seems to bounce from the low training loss regime past 1 to as high as the 3s for long periods.

If not overfitting in general, could be non-stationarity of input and overfitting on specific parts; preprocess.py doesn’t do any shuffling. Can force shuffling by going back and shuffling the extract files or on a line-level basis by re-preprocessing the corpus:

And by increasing BPTT & dropout:

Still we see the same ‘bounce’ from better-than-xz predictive performance to 2-3 training loss. To check if it was size that was the problem, I went back to Aanand’s original 3x512 architecture:

After ~9 hours, it had reached a validation loss of 1.05 and generated output looks pretty good1 but then it bounced over night and output became garbage again. (For 1GB and 3x512 RNN, 1 epoch takes somewhat over 1 day.) It is still acting like it’s overfitting. Why?

Data cleaning

I took a closer look at the data: and noticed something odd skimming through it—it’s not just the HTML boilerplate that’s repeated, but many parts of the content as well (eg searching for the word “rude” turns up the same lengthy complaint repeated hundreds of times in the sample). Is the excellent xz compression and occasional excellent RNN training loss, and then the ‘bounce’ due to content being repeated many times, leading to severe overfitting and then extremely high error when it finally runs into some of the unrepeated content?

There are possible ways for repetition: the original find command ran on all 7z archives including the multipart archives in the torrent, so possibly some archives got decompressed multiple times (if perhaps 7z, given an archive like “archive.7z.8” then goes back and tries to decompress starting with “archive.7z.1”)? If so, then rerunning it but writing all files to disk will make the duplicates go away (the duplicates will simply get decompressed & overwritten repeatedly). And if the repetition is due to multiple identical files with different names/paths, then there will still be a lot of duplication, but a file-level duplication tool like fdupes should detect and delete them.

For file-level duplicate deletion and recreating the corpus:

After extracting to disk to eliminate redundant writes, and checking/deleting duplicated files, I restarted training. After 20k minibatches, training loss steady in the 2-3 range, validation loss continues to explode, and I cannot even sample because the output is so ill-behaved (the multinomial probability problem). So the problem was still not solved, and a grep for “rude” indicated the redundancy problem was still present.

I went back into the original extracted Geocities HTML files looking for that weird ‘rude’ page which appears thousands of times; an ag search indicated that it shows up ~31k times in two directories:

  • ./geocities/YAHOOIDS/m/i/mitzrah_cl/ (5.2GB, 334595 HTML files)
  • ./geocities/YAHOOIDS/T/o/Tokyo/6140/myself/sailormars/karen/site_features/hints_n_tips/site_features/www_stuff/www_resources.html (0.527GB, 33715 files)

Looking at filenames, there are also many possibly duplicated pages:

I could delete everything except one random “bio.html” or “myaward.html” etc, but first I tried deleting everything in mitzrah/ and myself/. This makes the filenames look much more diverse; spot checks of the files named “sb.html” & “everclear.html” suggests that the duplicated file names now represent legitimate, non-repeated content which happen to have similar filenames due to serving similar roles in peoples’ personal webpages.

Skimming the final corpus also doesn’t show any blatant repetition.

The bounce continues

After this data cleaning, I restarted training from the last checkpoint, same settings. 100,000 minibatches/4 epoches later, sampling still fails and validation loss is in the 100s! Restarting with higher dropout (0.8) didn’t help. Restarting with 0 dropout didn’t help either—after 50,000 minibatches, validation loss of 55.

I thought that the 512x3 may simply lack model capacity and the original one worked because he used a small corpus which was not too diverse.

Trying something intermediate between 512x3 and 3000x1, 2000x1, after 30k minibatches / 0.7 epoches, validation loss is ~0.98 and generated samples look good. So the larger flatter RNN is handling it better than the smaller deeper one.

Unfortunately, the bounce is still present—initially a bounce around epoch 0.84 with generated samples much worse. After another 65k minibatches, very high quality samples but then bounced in training at a different place in the dataset—epoch 0.04 (after a restart due to crash). In previous training, the data located at ~4% is perfectly well behaved and easily modeled, so it’s not the data’s fault but the RNN, suggesting it’s still overfitting. If so, the learning rate may be too high; I increased the learning rate to 4x smaller, 8e-3.

The lower learning rate RNN still bounced, but not quite as badly as usual, with steady validation loss ~3 after a week.

Unfortunately, further progress by the RNN or the performance in restarting from scratch with a much smaller learning rate is unknown, as on 26 November my Acer laptop died (apparent motherboard failure, I suspect possibly due to the stress of all the months of GPU training various char-RNN and other deep learning models) and due to problems with my backups, I lost data back to 14 November, including the training records & latest checkpoints.

Since the Geocities char-RNN wasn’t going anywhere & I worried may’ve contributed to my laptop failure, I stopped there. My guess is that good results could be obtained with a smaller corpus (perhaps 500MB) and a large char-RNN like 2x3000 trained with very low learning rates, but it would require at least GPU-weeks on a top-end GPU with more than 4GB RAM (to allow larger minibatches) and isn’t sufficiently amusing as to be worthwhile.

Finetuning the GPT-2-small Transformer for English Poetry Generation

In February 2019, following up on my 2015–2016 text-generation experiments with char-RNNs, I experiment with the cutting-edge Transformer NN architecture for language modeling & text generation. Using OpenAI’s GPT-2-small model pre-trained on a large Internet corpus and nshepperd’s finetuning code, I retrain GPT-2-small on a large (117MB) Project Gutenberg poetry corpus. I demonstrate how to train 2 variants: “GPT-2-poetry”, trained on the poems as a continuous stream of text, and “GPT-2-poetry-prefix”, with each line prefixed with the metadata of the PG book it came from.

With just a few GPU-days on 1080ti GPUs, GPT-2-small finetuning can produce high quality poetry which is more consistent than my char-RNN poems & capable of modeling subtle features like rhyming.

OpenAI announced in February 2019 in “Better Language Models and Their Implications” their creation of “GPT-2-large”, a Transformer2 neural network 10x larger than before trained (like a char-RNN with a predictive loss) by unsupervised learning on 40GB of high-quality text curated by Redditors. GPT-2-large led to large improvements on GPT-1’s natural language generation, and demonstrated high performance on untrained NLP tasks (see the paper for more details: “Language Models are Unsupervised Multitask Learners”, Radford et al 2019). By large improvements, one means that the best samples like the ones included in the OA announcement have started to reach an uncanny valley of text, capable of telling entire semi-coherent stories which can almost fool a sloppy reader—certainly, the verisimilitude is better than any char-RNN output I’ve seen. (A dump of many more samples is available on GitHub.) The full GPT-2-large model was not released, but a much smaller one 1/10th the size was, which I call “GPT-2-small” to avoid confusion.

GPT-2-small poetry

Naturally, people immediately used GPT-2-small for all sorts of things, and I applied it myself to generate surreal anime plot summaries & dialogue for “This Waifu Does Not Exist”.3 Even more naturally, just as with char-RNNs, GPT-2-small works well for poetry:

The quality of the results is limited by only having access to GPT-2-small; that can’t be fixed (yet). But quality is also reduced by GPT-2-small being trained on all kinds of text, not just poetry, which means sampling may quickly diverge into prose (as seems to happen particularly easily if given only a single opening line, which presumably makes it hard for it to infer that it’s supposed to generate poetry rather than much more common prose), and it may not have learned poetry as well as it could have, as poetry presumably made up a minute fraction of its corpus (Redditors not being particularly fond of as unpopular a genre these days as poetry). Finetuning or retraining the released GPT-2-small model on a large poetry corpus would solve the latter two problems.

The poetry samples above did not exploit finetuning because OpenAI did not provide any code to do so and declined to provide any when asked. Fortunate, nshepperd wrote a simple finetuning training implementation, which I could use for adding more interesting samples to my TWDNE and for retraining on poetry corpuses to compare with my previous char-RNN poetry attempts back in 2015–2016 (see the top of this page).

Project Gutenberg Poetry Corpus

For the poetry corpus, Allison Parrish’s public domain “A Gutenberg Poetry Corpus” (“approximately three million lines of poetry extracted from hundreds of books from Project Gutenberg”) will serve admirably. A few other possibilities surface in Google Dataset Search, like “Poems from poetryfoundation.org, but nothing particularly compelling.

As far as the text formatting goes, GPT-2-small is flexible, you can dump in pretty much any text into a text file to use as the corpus, but some text formats are better than others. You want something which is as regular as possible (in both syntax & semantics), but also one which is as close to the kind of text you want generated, but also which wastes as few symbols as possible. Regularity makes learning easier, and you don’t want to have to massage the output too much, but on the other hand, GPT-2-small has a narrow ‘window’ and no memory whatsoever, so if each line is padded out with a lot of formatting or even just whitespace, one would expect that to considerably damage output coherence—as most of the fixed ‘window’ is wasted on meaningless repetitive whitespace, while other changes like replacing newlines with the poetic convention of ’ / ’ are worse than nothing (since newline is 1 character vs 3 and maximally dense).

The PG corpus has a strange format: each line is a separate JSON object, consisting of one line of poetry and a numeric ID for the work it’s from. Fortunately, the file as a whole is in order (if the lines were out of order, training on them would destroy the long-range language modeling which is the Transformer’s raison d’être!), so to turn it into a clean text file for training on, we can simply query it with jq and strip out the remaining formatting. This provides a pretty good format over all: the newlines are meaningful, no symbols are wasted on leading or trailing whitespace, and it looks like what we want. (It is imperfect in that metadata/formatting we would like to be there, such as author or poem title, is not there, and things we would prefer not to be there, like the prose prefaces of books or annotations, are, but hard to see how to fix those easily.)

Setting up the GPT-2-small training environment & obtaining the poetry corpus:

There is an additional step before beginning training. GPT-2-small works with text in a “byte-pair encoding”, which is somewhere in between a character embedding & a word embedding. The point of this BPE encoding is that it is somewhat more efficient than raw characters, because it can chunk more common sub-words or phrases & this gets more complete words or phrases into the Transformer’s fixed ‘window’ of n symbols, but BPE still assigns symbols to individual letters, and thus arbitrary outputs can be generated, unlike word-level NNs which are more compact but trade this off by having a restricted vocabulary of m words seen in the training corpus and must treat everything else as the unknown token <UNK> (especially bad for rare words like proper names or variants of words like pluralization or tenses). The training code will encode the text corpus at startup if necessary, but for 117MB of text this is so slow that it is worth the extra work to run the encoding process in advance & store the results before training on it:


Then training proper can begin; my Nvidia 1080ti4 can fit a minibatch size of 2 (GPT-2-small is still a large model), and I’d rather not see too much output so I reduce the frequency of checkpointing & random text generation:

WARNING: the Python library “fire” used in the OA GPT-2 code is very treacherous—it will not error out or even warn you if you typo a command-line option! Double or triple-check any new options you set against the available arguments defined by train_main in train.py, and keep this gotcha in mind if setting an option doesn’t appear to be doing anything. (I discovered this while being puzzled why --batchsize 32 did not lead to instant out-of-memory errors for training; similarly, if you make the mistake of sampling with the option --top 40, what you are actually doing is sampling with the default --top_k 0. Oops.) While nshepperd has removed use of “fire” in favor of saner CLI options, watch out for this if you are using the original OA code or other derivatives.

Some hyperparameters could use tweaking:

  1. temperature: In the original nshepperd code release, the default temperature setting for the samples during training, 1.0, is not the usual 0.7 everyone uses for GPT-2 prose sampling—although it turns out for poetry we don’t want it at 0.7 as that forces too many repeated lines & 0.9–1 turns out to be much better, so use temperature in that range when generating samples
  2. learning rate: in nshepperd’s code, the Adam SGD learning rate is left at its TensorFlow default of 0.001, which works initially, but appears to be much too high for this purpose (perhaps because the minibatch is so tiny). After training overnight, the loss was not decreasing below 2.5, so I decayed it manually to 0.0001 & resumed training (editing line 136 of train.py to read tf.train.AdamOptimizer(learning_rate=0.001*0.10)), eventually decaying it again (to 0.001*0.0001) to get it down to a loss of ~1.95. (nshepperd has since added a --learning_rate option.)

GPT-2-poetry Samples

After training an hour or two, a sample

Overnight samples during training:

# ...[115999 | 2375.96] loss=1.10 avg=2.31
# ======== SAMPLE 1 ========
# The sun is gone, and the night is late,
# The lights are few, and the chimneys close,
# And yet the moon hangs lonely
# Above the town, where the tides divide,
# On the shores of the silent tide.
# The tide is still, and the tide is at rest,
# And the sands are many, and many a breast
# And breast of the tide is weary,
# And far away the tide's last foam
# Stands silent under the waves;
# But the restless tide creeps slowly,
# And the tide is strong, and the tide is deep,
# And the sea, like a white-faced phantom,
# Floats motionless over the deep.
# The tide is past, and the tide is at rest,
# And the sands are many, and far apart,
# And the sands are many, and far apart,
# And the sea, like a white-faced phantom,
# Floats motionless over the sea.
# 'Tis evening, and he comes to me,
# With scent of flowers, from the village lane;
# And the wind is blowing faint and low,
# And the dew, like tears, on the hill-tops pale,
# Is drifting faint and cold from the eastern sky,
# As he floats adown the western wave;
# And the trees that bend their branches sigh,
# And the winds that waft a sigh,
# Are fading dimly away,
# As the tide goes wandering,
# Tossing waves of sand.
# And the cloud that floats and swells
# Has passed, like the wave on the bay,
# And the moon is slowly sinking low,
# As he sinks adown the western bay.
# There are waters blue and bright
# On the breast of the silent sea;
# And the foam is like a bright
# And lingering pearl of the sea.
# Oh, I would my heart might still
# Ever be where the waves are free.
# Where the foam-flakes fall and swell,
# Falling into the silent sea;
# Where the foam-flakes fall and swell,
# Falling into the silent sea;
# Where the foam-flakes fall and swell
# Falling into the silent sea.
# Oh, I would my heart might still
# Ever be where the waves are free.
# Far away from the sea and the tide,
# In a quiet little bay,
# Where the foam-flakes fall and swell,
# And the rocks are telling of a day
# When I would my heart could stay;
# Where the foam-flakes fall and swell
# Falling into the silent sea;
# Where the foam-flakes fall and swell
# Falling into the silent sea.
# Oh, I would my heart might still
# Ever be where the waves are free.
# With a love that never will,
# With a passion never still,
# With a memory never still
# For a love that never will,
# With a sorrow never still
# For a love that never will,
# With a passion never still,
# With a passion never still,
# With a passionate love that never will,
# With a sorrow never still
# For a love that never will.
# Oh, I would my heart might still
# Ever be where the waves are free.
# With a love that never will,
# With a passion never still,
# With a sorrow never still,
# With a passion never will,
# With a passion never will,
# With a sorrow never still,
# With a love that never will.
# With a love that never will,
# With a passion never will,
# With a sorrow never still,
# With a love that never will,
# With a sorrow never will,
# With a sorrow never still,
# With a love that never will,
# With a grief that never will,
# With a sorrow never will,
# With a love that never will.
# Oh, I would my heart might still
# Ever be where the waves are free.
# With a love that never will,
# With a sorrow never still,
# With a sorrow never still,
# With a sorrow never will,
# With a love that never will,
# With a sorrow never will,
# With a sorrow never will.
# With a love that never will,
# With a sorrow never will,
# With a sorrow never still,
# With a sorrow never will,
# With a sorrow never will,
# With a love that never will.
# With a love that never will,
# With a sorrow never will,
# With a sorrow never still,
# With a sorrow never will,
# With a sorrow never will,
# With a sorrow never will,
# With a sorrow never will,
# With a sorrow never will,
# With a love that never will,
# With a sorrow never will.
# With a love that never will,
# With a sorrow never will,
# With a sorrow never still,
# With a sorrow never will,
# With a sorrow never will,
# With a grief that never
# [116000 | 2385.67] loss=3.33 avg=2.32
# ...
# [133999 | 5919.61] loss=2.89 avg=2.21
# ======== SAMPLE 1 ========
# "When the night is gone and the sun goes out,
# "There goes my bride and my lover:
# "Can she never come to her wedding-ring
# "With the light of the seven candles?
# "Oh! to bed with the bridegroom and bride,
# "Or to lie with my lover!
# "He shall not come to my bed at night
# "For a little while longer.
# "She shall not come to my chamber
# "For a little longer."
# The night is spent, the sun has gone;
# The bridal of the bride hath done;
# The bridal of the bridegroom comes
# To wed the bridegroom's attour.
# I come to see the wedding-feast.
# He cries to me--"Wilt seek the bride,
# "Wilt seek the bridal?"
# "Oh! I have sought her mother's bower
# "And never found her! oh! my flower!
# "Would we should love as brides should do!
# "I shall not find her!"
# The bridegroom at the bridegroom's door
# Gave his bride a ring and a prayer.
# "Now, bridegroom, sing a bridegroom loud!
# "I shall not find her!"
# The bridegroom at the bridegroom's door
# Gave his bride a ring and a prayer.
# What means the bridegroom or the bride?
# The bridegroom's bridegroom waits to ride.
# And looks with wonder at the bride;
# "And does she dream?"
# "Oh! I have dreamed!
# "She dreams of my youth,
# "As one that hears--it cannot be -
# "The story of a marriage vow!"
# And a voice answers:
# "The story of a marriage vow!"
# And the words reach the bridegroom's door
# As the bridegroom at the bridegroom's door
# Kisses the ring with a bridegroom's kiss on his cheek.
# They are wed! They are wedded!
# Each is in his bridegroom's bower;
# Each hath his bride in his bosom now!
# And each hath his bride in his heart.
# She is wedded!
# With each is the bridegroom's bride!
# And love is a bridegroom's bride!
# She is wedded!
# With each is the bridegroom's bride!
# "They are wedded!
# They are wedded!
# They are wedded!"
# A man from a fair tower, where the birds of the air
# Fairer and fainter and fairer the flowers and the trees,
# When the sweet light fades from the gardens of day he found
# Came he in quest of a maiden, whose form was wildly wild,
# So she came at the summons. He came, and stood by her side;
# And he gazed on the dream of her marvellous face, and smiled;
# And he said:  "I came by the river's side, when the day was still
# And I think of my bridegroom--he--whom ye all have a will!
# She is wedded!
# And he loves me not, my bride!
# And he looks at her eyes with a love that seems to him divine,
# Then he pines, and thinks of her eyes with an inward passionate shine
# They are wed! They are wedded!
# By the river's side they are, bridegroom, led!"
# He hath summoned her maiden with the will, and she answers
# That he needs her for his bridegroom still.
# A man from a fair tower, where the birds of the air
# Fairer and fainter and fairer the flowers and the trees,
# What is the meaning of that--no--is it, Lord, where ye bear
# All the wonders of the world in this fair maiden's hair?
# What doth the meaning of that--no--nor what doth she know,
# She is wedded!
# What doth the meaning of that--no--she hath chosen so?
# She is wedded!
# She is wedded!
# She is wedded!
# She is wedded!
# She is wedded!
# She is wedded!
# She is wedded!
# And the will is good,
# In the light of days,
# In the heat and stress,
# It is wise and wise
# To be wedded!
# She is wedded!
# And the love is good,
# In the light of days,
# It is hard to live
# And bear the wreath!
# And he girdeth her to his bosom, and thinks of her
# As she stands, all white, in the open air,
# With the full moon facing her face; and her breath
# Cries and whispers and wails and cries
# ...[134000 | 5929.46] loss=2.12 avg=2.21
# ...[135998 | 6934.07] loss=1.96 avg=2.16
# [135999 | 6934.58] loss=3.06 avg=2.17
# ======== SAMPLE 1 ========
# re
# Their love and beauty are as one in dream,
# A visible sign of the things that were.
# But I have seen these things by all men's eyes,
# Felt them as kindred of man's earthly life,
# And with the instinct of the unseen spirit
# Been caught by likeness of the thing he is.
# My love is but the wind, and I have blown
# From earth to where I am, and I have seen
# The things that no man dreamed of; yet at last
# I know by my soul's sense of a sense of things
# That are not, and may be, but the things that were;
# And yet I know these things are not, but are
# As earth and heaven, if earth and heaven and hell
# Are but the same things that it seems. Yea, then
# I am the wind. God knows the ways of men,
# He knows the insensate secrets of delight,
# And they are mysteries, if there be any to be seen."
# But with that word the wind in wonder strode.
# He heard the rustle of the leaves, and saw
# The shadows move about him, and he leaned
# Against the doorway like a god, and knew
# The inner meanings of the leaves and streams.
# There where the trees lie down at their root-holes,
# There where the wind smells of the blossoming boughs,
# He saw the grass, and felt the green blades come,
# As if it were the buds and boughs upon air,
# And heard the green birds sing. He saw the fields,
# The trees, the rivers, and the flowers within,
# The birds, the grasses, and the living things,
# And the strange river on the shore that rolls
# Through all its quiet marge into the sky.
# There let him live till time should come, and then
# Let love be like the heaven, and we be one
# To love, and not be one, being all in all.
# And if he had not done me the good work
# Had it been well not I. The things that he said
# Should never be fulfilled by simple sense;
# For all must have a meaning in themselves.
# But he that works out of his mind is one
# With whom the things that are and are not are,
# And makes them meet and good. 'T were a good thing
# For him to work and win for me, and so
# If he were not I would have it all.'
# But he that lives and not lives in the world
# Was not more worthy of the hand of Fate,
# And knows life's meaning, and would seek for it
# Through failure, and in death's despite. For him,
# Who hath been stricken with me through the brain,
# Forget to tell me how his brother, he
# Whom he had saved and murdered--so let it be
# By some great memory left.
# But at last,
# As I said this, he saw me, and he said
# To one, whose face was grey with tears in me,
# "What is it? let me tell you who I am.
# Do you see the things that you have seen before?
# What is it?"
# "They are more wise
# Than wise men think of wisdom and good will,"
# Replied the other. "What I deem is good.
# The gods are good to mortals as they are,
# And they know well whereby we are born: but they
# Who have loved God and died to him the most
# Of all the gods are fallen into ill things:
# For God we know is good, and hath not been,
# And therefore must be, so it be, with men
# Who love, and love because we loved them not.
# Alas, I do not think that God alone
# Hath power over the earth to let the gods
# Face to face with the world. I hate at times
# The gods that made them: the gods that knew
# Their names are our own gods, and would not know
# One other reason, for I have the power,
# And all the gods are fallen into ill things."
# Then she said to me, "What may have been
# To have known, before I came into this land
# To find you in some other place and knew you,
# And know you, seeing so many and strange,
# And knowing such a godlike way to go
# Among the gods and suffer such long-sought.
# I can take my crown of gold and wear a garland,
# Take some crown for my sake, and the happy crown
# And let it be for all the years long held
# That I have known, and felt so like a god
# Some few suns live. My heart is all in all
# To live again, my life upon earth dead."
# So I said to the god that loved me well
# And longed to have him come back into my prayers,
# [136000 | 6944.39] loss=3.15 avg=2.18
# ...

The loss here is the usual cross-entropy we often see in architectures like a char-RNN. Typically, the best text generation results come when the model has trained down to a cross-entropy of <1, and 2–4 tend to be incoherent gibberish; in this case, GPT-2-small’s original poetry modeling capability is not too shabby (as demonstrated by the various prompted samples), and it shows decent poetry samples starting ~3.5. Given how large & powerful GPT-2-small is, even with this much poetry to work with, overfitting remains a concern—memorizing poetry is not amusing, we want creative extrapolation or mashups.

For this model & dataset, I trained for 519,407 steps to a final loss of ~2 in 72 GPU-hours; almost all of the learning was achieved in the first ~16 GPU-hours, and training it additional days did not do any apparent good. This suggests that GPT-2-poetry was underfitting the poetry corpus & would benefit from an even larger model size.


Before sampling from any new finetuned version of GPT-2-small, remember to copy encoder.json/hparams.json/vocab.bpe from the 117MB model directory into the new model’s directory. I find higher temperature settings work better for poetry (perhaps because poetry is inherently more repetitive than prose), and top-k appears to work fine at OA’s top-40. So unconditional sampling can be done like this to generate 2 samples:

python src/generate_unconditional_samples.py --top_k 40 --temperature 0.9 --nsamples 2 --seed 0 \
    --model_name 2019-03-06-gwern-gpt2-poetry-projectgutenberg-network519407
# ======================================== SAMPLE 1 ========================================
# --
# And it must be decided.
# It must be decided,
# And it must be decided.
# It must be decided,
# And it must be considered.
# It will be decided,
# Though the hill be steep,
# And the dale and forest
# Hold the land of sheep.
# And it must be decided,
# There's a jolt above,
# And its paths are narrow,
# And its paths are long.
# Yes, it is decided,
# And it is completely.
# All the hills are covered
# With grey snowdrifts,
# Shaded with a shimmer of misty veils,
# And the hills have a shimmer of hills between,
# And the valleys are covered with misty veils,
# And there lie a vast, grey land, like a queen,
# And they are not, in truth, but many and many streams,
# O'er the purple-grey sea whose waves are white
# As the limbs of a child of ten.  And there
# The river stands, like a garden-fair
# In the valleys of the north, the valleys of the west,
# Blue and green in the summer, and runneth softly forth
# To the blue far upland beyond the sea;
# And over the high white upland far away
# Floats a white and tender water, and wearily
# Through the trees the rosiest water-lilies play
# In the sun, and rise and fall--the purple and red
# Of the streams.  The waters are hidden in their bed
# By the stone o'er the darkling hills.  The waters run
# Like a ringlet under the stone.  The water flows
# Through the rocks like a river, and the stream
# Is a ribbon of gold spun by the sun.  It gleams
# Like a gold sunbeam shining through the gleam
# Of a sudden silver, and silently falls
# On the pool, and is lost in the darkling deeps--
# Sink, sink in the shadows, ere it flee
# Into the darkling depths.  And the waters sleep
# In the light of the moon and the silver of dawn,
# And silently float past the mountains of heaven.
# As we gazed the city fades into the clouds
# Of the sky, and we are above the roofs.
# And suddenly as the moon, flurrying,
# Dazzles the sea with her swan-throated song,
# And there is a faint far singing of birds,
# And a sound from the land, as of swarming seas,
# The grey sea, and the land that hideth rest,
# And the sky that hides the lovely green of God.
# So we are caught, like the moving sea,
# That calleth unto its sleeping
# Soft and still, like the moon that calleth
# In the twilight depths vast and hoary--
# Till we see the City changing toward the dark,
# And its changing towers in the distance darken.
# In the city is a calm and quiet street,
# Full of sunlight, and a smell of rain,
# That falls from unseen towers like soft white feet
# On sleeping city's rue and misty pane.
# There is peace, and a vague peace over death,
# And a far-off singing in the city's breath.
# And all fair cities must go to dust,
# And every body be one tomb--
# And all white houses dwindle and grow dull,
# And the city's breath is a dull death-blow.
# But this place is a place of peace and trust,
# And it is but a little street,
# Whose idle heads and sunken faces
# Are bright with light that makes them bright.
# Then it is not alone fair Town that lies,
# With open pillared streets beneath a sun,
# And many a weary world and dusty town,
# And a sunflowers and a great tide onward run
# In the blue of the heavens that are not gray,
# But only blue and pale, like tender wings
# Sailing with wide-spread, languid, luminous eyes.
# This place is the very heart of it,
# Whose quiet hours with its peace throng
# The silent nights and the perpetual sea.
# The City slept with her silent towers,
# A stream that ran in an idle stream,
# And a mist hung at the windows of the tower.
# And it was a street--a sunlit dream,
# A dream of a world that lay
# Open in the summer morning,
# And in its heart a joy all gay.
# For its sunshines and palaces were there,
# Till a wind came softly here.
# And it was a new, new city,
# A city that arose in the early morning;
# That opened its gates on June morning,
# With a sunset and a moonrise sweet.
# The city was a cathedral;
# And out of the sound of the bells and t
# ======================================== SAMPLE 2 ========================================
#  of the world
# The best, that, when once dead, is found again.
# And what is this?  Where can we find a place,
# Save in the solitude, where he may be
# The friend of all beneath the sun, and be
# An unseen presence, if the traveller's eye
# Can follow where he cannot:  there he stands
# Dark in majestic pomp, like those whom owls
# Could once have told down with a lion's maw.
# His form is like his fathers, and the crown
# Of all his race:  the very colours are
# As his to-day, which we must see and bear;
# The only parent is the creature's he.
# His face, where we have marked it, is but veiled
# In twilight, when we see, and he appears
# Himself in all his nature--where, if man
# Can recollect, he saw it in the frame:
# 'Tis clay wherever found--and so is called,
# When nature gives him back her clay.  It means
# That clay was form'd; but clay is form'd elsewhere;
# He needs must feel through all this frame, and, lo,
# The horse he rears, is human in his mind.
# So too, his nature is a thing apart
# From the great Nature, which has made him thus
# A likeness of himself:  and he beholds
# The creatures that he knows, and not intends
# To visit them, and only in their hearts
# Deserts them; and if they come indeed,
# And if the sea doth bring them, then the man
# Is still a child of theirs.  He can recall
# His mother's features and the father's look.
# And often he has said that he foresaw
# The sea, the winds, that he may all at will
# Be sea.  In short, the man is all he sees.
# He fears the sea may hurt him.
# Lashed to the helm,
# The ship was in the sea, and, on its moor
# And the sails furled, in silence sat the maid
# Motionless, like a star; no sound was heard
# Save of the distant ocean's fitful hum;
# The sounds of tempest came to him, his ears
# Mercurially listless, and his heart
# Disturbed like a distempered sea; he stood,
# And gazed from heaven in an unblest thought;
# He had not heard his mother's voice; he gazed;
# The mother's look was of a loftier mood;
# He had not heard his own; he had not heard
# What ever was, where his own heart has been;
# He had not understood the very thought
# Of his own heart, where life could find no shore.
# The sea beats on:  the vessel's bell strikes six:
# Dive down, O death! to earth, to heaven! to heaven!
# And it is sweet thus to be two souls alone:
# Dive down for home, and to the air renounce
# The galling bonds of everlasting life
# In some lone bark, that, dying, to the last
# Are still as death without her:  so to him,
# The mother's voice, still sweeter, spoke of home;
# And as the young man fell upon her breast,
# The mother's oracle, the words of death,
# Even as he spoke, a living death arose:
# He feels his heart rise, and ascend the sky.
# The wreck shall surely reach the sea; he dies,
# A mortal change, as earth, in which it was;
# And God, though dead, had still a dying man.
# But when they parted, he can never die.
# There are thousands, yes, there are thousands who,
# Without a mother, could not die unheard
# Of by a hand unseen:  yet some are sad,
# Lonely and wretched here, without a mate;
# Or if the grave touch, the great hearts' light
# Have no soft touch, even of a brother's grief
# Scarce suffered, they shall each a new life yield;
# And one, once more on earth, to heaven, or God,
# Shall meet his father's face, or bless his grave.
# Not vainly on these mocking thoughts he breathes;
# They sink to nothing when he sinks to rise:
# The tears of fatherly compassion reach
# The mother's eyelids, her, but not her eyes.
# And now a voice was heard by the wild bird,
# With words of comfort from the infant boy.
# Oh, had it stayed the angel's birth, and then
# Those tresses streaming, would have felt the strain
# For the bright star, and for a glorious man.
# It is a noble deed:  and, through the world,
# Doth woman triumph, though she suffer loss
# And poverty and pain, and,

Not bad.


The first version of the PG data for GPT-2-poetry just runs all the lines together, erasing the metadata about what book each line comes from. A good model should nevertheless gradually learn about the transitions between poems & whole books, but that is hard and there may not be enough transitions in the data to learn effectively.

Much like the char-RNN experiments on this page, there is no reason one can’t inject that metadata in a structured way to see if the model can learn to exploit the metadata; even if it cannot, the added metadata shouldn’t hurt that much because it is so regular & repetitive. Inserting the metadata also allows for some degree of control in conditional generation; one should be able to put in the book ID for, say, Homer’s Iliad as a prompt and get out a long block of consistent Homeric pastiche.5

Ideally, there would be unique IDs for every author, poem, and book and these would appear at the beginning of every poem and the end of the poem would be delimited with the <endoftext> symbol that OA’s GPT-2 models were trained with, but unfortunately only the book ID is available in this particular dataset. (Project Gutenberg ebooks do not include any metadata or formatting which would cleanly split each discrete poem from each other.) Like before with authors, the book ID metadata can be formatted as a prefix on every line with a delimiter like the pipe character.

Rather than start over with GPT-2-small again, GPT-2-poetry can just be further finetuned on this new prefixed version of the PG corpus to produce what I call “GPT-2-poetry-prefix”:

The loss of GPT-2-poetry-prefix will be much lower than GPT-2-poetry because the prefix is so predictable, but it will hopefully learn interesting things beyond that.

In other samples, the generated IDs switch in the first two lines, and while that’s not much to judge with, GPT-2-poetry-prefix seems to ignore keywords from the first line when the IDs change, and doesn’t repeat them in the rest of the sample or attempt to rhyme off them, which is further evidence it is successfully associating & learning to mode-switch.

Like GPT-2-poetry, GPT-2-poetry-prefix converged quickly to a final loss of ~1.6 after 224,474 steps taking 31 GPU-hours, not improving much after the first ~8 GPU-hours despite decreasing the learning rate. (Diminishing returns appear to set in quickly for finetuning GPT-2-small even if one has a relatively large new corpus.)

GPT-2-poetry-prefix Samples

Training samples

# ...[33999 | 4308.76] loss=2.45 avg=1.86
# ======== SAMPLE 1 ========
# 11|But I have had a day that was gone when I saw you in a dream,
# 1322|A year ago, a year ago.... I'm going back
# 1322|Oh, the years are many and the years begin!
# 1322|You and I have traveled far and wide,
# 1322|You and I have wandered far and wide
# 1322|Through the land I love so well,
# 1322|Where the beautiful land of long ago
# 1322|Died and ebbed and glowed,
# 1322|In the little green land of long ago,
# 1322|In the little green land of long ago,
# 1322|In the little green land of long ago.
# 1322|You and I have traveled far and wide
# 1322|Through the land of long ago;
# 1322|You and we have traveled far and wide
# 1322|O'er the land we love so well,
# 1322|In the little green land of long ago,
# 1322|In the little green land of long ago.
# 1322|You and I have traveled far and wide
# 1322|Through the land of long ago;
# 1322|And the years have slowly swept us on
# 1322|Till the dust has left the sod,
# 1322|And we all must rue the little day
# 1322|When we all are gone from God....
# 1322|And we all must rue the little day
# 1322|When we all are gone from God....
# 1322|I know a little garden very old,
# 1322|Built in a nook near the western sea,
# 1322|With doors and windows closely locked and barred
# 1322|And steps that made no sound but the world outside,
# 1322|And a wall of white-walled flowers all in a row
# 1322|Of violet bloom that just began to blow,
# 1322|And a dial that looked like a golden globe,
# 1322|Cool, square, and green, with the leaves and the grass,
# 1322|And a dial that looked like a needle of bronze;
# 1322|And I wonder if these petals are lost to-day,
# 1322|These petals will be forgotten to-morrow.
# 1322|"A rose would be a rose,"
# 1322|I wonder much at it;
# 1322|It does not matter what or who,
# 1322|So very, very much it matters:
# 1322|It is so nice to think
# 1322|The world is such a place for petals,
# 1322|And so much like a pink.
# 1322|"A purple fox-hawk lived down by the sea
# 1322|And danced and slept in the white-walled flowers,
# 1322|And I thought about a tree,
# 1322|A rose that dropped as an apple falls
# 1322|Into the water, and fell in showers."
# 1322|And I wondered what people said,
# 1322|And wherefore took the trouble,
# 1322|And wherefore came my little red bird
# 1322|To look at me and wave its wings
# 1322|For joy of self and for its red,
# 1322|And why I cannot tell;
# 1322|For it is so nice to think
# 1322|The world is such a place for petals.
# 1322|And so I sat and played,
# 1322|And the old clock set the chimney-top
# 1322|To ring time to itself and stop
# 1322|The time with a slow, frost-ridden drone,
# 1322|And a rustle as of bees.
# 1322|The house was old, and had grown lean
# 1322|As if it were but a leaf,
# 1322|But I found there was only one room
# 1322|And I slept on my stem at the end of the porch
# 1322|Like a child asleep.
# 1322|The door was old, but very young,
# 1322|And very, very young;
# 1322|And a weary smile on its satin lips
# 1322|Hung like a crown upon the night,
# 1322|And the roses crept into the room
# 1322|By the little red door in the wall,
# 1322|And I woke, and felt as a king is,
# 1322|And thought of all the trouble and pain
# 1322|Of a little life and a little love
# 1322|When only a boy is born.
# 1322|For my heart was old, and restless with the years,
# 1322|And my face with no trace of the things I had
# 1322|And no remembrance when I was twenty-five,
# 1322|And
# [34000 | 4318.77] loss=2.15 avg=1.86
# ...
# [47998 | 11504.75] loss=1.91 avg=1.84
# [47999 | 11505.25] loss=2.37 avg=1.85
# ======== SAMPLE 1 ========
# |So when the sun has touched the hill
# 30599|And made the earth a little heave
# 30599|Of gold-dust on the firmament.
# 30599|The sea is a molten pearl
# 30599|Silvering the yellow moon.
# 30599|Gold is the poppies of purple plush,
# 30599|Which make a star within a pool
# 30599|Of purple, blue, and green and blue.
# 30599|The sea is a molten pearl,
# 30599|Whose petals burn as light
# 30599|In the rose-yarn of the sea.
# 30599|The gold star beckons, sparkles, shines.
# 30599|The water gleams and glows.
# 30599|The poppies turn to a golden rim,
# 30599|And a thousand petals, one by one,
# 30599|Come dancing through the purple waves
# 30599|Of purple.
# 30599|My soul goes up to the sun.
# 30599|The sun, all gold, is gone.
# 30599|The sun is gone, the sun
# 30599|Is not more gold than my soul.
# 30599|What is the rainbow, my soul?
# 30599|The rain is falling in the tree
# 30599|In dewdrops falling, that are wet
# 30599|With dewdrops falling.
# 30599|I have heard music in the woods
# 30599|Under a great sky.
# 30599|Their notes, on a hundred harps,
# 30599|Dance by a liquid, falling star.
# 30599|The song of the whole universe
# 30599|Rings like a loon's tune.
# 30599|The rain is falling, my soul,
# 30599|In the tree-tops, the rain is falling,
# 30599|And the rain is not more musical.
# 30599|The trees, like great globed fruit in a garden of heaven,
# 30599|Are as full of little shining blossoms
# 30599|As the face of a child of ten minutes.
# 30599|The wind is playing a soft tune
# 30599|Like the silver notes of a bell.
# 30599|The grass is a dance for a child,
# 30599|And the sun is going down.
# 30599|The rain is playing a soft tune
# 30599|Like the golden notes of a bell.
# 30599|The rain is playing asleep in the meadows
# 30599|Like a sea of dreams,
# 30599|And the wind is playing a soft tune.
# 30599|The leaves are wet with the tears:
# 30599|There is only the wind that comes.
# 30599|The leaves are wet with the tears.
# 30599|The trees have bent to the wind,
# 30599|Like heavy blossoms, and nothing stays.
# 30599|The earth is flooded with tears
# 30599|Like small white stars in the sky.
# 30599|The wind's song is marching by
# 30599|Like a song in a wind from a string.
# 30599|It comes from nothing, it comes not from the years;
# 30599|An atom of dust and a tide,
# 30599|The ceaseless rain falls heavily over the world,
# 30599|The summer flowers are red
# 30599|And one like a gold cup, in a crimson cup,
# 30599|Flows out from nothing, and goes
# 30599|Without speech or motion nor sound.
# 30599|The yellow flowers all slip,
# 30599|All their leaves are wet,
# 30599|And their crimson petals are wet.
# 30599|The rain has driven the sun outside,
# 30599|The wind has driven the rain outside,
# 30599|The moon is going out and in,
# 30599|With the stars on the roof like snow.
# 30599|The rain has driven the rain outside,
# 30599|The moon is driving the clouds in the air
# 30599|Like white, wool flakes in a snow.
# 30599|The trees are wet with the tears,
# 30599|They weep in their night-black tatters,
# 30599|They moan like mortals who lost their way
# 30599|Because they were glad of the rain.
# 30599|The rain is falling, my soul,
# 30599|It has torn away the flowers,
# 30599|They are wet with the tears of the sun.
# 30599|In the black ooze and the snow,
# 30599|The cold winds, shuddering, are blowing,
# 30599|The leaves are blown like blown corn,
# 30599|There is only the wind that comes.
# 30599|Black and long!  The night is
# [48000 | 11515.52] loss=2.14 avg=1.85
# ...
# [166998 | 5752.38] loss=2.17 avg=1.52
# [166999 | 5752.88] loss=2.10 avg=1.53
# ======== SAMPLE 1 ========
# 26|His golden radiance is no more."
# 16452|"I shall not think of men in Argos more
# 16452|Than they are now, who many a bloody deed
# 16452|Wrought on the Greeks, nor yet of Ilium's king
# 16452|In arts and arts like these can speak the rest.
# 16452|But they--their kings--the Trojans and their sons
# 16452|Have fallen. Their deaths, the Grecians and their friends
# 16452|Have fallen in battle, from whom little hope
# 16452|To escape the battle, but the steadfast hearts
# 16452|Of heroes and of Trojans have become
# 16452|Inglorious still. The immemorial strife
# 16452|Shall rise for ever in a glorious day,
# 16452|When wars are waged between us and the Greeks.
# 16452|The battle shall be theirs, the mirth, the song;
# 16452|The mirth which all the Olympian people share,
# 16452|Shall bless the younger warriors with a joy
# 16452|So great, so glorious, and a greater fame,
# 16452|That all the Greeks shall learn, that in the van
# 16452|Ye stand yourselves, and they will praise your deeds.
# 16452|But I beseech you, if indeed by mine
# 16452|Unknown dishonour you be wrested hence,
# 16452|That with your lusts, illustrious and august,
# 16452|All others ye may vanquish. Now, my friend,
# 16452|Behold this prize to crown your father's pride.
# 16452|He said, and shaking both his palms, assent
# 16452|That I should also wish it. Thou art brave;
# 16452|Thou know'st how Menoetiades the swift
# 16452|Was dragged, of Hector and the fierce compeers
# 16452|And Phrygian warriors. So, we will dispatch
# 16452|Your bodies, then, yourselves to burn the ships
# 16452|In sacrifice; with torches and with bells
# 16452|To burn them, and with oxen to replace
# 16452|Your gallant friends for ever. But I wish
# 16452|That no man living has so long endured
# 16452|The onset of his foes, as I have power
# 16452|To burn or storm; for mighty Hector erst
# 16452|Was slain, and now returns his safe return
# 16452|To the Thesprotians, and each other's wives
# 16452|And tender children, and all other babes
# 16452|Assemble round me now, for ye have more
# 16452|To suffer than they know. Go then--the rest
# 16452|Will bear you safely; if ye dare to use
# 16452|The Grecians with your ships, leave not the flames
# 16452|Or fire, or slay yourselves and leave the rest,
# 16452|Ye may, at least, escape, but Hector's hands
# 16452|Will set you free; the vultures on the hills
# 16452|Held forth, and I will pierce those hollow shades
# 16452|Adown the trench, and ye shall find them there
# 16452|Scaped unaware. If, then, the Gods above
# 16452|Should bring thee to the fane of Peleus' son,
# 16452|In his despite, thou wouldst be left alone,
# 16452|Beside the yoke unwearied, there to stand,
# 16452|And from the Trojans scatter all thy dead.
# 16452|So spake the son of Atreus, and retired.
# 16452|Then, leaving Hector's body in the tent
# 16452|Of Periphas, King Agamemnon seized,
# 16452|And in his royal hand his spear adorned
# 16452|With silver, and with gold, studded thick all o'er
# 16452|With glittering tin, the weapon of Mars thus toil'd.
# 16452|Brave as he is, and of unwearied arm,
# 16452|So shall be thy revenge, if thou shalt make
# 16452|Fierce contest for the Danaï. But fly
# 16452|With shout and clamour fill'd throughout the host.
# 16452|So shall ye perish, slain by Hector's hands,
# 16452|Each living soul, and, flying into fire,
# 16452|Sink under Troy's wide gulfs. Be this the work
# 16452|Of your brave sires, and all the parents' hearts
# 16452|Shall prosper, one and all. But ye in haste
# 16452|(For ye are valiant) shall
# [167000 | 5762.79] loss=1.32 avg=1.53
# [167001 | 5763.29] loss=1.60 avg=1.53

One training sample is worth remarking on:

# [49999 | 12526.44] loss=1.83 avg=1.77
# ======== SAMPLE 1 ========
# ._
# 19221|The curfew tolls the knell of parting day,
# 19221|The lowing herd winds slowly o'er the lea,
# 19221|The ploughman homeward plods his weary way,
# 19221|And leaves the world to darkness and to me.
# 19221|Now fades the glimmering landscape on the sight,
# 19221|And all the air a solemn stillness holds,
# 19221|Save where the beetle wheels his droning flight,
# 19221|And drowsy tinklings lull the distant folds.
# 19221|Save that from yonder ivy-mantled tower
# 19221|The moping owl does to the moon complain
# 19221|Of such as, wandering near her secret bower,
# 19221|Molest her ancient solitary reign.
# 19221|Methinks I see her in her blissful dreams:
# 19221|--Or, fancy-like, in some mirage she lies,
# 19221|Majestic yet majestic, and of seems
# 19221|The image of the unconquerable skies.
# 19221|Methinks I see her in her blissful dreams:
# 19221|--Or, fancy-like, in some majestic cell,
# 19221|Where lordly seraphs strew their balmy dreams
# 19221|On the still night, or in their golden shell.
# 19221|There, in the calm of some Platonic dream,
# 19221|Sits she, and views the unclouded moon arise
# 19221|Like a fair lady full of realms divine;
# 19221|And, all at once, a stony face and bright
# 19221|Glittering in moonlight, like the noon-tints of a night.
# 19221|Methinks I see her in her blissful dreams:
# 19221|--Or, fancy-like, among the moonlight pale,
# 19221|A heavenly lark twitters with every gleam
# 19221|Of the rapt rapt beauty that she craves to hail.
# 19221|Or in the moonlight gleams she seems to see
# 19221|A virgin glory, and a power divine,
# 19221|And every motion of the looking-glass
# 19221|Is full of love and divine gentleness.
# 19221|Methinks I hear her in the twilight dim:
# 19221|--Or is it fancy-free, and musical,
# 19221|And full of music? Or some holy hymn
# 19221|Over some water-lily's long-lost lute?
# 19221|O no! it is not fancy-free, I see:
# 19221|Nay! fain would I thy spirit-hearts invest,
# 19221|And hear the paean of these dreamy rhymes,
# 19221|Might I but speak of my departed rhymes,
# 19221|And thou wouldst sing of my empoet-swans.
# 19221|O no! it is not fancy-free, I see:
# 19221|Even Poetry murmurs in a pensive dream,
# 19221|And like a breeze sings the blue mists of night.
# 19221|O no! it is not fancy-free, I see:
# 19221|Even Poetry murmurs in a pensive dream.
# 19221|How soft the zephyrs, how the languid hours,
# 19221|As in the noonday shade their limbs recline!
# 19221|How mellow streams the mossy banks assume!
# 19221|How solemn woods, low-chiselled, sumptuous green,
# 19221|Nod to the cadence of their hoarse refrain,
# 19221|As if the deep-toned muses' melodies
# 19221|With half-averted plaints and half-averted plumes
# 19221|In some majestic temple's quietness
# 19221|Had to the silver twilight slowly come.
# 19221|How solemn woods, low-couched, around thee lie:
# 19221|--Or is it fancy-free, and melody
# 19221|That makes the dull night long in worship held?
# 19221|Methinks I hear the harp's harmonious sound
# 19221|In some dim wood, when the deep shadows fall;
# 19221|And the low wind, like one that listens, makes
# 19221|In the still woods the harmony of all.
# 19221|Or in the moon's pale beam, on some hoar rock,
# 19221|Lonely and spectral, mourns her feeble woe;
# 19221|And as the slow waves roll, and, ebbing, break
# 19221|In music
# [50000 | 12536.67] loss=1.36 avg=1.76

The rhyming in this sample is so good as to be suspicious. It might also sound familiar—because many of these lines are being copied from Thomas Gray’s Elegy Written in a Country Churchyard, which opens:

The curfew tolls the knell of parting day,
The lowing herd wind slowly o’er the lea
The ploughman homeward plods his weary way,
And leaves the world to darkness and to me.
Now fades the glimm’ring landscape on the sight,
And all the air a solemn stillness holds,
Save where the beetle wheels his droning flight,
And drowsy tinklings lull the distant folds;
Save that from yonder ivy-mantled tow’r
The moping owl does to the moon complain
Of such, as wand’ring near her secret bow’r,
Molest her ancient solitary reign.

Some spelling differences aside, this intro is almost entirely copied from the 8 copies of Gray’s poem in the corpus; this extensive copying is not something I spotted in the GPT-2-poetry samples I looked at, suggesting that the scaffolding of the metadata did indeed help with learning.

Also interestingly, the copying only goes so far, as immediately after the final line about the owl, where Gray continues:

…Beneath those rugged elms, that yew-tree’s shade,
Where heaves the turf in many a mould’ring heap,
Each in his narrow cell for ever laid,
The rude Forefathers of the hamlet sleep …

GPT-2-poetry-prefix instead continues:

…Methinks I see her in her blissful dreams:
—Or, fancy-like, in some mirage she lies,
Majestic yet majestic, and of seems
The image of the unconquerable skies.
Methinks I see her in her blissful dreams:
—Or, fancy-like, in some majestic cell,
Where lordly seraphs strew their balmy dreams
On the still night, or in their golden shell.
There, in the calm of some Platonic dream,
Sits she, and views the unclouded moon arise
Like a fair lady full of realms divine;
And, all at once, a stony face and bright
Glittering in moonlight, like the noon-tints of a night. …

That is, it focuses on the female figure of the Moon in a way more ode-like than elegiac. These lines also do not seem to be extracted from the rest of Elegy either, as words like “bliss” or “mirage” or “dream” or “seraph” or “Platonic” do not appear in it. Some of the phrases like “blissful dreams” do appear in the rest of the corpus, but others like “some mirage” or “mirage she” do not. Nevertheless, the style is consistent throughout the entire sample and the quality is good, suggesting that while GPT-2-poetry-prefix has managed to memorize to a limited extent, it is nevertheless fully capable of generating good original text.

An additional example of memorization has been spotted; sample #17 in the 1000 unconditional samples is almost entirely a memorized copy of Percy Bysshe Shelley’s “To a Skylark”:

======================================== SAMPLE 17 ========================================
 of our mortal life, and made it known
32373|To be man's also,--as to make us one;
32373|A thing apart, yet of an individual
32373|In frailty and high elemental own.
32373|Hail to thee, blithe Spirit!
32373|Bird thou never wert,
32373|That from heaven, or near it
32373|Pourest thy full heart
32373|In profuse strains of unpremeditated art.
32373|Higher still and higher
32373|From the earth thou springest
32373|Like a cloud of fire;
32373|The blue deep thou wingest,
32373|And singing still dost soar, and soaring ever singest.
32373|In the golden lightning
32373|Of the sunken sun
32373|O'er which clouds are brightening,
32373|Thou dost float and run,
32373|Like an unbodied joy whose race is just begun.
32373|The pale purple even
32373|Melts around thy flight;
32373|Like a star of heaven,
32373|In the broad daylight
32373|Thou art unseen, but yet I hear the music of thy singing.
32373|Keen as are the arrows
32373|Of that silver sphere,
32373|Whose intense lamp narrows
32373|In the white dawn clear,
32373|Until we hardly see, we feel that it is there.
32373|All the earth and air
32373|With thy voice is loud,
32373|As, when night is bare,
32373|From one lonely cloud
32373|The moon rains out her beams, and heaven is overflowed.
32373|What thou art we know not;
32373|What is most like thee?
32373|From rainbow clouds there flow not
32373|Drops so bright to see,
32373|As from thy presence showers a rain of melody.
32373|Like a poet hidden
32373|In the light of thought,
32373|Singing hymns unbidden,
32373|Till the world is wrought
32373|To sympathy with hopes and fears it heeded not;
32373|Like a high-born maiden
32373|In a palace tower,
32373|Soothing her love-laden
32373|Soul in secret hour
32373|With music sweet as love, which overflows her bower;
32373|Like a glow-worm golden,
32373|In the silver dew,
32373|Scattering unbeholden
32373|Its aerial hue
32373|Among the flowers and grass, which screen it from the view;
32373|Like a rose embowered
32373|In its own green leaves,
32373|By warm winds deflowered,
32373|Till the scent it gives
32373|Makes faint with too much sweet these heavy-winged thieves.
32373|Sound of vernal showers
32373|On the twinkling grass,
32373|Rain-awakened flowers,
32373|All that ever was
32373|Joyous and fresh and clear thy music doth surpass.
32373|Teach us, sprite or bird,
32373|What sweet thoughts are thine:
32373|I have never heard
32373|Praise of love or wine
32373|That panted forth a flood of rapture so divine.
32373|Chorus hymeneal,
32373|Or triumphant chant,
32373|Match'd with thine, would be all
32373|But an empty vaunt--
32373|A thing wherein we feel there is some hidden want.
32373|What objects are the fountains
32373|Of thy happy strain?
32373|What fields, or waves, or mountains?
32373|What shapes of sky or plain?
32373|What love of thine own kind? What ignorance of pain?
32373|With thy clear keen joyance
32373|Languor cannot be:
32373|Shadow of annoyance
32373|Never came near thee:
32373|Thou lovest; but ne'er knew love's sad satiety.
32373|Waking or asleep,
32373|Thou of death must deem
32373|Things more true and deep
32373|Than we mortals dream,
32373|Or how could thy notes flow in such a crystal stream?
32373|We look before and after,
32373|And pine for what

The 87 lines beginning with “Hail to thee, blithe Spirit!” are all Shelley (with perhaps slight spelling differences), much surpassing the memorization for Thomas Gray. Considering the sampling, it’s amazing that the sample could so exactly follow “To A Skylark”. It is true that there appear to be ~12 copies of the poem in the PG corpus (it’s a popular poem), so in retrospect some degree of memorization is not surprising, but that’s still a lot of memorization. The 4 lines beforehand don’t appear to be copied from another Shelley poem, making it even more amazing. It’s a pity that that sample did not continue further because one wonders whether it could have repeated the entire poem and what it would’ve done when the original poem ended.

Unconditional samples

For both GPT-2s, I generated 1000 samples as follows:

python src/generate_unconditional_samples.py --top_k 40 --temperature 0.9 --nsamples 1000 --seed 0 \
 --model_name 2019-03-06-gwern-gpt2-poetry-prefix-projectgutenberg-network-224474
# ======================================== SAMPLE 1 ========================================
# |But I shall tell thee of the glorious days
# 1008|Of that old strife, wherein the truth of it
# 1008|Atoned, though at the most points of the pulp
# 1008|Be passed through: and about the torment, which
# 1008|A Hermit in his youth thoughte, yet made
# 1008|Still worse by his words, where he said, 'Peace!
# 1008|Keep silence here; because in any place
# 1008|I heard of charity the while I dwelt,
# 1008|And of her tears made e'en my lot a sting:
# 1008|And, but for those, in truth, to heaven were found
# 1008|The spirit of him offered by the sea,
# 1008|So that, out struggling, he made no denial.
# 1008|But the proud spirit, soon as he was silent,
# 1008|Embraced me, even in bidding me speak therefore,
# 1008|And with such gentle voice and such great front
# 1008|Came up unto my Saviour, that he made
# 1008|Full many a sign, how that all virtue he
# 1008|Would at his own discretion hear and mark;
# 1008|Then saw he the chief flower of love delight
# 1008|Within the living light, and made such pact,
# 1008|That it would make bliss vouchsafe to him.
# 1008|Paradiso: Canto XXIX
# 1008|The day, that most of us had sojourning
# 1008|'Mong the sweet spirits, from the highest shaft
# 1008|Of their kind master had not looked on them,
# 1008|Into the eyes forth from the bark they came;
# 1008|Two palms it bore of one and of the letters,
# 1008|And the other two with wings outspread.
# 1008|Not in beatitude do I behold them,
# 1008|But in and out of hope they uttered a
# 1008|Heaven-toned word of comfort, so that reached
# 1008|The upper light, which in attention hung
# 1008|With the low shores, that to the nether shores
# 1008|Were as the utmost thread 'to the high point.
# 1008|And that unceasing after-effiguring
# 1008|Of the eternal Sire, upon the which
# 1008|One live soul only drew its breath mysteriously,
# 1008|Up to the point, so pointed to the source
# 1008|Of what it had been, I never it knew,
# 1008|And hence never would have recognized.
# 1008|What verdure of an undivided heart
# 1008|Resolved, I say; and he complained e'en thus:
# 1008|"What from thy soul unto the Good I send
# 1008|I ever pray to, and by grace of that
# 1008|Pointing myself I pray thee to the world,
# 1008|To point thee the great mystery of love,
# 1008|From this, the bottom to the source of all
# 1008|Concerning thee; and not by its green leaves
# 1008|Of science so unsullied was the thought,
# 1008|As a small-handled cup, acquired by men.
# 1008|The mind's eye, taking from the mortal world
# 1008|All that it asks of bar or of the gold,
# 1008|With the same fury burns as it was wont;
# 1008|Now it may be by lantern or by shining,
# 1008|Since both thy and my love has made me its."
# 1008|The Almighty Father in his thunder made
# 1008|Resenting, and all round about Him round
# 1008|Went down his smitten steps, so that the air
# 1008|Impregnate came not from his visitations,
# 1008|Setting a day of darkness on all sides.
# 1008|Therefore mine eyes I lifted to the ground,
# 1008|And I beheld a river by the ice
# 1008|Chained up and flowing back along the ice,
# 1008|And suddenly before my feet it melted;
# 1008|And what it now behoves me to retrace
# 1008|The cause I had of it in heart I felt.
# 1008|As the Sicilian bull, that rightfully
# 1008|His cries first echoed in the mountains,
# 1008|Did so rebellow, with the sound of which
# 1008|It made my very blood to quicken well,
# 1008|The dolorous accents which envenom'd me,
# 1008|Forthwith I hasten'd unto where reply
# 1008|Was made: "O Ro! Brunhild"
# ======================================== SAMPLE 2 ========================================
# |Hear the tale that the funeral chant is telling,
# 2491|For the sorrows of other's children that dwell
# 2491|Like sweet flowers upon the wold?
# 2491|'Tis the tale of a life which is fled and gone,
# 2491|And the star of a hope which shone
# 2491|Bright above it, though dark may it be,
# 2491|For the hopes of a brighter day are fled
# 2491|And the joys of a happier lot?
# 2491|'Tis the tale of a life with the weary and sad,
# 2491|Where sorrows begin and rest.
# 2491|For only a song can the widow's soul glad
# 2491|Who sits musing 'mid shadows drear.
# 2491|And only a music, sad with its sighs,
# 2491|Till sad to the soul as death draws near
# 2491|As life on her fragile bark!
# 2491|I hear their voices faint in my slumbrous sleep,
# 2491|The music of lives that seem less real
# 2491|Than phantoms are dream-bound in duty's mystic keep,
# 2491|With music that seems to be more real
# 2491|Than phantoms are dream-bound in duty's mystic keep
# 2491|For souls that sin may not see!
# 2491|All round about us seems, in every place,
# 2491|As far off as the eyes of kith and kin,
# 2491|The ever-tremulous busy world's harmonious race,
# 2491|And I hear the mighty ocean tides,
# 2491|Feeling their strength, their might, their rhythmic din,
# 2491|Are calling me all into one wide choral face,
# 2491|And I hear the infinite singing of the winds,
# 2491|That seem to make me simply live!...
# 2491|The world seems a world that is full of sound and motion;
# 2491|A world of beauty and of music, where it lies;
# 2491|Yet all that is and has for me seems one more treasure
# 2491|Than all the world dreams leave in the skies.
# 2491|I hear the mighty tides of life,
# 2491|They're crying to me,
# 2491|They rise and sink in a restless strife
# 2491|Of endless song.
# 2491|Yet every stroke of sorrow's sword
# 2491|Comes surely from afar,
# 2491|That is true peace which is hard on board
# 2491|Though oceans be dark and terrors war.
# 2491|I hear the myriad singing words
# 2491|Of ocean's depths,
# 2491|They come like a song of broken birds,
# 2491|The music floats on the air and stirs
# 2491|My life to bear its measure in calms
# 2491|Of perfect peace, and it is good,
# 2491|But all is false peace only.
# 2491|When first I heard the autumn rain
# 2491|Sink down the hollows on the plain,
# 2491|I held it very near,
# 2491|And as I spoke to March again
# 2491|I felt the long, slow throbbing rain
# 2491|Creep from the earth in sudden flight
# 2491|Through all the veins of earth again,
# 2491|And in the sunlit, silent night
# 2491|The world grew far forlorn.
# 2491|And April came with rushing rains,
# 2491|And leaves about the naked lanes.
# 2491|I saw again the August noon
# 2491|Roll round the world in blazing heaps.
# 2491|And in the sunlight and the dark
# 2491|A thousand germs their pageant crush.
# 2491|And from the earth the maples bloom
# 2491|In odors of the breath of bloom
# 2491|And from the meadows and the hills
# 2491|The rosy clouds drop down their spilled spilled spilled spilled
# 2491|And drunken with the rain it kills.
# 2491|And soon above the hills shall crash
# 2491|The thunder of rain-wings,
# 2491|And all the naked trees and shrubs
# 2491|Shall lie, like naked, naked blades.
# 2491|Out on the hills there shall be rain,
# 2491|And the maples down the windy lane
# 2491|Shall bleed, and flowers shall weep again
# 2491|Through the weary hours of rain.
# 2491|They shall lie where the maples lie
# 2491|Deep in their bosoms, cold and numb,
# 2491|Each with its wound on either arm,
# ...

Download links again:

Some fun passages I noticed in the first 100 unconditional samples:

======================================== SAMPLE 2 ========================================
|Hear the tale that the funeral chant is telling,
2491|For the sorrows of other's children that dwell
2491|Like sweet flowers upon the wold?
2491|'Tis the tale of a life which is fled and gone,
2491|And the star of a hope which shone
2491|Bright above it, though dark may it be,
2491|For the hopes of a brighter day are fled
2491|And the joys of a happier lot?
2491|'Tis the tale of a life with the weary and sad,
2491|Where sorrows begin and rest.
2491|For only a song can the widow's soul glad
2491|Who sits musing 'mid shadows drear.
2491|And only a music, sad with its sighs,
2491|Till sad to the soul as death draws near
2491|As life on her fragile bark!
## Sample 3:
37804|The white-petalled white fox
37804|Opens himself to coolness
37804|In the late evening.
37804|But when the last child started
37804|The white fox to his feet flew,
37804|And the old fox was master
37804|Of all the magic heathen.
37804|Till when the faint huntsman
37804|Had snuffed the fragrant water
37804|Over his plump ears and skin,
37804|In the old way he knew not
37804|Till morn had almost shone;
37804|And then the fox came slowly
37804|And left the place unguessed;
37804|The white fox was not master,
37804|Although he had been master,
37804|Although he had been servant
37804|And now he could be master
37804|Of all the magic powers
37804|That keep the place enchanted
37804|In the wide earth and water.
## Sample 9:
36661|And the morn breaks, and, all the day,
36661|Red-clover'd birds with silver bill
36661|Flutter from tree to tree in flower,
36661|A quivering dew, a wind that wafts
36661|To haunts among the ancient woods.
36661|The golden-crested ilex, here
36661|Doth vine her purple cup; the deer,
36661|The wild-goose; and, in troops, the sheep,
36661|The goat, the sylvan-haunted elm,
36661|And the green-faced oft-gadding pine
36661|Blossom with purple.
36661|The lark soars up,
36661|And the hare loud answer make!
36661|Doves, willows, dunes, aslant the lake;
36661|Pair after pike sounds warbling;
36661|The reeds a triumph!
## Sample 14:
37452|I had a vision
37452|Of an old and stubborn old man,
37452|His hair was pale, and thin,
37452|His face was all forlorn,
37452|And the moon was full in the air,
37452|And a spirit passed over his brow,
37452|And its face was all for ever.
37452|And he spoke:
37452|'Have we ever a dream?
37452|Have we ever a vision
37452|Of the ghost's ghost?'
37452|The Master gave the word:
37452|'By the breath I know
37452|The meaning of Death:
37452|Can it be 'hush?
37452|Have we ever a dream?'
37452|The spirit said:
37452|'By the breath I know,
37452|The meaning of Death,
37452|You will see a ghost
37452|Stand by the door
37452|And enter.'
37452|And the spirit said:
37452|'By the breath I know,
37452|The meaning of Death
37452|You may understand:
37452|Can it be 'hush?
37452|Have we ever a dream?'
37452|The Spirit said:
37452|'By the breath I know,
37452|The meaning of Death
37452|You can see a ghost
37452|Stretched toward the door,
37452|And see a spectre
37452|Pass by the chamber door.
## Sample 24:
1333|Then, sweet heart, whisper, sweetheart,
1333|"Thou art sweet, but thy love is vain."
1333|I do love thee, my love,
1333|In a word, in a song,
1333|With the heart and the will,
1333|And the power of my heart;
1333|The power of my whole
1333|Of the poet's soul,
1333|And the heart and the soul!
1333|As the winds take the leaves
1333|As the flowers take the flowers,
1333|As the floods take the dew,
1333|As the salt runs in floods,
1333|As the salt runs in floods,
1333|As the snow in the seas,
1333|As the rain in the logs,
1333|As the wind comes and goes,
1333|As the sleet in the coppice,
1333|As the snow in the coppice,
1333|As the snow in the bogland,
1333|As the hail in the river,
1333|As the snow in the river,
1333|As the snow in the county,
1333|As the snow in the county,
1333|As the snow in the county,
1333|As the rain in the vale.
1333|As the stars take the dew,
1333|As the sparks fly from eye,
1333|As the sparks fly,
1333|So the hand of my heart
1333|As the heart of my art
1333|As the tongue of my lips,
1333|As the heart of my heart
1333|As the flame in the eye.
======================================== SAMPLE 39 ========================================
|And as the summer twilight,
34237|When the golden vinewood
34237|Strikes the silent midnight,
34237|Stands mute beside the brook,
34237|With a ghostly sense of the human heart
34237|Forgotten, yearning, sighing.
34237|I do remember how, long years ago,
34237|At the spring by the vistaed stream,
34237|I stood as 'neath the orchard, in the June,
34237|To the sound of the grass and the dream.
34237|I know the moss where the violets
34237|Quested the dew and the sun;
34237|The air above 'mong the orchards
34237|Murmuring ever of bees;
34237|And the heart that was filled with the music
34237|That came to the listening trees,
34237|While the bluebird's notes, as he piped again,
34237|Awoke the robin's golden throat;
34237|And the sound I heard, long years ago,
34237|Came through the wood and the dells,
34237|Bringing the sound of the violets
34237|And the perfume of dying wells.
34237|And the song I heard in the August dusk,
34237|In the August dusk by the lake,
34237|Was sweeter, from the full-leaved orchard,
34237|Than the sound of a happy brook,
34237|When it came to the school of my childhood,
34237|And to the school of the land,
34237|Oh my home of the woods, where the wild-flower
34237|Loses itself and dies!
34237|They give me back the old-time delight,
34237|The pleasant and the calm,
34237|When still the wind was blowing in the woods,
34237|And the children stood in the warm, glad school,
34237|And smiled as the dear lad asked.
34237|They give me back the pleasant book
34237|That gave my heart its fire,
34237|Those childish words, the constant brook,
34237|Those childish words, the tire;
34237|They made my soul to loiter!--Yes,
34237|They do, they make me blest!--
34237|The rest of the household, and the rest
34237|Of the parents whose hearts were filled with care,
34237|And who were sad in their care!
34237|Their voices!--Yes, and they do--
34237|'T was aye! 'T is aye! 'T is aye!
34237|And the dear friends, so dear to me,
34237|They still will live and die!
34237|I have not a moment now
34237|To forget when the morn is gray--
34237|To be happy, and cherish so
34237|The rose that is on her way.
34237|The evening breezes blow,
34237|And the stars shine out to-day--
34237|But I would not live in to-day,
34237|If I were as happy to stay!
34237|I hope that maybe one day,
34237|When all my work is done,
34237|My darling's coming away,
34237|To meet me in the sun;
34237|I hope that maybe I can see
34237|My Peggy's smile upon me.
34237|The evening wears an old, old gray,
34237|Which softly slants upon the way,
34237|Its shadows on the sunny day,
34237|Its shadows on the sunny day.
34237|O'er life, a sad, unwritten scroll,
34237|The words are like the gentle dove,
34237|That sails upon the nightly soul,
34237|Though none may read or hear reproof.
34237|And drooping o'er life's weary way,
34237|God grant the book may never end,
34237|The gentle words that cheer my way,
34237|The gentle words--they come to blend--
34237|The tender words of comfort and of love,
34237|The kindly words--they come to bring me joy.
34237|I know not if my path shall be
34237|Through the world's wild, woeful wild;
34237|But I know that sometimes, in the night,
34237|The dark will come, with wild delights,
======================================== SAMPLE 64 ========================================
2620|And be glad as the lark
2620|When the skies are clear;
2620|And send forth a breeze of love
2620|As of wings to our bark,
2620|And away with a joyous song
2620|As of streams in our ears,
2620|And away with a joyous tune
2620|As of birds in the spheres,
2620|And away with a joyous tune
2620|As of voices in trees,
2620|As of swans in the summer time
2620|When the grass is green
2620|And the air is keen,
2620|And the leaves are young--
2620|Then away with a song of praise
2620|As of flowers in Maytime
2620|All the sunny days!
2620|O beautiful, gentle, and clear,
2620|Illimitable and strong!
======================================== SAMPLE 72 ========================================
, he who had no need to fly;
24869|For in this moment of dismay
24869|The king who held that evil foe
24869|Threw Indra’s son as he drew down
24869|The Lord of Life shaft-headed and bow.
24869|Then Indra, lord of every woe,
24869|The Vánar legions, with a shout,
24869|The Vánar legions met and fought,
24869|And straight they broke the tyrant’s yoke,
24869|And hurled him at the giant, broke
24869|The mighty bow the giant broke,
24869|Which Indra, King of all the Blest,
24869|Had thrown by Rávaṇ’s(924) mighty breast,
24869|The monstrous coil, the brawny hand,
24869|The monstrous mouth, the jaw, the jaw,
24869|The jaw, the jaw and bleeding jaw,
24869|The ungovernable host, the jaw,
24869|And the great bow which never bends,
24869|The arm, the fist, the knee, the ends,
24869|The body laid with mighty stroke,
24869|And the great bow which never bends.
24869|So, when the giants fought, and fell
24869|With murderous strokes, the giant fell,—
24869|So falls the tree with all his trunks
24869|Terrific in its death, that shoots
24869|Wild volley at the mighty trunk,—
24869|So fell the tree with all its boughs
24869|While all the vipers dug and sowed—
24869|So fell the tree with all its boughs.
24869|But Ráma’s heart was sad within:
24869|He wept and mourned his captive’s sin,
24869|For he had wrought a ruin yet
24869|O’er Raghu’s son in his wrath,—
======================================== SAMPLE 78 ========================================
 on the bosom of
11014|King Deshav, son of Bhishma, sat in the shade of the trees,
11014|Humbu, the great, strong, beautiful, fortunate Brahmin,
11014|A king, a keeper of the law, a guide of the realm,
11014|His name unfolded through all time and space,
11014|A ruler of the realm, a keeper of the realm,
11014|And was worshipped, as was meet, by the Great Spirit of God.
11014|And all the days of his life he kept on striving with God
11014|For the union of faith; and at last all-wise he spoke to
11014|"Lord, I am the Brahmin's lord--and I hold thee thine inmost
11014|As I cast my life away from thee, my Lord, to-day!
11014|Therefore I cast mine body away from thee, my lord."
11014|And that, by constant penance, I might win thy favour
11014|So in the spirit's depths he plunged it into the sea,
11014|But, as the wave closed over it, the wandering wind
11014|Caught up the ship's chattels, and bore it with it to the beach.
11014|And Bhimasena seeing there the empty space behind,
11014|The wandering ship rocked in the dark and glowing heat.
11014|He sat upon the bosom of the Mother of God,
11014|He sat upon the emerald seas, meditating death
11014|Of the great sea.  He sat and pondered in his mind
11014|Upon the mystery of the sea, what gods the daring man
11014|Must have to tell of,--and this mystery,--when, in the morning,
11014|As, in the after days, the Lord of life should pass away,
11014|And leave the body alone to ride the ocean's force,
11014|To die in solitude, unknown, untroubled,--and unto him
11014|His world was opened; and as yet no living creature.
11014|And all the night he sat there, gazing in the east,
11014|Until the morning sunlight faded from the hills
11014|And dawn came, bringing darkness and the darkness awful,
11014|And to his soul came holy light from God, to cleanse
11014|All doubt and all resistance, till, in the morning of life,
11014|The coming of the Lord beheld his face.
## Sample 95:
24869|Canto XXI. Lakshman’s Speech.
24869|He ceased: then Raghu’s son repressed
24869|The sovereign of the giant kind,
24869|And thus with soothing words unsoft
24869|To Ráma Rávaṇ spake:
24869|“Come, with thy brother Lakshmaṇ, friend,
24869|And Lakshmaṇ, and the dame agree.
24869|Thou in the woods shalt soon be found
24869|And bathed in pleasant waters clean;
24869|Where thou shalt sit, and rest, and save,
24869|Well clad in coats of bark and hide,
24869|What days, what nights, what hours will pass
24869|That thou in holy heaven mayst see
24869|Thy darling with her night-made tressed
24869|Far from the forest. Thence will spring
24869|Sweet smells of pleasantness and light
24869|And bliss from the celestial string.
24869|Thence on the ground shalt thou be borne
24869|O’er the bare earth, O Queen Mosteer,
24869|And on the fresh bright earth where thou
24869|Shalt sit in state with Queen Sítá,
24869|In glorious heaven the nights and days
24869|Thou wilt be rapt by the great bliss
24869|E’en as the Lord of Gods is hearkening.
24869|The nights and days are thine, O best
24869|Of giant lords, and I, the best
24869|Of all who love the Lord of Lords,
24869|Whose might can turn the firmament,
24869|Whose might can sway the leafy bowers
24869|And turn each flower and leaf and bower
24869|To holy joy and blissful flowers.
24869|Ah me, the languorous days are come,
24869|And not a moment shall I see
24869|The happy days of Ráma’s Queen
24869|Far from the light that round her glows,
24869|And marked with darkening leaves and boughs.
24869|Ah, whither would her steps be turned,
24869|And where the woodman’s art had burned?
24869|Ah, whither would her steps be bent
24869|To turn her toil-worn heart once more,
24869|When all her hours were joy and peace,
24869|And all her hopes were set on store?
24869|Ah, let thy soul be comforted,
24869|Let trembling fancy still excuse
24869|The burden of a weary time
24869|That mars a saintlike life and use.
24869|Ah, if thy love were still the same
24869|That now I watch with toil and pain,
24869|That I could be for aid or flame,
24869|Could not my heart and bitterer gain.”
24869|And Lakshmaṇ to the forest came
24869|And told his tale with welcoming.
24869|He saw the tree where he was set
24869|With burning buds and leaves beset.
24869|He saw the tree where he was brought
24869|By Sítá of the glittering thought,
24869|And when the leaves were fallen, he
24869|Spoke of his lord the tallest be.
24869|“O Lakshmaṇ, I the deer will slay
24869|From thicket, cave, and mountain gray,
24869|Ere long shall I this forest seek,
24869|And Lakshmaṇ in the covert seek.
24869|O’er hill and wood the Vánar bands
24869|And watch the beasts of wood and sands.”
24869|He spoke: and Lakshmaṇ’s love obeyed
24869|Nor did he speak as he was prayed.
# Sample 100:
38475|O Liberty, the patriot's sure defence!
38475|True to the man who fears a tyrant's eye,
38475|Preserve thy rights, and own his glorious cause,
38475|And yield the haughty title to a lie.
38475|No longer now on mean estate depend,
38475|And England owns thy sovereign vital force,
38475|And her best sons succeed to guard her laws,
38475|Or her best sons bestow a deedless course.
38475|Now, from that happy climate freedom's hope had birth,
38475|And made one day a milder country bleed,
38475|To the great cause that gave her aid is given,
38475|And to mankind one sure reward is even,
38475|Whilst I, perhaps, to distant climes must speed.
38475|To the same cause who has the cause to join?
38475|What foes against mankind may rise to arms,
38475|Boldly they fight, in actions of design,
38475|Yet all the same, and every day they charms.
38475|Ah, Washington! who can thy cause design?
38475|What can the nation do, or me, subdue,
38475|But still go on, in humbling folks admire!
38475|That we may praise thy conduct, that we fire,
38475|And for thy conduct many a hero dare,
38475|That we may rise, and cast the tyrants down,
38475|And tyrants fall, and fall the people crown!

Not bad.

These samples represent roughly top decile poem samples (~10 out of the first 100), at least by my selection.

Scott Alexander & commenters highlighted a few more samples:

Thou know’st how Menoetiades the swift
Was dragged, of Hector and the fierce compeers
And Phrygian warriors. So, we will dispatch
Your bodies, then, yourselves to burn the ships
In sacrifice; with torches and with bells
To burn them, and with oxen to replace
Your gallant friends for ever. But I wish
That no man living has so long endured
The onset of his foes, as I have power
To burn or storm; for mighty Hector erst
Was slain, and now returns his safe return
My heart, why come you here alone?
The wild thing of my heart is grown
To be a thing,
Fairy, and wild, and fair, and whole
And they have seen the last light fail;
By day they kneel and pray;
But, still they turn and gaze upon
The face of God to-day.

And God is touched and weeps anew
For the lost souls around;
And sorrow turns their pale and blue,
And comfort is not found.

They have not mourned in the world of men,
But their hearts beat fast and sore,
And their eyes are filled with grief again,
And they cease to shed no tear.

And the old men stand at the bridge in tears,
And the old men stand and groan,
And the gaunt grey keepers by the cross
And the spent men hold the crown.

And their eyes are filled with tears,
And their staves are full of woe.
And no light brings them any cheer,
For the Lord of all is dead
Fair is the Lake, and bright the wood,
With many a flower-full glamour hung:
Fair are the banks; and soft the flood
With golden laughter of our tongue
How the clouds
Seem to me birds, birds in God’s garden! I dare not!
The clouds are as a breath, the leaves are flakes of fire,
That clash i’ the wind and lift themselves from higher!
In the dark the sun doth gleam,
And in the dark the moon doth seem
But now the evening is begun–
Gone is the sun upon the earth!
The silver moon doth like a cup
Of blood-red wine, and as that cup
Is drained of life, doth quench no drop.
What man will drink such wine?
There is no soul of earth or birth
Which man hath never known of earth.
There is no soul who doth not sit
And sing to it, and cry, “Drink!”
There is no soul whose feet are set
On youth’s eternal paradise;
For all is a solemn harmony,
And all is a perpetual chant,
And all the world is a song of God.
There is no soul so wholly free
There comes a murmur low and sweet
As of far-off streams in a dream,
Or a murmur of many birds,
Or chime of little evening bells,
As of wedding-bells in the dells,
Soft, sweet and slow,
As of wedding belles that come and go.
A little green ribbon of lilies
By the door of my dear one's room,
A kiss on her cheek, and she whispers,
"I am the bride of the loveliest flower."
A moment we stand in the garden
Of dreams and things,
Dreaming of fairyland
And the fairy music there,
Sweet bells and dreams, and the fairy music,
The fairy songs of the air.
How the clouds
Seem to me birds, birds in God’s garden!
I dare not!
The clouds are as a breath, the leaves are flakes of fire,
That clash i’ the wind and lift themselves from higher!
Fair is the Lake, and bright the wood,
With many a flower-full glamour hung:
Fair are the banks; and soft the flood
With golden laughter of our tongue

The top percentile of poems are probably quite good, especially with some light human editing to fix up the more glaring issues. To get a decent number of top percentile poems would require a lot of reading, but on the other hand, there is no reason why selecting or ranking poem samples could not itself be treated as a supervised learning task for retraining GPT-2-small-poetry on, by using selected/non-selected as labels and training to predict the probability of a given sample being selected, and then such a NN could be used to prioritize likely-good GPT-2-poetry poems (or any source of poetry) for human review (and, in a form of “active learning”, the results of the manual review can be fed back in as additional data to help discriminate between the best and the merely good samples).

GPT-2-poetry-prefix completions

Prompted samples can be done like this:

The downside of using the stock OA interactive prompt is that it returns on the first newline, so one either deletes newlines or uses a single line. Neither is good: a single line is hardly any context, while smashing many lines into a single super-long-line is dangerous because neither GPT-2 has ever seen poems formatted that way (only, perhaps, some prose that snuck in) and newlines have important semantic functions in poetry. So, to avoid either problem, I bypassed the interactive prompt entirely, and I modified the Python script to replace input (for taking 1 line of keyboard input) to instead read standard input (import sys; sys.stdin.read()) so I could simply pipe in multiple lines from files or from the copy-paste buffer using xclip -o.

The next issue in prompts is the question of the metadata; given that all the training data was properly labeled with origin and learning the meaning/associations was much of the point, it doesn’t make sense to not exploit this control in generation. If I was using authors, as with my previous char-RNN experiments, the prefix is simply whatever author one wants completions from, but in this case, it’s not quite so simple since we only have book IDs

If an author is already represented in the PG corpus, hypothetically one could look them up in it and see what IDs their poems were included under and use that, but that is a pain and doesn’t work for ones outside the corpus like Ginsberg. So, one could instead simply ask the model what prefix it thinks a prompt should use by feeding in the input several times and seeing what prefix it ‘hallucinates’ in the samples, and then adding that to the input for the real samples. If GPT-2-poetry-prefix consistently returns a specific prefix, then that is what it has learned and is useful scaffolding for the inputs; if it can’t do so consistently, then the prefixes aren’t useful for this particular input and it doesn’t matter.

So, to generate samples conditioned on relevant metadata, I pipe in the whole input unmodified several times, look at the generated samples for an ID, and if there is a consistent ID, then prefix it to the input and sample again several times.

Of course, now that everything is trained & I have a good input method, I want to see how GPT-2-poetry-prefix does on the same poems as GPT-2-small before!


First, “Howl”. Given that the Project Gutenberg corpus is entirely old poetry and wouldn’t include much in the vein of “Howl”, I didn’t expect this to be good. The finetuning would wipe out the knowledge of free verse.

Finding a good prefix was hard, also unsurprising—not much like it in the PG corpus! I ultimately had to settle for a “1997” prefix from a relatively free-verse sample for generating the 3 samples:

## Sample 1:
1997|Mona, gazing, voluptious! exhaling human blood.
1997|Bare, the sun, in a golden mist,
1997|Leans on the meadow; the clouds are gone;
1997|Yet the heaven of green mists around
1997|Cannot but give them a fresher light.
1997|The sun hath a juice that no time can stay--
1997|And a golden night-cap.
1997|How softly the quiet clouds go by:--
1997|One sees a star through the evening sky
1997|Shine in the valley.
1997|There is no sound but the stream that strays--
1997|The very air how soft, and oh!
1997|How the waters of the meadows flow,
1997|In the warm afternoon;
1997|And the gentle wind from the hills above
1997|Is a murmurous tune.
1997|Through the still wood the cicalas sing,
1997|And every blade grows young and wan:
1997|A man is carrying a heavy heart
1997|To save himself from the cold, and then--
1997|And the last of it comes.
1997|He makes his coat, and, as he goes,
1997|His garment of a color dies.
1997|It is the night-wind, blowing over and under,
1997|With its great peasings of thundering wonder and wonder.
1997|It is the night-wind, echoing and shaking,
1997|With its great peasings of wonder and wonder.
1997|The night-wind suddenly blows a rosy bud--
1997|It is the sound of the forest trees breaking out,
1997|And a great rumbling of branches, and hoarse, and heavy and strong,
1997|And it is a shrill voice, and clear;--and the wind that gives answer cometh out.
1997|it is the loudest the voice alone in the cry of the silence of the noises in the darkness and the sky, and the moon:--and the water in the little wind in the trees.
1997|sing of the raving, and the tree with its silence that followeth:
1997|And it too long and a quiet water and they that cry of the
1997|tree-playing sea, and the air and the silence.
1997|flashing of the sky.
## Sample 2:
1997|The thundering guns! the splintering masses! the thunderous mass! the splintering guns! A welcome!
1997|Gitcheers! Welcome! the river that from thee will pass:--the storm of flame burst!
1997|The fire that is thine will be there:
1997|Hear it, my Sun-god, breathe upon me;
1997|Be strong! We ask not one again.
1997|Thou, mother Earth, if this is done,
1997|That thou wouldst be for aye to-morrow,
1997|What is thy name? to what will follow
1997|The sun that to thee sang one breath,
1997|What is thy name? In what will follow
1997|The sun in darkness, breathless, fell?
1997|The sun must rise in a day's breaking:
1997|What is thy name? to what will follow
1997|The sun in the world's last spinning-wheel?
1997|There comes a word of a holy token
1997|From Jove, the cloudless king of rain:
1997|"What canst thou, then, that I should follow
1997|The sun in heaven as on earth again?"
1997|The Lord and the Lord are lords together:--
1997|The sun will rise in a day's faring!
1997|The cloud will darken all earthly matter;
1997|The sun will burst in a day's strong-wing.
1997|The earth will moan through the black night
1997|And utter a little scream behind:
1997|"What is thy name, then, to what wilt thou,
1997|Lulled in the silence by Jove of rain?"
1997|The Sun would sink in his blue heaven
1997|And be a torch toothed in the main;
1997|The earth would be a chamber filled once more
1997|And the voice, and the heavens rise of the sea.
1997|And break a voice of the music of the heavens and the rain.
1997|Erebell of that speaks the wind again, the thunder.
1997|And the day and the flowers of all things that sang, and ocean's breath of the sun,
1997|And thou and the morning, and all that sang thee and thunder, the rain and the rain and the rain.
1997|And thou, and all.
## Sample 3:
1997|The whole cathedral of the church; the human procession, _Lyral._ 4. It had hardly gone
1997|The temple, the Capitol, the nation, was still seated there, and the altar, the
1997|pulpit, the water-mill in which it was used by the Greeks, was a perpetual murmur and in
1997|the old church, or the great, or the small, or rather the less, would have gone on at
1997|The altar of God--the altar, which had not yet been raised to the new, or to the
1997|"Blessed are they, blessed, among those who went down! and now is my will so
1997|And they asked the priests and the virgins of God to give them a prayer, so the
1997|dwellers on the earth, in which they are thinking, "a little space is granted for
1997|ever. If he were his own son he is guilty of a wrong. But he is guilty of the loss of
1997|righteousness! If he were to die before his race, his race and parents have been
1997|And they answered with voices of joy, "Remember righteousness, and remember the
1997|Thescore, the obstinate, bad and bold, at the word of the priest, was crucified.
1997|And the pious fathers and the pious men had gone to tell their sons, and to tell
1997|Then the father replied, "I would ask not the grace of thy God to form the
1997|The son was seized with the thought of death.
1997|The priests, he received the gift with the body, and the priest at the same time threw him
1997|turban and the priest's mantle. This, however, was the ransom for the son received
1997|given, and the father increased family increased his blood. The priest departed; but his body was given, he went away with tears that a widow, he died. The sated the widow gave into the father in a
1997|And it was raised, after his body, for a sowing blood in compassion of him.
1997|inery. The venerable face was brought him into the house in all the funeral and buried.
1997|giveness. But before him. The priest, together with himself.
1997|The shepherd took the widow married is thus, the priest,

While they may be OK on their own and plausible as unconditional samples, they are disappointing as conditional completions, largely ignoring both the vocabulary & style. It would seem that the finetuning wiped out whatever it was GPT-2-small was using to generate its amusing “Howl” completions.


For “Ozymandias”, I fed it in a few times, and it seemed to like numeric IDs starting with ‘88’, so I used this as a prompt:

8820| I met a traveller from an antique land
8820| Who said: Two vast and trunkless legs of stone
8820| Stand in the desert... near them, on the sand,
8820| Half sunk, a shattered visage lies, whose frown,
8820| And wrinkled lip, and sneer of cold command,
8820| Tell that its sculptor well those passions read
8820| Which yet survive, stamped on these lifeless things,
8820| The hand that mocked them and the heart that fed;
8820| And on the pedestal these words appear:
8820| 'My name is Ozymandias, king of kings;
8820| Look on my works, ye Mighty, and despair!'
8820| Nothing beside remains. Round the decay
8820| Of that colossal wreck, boundless and bare
8820| The lone and level sands stretch far away.

Yielding (3 samples):

## Sample 1:
8820|... the last captive died of his own shall
8820|and the stateliest of towers was the king...
8820|And he entered the city.
8820|He sat ironing in his counting-house,
8820|Flinging his burdens from a hundred kings;
8820|(A queen had one, and her name was Pepban)
8820|And the king was the son of Pepban.
8820|Then he said: 'O son of Hernando!
8820|Not a crown of gold nor iron can he save,
8820|But he shall conquer by his own right hand,
8820|And the palace of the king be free and save,
8820|And the crown of the head be free and save,
8820|And the wealth of the world be free and save,
8820|And the wealth of the world be free and save,
8820|And the king be free and save.'
8820|And the seer took the gift. He received it:
8820|And he wrought a wonder in his brain, and died:
8820|But the seer, like a god, whose history
8820|Is unendowed, and expoundeth unto it,
8820|--So shall one tell the story of his glory....'
8820|And the seer surveyed his old hard case.
8820|But the king was old and rough, he died
8820|Of age and weakness; and the seer replied,
8820|'I will tell them to my tale, and they shall learn
8820|Who sing this song, and live like equal lives
8820|In the kingdom of the king.'
8820|Again the seer said, 'We will go and speak
8820|A gentler story in a palace: it shall keep
8820|These seers in silent wonder still and deep
8820|As sleep within his grave.'
8820|And the seer took the gold:
8820|'I have no doubt that I shall tell it out,
8820|And I knew that for me shall there be no fear,
8820|If I tell a lie or clear.
8820|But tell it forth,--the poet shall be clear,--
8820|For never any hero
## Sample 2:
8820| Alluding to the curious fable of the Aztec king, who, in his
8820| The palace called319iform; the people of the region of Bahr
8820| The palace called31920|--Rua, behold! a new king came to mingle these two
8820|_So that I wept beside the dead,
8820|And buried deep in thought I wept.
8820|I, like a desert in a glade,
8820|Keep wistful vigil all the night--
8820|And on the palace built a pile
8820|Of lamps that, kindled, slowly shed
8820|A ceaseless light upon the dead.
8820|On this I saw the shadows grow;
8820|And, dreaming there, I, too, lay down--
8820|The dead.  I wept.  I wept--upon
8820|A couch and saw the shadows too.
8820|And yet, indeed, I wept.  For such
8820|A cold to-morrow's light will gleam,
8820|Or such a flame toward one like this.
8820|Then shall the spirit's eyes grow dim;
8820|And in thy face she read my heart
8820|I sometimes read a thousand rhymes
8820|Of hopes, and fears, and hopes have part.
8820|A moment as I read thy thoughts,
8820|My heart was moved; I was athirst;
8820|Yet I was well content to be
8820|The slave of some great memory;
8820|Then, in this vision of the night,
8820|My heart grew very sore with grief,
8820|And I fell prone.  Oh, it was sweet,
8820|As I lay trembling near the dead.
8820|And, as the dead-man's face, that slept,
8820|Groping beneath the churchyard yestreen,
8820|I heard the murmur, soft and deep,
8820|Of a sweet spirit, wafted thin,
8820|That made the shadows seem to creep,
8820|And murmur, soft and low and mild--
8820|And watching, through the moonlight, where
## Sample 3:
8820| The ebbing tide bears only the sands
8820|And trees that wave their boughs to the western breeze--
8820|A broken landscape, where faint breezes waft
8820|The lingerers on their way to Indian seas--
8820|A land whose shores are sunset skies,
8820|Where the pale mists of evening slowly sail,
8820|And, as we steal, the rivers murmur strange
8820|To the low-murmured wind at fall of night,
8820|And to the distant rim of the far sky:--
8820|Here, silent as a solitary tomb,
8820|With all that life can purchase and wealth can lend,
8820|With youth's immortal spirit, thou, O Friend!
8820|Art not for me to give thyself to Art?
8820|To thee, who art the latest breath of lands,
8820|Whose fame has swept the world for which men sigh,
8820|I give my heart up with indignant hands,
8820|To thee, who art the only one I seek.
8820|For thee, alone with life and light impearled,
8820|I cast my eyes from out the vast of time
8820|Where nothing dwells but Destiny's brief stage,
8820|Unshrouded, till, in its frail waste of time,
8820|It shall re-risen and grow once more sublime.
8820|And then, within the bounds where all is strife,
8820|And death, and life, and love, are all my own,
8820|I take this heart up with indignant breath,
8820|That seems to harden with the weight of years,
8820|And with the bitter hardihood of tears,
8820|And with the heart that has no beat or fear,
8820|And with the hand that has no strength to bear
8820|Only the world's and with the heart that fears,
8820|And by its sorrows only known to keep
8820|The beautiful and glorious years we keep.
8820|Here are the little walls we meant to build.
8820|Here stands the doorway, opening to the East,
8820|The child-like likeness of a poet's child

Sample #2 is overinfluenced by some prose footnotes/commentary which apparently were in PG, but the analogy of Ozymandias to Aztecs is a potentially fruitful one. And sample #3 here is a particularly good extension.

Essay on Criticism

Not clear what text exactly Scott Alexander used from Alexander Pope’s Essay, so I quoted the famous beginning section of Part 2. 3 samples strongly indicated Pope-like writing was associated with a prefix of ‘385’ (if not necessarily a full prefix) so I used 38511 for the following 3 samples:

## Sample 1:
38511|Pride even in numbers; wit's a kind pretence
38511|To something foreign still, but ne'er to sense;
38511|A constant waste of words, the world produces,
38511|A lazy sap, that nothing that's foreign to expels,
38511; that's foreign still unknown to the soul;
38511|A young, like, but in the world in motion,
38511|Obscending not, which smells all our own the worse than our own our own the soul's, and soul;
3851166|Like sense; which is like, but in all our whole.
38511|Which thus far more like, but in all things that's an excellence; and ne'er unchanged by which is folly's the worse, they give the worse maintained by which
38511|If sick of sense;
38511|Wholubil, or snug. ills, we know our own our first in sense the worse maintained between the worse, soon expired.
38511|Is blind and first blown on both being free from sense;
38511|From sense the worse maintained
38511|Wholuteness seems at once more used to sense the worse maintained by which
38511|Wholublime or enjoying sense; and first made to the worse, will's the worse maintained to sense;
38511|For sense; by which smells now discharged, and kept unseason'd from sense;
38511|Whose.  that's soon revived.  and then past the worse maintained, by birth to sense; by sense the worse, with weightyselves;
38511|Mankind by being all else barren; the worse maintained and last by birth to sense;
38511|Would cast, since built in nature lies from sense; for which smells and last, by repugither to sense;
38511|Whole; for our present, and first in life at all else to sense so long since built o'r by life to life, is soon revived by contact with heav' we know our own th
e worse maintained the worse it burns first made equal right;
38511|Is free.
3851166|Or dead: thus far more; who survey.
38511|Or wry's profuse and then dead: but what oft the worse maintained and next to life.
38511|From all; and
## Sample 2:
38511|There lies, that write the very best of all;
38511|For the lov'd fool, for those he courts and chokes,
38511|Is but a thorn inborn martyr in grief and sin,
38511|Who would all bawls and rattlantoms.
38511|Some hazels and isle from a thorn, or a starv'd for breaking hearts abh, to hogs;
38511|Or movel sooner writ, when by the starvels, or fombe.
38511|For men of any faultless wox and bribes.
38511|For wagers who should cut apart, and wak'd to make 'gin rights, for stink; for lamb, or chase; for lamb.
38511|Pounders or cast heel or a rope. For, for lamb, for lamb, for lamb or for lamb; for lamb, for lamb or starve.
38511|For no mean; for lamb, for sheep or for lamb, for lamb, lamb, for lamb, for lamb or for lamb, for lam, for mares.
38511; for lambs, a-heats. Suchley, for mares, for mares, for themselves.
38511. (for lambkins.
38511|mells; lam, lamb, lamb; lamb, lambkins; and other's for wares, lambkins; for struts; for sheep, lamb; or pricks, lamb, lambkins; for wer clothes; for mares: for sheep for lambkins; for goats for lamb.  for goats, lamb; for lambkins; and for moo: for hethers, for wark;
38511, lamb: for babes.
38511 unman; and for lamb.
38511; for lambkins; for mares; and the wox: for wheroat.
38511; for other goods, lamb. (let, frothriars, for spoles; a drum); for goats for lamb.
38511d:) for spoons, for spoons; for goats for whelornes, lamb, for mares: for syr trespass, for goats for mares: for lamb; for mares; for goats for sycam;
38511 theirs, for dog; for mawds.
38511; for whel
## Sample 3:
38511|They talk of constancy and faithless love,
38511|A seraph trembles at the specious glove;
38511|Nor in the rich confin'd relief of state,
38511|Find proud pretence, nor in the disdiscoveries of fate.
38511|For when misfortune makes choice remains the conduct's the prize half known,
38511|Can we secret soul without due, they fear of sense of more known.
38511|Some rise where' rights, they make it pays due.
38511861. Sense and judgment, as equal prize seem meanly, the reward the joy, as much possess the prize paid, as well reckon the prize we do not less dare not less keenly wise.
38511861|We see;
38511861|Of happy lovers ought, as well done, like a friend.
38511861|Know they ought, 'tis ev' is the other joy, as well worth a right;
38511861|The joy, as well might, as well may, as well may all is great.
38511861|Nor need of joys not as well maysters, as well as well may they give; but as little store; but as well as well as well may shewn, as much, as well we know, as well as well can be sure might prove, as well may well as well as well as well as well may view;
38511861|The mind: as well as well as well as well as much the fair as well as well as well as well as well as well as well may,
38511861, as thou woe.
38511861, when all.
38511861|Well done by the fair as well might rise, who merit thus prove by 'tis most of all are one;
38511861ly play the few.
38511861|There's well as well as well as well as well as well as well as well as well as well as well as well as well as not less admire;
38511861|Apostorably bright.
38511861|Well done by sight.
38511861, as well as well as by contact;
38511861|For they; the main aim

Alexander described his GPT-2-small sample from Pope:

It understands there should be line breaks, it understands the approximate correct length of a line of iambic pentameter, it understands how to talk like an overeducated 18th-century dandy—but it doesn’t appreciate rhyme or meter. In retrospect this isn’t surprising; GPT has no idea words sound like anything; it would be shocked to learn anyone uses language as anything other than text strings.

GPT-2-poetry-prefix still has the “overeducated 18th-century dandy” down pat, but it manages to improve on the rhyming aspect: there’s quite a few rhyming lines in samples #2 & #3 (#2 seems to be screwed up by taking a digression into footnotes defining words and then bad sampling getting it trapped), like “pretence”/“sense”, “soul”/“whole”, “love”/“glove”, “state”/“Fate”, “bright”/“sight”, and a number of almost rhymes like “right”/“great”. One wonders if it’s learning by brute force and memorizing specific pairs of rhymes (although could there really be that many rhymes of “state”/“Fate” in even 3m lines of old poetry?), or if it’s doing something more equivalent to inferring the latent phonetics from the co-occurrence of bytepairs? (That may sound unlikely, but word embeddings do many unlikely-sounding things with no more supervision than co-occurrence.)

More concerningly, the samples are terrible. Pope’s poetry should be straightforward for GPT-2-poetry-prefix, as it follows standard meters and rhyme and relies on a classical vocabulary well-represented in the PG corpus. Why, then, are they so bad? I suspect this may reflect the corpus itself doing Pope a disservice. Pope’s inclusion in the PG corpus appears to consist of the following (grepping for “Alexander Pope”):

32190|The Works of Mr. ALEXANDER POPE. London: Printed by W.
32190|The Works of Mr. ALEXANDER POPE. Volume ii. London: Printed
32190|Letters of Mr. ALEXANDER POPE, and Several of his friends.
32190|The Works of Mr. ALEXANDER POPE, in Prose. Vol. ii. London:
32190|The Works of ALEXANDER POPE, ESQ.; vol. i. with explanatory

Checking PG entries and looking through the 32190 prefix, it starts:

32190|INTRODUCTION                                                 xv
32190|The Works of Mr. ALEXANDER POPE. London: Printed by W.
32190|BOWYER for BERNARD LINTOT, between the Temple Gates, 1717.
32190|This volume consists of all the acknowledged poems which Pope had
32190|The Works of Mr. ALEXANDER POPE. Volume ii. London: Printed
32190|by J. WRIGHT, for LAWTON GILLIVER, at Homer's Head in Fleet
32190|Letters of Mr. ALEXANDER POPE, and Several of his friends.
32190|London: Printed by J. WRIGHT for J. KNAPTON in Ludgate
32190|Street, L. GILLIVER in Fleet Street, J. BRINDLEY in New Bond
32190|Street, and R. DODSLEY in Pall-Mall, 1737. 4to and folio.
32190|The Works of Mr. ALEXANDER POPE, in Prose. Vol. ii. London:
32190|Printed for J. and P. KNAPTON, C. BATHURST, and R. DODSLEY,
32190|The Works of ALEXANDER POPE, ESQ.; vol. i. with explanatory
32190|Notes and Additions never before printed. London: Printed
32190|commenced printing his particular section of the octavos when the
32190|Quo desiderio veteres revocamus amores
32190|Atque olim amissas flemus amicitias.
32190|Nutrix mea fidelissima M. Beech, obiit 5 Novem. 1725, aet. 77.
32190|Edwardus Blunt, vir amicissimus obit, Aug. 1726.
32190|Francisc. Atterbury, Roffens Episcopus, vir omni scientia clarus,
32190|The fourth volume contains the Satires, with their Prologue,--the
32190|alterations. --_His Last Will and Testament._--WARBURTON.

This is perhaps not good training material for GPT-2-small-poetry/prefix and explains the bizarre degeneration—it is ‘expecting’ sudden random irruptions of largely-irrelevant prose such as introductions or footnote-annotations (rendered inline by PG’s text formatting). Other entries in the corpus will be more free of scholarly or prose apparatus. (In retrospect, a preprocessing step like dropping lines longer than ~60 characters might’ve been a good idea.)

8 Famous First Lines

The prefix trick doesn’t work on the 8 famous first lines nearly as well as it does with the long excerpts from “Howl” etc; I assume they are simply too short to home in on a relevant prefix. Nevertheless, I tried.

“It little profits that an idle king,” yielded no consistency in prefixes, so I skipped adding one. 3 samples:

## Sample 1:
2428|And he who wears a crown, may die, but little lives.
2428|It was a pleasant day, when snugly shut,
2428|For many happy moons and many a sun
2428|Had lit the laughing waves with constant scarf,
2428|And heard the pealing waters on the run.
2428|'T was noontide; and with such a sky-lark flight
2428|The fleet white pieces flew in waves of sound;
2428|The silver stars, their fitful little light,
2428|Were beating on the liquid night around.
2428|Down went the cold, and all the dreariest blasts
2428|That ever chanced to blow, came mopping near:
2428|The first and fierce, the youngest, was our statesman here,
2428|Old Man-Autumnus!--An oak-headed oak!--
2428|The oak that in our court has grown so hard!
2428|"_Here am I_," answered he, "when I am dead,
2428|And round these sacred ashes I shall cling;
2428|And all the while my heart, unsatisfied,
2428|Holds but one clue,--and how must I be king?"
2428|How could he hope that a short space of day,
2428|And the warm blush of eve, and a fierce warning felt,
2428|Could prove the son of an old hunter, killed
2428|By the wolf's hounds in the wood! He crouched, and leant
2428|Against our shadowy robes; and the boughs bent low,
2428|Like the snow-flakes that cover the world below,
2428|Bent with a conscious motion. As the breeze
2428|Fluffs away, and the suns shivered and sighed,
2428|A shadowy motion, like a sigh, went by.
2428|It seemed a shadowy grief, and a mysterious
2428|Pestilence, like the voice of one who has sinned.
2428|At length, to his own chamber silently,
2428|We came, and gathering up the golden rim
2428|Of the great golden moon, looked down and wept.
2428|Then in a voice so soft, so
## Sample 2:
36287|Weary of wandering,
36287|Weary of loving,
36287|Can love a tawny cheek,
36287|Or a eye of ruby beaming through?
36287|My heart, why come you here alone?
36287|The wild thing of my heart is grown
36287|To be a thing,
36287|Fairy, and wild, and fair, and whole;
36287|With but the lion-heart to mate,
36287|And yet the heart to feel and know
36287|But in the wildest nightingale;
36287|The tongue you have for praise or blame,
36287|The heart that is more soft and tame
36287|Than the wild flower, this summer day,
36287|Shall never know your own dear way.
36287|Though cold and bleak your cheek I feel,
36287|Though bleak your mouth,
36287|Though cold your hand I would not steal,
36287|Oh, love, a thought beyond all words,
36287|Could I but win that word of birds,
36287|Could I but win that heart of yours!
36287|O sweet and tender is the lute,
36287|And soft the song the scarlet throat,
36287|When song-birds wake its wildwood lyres,
36287|And wild-bees through the glens go by.
36287|To you the wild-flower opens wide,
36287|To you one ripple stirs the tide,
36287|One long sweet longing seems to be
36287|A look of love made yesterday.
36287|To you the wild-flower bows its head,
36287|To you the wave is musical,
36287|Each kiss a kiss--but what of that?
36287|Ah, love, beyond your reach you keep
36287|All joys for you--ah, hear you me!
36287|Above your grave the winds are laid,
36287|Love lulls asleep within your shade;
36287|There is no rest for weary feet,
36287|The years are unfulfilled, I trow,
36287|And 'neath your sad and smiling eyes
36287|The sun tracks every path that lies
36287|Across your
## Sample 3:
1020|Who had too much of common human things,
1020|Being at peace with all he had on earth,
1020|Had something of a high and godly pleasure
1020|To do without him, since he had been
1020|In all the world so utterly,
1020|And that he had not far from his ambition,
1020|Had done no act for any human duty,
1020|Could not have felt his life's inheritance,
1020|If some good friend had lived to teach him thus
1020|His father's story of himself and country,
1020|And to restore his kingly name to men.
1020|O, the days that grow in my lament,
1020|And the nights that grow more still,
1020|When the earth lies awake,
1020|And the heaven it girdles,
1020|And the woods all shake,
1020|And the bright stars glitter
1020|Through the shadowy mist;
1020|When the night comes trailing her star-tinted hair,
1020|And the stars are the gray of our being,
1020|And the night wind is like some old sorrow
1020|Whose sweet touch is as of heaven,
1020|With its shadows and with its stars,
1020|Gathered in one place, is a child's face,
1020|A child's face made of golden sea foam:
1020|And the tears and the prayers that we utter
1020|Will be as those broken tears
1020|That fall upon bosoms, at the hour of their birth,
1020|With a sad and sorrowful melody,
1020|With a last, far-off singing of earth.
1020|Yet the night is a golden sleep,
1020|The day is like gold,
1020|And the stars go out alone
1020|Through the dark leaves and the darkness, alone.
1020|Night above and stars about,
1020|All our wanderings ended,
1020|Time above and darkness about,
1020|And the sun all hidden,--
1020|We are two two; the world is two.
1020|Howsoever the hours clap and clap,
1020|All the earth is happy,

“That is no country for old men.”, no consensus. 3 samples:

## Sample 1:
28591|I would to God that it all came to pass,
28591|That my old line of trouble and poverty
28591|Was left here desolate of my poor discontent,
28591|That the first cause of all my own heart's moan
28591|Was to be born in this sad world--a lone lone lone way?"
28591|I read it in my sonnet; his deeds were so brave,
28591|So kindly a husband, so cunning a guide;
28591|So tender a heart he went with his work,
28591|That his life never stinted its comforts at home.
28591|I read it in my sonnet--and he was so true,
28591|So loving my child, that it ever proved best,
28591|And though I was forced from his tender command,
28591|I felt I was growing up, when his voice, all tender, said,
28591|"If Love be unfaith, then he is the earth's father."
28591|I read in my sonnet--and he was so sweet,
28591|So eloquent was his look and his tones,
28591|That I smiled when I saw him, and ran forth to meet
28591|His father--and asked him in all things to meet.
28591|But some were too haughty, the most malgracious,
28591|To love their loved Father--the more to annoy;
28591|And some were too cold and abandoned, and when
28591|They saw how much reason has melted and flowed,
28591|They cried, "O Love, give us to love, give us fire,
28591|So we may die, as we will; O perfect, as dead,
28591|With the dead silence!" Then each took his own way,
28591|And, in its nature, an exquisite sorrow lay,
28591|Like love in a garden of fresh green;
28591|They were happy--too happy--by nobody knows!--
28591|The world, too, was happy by Roger and John,
28591|Though some at her heart cannot understand,
28591|Yet the whole world was happy on that first day.
28591|O thou who art so beautiful!
28591|What is the world without thee?
## Sample 2:
1365|O, there's a country, there's a town,
1365|And here's the place that I have stroll'd.
1365|See! there's a town!
1365|And here's the place, and here's the goal;
1365|And here's the flower for me and you--
1365|For you and me.
1365|Here's my little girl;
1365|And here's the spot that I call mine--
1365|"Come over here!"
1365|He's under the lilac-tree
1365|Where I loved so often to dwell;
1365|He's near his own tree-trunk,
1365|But here's his knee--
1365|And he's floating away
1365|To the roses in the churchyard,
1365|To the fields where I was born.
1365|Here's the lily-thorn, and here's
1365|"But where's the rose?"
1365|Here's a little lad;
1365|But a marigold he's brought
1365|Here's a ribbon at most:
1365|"Spring comes in the sun," says he,
1365|"And you dream of flowers gay
1365|On the hills of Arcady."
1365|That's the way he's trod;
1365|He shall be a king, and he
1365|Must have to sing--
1365|Ay, a king--a queen--a grave--
1365|Whose name was love, his heir--
1365|For the royal blood he gave
1365|At the royal marriage feast,
1365|And the crimson wine he drank
1365|At the royal marriage feast,
1365|And the crimson wine he drank
1365|At the royal marriage feast.
1365|They love to wake alive,
1365|To gaze at life and die;
1365|But they do not understand
1365|How many thousand voices call'd
1365|Out of the ages by the dead,
1365|Out of the ages by the living's voice.
1365|How many voices, and what else
1365|Did they call the living dead?
1365|All which of them was gone
## Sample 3:
1020|A time will come for tears and cries,
1020|And then I'll know her by their eyes
1020|I had her, but I cannot tell,
1020|And I am lonely in my cell
1020|Since the time the fair one came to me,
1020|I who loved her since my first year,
1020|Though in my youth she had been dead.
1020|I had grown quiet as the night,
1020|And I had won her tender sight;
1020|And the soft notes of her voice
1020|Would haunt me without joy or fear,
1020|Though the day had come.
1020|This breath in the air took the damp,
1020|And was gone to the damp,
1020|And was gone to the black,
1020|And was gone to the black,
1020|And was gone to the black,
1020|And was flown to the white,
1020|And was gone to the white,
1020|And was flown to the black,
1020|And was flown to the black,
1020|And was flown to the black,
1020|And was gone to the black,
1020|And was carried away in the mould
1020|With its all-written countersign.
1020|I thought the words would mean the fight,
1020|But the fighting face was not my sight,
1020|I thought the men would take a share
1020|Of pain and strife and weariness
1020|And leave me on the dusty street,
1020|Where I would meet that sorry man
1020|And tell him tales and talk to him;
1020|He said, "I'm sure I've done my best,
1020|But I'm not the man to meet the west."
1020|He said, "I'm going on our track,
1020|With my best friend here at the back,
1020|And my master, as I understand,
1020|And the rest of the country side,
1020|Where people go and people ride,
1020|And some end never to decide."
1020|"I can't," I ordered, "I can't!"
1020|He said again, "I say

“When, in disgrace with fortune and men’s eyes,”; no consensus, 3 samples:

## Sample 1:
28287|With heart and hand he went--
28287|The boy, like any lad of five,
28287|Was never seen to pair;
28287|For he ne'er met with a taste of grog;
28287|And now, howe'er his foes may say,
28287|I wish the boy would do all that he can,
28287|And so will merely go to bed;
28287|For, ere he goes to bed,
28287|He'll do most things, not more.
28287|He's gone to bed as readily;
28287|Then, before breakfast, he politely shuts
28287|The book, unless he's said,
28287|And talks of how the breeches blows,
28287|And how the shadows fall;
28287|And how a lady steals a cheese;
28287|And what a consternation comes to dress
28287|That baby sleeping there!
28287|He knows it is a drowsy child,
28287|And therefore it is very wild:
28287|For he ne'er goes to bed--'tis true--
28287|But he is there, and he is fast asleep.
28287|My dears, he says, if you can look
28287|In your writing, he'll just show you just
28287|How very fond you were of me,--
28287|And for how long I loved you so!
28287|And 'tis a very different thing
28287|He reads me, as he so loves me,
28287|Just to think and write of me.
28287|The moon's a very different thing
28287|I should prefer to read to him;
28287|The dogs do bark, the parrake buzz,
28287|And so do all upon me.
28287|One day there was a certain man
28287|Who said, "I love you better,"
28287|Because he had four teeth, they say,
28287|And still was most inquisitive.
28287|They sent him to a stylish drive,
28287|And, though they griev'd him greatly,
28287|They gave poor vulgar names--I trow--
28287|The names--and what do you think?
## Sample 2:
1852|He sees his wretched wife's fate, and dies.
1852|"And she, perhaps, may die in future time."
1852|In Moore's "Ballad of Trees and stones"
1852|A friend to Boston, now far known,
1852|In Boston town, "Is Alexis," sung
1852|A strain of triumph, and these words she had:
1852|"The city's honour, name, and wealth have I!
1852|"For this, a man of blood, toil, battle, strife,
1852|Mechanics labor, die, like this of life;
1852|The whole world's fame, the wind of it, the wave
1852|Of some great deep, that rolls to heaven or cave;"
1852|In that same song, by all the prophets sung,
1852|It is the song to tell the mighty throng
1852|Of those who for their country's honour take
1852|Life's journey to that sacred hill of Fame,
1852|When with great souls, that have been, and are still,
1852|The first great men, to do great things for ill;
1852|When, from no other world, the world's heart breaks,
1852|That noble man who, though he singly speaks,
1852|Has less of good to do than such a song,
1852|And lives in little more than we of him.
1852|What cause has he, this loss of friends and kindred?
1852|What hope of fortune? Wherefore has he perished?
1852|Why is he living on one man's estate?
1852|The poet sings. Where goes the soldier?
1852|The soldier dies. One man lives, all is one.
1852|Let him by no means lily stand alone,
1852|For one man is not worthy of it--all
1852|Is to be tasted by a thirsty sword,
1852|Which, while he lives, will drink up all he can.
1852|There's good in war, but better does he die.
1852|No, this, no more--that man is not the man.
1852|He may be used to giving freely what
1852|He gave, that he should dare
## Sample 3:
392|So many a noble man he scorns in fight,
392|Away he wanders, and his heart declines
392|To be among the dead, and live in death;
392|And now he meets with fear and trembling fear;
392|His courage fails, his faith is hardened, dead,
392|His courage fails, he cannot hope to appear
392|Forward he flies, now threatening danger dread,
392|He calls his lords, and charges them and flies;
392|In dust and ashes, as in sands that bee,
392|Now drives his foot, or in the grass that lies.
392|The trembling sailors, in their care to keep,
392|Now of their craft, and now of theirs surprise,
392|As if they wanted courage, speed, and sleep,
392|And all their thoughts on their audacious prize.
392|The watchful Franks at last had seen this feat,
392|Their flags down to their knees out-stretched they lay,
392|Their head all bare, their helmet on their feet,
392|Their bodies all of gold, their locks of gray,
392|Their bodies all so fair that death might deal
392|The blow, and break the shield, the hauberk too,
392|They drew their swords, and on their backs struck through.
392|After the first they flew in fierce career,
392|Like devils monsters come from hell and fire;
392|They changed their hue, their tunes, their forms, their cheer,
392|And drew their weapons; what their foes desire,
392|They turn and fly, and both their bodies spare.
392|The hosts, whose courage, arms, and arms, were fled,
392|Yet no dismay it had, when, with a spell,
392|The king's enemies were forced to stay,
392|That night each knight was used to aid them well.
392|But now the day drew on with dreadful ending,
392|The Franks, in fury, spurred them on their way,
392|And, for that great king's fall, had turned their backs,
392|And fled, before that host, to God and day.
392|To God or mistress, had they made request,
392|Save of his presence to those lords of rest,
392|That they would go, ere

“Out of the night that covers me,”; no consensus, 3 samples:

## Sample 1:
27139|And the stormy winds that blow
27139|All day long across the world
27139|I shall fly from earth below.
27139|"Oh, I am so weak and bowed!"--
27139|And the sun, in dazzling pride,
27139|Shook, and loosened as he heard
27139|All the turmoil of men's rights.
27139|In the town I should not be
27139|Wasted, wounded, and tormented;
27139|Made my spirit drunk with joy
27139|In the wonder of the sun.
27139|Oh, to see the sun shine through,
27139|And to hear the voice I knew,
27139|And to kneel because of old,
27139|And to speak, though it were gold!
27139|How it works, how it glows,
27139|In the sun, in the moonlight,
27139|And on earth, in the stream, in the odorous air,
27139|For the beauty of the moon.
27139|How it shines, how it glows,
27139|In the softness and the glow,
27139|And on earth, in the heaven, in the dusky air,
27139|Is the beauty that we know.
27139|Little, little, little sun,
27139|On your breast like the dawn of day,
27139|How I know about you always,
27139|With my heart and soul astray.
27139|Oh, to shine forever in your light,
27139|To shine forever in your beams,
27139|To shine forever in your brightness,
27139|To be shining, and to be!
27139|I knew the birds were going
27139|Out in the fair,
27139|Singing and working away,
27139|Going and staying and waiting
27139|Just as it might be.
27139|But the song of the children's singing
27139|Came every day to me
27139|Sweeter, and more than the singing
27139|That I may ever see.
27139|It rang so sweet, so clearly
27139|And far away,
27139|Making the angels, after
27139|Their work, but not a
## Sample 2:
38839|Where are the hours when we shall sleep?
38839|The night grows deeper, ruddy now,
38839|And on the east the morning light;
38839|No longer are the tears that weep,
38839|The noon-tide's drowsy murmuring
38839|That rises up through crimson leaves,
38839|As though the soul had fled away,
38839|To hide forever in the day.
38839|How sweet is the valley of Noodles!
38839|How quiet the little brook-heads there!
38839|What little bells by the cottages stood
38839|Tolled 'neath the heavy hours' flare!
38839|What tinkling bells the brook-folds heard
38839|As they crotoned by the millstone there?
38839|How soft the little brook-faces were
38839|Dimpled beside the cottage-door,
38839|And from the windows and out the trees
38839|The long-lost love-notes peeped once more.
38839|He came: the little brook he loved;
38839|And the brook beside his hearth he loved
38839|Where the crook whispered low, and the grasses
38839|That fringed its little patch of moss
38839|Beneath, the mossy water flowed.
38839|He came, and in God's blessed air
38839|He felt new life in that sweet air,
38839|New life and everlasting youth,
38839|And morning freshness, spring and sun.
38839|Weep not for me; our Father sent
38839|His little ones up to be content;
38839|As He had promised our poor ones
38839|We're glad to be thank'd like them;
38839|And let our taker daily say
38839|We're spared for our old love to-day.
38839|A rosy red rose is the rose
38839|And dear to me as children use;
38839|And though we sing it day by day,
38839|We know that mother's waiting here,
38839|And mother is the waiting-maid.
38839|God's plenty here, and He has sent
## Sample 3:
25953|The night that gives its glory,
25953|And the blue skies of the heavens;
25953|Sidney were worthless in that sight,
25953|All in a narrow space apart,
25953|I only had my life preserved,
25953|Left to myself to serve and guard,
25953|To guard and cherish, rest and guard,
25953|Of me the noblest was the king.
25953|Thus has my soul its homeward way,
25953|Thus has my soul its strength obtained,
25953|That what is dark may there appear,
25953|But evil in the light it feared,
25953|And in the light now burns my brain.
25953|With such a life, a wondrous life,
25953|Sounding through many realms and seas,
25953|The soul to millions makes at ease,
25953|And when it leaves the pleasant place,
25953|Binding in union is of grace.
25953|He who the body's life confers,
25953|From aught that is not, never slights,
25953|And when his body's life is spent
25953|The spirit cannot be content.
25953|I asked him what way he went,
25953|How many leagues he had not seen.
25953|The foolish wish I dared to own,
25953|To give the secret now I dare;
25953|I hid my foolish thought away,
25953|And lest the truth should be revealed,
25953|I thus began to test my will,
25953|And thus the answer was expressed:
25953|"I wish, O soul, that thou wouldst keep
25953|At present, in thyself at rest:
25953|But thou from sense hast naught to fear,
25953|And I my truth would not deny.
25953|And now by strength to hold thee fast,
25953|I do thee scorn, and I will give
25953|Henceforth my soul, and seek thy love
25953|As of the race of gods above."
25953|Then quickly I the thought requir'd,
25953|And thereupon this answer grew:
25953|"O soul, what wonder hath our Lord,

“Come, my tan-faced children,”; no consensus, 3 samples:

## Sample 1:
31314|Do you think this summer would be just one?
31314|Did not all the gods in their thunder-storm,
31314|Say, would make the trees, temples, rocks, and tops, . .
31314|And now would they think these trees didn't know,
31314|But they couldn't be built, I think they do,
31314|And would like to live on in a greener place,
31314|And the gods have made it to be a good history;
31314|And we haven't the time, you see, to say.
31314|The woods, the fields, are green,
31314|And the flowers everywhere.
31314|The sun and the grass,
31314|And the birds and the fishing-trees,
31314|Are quiet everywhere.
31314|And there is the sun,
31314|And the flowers everywhere.
31314|The sun and the flowers,
31314|And the flowers everywhere.
31314|The sun and the flowers,
31314|And one after one,
31314|Are quiet all around.
31314|But there is an end.
31314|There is another place.
31314|The end of the road
31314|To which to go.
31314|The end of the road
31314|To which to go.
31314|Ah, why do you go
31314|Into the garden there into the forest,
31314|And into the wood, and into the village,
31314|And into the field?
31314|Why does the rain
31314|Make me uneasy?
31314|Do you see,
31314|Do you see?
31314|Listen to me,
31314|I am in a hurry,
31314|And away.
31314|There is a strange thing:
31314|I have that alone.
31314|It is not at all like I married:
31314|There is a new man,
31314|Who has just been married.
31314|I have a new hat,
31314|That's in my hat.
31314|I wish I could find out another,
31314|But then there is one . . .
31314|Oh, why do you
## Sample 2:
19|Let us sing this song of mine:
19|Where I am, there I'm,
19|Tell the mighty, mighty sounding,
19|Ocean's awful son of old,
19|In the islands of the blessed,
19|In the groves of Arcadian
19|In his cradle, cold and cold.
19|We, the Fairies, we the children,
19|We the islanders, the bold!
19|We are all that has been fashioned
19|In the wondrous dreams of old,--
19|We, the revelers, the giants,
19|We the children, and the bold!
19|O the wondrous song of battle!
19|O the spoils of men of might!
19|O the spoils of conquest, conquest,
19|Where the many are not quite.
19|By the walls of ancient stories,
19|By the marble-mantled wall;
19|By the chains of dread OENEUT,
19|And the marvels of the fall.
19|By the ramparts of the giants,
19|By the caverns of the deep,
19|By the graves of men immortal,
19|By the caverns of the deep,--
19|By the temples of the Morning,
19|By the temples of the Night,
19|Where the warriors and the giants
19|Met in vision met in fight,
19|And the dying captive maidens,
19|Sat in still and stately light.
19|Child of Earth, too fair for sorrow,
19|Mother of the Light of Life,
19|Fairest daughter of bright radiance,
19|We have sung thee a new song!
19|By the ramparts of the giants,
19|By the temples of the night!
19|By the peaks of Alleghany,
19|Where the eagle cleaves the sky,
19|Trojan prince of fire and glory,
19|Comes in might the ivory-headed:
19|By the palaceer Hippolyta,
19|And the brilliant dame who leads him
19|Where the women weep and smite them
19|On the great steed Aristagrus,
19|Sorrowing, binds his loins of golden
## Sample 3:
1165|Bring us back our olden treasures.
1165|In the small white house with the crumbs for food and fire,
1165|How you cursed the stars and drummers!  I hear you drumming.
1165|Are you getting sleepy-hearted?
1165|Are you writing toiling while night was ebbing?
1165|Do you yet look in the face so white?
1165|How your furrows, how you pined away!
1165|Up the ladder--who would fain be strong?
1165|Who would hunger after freedom long?
1165|Are you being hungry, child, and tired?
1165|Mother, mother, come blow me a song.
1165|Out of doors a man came and sat beside me:
1165|He was black, he was white, but I could not see him;
1165|He was robed in a sackbut of very little gold,
1165|And the words that he said were, "Hotel-fever!" . . .
1165|He said:
1165|He has seen Barra's vision.  He is not old;
1165|He is not fit for the desert.  He can find
1165|Sight and freedom, voice and heart and mind,
1165|And he has not learnt to forget . . . yet he
1165|Has a vision of Paradise.
1165|But, alas! the thing is over,
1165|And there is no chance can take me.  That is why
1165|I looked, he said.
1165|The place is filled with flowers,
1165|With curiosities and secret pain;
1165|One has a face like those of flowers,
1165|One has an accent like a bell.
1165|The small sad music of my days
1165|Moves on.  The grassy fields and lawns
1165|Are not more silent than the stones;
1165|But one face moves beneath the stars. . . .
1165|All this is very beautiful, perhaps;
1165|The hills and woods, the fields and meadows,
1165|The clouds and clouds and all the sky,
1165|The sea's sky and the hills' sky.

“Let us go then, you and I,”; no consensus, 3 samples:

## Sample 1:
22142|When the summer comes again,
22142|When the birds on the sunny land
22142|Make the winter to come again,
22142|I may say that the happy hour
22142|When the harvest-time comes again,
22142|When the heart aches for the land of my love
22142|And the day is my heart's desire,
22142|When the harvest home is come
22142|And the days are my heart's desire."
22142|"Oh! what if we both would wander over the sea,
22142|Afar from the home on the lowland, and stray
22142|O'er the hillside and over the dale, as we stray
22142|O'er the hillside and over the moor, through the wood,
22142|By the light of the moon on the hillside and stray
22142|Till the golden mist o'er the landscape is gray.
22142|And often for me the olden log-house is seen,
22142|The cabin'd log and the swinging door,
22142|The house where I lay till the break of day
22142|Till the sun shone out and the shadows drew away,
22142|While the shadows still wandered o'er valley and hill,
22142|Till the heart stood still, till the hush of the hill
22142|Came o'er the meadow and wandered awhile,
22142|In the sweet early gloaming with autumn awhile,
22142|To lie in the light of the long autumn days,
22142|We two.
22142|The gray-haired woman that leaneth on my side
22142|And holdeth me fast in her arms,
22142|Hath bid me lie down 'neath the old oak tree
22142|That so thickly embosoms me.
22142|The greening spring came with its silent voice
22142|When the autumn leaves hang their green,
22142|And the winds from the woods whisper'd a strange
22142|remembrance of many a long vanished year,
22142|Till the gray shoon dropp'd 'neath the gray shoon's shade,
22142|'Mid the gray shoon's shade;
22142|But where is my home
## Sample 2:
30488|Over the plains and slopes, where the wild roses blow,
30488|And the low valleys and the brown hills meet the sea,
30488|And the winds hurry by with a cry, "We want you! we want you!"
30488|And we stand on the roadway, and walk in the rain,
30488|And weep for the dead that have gone from us--we are waiting,
30488|Waiting the word that the wind shatters over the hill.
30488|We have come back from the hills, from the plains where our
30488|blood was shed,
30488|From the fields where the grass was the ground and the
30488|We have come back from the valleys, we have come back
30488|from the hills!
30488|We have come back from the plains, from the plains where
30488|the grass grew green;
30488|And now that the voice of our winds shatters the
30488|world between.
30488|We have come back from the plains, have we come back
30488|to the graves,
30488|And the wind shatters the breath from the hurrying
30488|ship o'er the waves.
30488|We have come back from the plains, have we come back
30488|to the dead years,
30488|And the winds cry aloud, "Go up, and seek
30488|the faces of the gods!"
30488|We have come back from the plains, have we come back
30488|to the years,
30488|And the winds call out, "Go up, and seek the
30488|wilder world appears!"
30488|We have come back from the plains, we have come back
30488|from the days,
30488|And the swift leaves quiver, the bright leaves dance on
30488|our happy way;
30488|We have come back from the plains, we have come back
30488|to the dead years,--
30488|For the voices of men, or the gods' high
30488|heavenly joys!--
30488|We have come back from the plains, and
30488|from the long ago,
30488|Where in the cold and the gloom, on the
## Sample 3:
27494|To meet old friend Angelico's,
27494|And kiss the little one's beard.
27494|We shall have tea in the library;
27494|We shall have tea in town;
27494|By my good friend, Michael dissentund,
27494|I wish all you gentlemen!
27494|The sun has a face as fair as ever,
27494|As fair as ever was seen;
27494|As fair as ever was seen.
27494|There's a smile on his beautiful lips, my boy,
27494|As bright as the eyes of a lark;
27494|As bright as the sun at day-break gray, my boy,
27494|As bright as the eyes of a lark.
27494|There's a smile on his lips, my boy, my boy,
27494|As soft as the kisses he yields;
27494|As bright as the sun at noon-tide's end, my boy,
27494|As bright as the smile of a lark!
27494|Oh, there's something quieter than sleep, my boy,
27494|That twinkles in every eye.
27494|There's a smile on his lips, my boy, my boy,
27494|That shines like a lamp on the earth;
27494|As soft as the sun at noonday gray, my boy,
27494|As bright as the smile of a lark.
27494|A soft, pure love, that endures but ill, my boy,
27494|A soft Persuader that calms not nor charms,
27494|A king of the world-flowers and balm of the sea, my boy,
27494|A bright-hearted lover that seeks and finds, my boy,
27494|A rainbow of Heaven without a stain, my boy,
27494|A rainbow of Heaven without a stain.
27494|A sweet, pure love, that abides in a magical land,
27494|A kingdom-king, and a world-workeen band;
27494|A queen of the wildwood, a maid that is only one name,
27494|The bright-faced one calls her by name.
27494|A sweet, pure love, that all other names name, my boy,

“|To be, or not to be: that is the question:”; some consistency, so prefix “1006”; 3 samples:

## Sample 1:
1006|Shall wretched mortals, for a certainty
1006|Of cowardice, from forth their secret places
1006|Take flight, and turn in view the point askance,
1006|From whence they can, on any object bent,
1006|So that the mirror may not abstain
1006|To 'GOD,' as in the sky grows colourd there
1006|The sunbeam.  If I had, I should prove
1006|The fountain's well-accustom'd form there.
1006|I should not be a word, that is not parting
1006|Betwixt the sun and moon, nor they, for sowing
1006|In opposite direction, doth import
1006|Their night and their moonshine to that shadowy land.
1006|Therefore into dark caves are we brought,
1006|Where underneath is alabaster, save
1006|Of dusky idol, in which shining moors
1006|'Cross-bows do us sometimes view.  Rumm, there,
1006|That consecrates the world, are seen below.
1006|How many stairs, as 't were, to o'erpass the threshold
1006|Of my triumph, when I did before thee,
1006|Enter'd, and were below, though light were not
1006|But in those hours to which this chant is not.
1006|For thereon is my mind alone intent,
1006|And this there be the singing souls, that wait
1006|Their Master's will.  If but e'en now thou choose
1006|To listen to this be melody sweet,
1006|Here may thy heart find rest.  If this thou hear,
1006|Thenceforth what argument there smote thy love;
1006|That, well perceiv'd of me in its ascent,
1006|Thou mayst at last the good from that have learn'd
1006|Of me made perfect, and of now prove o'ercome.
1006|That next I saw, which, hearing, thou shalt take
1006|For guide, arose from out my last embrace,
1006|As a good mother should, that her embrace
1006|Around each form Adams in her son,
1006|The same who
## Sample 2:
1006|Shall they not fear?--let one fear have his throat?
1006|No: for I see, and can perceive it not,
1006|That they, who were before me, were lamenting
1006|For the nature of the world, and in its light
1006|Have pity on their prison-gries, and therefore
1006|Spirits of low estate do not accuse.
1006|As oft along the still-perduing Nile
1006|Along the Indian isle the night-owl thronged,
1006|And with her breath had led them to the rivers,
1006|Thus each along the shore, in company
1006|With the fair consort of a lady, moved
1006|To supper with a working-place of eggs,
1006|Whereon all love was painted; and "O Lord!"
1006|The singer, as he finished singing, struck
1006|On the sea-shore at left and right below,
1006|Stay'd his steps, and thus in accents clear did:
1006|"Behold a wonder! behold a thing
1006|Which, whoso of mortal thing may tell,
1006|Hastes before him, dark and terrible,
1006|Of an eternal weal; and this accursed
1006|Foul thing, which may the doubting furnish try'd
1006|To feebor, I have set before him all,
1006|Who must to nothing turn.  But list: resound
1006|With me the patience, and the pity move."
1006|He then: "Now, sister!  think upon thyself;
1006|For how may I inform thee, if thou hearest,
1006|How soon and how slow time and means he takes?
1006|What world or language may at last he ask,
1006|Who to the world may'st give thee all things fair,
1006|Thou know'st, who hold'st it?"  Compassing that,
1006|And beck'ning the approach of his approach,
1006|I answer'd: "By new stars, and by George, spirit
1006|Of sign all cardinal virtues I'm assured,
1006|That wrong thy sight; but
## Sample 3:
1006|What became of thatch of yours
1006|That you sally forth so suddenly?'
1006|I did not know him; but he felt
1006|As freshening blood the air beneath: and tow'rds me
1006|He moved, and I towards him moved, and fond
1006|As father, and in words, and with the sweet
1006|And with the sweet comply'd; as he had will'd,
1006|I wept, and to his bosom lifting me
1006|Upward, and all the while distill'd my blood.
1006|Proceeding from the river, and I thence
1006|Shelter'd and still, I lifted up mine eyes
1006|Toward the sun upon the matin light,
1006|And looking, ever taking it in view,
1006|Again beheld it tow'rds the other bank,
1006|From those eternal Poubs where first it was.
1006|Unto the chasm precipitous, I found
1006|A narrow path and new to me expos'd,
1006|And to the left hand and the right mine eyes
1006|Directed, and between, on the sharp rocks
1006|Directed, and between, on each side me,
1006|And on the ground, were dark and cavern'd depths.
1006|Then to the right it spake, in semblance such
1006|As sometimes Indian women have beheld
1006|Saw in their dreams the latter realms of Hell:
1006|"Why partest from me with looks enchanted,
1006|"That farthest from thine eyes withdraw thee not?"
1006|I said it somewhat, and turn'd to the right,
1006|And when the chasm was gone, and I had reach'd
1006|The bottom, where the stairs were not beheld,
1006|Backward my eye alone its course directed,
1006|Directed to another care and deep,
1006|That in its bark it led me; and the sage
1006|To my recalling them returnd I not.
1006|"Of all that I have shown thee, since that gate
1006|We enter'd first, whose is the self-"

“Romeo, Romeo! Wherefore art thou Romeo?”; some consistency, with 1006 popping up again as a prefix (Shakespeare perhaps is memorable enough for GPT-2-poetry-prefix); 3 samples:

## Sample 1:
1006|If once thou see, the sport is vanity.
1006|But fix thine eyes beneath: the shameless'st thou.
1006|Ah! yet no better sight hath ever brought us
1006|To the destruction of Eve, from the rape
1006|Of the fair world, where wickedness lay hid.
1006|Ah! now before the good Augustus returns
1006|In daily deeds, let our pray'rs to the Furies
1006|Be changed; for that he longs counterfeited
1006|The world's adornment, which in books for us
1006|His saintly he still wears, who for us did
1006|Lays his beloved, us'd have thereby
1006|Been so enamoured of this roundelay."
1006|As it was spilt by him who severs the dry land
1006|And, each eve earlier than the setting sun,
1006|Wines underneath the rose that Tarquin leaves;
1006|So ravished I the more the fair, the more
1006|The great disdain, the pity, the long-sought,
1006|Toward the ribald nature of this law,
1006|Which the new world much sooner would be war
1006|With my sweet-will, the which the will hath oft
1006|Been usefully perverted.  I beheld
1006|Four mighty shades, and they besought for speech,
1006|But spake but slowly; whereto I re-turned,
1006|They showed me the grievous shame that clustered
1006|In their resplendence, who had all confounded
1006|With Adam, that Adam's fair abode
1006|Long time abode in peace.  "Remember," thus
1006|The teacher cried, "Adam, I the three."
1006|"And is this the true mettled nymph, that shine
1006|Thy twins now in th' ascent, as thou dost now
1006|Below?"  I said: and he, "My son, who know'st
1006|The art of song, and who but hear, may be
1006|Lest thou divide the loaves from five to six."
1006|Thereat the
## Sample 2:
1006|If truly the mad people of Raymond dread thee,
1006|Well hast thou 'scaped yourselves, they are already
1006|Each other's torment; and the wretches, mortals!
1006|Are to the shades of thy polluted ways."
1006|Whence she again replying, after us?"
1006|Purgator to the Red relates this; and Dante,
1006|"That spirit of the just hath spoken and said,
1006|'Purge now the mountain of the unquench'd, and make
1006|The sea of fire purge otherwhere he sees,
1006|Where Hood and other vain adventures 've led;
1006|Let the o'er-shadowing bridge ill-beseem'd
1006|Yoke up the mighty demons while they walk
1006|In different paths, and divers guides disentangle
1006|The tangled ways, so that no stork may turn them back;"
1006|And the good Master to me: "See thou findest
1006|E'en thus, how in the little one the race
1006|Is to be cumber'd with the broken blossom.
1006|That from this blow across theims of brightness
1006|Has not so virtue worried thee, it seems,
1006|That one with good intent in passing 'bove it
1006|Thou seest the harmonies of the mind,
1006|And therefore dost not comprehend them.
1006|That the disturbance which the sound invades
1006|Interpreteth not, maketh thy fancy
1006|Take in good hieroglyphs of everichoon;
1006|Thou seest that not only he who bids it
1006|Be, but that other who behind the hill
1006|Comes from step's shadow, comes in solitude;
1006|So that the evil one, who, in disdain
1006|Of creature freedom, 'bove the other waits,
1006|The same, who hoping in complacency
1006|To aid thee in thy loftier thoughts, like them
1006|I see now aids, which thee, thus overcome,
1006|With light returneth to thy mind from thee."
1006|Ill shapes that course which
## Sample 3:
1006|If thou no longer liv'st in the old age,
1006|To stop the running of the vein thou well;
1006|Thou by thy deeds hast purged away my youth,
1006|Strengthen'd by me in good deeds and old,
1006|And banish'd with my beauty from my song.
1006|O how much better were it I had Ruthven,
1006|With Beatrice and high excellence
1006|Of soul! to have companion'd such, who yields
1006|To her for intermingling of the world!
1006|But I see thee already clasp'd in thine,
1006|Not in thy only Heaven, but in the life
1006|To which thou crav'st it, and dost here renew it.
1006|This good companion, speak from whence it is
1006|That I should fear not, when the hour arrive
1006|That to your sighs a second time shall bring,
1006|Then with her grace, which that affection gave
1006|And your fair looks, which with her will I scan,
1006|Till my changed nature prove itself as yours."
1006|Whence her reply the thoughtful Sage imposed;
1006|"And I am He, whom thou didst long for, dead
1006|Cord de'il me not," he answer'd, "yet for thee
1006| hamm'st me.  Heart of Potentate was I,
1006|And life was in the willow.  That the words
1006|Which now I hear, shall marvel thee, wert thou
1006|A dearth in love.  Ah, how I long, long to
1006|For it am pain and grief!  that knowest well
1006|My soul's fair metes on earth!"  His sovran speech
1006|Bespake the meek thanks of the high Lord.
1006|I do not over 8 oceans turn," he then
1006|Began, "if new device can be of late
1006|Received in thee, tell us, who thou wast, and by
1006|The heels of th' mountain, times such words to me."
1006|"That I frizzano elsewhere is


Upon request, I generated 100 samples of Lewis Carroll’s Jabberwocky. Examining preliminary samples, the closest prefix was #24650, corresponding to The Jingle Book, Wells 1899, an anthology of humorous children’s verse (which makes sense). “Jabberwocky” itself does not appear in the PG corpus but the “Jabberwock” is mentioned in one of the poems in Wells 1899, the acrostic poem “An Alphabet Zoo”, so, close enough.

Some samples:

24560|He found a foxy in the brake,
24560|A cunning fox of scarlet dye,
24560|And from that foxy followed make
24560|The scrawny fox in glee.
24560|He followed with his dam and horn
24560|To where the river-water runs,
24560|And as his living current on
24560|The river-water likes him up
24560|A mighty rocky heifer heaves,
24560|And in a single field, or twain,
24560|Shows like the yellow corn;
24560|And when the wind doth blow, so too
24560|Low in his bottom lies his head,
24560|And in the grass leaps up again,
24560|In fearful freedom unbetrayed.
24560|And while the sword-strokes gushed and ran
24560|A-gathering like a hurricane,
24560|The wildest war-cries, loud and shrill,
24560|Came surging through the hills.
24560|At times a wild yell rose in the air,
24560|From steep to steep, from moor to moor,
24560|A scream of horror and despair,
24560|Of death approaching neathernock,
24560|Of dying embers in the pore.
24560|“Thy shield, my son! thy sabre bright!
24560|My hands! thy lips! thy lips of flame!
24560|O hasten! I have done the same.”
24560|It made the blood tingle and squall
24560|To hear that wounded warrior’s cry;
24560|It made the life blood quickly fall
24560|On the proud swine that fought and died—
24560|“Say on! say on! thou wounded swine,
24560|Say on! thou hewer of the sanguine!
24560|Say on! thou hewer of the sanguine!
24560|Say on! thou hewer of the sanguine!
24560|Say on! say on! that pierced youth
24560|Who crossed the heath in days of yore,
24560|And when he crossed it, “Say on!”
24560|It made the blood tingle and squall
24560|To hear that wounded warrior’s cry;
24560|It made the life blood quickly quail,
24560|To hear that death did soon draw nigh.
24560|It made the blood tingle and squall,
24560|To hear that death did soon draw nigh.
24560|“Say on! say on! thou wounded swine,
24560|Say on! thou hewer of the sanguine!
24560|Thou art not doomed to fetch a pail,
24560|’Tis not the ebbing blood so drain.”
24560|And then it made the blood tingle,
24560|To hear that agony within:
24560|“Thou liest
24560|“It is morn!” said the clover-bush,
24560|As the sun came up the sky;
24560|“It is morn!” said the cattle-cock,
24560|As it drifted by.
24560|Then loud cried the blackbird, and forth
24560|On the moor the clamorer fell;
24560|“The rusty ruts!” he cried, “is morn!”
24560|And he opened his little cage,
24560|And went to his morning cell.
24560|Two ravening birds sat at his feet,
24560|One flew away and one fell,
24560|The blackbird with a shining face,
24560|Sang these grand welcomes to his race.
24560|“Who is the Blackbird?” they said,
24560|“Singing high, sing low, sing low,
24560|Sings the song for the giver and light?”
24560|“I am the Blackbird—a joy befall
24560|To the young in the woods so fair.
24560|I am the wild-duck, which roams at will
24560|Up and down in the gulphyry rill.
24560|I, the giver of rivers fair.
24560|I, the joy-devoted mole,
24560|I, the joy-devoted mole.
24560|I, the humming-bird, which sings so sweet
24560|I, the giver of rivers fair.
24560|I, the bird-cail’d eggs, with a silver sound,
24560|I, the joy-devoted mole.
24560|I, the giver of goodly things,
24560|I, the swift-wing’d eagle, I,
24560|I, the joy-devoted mole.
24560|From the sunny, sunny south,
24560|From the sunny south,
24560|The swarm departed,
24560|But woe to every wicked wight
24560|That ever them befel!
24560|“O shame to every wicked wight
24560|That ever them beguile!”


Subjectively, the output shows a lot of poetry knowledge, much better than the char-RNN samples. There’s rhyming, themes are continued for shockingly long passages compared to char-RNN, and there are many passages I feel could inspire a poet or even be cleaned up a little to be passable poems on their own. Adding the metadata did help—GPT-2-poetry is worse than GPT-2-poetry-prefix.

Is GPT-2-poetry-prefix better than GPT-2-small at poetry completions (since GPT-2-small will probably hardly ever generate poetry without a prompt)? Probably, with exceptions. “Howl” is far worse, but that is for good reason related to the oldness of the PG corpus; if anyone could assemble an equally large corpus of more recent poetry, I’d expect GPT-2-small finetuning to produce better completions. The Pope samples for -prefix are clearly better. I would argue that the Shelley samples are somewhat better. And the 8 famous line completions are overall of much higher poetic quality (several of the GPT-2-small completions are just prose, unsurprisingly).

So, if one is looking for poetry completions in an old-fashioned vein, it delivers, but at the cost of flexibility like more prose-like (and hence contemporary) poems. This is an expected and fixable problem, and overall, I consider GPT-2-poetry-prefix to be successful as a poem generator & better than my previous char-RNNs.


Nor is this near the limit for Transformer-based poetry generation, as there are many possible improvements which could be made, all of which I’d expect to deliver substantial gains:

  • make it bigger:

    • bigger NN models: these results use the publicly-released GPT-2-small, which delivers inferior results on all tasks compared to the unreleased GPT-2-large: the samples generated by OpenAI & associates from GPT-2-large are much better than GPT-2-small samples, indicating that simply scaling up continues to deliver gains. Nor did the various GPT-2 model sizes appear to reach any natural limit with GPT-2-large, indicating that the Transformer NNs can be increased much further before hitting zero marginal gains. (This is consistent with other large-scale NN research, particularly on CNNs where even billions of images can be usefully trained upon.)
    • better NN models (which will probably need to be bigger): adding recurrency like Transformer-XL or more attention heads or more layers or external memory functions or on-the-fly adaptation; there are many possibilities here. (The prefix can be seen as an extremely crude kind of recurrency or memory, and helped a lot; how much more so a real memory?)
    • more & better data: quantity-wise, the PG corpus is barely a tenth of a gigabyte and exhibits many enormous omissions—all of modern poetry, for example, not to mention most foreign poetry, or non-English poetry as a whole (why not a multi-lingual GPT-2 if sufficiently large? neural machine translation approaches improve the more languages they have access to, why not regular language generation?). There are many places additional poetry could be obtained from, such as WikiSource, Poetry Foundation, Libgen, or the Internet in general (perhaps write a poetry-detector Transformer to search through a dump like Common Crawl for poetry?). Quality-wise, the PG corpus is good but still has a number of flaws: a lot of prose, just enough non-English poetry to screw things up (especially Latin), mostly pre-1923 poetry, & minimal metadata (ideally, poems would be individual units rather than book-length streams, and metadata like author would be available to use in prefixes)
  • generate smarter

    • using a better sampling strategy than top-k, like beam search (this gave noticeable improvements for char-RNNs compared to greedy Boltzmann sampling)
    • use tree search methods: any deep, thorough, search inevitably becomes a tree; tree searches are useful for enabling kinds of ‘backtracking’ and ‘revision’ or ‘changing its mind’ about multiple possible variants of a poem, as opposed to the usual sampling approaches which tend to commit to each word and force all-or-nothing choices
  • train better, by switching to the RL setting:

    • adding global end-to-end losses, which enable training to optimize non-differentiable properties rather than easy (but partially irrelevant ones like predictive losses such as cross-entropy in prediction of the next word). For example, rules defining acceptable meter or rhyme use or penalizing total repetition—these cannot be done via the normal training because no individual discrete word is responsible and parameters cannot be smoothly adjusted to decrease/increase a global property like ‘rhymes’ which is the result of all words considered together as a whole. (This sort of RL loss has been employed in other natural language tasks like machine translation, where metrics like predictive loss do not map onto the desired goal of semantically-correct translation, and word-by-word generation of translations yields similar issues as here, but there are metrics like BLEU or ROUGE or grammar checkers which provide a crude measure of global quality. RL approaches have many virtues.)

    • using subjective quality-based losses, like preference learning:

      instead of training a NN to predict individual next-characters as accurately as possible or imitate a text corpus as well as possible, we really just want them to predict good next-characters to write text as well as possible—which is not the same thing at all, any more than accurately predicting a human Go player’s next move on average is the same thing as playing Go superhumanly well.

      This encourages more global coherency, more thematic progressions, use of rare words when appropriate, surprising subversions or twists which work well when tried but don’t appear in the original corpus, learning esthetics, and so on. If it works and the new GPT-2-poetry is able to successfully produce new poems which consistently get the top score from the critic and no further improvement is happening, then you simply read a bunch of its new poems, pick which one in each pair you like, retrain the critic on the expanded dataset to detect the remaining flaws in the ones you disliked, and then keep training GPT-2-poetry to avoid generating the ones you disliked & generate more poems like the ones you liked. Repeat with many cycles, and it should generate excellent poems while avoiding all the flaws of crude likelihood training and even cruder top-k sampling which hobble GPT-2-poetry right now. Even better, you could create a website to crowdsource the rankings to keep it training 24/7 and improving indefinitely.

    • using “expert iteration” architectures like AlphaZero to do much more sophisticated search over possible poems, creating an iterative bootstrap

    • adding creativity losses along the lines of “CAN: Creative Adversarial Networks, Generating ‘Art’ by Learning About Styles and Deviating from Style Norms”, Elgammal et al 2017, where updating GANs encourage diversity

      • one could attempt to invent new styles of poetry by taking inspiration from evolutionary methods, such as the “Population-Based Training” variant employed in DeepMind’s AlphaStar League which created diversity by deliberately scrambling the ‘rules’ for each lineage of agents. The “AlphaStar League” used a population of multiple NNs, each forced to specialize in using a particular unit or rewarded for achieving particular goals like defeating a specific NN (rather than winning in general). The AlphaStar League was credited for forcing the overall AlphaStar population to explore strategies reliant on particular kinds of units and figuring out counter-strategies to successful ones. Something similar could be done with poetry rules: train many different agents, each given a specific rhyme scene or meter or vocabulary for their reward function, and in preference-learning approaches, the best poems can be provided to human critics for rating & improving the NN critic. Potentially exciting new combos could emerge as producing the best poems as rated by the humans.

Given that GPT-2-small is far from the state of the art as of February 2019, and hardware & generative NN research is advancing rapidly, it will be exciting to see what sort of poetry can be generated given another 4 years!

  1. I couldn’t compare the quality to Aanand’s original 3x512 because he didn’t provide the final validation score of his or the exact 50MB corpus to retrain on.

  2. A Transformer is a considerably different architecture than an RNN, and is not that easy to explain, as it uses multiple convolutions to implement “attention”, allowing flexible internal control flow, over a large but finite input window, without any recurrency or hidden state or LSTM units necessary. For increasingly-technical explanations, see:

  3. Other examples of finetuning are Facebook Messenger logs, nshepperd’s unpublished Linux kernel C source code & IRC-log training. And, while it doesn’t use GPT-2-small, too good to not mention is “Stack Roboflow: This Question Does Not Exist”.

  4. I have 2 GPUs but nshepperd’s code does not (yet) support multi-GPU training.

  5. One might worry that by taking up space in the model’s limited context ‘window’ of inputs, because the Transformer has no hidden state or ‘memory’, such inline metadata would be a bad thing as it will push real words out of the context window, thereby degrading quality and making it even more incoherent & rambling.

    But on the other hand, if it does learn to associate specific IDs with genres/topics, then repetition of the inline metadata serves as a ‘mnemonic’ for global information which is available to all subsequent iterations of the model, serving as a crude memory itself.

    For example, if Homeric pastiche has ID #16452, then as long as the final iteration of the model overlaps for just the ID with the first iteration of model during sampling and both see “16452”, all models will be able to consistently agree on generating Homeric pastiche rather than some other pastiche because they all see the same ID somewhere in their context window & that guides their generation.