Skip to main content

AI/​poetry directory

See Also


“Fully-Connected Neural Nets”, Branwen 2021

FC: “Fully-Connected Neural Nets”⁠, Gwern Branwen (2021-04-24; ⁠, ⁠, ⁠, ; backlinks):

Bibliography of ML papers related to multi-layer perceptrons (fully-connected neural nets), often showing surprising efficacy despite their reputation for being too general to be usable (representing a possible future Bitter Lesson).

“Weird AI Yankovic: Generating Parody Lyrics”, Riedl 2020

“Weird AI Yankovic: Generating Parody Lyrics”⁠, Mark Riedl (2020-09-25; ):

Lyrics parody swaps one set of words that accompany a melody with a new set of words, preserving the number of syllables per line and the rhyme scheme. Lyrics parody generation is a challenge for controllable text generation. We show how a specialized sampling procedure, combined with backward text generation with XLNet can produce parody lyrics that reliably meet the syllable and rhyme scheme constraints.We introduce the Weird AI Yankovic system and provide a case study evaluation. We conclude with societal implications of neural lyric parody generation.

“Can GPT-3 Pass a Writer’s Turing Test?”, Elkins & Chun 2020

2020-elkins.pdf: “Can GPT-3 Pass a Writer’s Turing Test?”⁠, Katherine Elkins, Jon Chun (2020-09-14; ; backlinks; similar):

Until recently the field of natural language generation relied upon formalized grammar systems, small-scale statistical models, and lengthy sets of heuristic rules. This older technology was fairly limited and brittle: it could remix language into word salad poems or chat with humans within narrowly defined topics.

Recently, very large-scale statistical language models have dramatically advanced the field, and GPT-3 is just one example. It can internalize the rules of language without explicit programming or rules. Instead, much like a human child, GPT-3 learns language through repeated exposure, albeit on a much larger scale.

Without explicit rules, it can sometimes fail at the simplest of linguistic tasks, but it can also excel at more difficult ones like imitating an author or waxing philosophical.

“GPT-2 AI Poetry Generation: Writing like Donne”, Case 2020

2020-case.pdf: “GPT-2 AI Poetry Generation: Writing like Donne”⁠, Kaiya Case (2020-06-04; ; backlinks)

“Rapformer: Conditional Rap Lyrics Generation With Denoising Autoencoders”, Nikolov et al 2020

“Rapformer: Conditional Rap Lyrics Generation with Denoising Autoencoders”⁠, Nikola I. Nikolov, Eric Malmi, Curtis G. Northcutt, Loreto Parisi (2020-04-08; ):

The ability to combine symbols to generate language is a defining characteristic of human intelligence, particularly in the context of artistic story-telling through lyrics. We develop a method for synthesizing a rap verse based on the content of any text (eg. a news article), or for augmenting pre-existing rap lyrics. Our method, called Rapformer, is based on training a Transformer-based denoising autoencoder to reconstruct rap lyrics from content words extracted from the lyrics, trying to preserve the essential meaning, while matching the target style. Rapformer features a novel BERT-based paraphrasing scheme for rhyme enhancement which increases the average rhyme density of output lyrics by 10%. Experimental results on three diverse input domains show that Rapformer is capable of generating technically fluent verses that offer a good trade-off between content preservation and style transfer⁠. Furthermore, a Turing-test-like experiment reveals that Rapformer fools human lyrics experts 25% of the time.

“A Hundred Visions and Revisions”, Binder 2020

“A Hundred Visions and Revisions”⁠, Jeff Binder (2020-03-11; ; backlinks; similar):

“A Hundred Visions and Revisions” is a computer program that alters poems using a neural-network language model. It works by replacing the individual words of the text, one by one, with other words that are more probable according to the BERT language model, while preserving rhyme and meter; in effect, this process banalifies the poem, replacing its linguistic distinctiveness with normativity. The program can also attempt to revise a poem to be about a different topic. As an example, I started with the poem “The Sick Rose” by William Blake:

O Rose thou art sick.
The invisible worm,
That flies in the night
In the howling storm:

Has found out thy bed
Of crimson joy:
And his dark secret love
Does thy life destroy.

Here is the revision:

By God thou art blessed.
The invisible man,
Who walks in the night
In a hooded cloak:

Has found both his source
Of body heat:
And his own power that
Makes his life complete.

…It is also possible to have the program revise a poem to be about a different topic while retaining rhyme, meter, and some other, subtler traces of the original. When I created the finetuned neural network, I included annotations indicating the title and author of each poem. This enables the AI to pick up on patterns in the relation between title and poem. You can then feed in hints about the poem’s title, and the AI will alter the text accordingly…All of these revisions retain the rhyme, meter, and punctuation of the original (excepting the slant-rhyme of “eye” and “symmetry”, which the current code cannot detect). If these formal constraints are lifted, the poem will degenerate into prose that bears little relation to the original…I also included a feature that enables you to bias the output toward an arbitrary vocabulary. I tested this out using the data from Iain Barr’s analysis of the vocabulary of heavy metal lyrics

How it works: The BERT model is capable of guessing a word that is “masked”—that is, hidden from the model. To pick an example from the documentation for the implementation I used, one could enter “Who was Jim Henson? Jim Henson was a [MASK]”; the model predicts that the masked word is “puppeteer”. The point of this is to enable the computer to perform question-answering tasks, language modeling standing as a surrogate for more general intelligence. But it is also possible to use the model’s predictions to alter an existing text. To do this, my program tries masking each word in the text and guessing what word should be in that position. For instance, suppose we are looking at this text:

Tyger Tyger, burning bright, in the forests of the night

We try masking each word in order; for instance, at one point we will end up with this:

Tyger Tyger, burning bright, in the [MASK] of the night

The program uses the neural network to predict what word appears in the masked position, subject to various constraints such as rhyme and meter. In this case, the BERT model guesses “middle”, with probability 0.6762. On the other hand, the word that is actually in that position—“forests”—gets probability 0.000076159. We divide the latter by the former to get a score for this potential change: 0.0001126. Since this score happens to be the lowest for any word in the text, the program selects the word “forests” for replacement, giving us this revision:

Tyger Tyger, burning bright, in the middle of the night

The program then repeats this process until there are no more “improvements” to be made.

“Writing the Next American Hit: Using GPT-2 to Explore the Possibility of Creating Successful AI-Generated Song Lyrics Possibility of Creating Successful AI-Generated Song Lyric”, Barrio 2020

2020-barrio.pdf: “Writing the Next American Hit: Using GPT-2 to Explore the Possibility of Creating Successful AI-Generated Song Lyrics Possibility of Creating Successful AI-Generated Song Lyric”⁠, Barrio (2020; ; backlinks)

“The Machine As Author”, Gervais 2019

2019-gervais.pdf: “The Machine As Author”⁠, Daniel J. Gervais (2019-03-24; ⁠, ⁠, ; backlinks; similar):

The use of Artificial Intelligence (AI) machines using deep learning neural networks to create material that facially looks like it should be protected by copyright is growing exponentially. From articles in national news media to music, film, poetry and painting, AI machines create material that has economic value and that competes with productions of human authors. The Article reviews both normative and doctrinal arguments for and against the protection by copyright of literary and artistic productions made by AI machines.

The Article finds that the arguments in favor of protection are flawed and unconvincing and that a proper analysis of the history, purpose, and major doctrines of copyright law all lead to the conclusion that productions that do not result from human creative choices belong to the public domain⁠.

The Article proposes a test to determine which productions should be protected, including in case of collaboration between human and machine. Finally, the Article applies the proposed test to three specific fact patterns to illustrate its application.

[Keywords: copyright, author, artificial intelligence, machine learning]

“Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme”, Lau et al 2018

“Deep-speare: A Joint Neural Model of Poetic Language, Meter and Rhyme”⁠, Jey Han Lau, Trevor Cohn, Timothy Baldwin, Julian Brooke, Adam Hammond (2018-07-10; ; backlinks; similar):

In this paper, we propose a joint architecture that captures language, rhyme and meter for sonnet modelling. We assess the quality of generated poems using crowd and expert judgements. The stress and rhyme models perform very well, as generated poems are largely indistinguishable from human-written poems. Expert evaluation, however, reveals that a vanilla language model captures meter implicitly, and that machine-generated poems still underperform in terms of readability and emotion. Our research shows the importance expert evaluation for poetry generation, and that future research should look beyond rhyme/​meter and focus on poetic language.

“RNN Metadata for Mimicking Author Style”, Branwen 2015

RNN-metadata: “RNN Metadata for Mimicking Author Style”⁠, Gwern Branwen (2015-09-12; ⁠, ⁠, ; backlinks; similar):

Teaching a text-generating char-RNN to automatically imitate many different authors by labeling the input text by author; additional experiments include imitating Geocities and retraining GPT-2 on a large Project Gutenberg poetry corpus.

Char-RNNs are unsupervised generative models which learn to mimic text sequences. I suggest extending char-RNNs with inline metadata such as genre or author prefixed to each line of input, allowing for better & more efficient metadata, and more controllable sampling of generated output by feeding in desired metadata. A 2015 experiment using torch-rnn on a set of ~30 Project Gutenberg e-books (1 per author) to train a large char-RNN shows that a char-RNN can learn to remember metadata such as authors, learn associated prose styles, and often generate text visibly similar to that of a specified author.

I further try & fail to train a char-RNN on Geocities HTML for unclear reasons.

More successfully, I experiment in 2019 with a recently-developed alternative to char-RNNs⁠, the Transformer NN architecture, by finetuning training OpenAI’s GPT-2-117M Transformer model on a much larger (117MB) Project Gutenberg poetry corpus using both unlabeled lines & lines with inline metadata (the source book). The generated poetry is much better. And GPT-3 is better still.

“The Unreasonable Effectiveness of Recurrent Neural Networks”, Karpathy 2015

“The Unreasonable Effectiveness of Recurrent Neural Networks”⁠, Andrej Karpathy (2015-05-21; ⁠, ; backlinks; similar):

[Exploration of char-RNN neural nets for generating text. Karpathy codes a simple recurrent NN which generates character-by-character, and discovers that it is able to generate remarkably plausible text (at the syntactic level) for Paul Graham⁠, Shakespeare, Wikipedia, LaTeX, Linux C code, and baby names—all using the same generic architecture. Visualizing the internal activity of the char-RNNs, they seem to be genuinely understanding some of the recursive syntactic structure of the text in a way that other text-generation methods like n-grams cannot. Inspired by this post, I began tinkering with char-RNNs for poetry myself; as of 2019, char-RNNs have been largely obsoleted by the new Transformer architecture⁠, but recurrency will make a comeback and Karpathy’s post is still a valuable and fun read.]

There’s something magical about Recurrent Neural Networks (RNNs). I still remember when I trained my first recurrent network for Image Captioning. Within a few dozen minutes of training my first baby model (with rather arbitrarily-chosen hyperparameters) started to generate very nice looking descriptions of images that were on the edge of making sense. Sometimes the ratio of how simple your model is to the quality of the results you get out of it blows past your expectations, and this was one of those times. What made this result so shocking at the time was that the common wisdom was that RNNs were supposed to be difficult to train (with more experience I’ve in fact reached the opposite conclusion). Fast forward about a year: I’m training RNNs all the time and I’ve witnessed their power and robustness many times, and yet their magical outputs still find ways of amusing me. This post is about sharing some of that magic with you.We’ll train RNNs to generate text character by character and ponder the question “how is that even possible?”

“The First Sally (A), Or, Trurl’s Electronic Bard”, Lem & Kandel 1974

1974-lem-cyberiad-trurlselectronicbard.pdf: “The First Sally (A), or, Trurl’s Electronic Bard”⁠, Stanisflaw Lem, Michael Kandel (1974-01-01; backlinks)

“Humans Who Are Not Concentrating Are Not General Intelligences” “Humans Who Are Not Concentrating Are Not General Intelligences” (⁠, ⁠, ; backlinks)

“AlphaStar: Mastering the Real-Time Strategy Game StarCraft II” “AlphaStar: Mastering the Real-Time Strategy Game StarCraft II” (⁠, ⁠, ; backlinks)