This Waifu Does Not Exist

I describe how I made the website ThisWaifuDoesNotExist.net (TWDNE) for displaying random anime faces generated by StyleGAN neural networks, and how it went viral.
anime, NN, Python, shell, technology, GPT, tutorial
2019-02-192020-01-20 finished certainty: highly likely importance: 3


Gen­er­at­ing high­-qual­ity anime faces has long been a task neural net­works strug­gled with. The inven­tion of StyleGAN in 2018 has effec­tively solved this task and I have trained a StyleGAN model which can gen­er­ate high­-qual­ity anime faces at 512px res­o­lu­tion. To show off the recent pro­gress, I made a web­site, for dis­play­ing ran­dom StyleGAN 2 faces. TWDNE dis­plays a differ­ent neu­ral-net-gen­er­ated face & plot sum­mary every 15s. The site was pop­u­lar and went viral online, espe­cially in Chi­na. The model can also be used inter­ac­tively for explo­ration & edit­ing in the .

TWDNE faces have been used as screen­savers, user avatars, char­ac­ter art for game packs or online games, uploaded to Pix­iv, given away in streams, and used in a research paper (). TWDNE results also helped inspired Sizigi Stu­dio’s online inter­ac­tive waifu GAN, , which gen­er­ates even bet­ter anime faces than my StyleGAN results.

In Decem­ber 2018, (source code/demo video) came out, a stun­ning fol­lowup to their 2017 (source/video), which improved the gen­er­a­tion of high­-res­o­lu­tion (1024px) real­is­tic human faces even fur­ther. In my long-run­ning dab­bling in gen­er­at­ing anime with GANs, ProGAN had by far the best results, but on my 2×1080ti GPUs, rea­son­able results required >3 weeks, and it was prov­ing diffi­cult to get decent results in accept­able time; StyleGAN excited me because it used a rad­i­cally differ­ent archi­tec­ture which seemed like it might be able to han­dle non-pho­to­graphic images like ani­me.

The source code & trained mod­els were released 2019-02-04. The wait was ago­niz­ing, but I imme­di­ately applied it to my faces (based on a cor­pus I made by pro­cess­ing ), with aston­ish­ing results: after 1 day, the faces were supe­rior to ProGAN after 3 weeks—so StyleGAN turned out to not just work well on ani­me, but it improved more on ProGAN for anime than it did for pho­tographs!

While I was doing this & shar­ing results on Twit­ter, other peo­ple began set­ting up web­sites to show StyleGAN sam­ples from the pre­trained mod­els or StyleGANs they’d trained them­selves: the quiz “Which Face is Real”, /“This Per­son Does Not Exist”/“This Rental Does Not Exist”/“These Cats Do Not Exist”/“This Car Does Not Exist”/“This Mar­ket­ing Blog Does Not Exist” etc.

Since my StyleGAN anime faces were so good, I thought I’d hop on the band­wagon and cre­at­ed, yes, —one might say that “wai­fus” do not exist, but these wai­fus espe­cially do not exist.

Examples

“I feel like that rat in the exper­i­ment where it can press a but­ton for instant grat­i­fi­ca­tion—I can’t stop refresh­ing”

Anony­mous

A screen­shot of “This Waifu Does Not Exist” (TWDNE) show­ing a ran­dom StyleGAN-generated anime face and a ran­dom GPT-2-117M text sam­ple con­di­tioned on anime keywords/phrases.
“It is so sad to say that this manga has never been seen by any anime fans in the real world and this is an issue that must be addressed. Please make anime movies about me. Please make anime about me. Please make anime about your beau­ti­ful cat. Please make anime movies about me. Please make anime about your cute cat. I wish you the best of luck in your life.
Please make anime about me. Please make anime about my cute cute kit­ten.” —TWDNE #283 (sec­ond screen­shot of TWDNE)
64 TWDNE face sam­ples selected from social media, in an 8×8 grid.
Face sam­ples selected by users of Art­breeder
Time-lapse video of TWDNE, show­ing many differ­ent pairs of faces/texts.

Obor­mot made a ver­sion of TWDNE–dubbed “These Wai­fus Do Not Exist”–which dis­plays a con­stant­ly-up­dat­ing grid of anime faces, with an alter­na­tive ver­sion dis­play­ing an infi­nite mov­ing grid. On a large screen, these are par­tic­u­larly strik­ing as one Baidu forum poster demon­strated:

Pho­to­graph by 大藏游星 of “These Wai­fus Do Not Exist” dis­played max­i­mized on a large com­puter mon­i­tor.

And the slid­ing-grid is hyp­notic (half-hour-long video ver­sion):

Video of “These Wai­fus Do Not Exist” show­ing infi­nite grid scroll of gen­er­ated anime faces.

Implementation

“Once opened I was shown a waifu so lovely and pure that I stared in amaze­ment and awe.
Then the page refreshed auto­mat­i­cal­ly.
Now I am doomed to for­ever refresh to get her back, know­ing it shall never be.”

ter­ratheil­lu­sion­ist

TWDNE is a sim­ple sta­tic web­site which has 100,000 ran­dom StyleGAN faces and 100,000 ran­dom -small text snip­pets; it dis­plays a new image/text pair every 15 sec­onds.

TWDNE is imple­mented as a sta­tic site serv­ing pre-gen­er­ated files.

Why sta­tic instead of using a GPU server to gen­er­ate images on the fly? The (al­low­ing for image edit­ing), but was­n’t avail­able in time for TWDNE and -style evo­lu­tion­ary explo­ration has­n’t been imple­mented at all, so there was no point in rent­ing an expen­sive (>$100/month) GPU server to gen­er­ate ran­dom faces on the fly, and it is bet­ter to sim­ply gen­er­ate a large num­ber of ran­dom sam­ples and show those; any­one who wants to look at even more faces can down­load the model and run it them­selves (which would let them con­trol psi or retrain it on new datasets such as faces of a spe­cific char­ac­ter or attribute they are look­ing for). This is also far eas­ier to imple­ment.

There are 3 groups of ran­dom faces gen­er­ated with differ­ent hyper­pa­ra­me­ter set­tings to show the full spec­trum of the trade­off between qual­ity & diver­sity in StyleGAN. The first 70k tex­t-sam­ples are gen­er­ated using OA’s pub­licly-re­leased model GPT-2-117M, given a ran­dom seed 1–70,000 + a long prompt with many ani­me-re­lated words & phrases I picked arbi­trar­ily while play­ing with it; the final 30k were gen­er­ated using a 2-step process, where GPT-2-117M was retrained on an Anime News Net­work dataset of short plot syn­opses to emit plot syn­opses which are then fed into the orig­i­nal GPT-2-117M as a prompt for it to enlarge on. (Un­for­tu­nate­ly, it is not yet pos­si­ble to make the gen­er­ated face & text related in any way, but some of the jux­ta­po­si­tions will, by chance, be amus­ing any­way.) One can see the first set of 40k all dis­played in a video by Scavin.

The sta­tic site is a sin­gle HTML page, ./index.html, plus 100,000 images at the file­names ./example-{0..99999}.jpg and 100,000 text snip­pets at ./snippet-{0..99999}.txt. The JS selects a ran­dom inte­ger 0–99,9991, loads the image with that ID in the back­ground, swaps it2, and loads a new text snip­pet with the same ID; this repeats every 15s. Addi­tional JS adds but­tons for forc­ing an imme­di­ate refresh, stop­ping the refresh­ing process (per­haps because the user likes a pair & wants to look at it longer or screenshot/excerpt it), and of course load­ing Google Ana­lyt­ics to keep an eye on traffic.

There is no rewind but­ton or his­to­ry-track­ing, as that would cheapen the expe­ri­ence, elim­i­nat­ing the & the feel­ing of read­ing through an anime ver­sion of the infi­nite —if the user is too slow, a face or story will van­ish, effec­tively for­ever (un­less they want to go through them by hand).

A large pile of respon­sive CSS (writ­ten by Obor­mot) in the HTML page attempts to make TWDNE usable on all devices from a small smart­phone screen3 to a widescreen 4k dis­play, resiz­ing the faces to fit the screen width and putting the image & text side-by-side on suffi­cient­ly-wide dis­plays.

As a sta­tic site, it can be hosted on Ama­zon S3 as a bucket of files, and cached by Cloud­Flare (a fea­ture which turned out to be crit­i­cal when TWDNE went viral). The total TWDNEv1 web­site size is ~6GB. (As of TWDNEv3, with all ver­sions & text snip­pets, it weighs 34G­B.)

The main upfront cost was ~$50 to pre­pay 4 years of DNS for thiswaifudoesnotexist.net (in­fu­ri­at­ing­ly, thiswaifudoesnotexist.com turned out to have been squat­ted just hours before I began work­ing on it, pos­si­bly because of my ear­lier tweet); while Cloud­Flare is free, it does­n’t cache 100% and the non-Cloud­Flare-cached S3 band­width & host­ing cost $98 in Feb­ru­ary 2019.

“There’s this neural net­work that gen­er­ates anime girl­s…­some results look nor­mal and oth­ers are ter­ri­fy­ing. I painted a col­lage of my favorite results.” —Dinosar­den, 2019-02-22
“Real­ity can be what­ever he wants.”Venyes [Thanos meme about “These Wai­fus Do Not Exist”]

Downloads

  • The StyleGAN model used for the TWDNEv1 sam­ples (294MB, .pkl); alter­nate down­load via (avail­able for any Unix; alter­na­tive imple­men­ta­tions are avail­able for ):

    rsync --recursive --times --verbose rsync://78.46.86.149:873/twdne/2019-02-26-stylegan-faces-network-02048-016041.pkl ./

  • all TWDNEv1–3 faces & text snip­pets (34GB) are avail­able for down­load via a pub­lic rsync mir­ror:

    rsync --recursive --times --verbose rsync://78.46.86.149:873/twdne/ ./twdne/

  • all 100,000 text sam­ples (50MB, .txt)

Creating

Training StyleGAN

The process of cre­at­ing the face dataset & train­ing a StyleGAN is too involved to go into here. For a detailed tuto­r­ial & use­ful patches/scripts, see the main arti­cle .

Faces

TWDNEv1

“We’re reach­ing lev­els of smug never thought pos­si­ble”

Anony­mous

I don’t know what hap­pened here.

Gen­er­at­ing the faces is straight­for­ward. The StyleGAN repo pro­vides pretrained_example.py, which down­loads one of the Nvidia mod­els, loads it, and gen­er­ates a sin­gle face with the fixed ran­dom seed 5; to make this more use­ful, I sim­ply replace the remote URL with a local model file, change the ran­dom seed to None so a differ­ent seed is used every time, and loop n times to gen­er­ate n faces:

23,25c23,26
<     url = 'https://drive.google.com/uc?id=1MEGjdvVpUsu1jB4zrXZN7Y4kBBOzizDQ' # karras2019stylegan-ffhq-1024x1024.pkl
<     with dnnlib.util.open_url(url, cache_dir=config.cache_dir) as f:
<         _G, _D, Gs = pickle.load(f)
---
>     _G, _D, Gs = pickle.load(open("results/02046-sgan-faces-2gpu/network-snapshot-011809.pkl", "rb"))
34,35c35,37
<     rnd = np.random.RandomState(5)
<     latents = rnd.randn(1, Gs.input_shape[1])
---
>     for i in range(60000,70000):
>         rnd = np.random.RandomState(None)
...

I ran the pretrained_example.py script on the then-lat­est face model with psi=0.7 for 60k 512px faces4, upscaled each image to 1024px with waifu2x, and used ImageMag­ick to con­vert the PNGs to JPGs at qual­ity = 25% to save ~90% space/bandwidth (av­er­age image size, 63k­b).

“I’m ter­ri­fied but also intrigued by this cryp­tid ghost waifu.”

tick­-tack­-s­nick­-s­nack

For the next 10,000, because peo­ple online were par­tic­u­larly enjoy­ing look­ing at the weird­est & most bizarre faces (and because the weird sam­ples help rebut the com­mon mis­con­cep­tion that StyleGAN merely mem­o­rizes), I increased the trun­ca­tion hyper­pa­ra­me­ter (line 41) to psi = 1.0. For the final 30k, I took the then-lat­est model and set psi=0.6.

An overview of faces:

100 ran­dom sam­ple images from the StyleGAN anime faces on TWDNE, arranged in a 10×10 grid.

TWDNEv2

“I can’t stop hit­ting reload”

Cory Doc­torow

The first set of 100k faces was gen­er­ated using all of Dan­booru2017’s faces both SFW & NSFW, the default face crop­ping set­tings, and 2 addi­tional Holo/Asuka datasets I had. This led to some prob­lems, so of faces with expanded mar­gins I call ‘por­traits’, to retrain the anime face StyleGAN on. When done train­ing, they were con­sid­er­ably bet­ter, so I gen­er­ated another 100k and replace the old ones.

TWDNEv3

“Recent­ly, he has begun to craft objects of great pow­er. And many admire him more for this. But within his cre­ations, one finds hints of a twisted humour. Look closely at his mir­a­cles and vic­tims begin to emerge. Take an older project of his, ThisWai­fu­Does­No­tEx­ist.net. A won­der to behold? An amuse­ment but noth­ing more?…How many more young men sit, hunched and enslaved to this mag­ic? What strange pur­pose does this serve?”

Ghen­le­zo, “Beware of the Gwern”

In Decem­ber 2019, Nvidia released . S2 diag­nosed the frus­trat­ing ‘blob’ arti­facts that dogged S1 sam­ples, both face & por­trait, as stem­ming from a fun­da­men­tal flaw in the S1 archi­tec­ture, and the best way the neural net­work could fig­ure out to work around the flaw. It removed the flaw, and made a few other more minor changes (for bet­ter latent spaces etc). Aaron Gokaslan trained a S2 model on the por­trait dataset, and I used it to gen­er­ate a fresh batch of 100k faces, with differ­ent 𝜓 val­ues (0.6, 0.8, & 1.1) as before:

python3 run_generator.py generate-images --seeds=0-50000 --truncation-psi=0.8 \
    --network=2020-01-11-skylion-stylegan2-animeportraits-networksnapshot-024664.pkl
python3 run_generator.py generate-images --seeds=50001-75000 --truncation-psi=0.6 \
    --network=2020-01-11-skylion-stylegan2-animeportraits-networksnapshot-024664.pkl
python3 run_generator.py generate-images --seeds=75001-100000 --truncation-psi=1.1 \
    --network=2020-01-11-skylion-stylegan2-animeportraits-networksnapshot-024664.pkl
100 ran­dom sam­ple images from the StyleGAN 2 anime por­trait faces in TWDNEv3, arranged in a 10×10 grid.

I did­n’t delete the v1–2 images, but did move them to a sub­di­rec­tory (www.thiswaifudoesnotexist.net/v2/$ID), where they are also rsync-able.

Text

“the f— did i just read”

Anony­mous

I thought it would be funny to include NN-gen­er­ated text about ani­me, the way “This Rental Does Not Exist” included char-RNN gen­er­ated text about hotel rooms. But the accom­pa­ny­ing GPT-2-117M snip­pet gen­er­a­tion turned out to be a lit­tle more tricky than faces.

GPT-2-117M: prompted plot summaries

“It’s a GAN AI, and it just sits there, end­lessly gen­er­at­ing fake anime char­ac­ters along with gib­ber­ish anime sto­ry-plots.”

Bruce Ster­ling

The full GPT-2-1.5b model which gen­er­ates the best text has not been released by Ope­nAI, but they did release a much smaller model which gen­er­ates decent but not great text sam­ples, and that would just have to do.

I used the gpt-2-PyTorch CLI pack­age to run the down­loaded model with­out mess­ing with the orig­i­nal OA code myself. The repo explains how to down­load the GPT-2-117M model and pip install the Python depen­den­cies. Hyper­pa­ra­me­ter-wise, I roughly approx­i­mated the Ope­nAI set­tings by using top_k=30 (OA used top_k=40 but I did­n’t see much of a qual­ity differ­ence and it slowed down an already slow mod­el); the tem­per­a­ture hyper­pa­ra­me­ter I did not change but gpt-2-PyTorch seems to use the equiv­a­lent of OA’s 0.7 tem­per­a­ture.

I found that feed­ing in a long prompt with many ani­me-re­lated phrases & words seemed to improve out­put qual­i­ty, and I noticed it would eas­ily gen­er­ate MAL/Wikipedia-like “plot sum­maries” if I prompted it the right way, so I went with that. (I threw in a par­ody light novel title to see if GPT-2-117M would catch on, and as for “Rac­coon Girl”, that was a lit­tle joke about which I’d seen so many anime memes about in January/February 2019.)

I could­n’t fig­ure out gpt-2-PyTorch’s mini­batch func­tion­al­i­ty, so I set­tled for run­ning it the sim­plest way pos­si­ble, 1 text sam­ple per invo­ca­tion, and par­al­leliz­ing it with parallel as usu­al.

The Bash script I used to gen­er­ate all sam­ples:

function gpt2 {
    BIT="$(($@ % 2))"; # use ID/seed modulo 2 to split GPT-2-117M instances evenly across my 2 GPUs
    CUDA_VISIBLE_DEVICES="$BIT" python main.py --seed "$@" --top_k 30 --text "Anime ai nani arigatou gomen \
        sayonara chigau dame Madoka jigoku kami kanojo mahou magical girl youkai 4koma yonkoma Japan Oreimo baka \
        chibi gakuran schoolgirl school uniform club Funimation Gainax Khara Ghibli Hayao Miyazaki Saber Fate \
        Stay Night Pop Team Epic Japanese Aria Escaflowne Kanon Clannad comedy itai manga Shonen Jump pocky \
        tsundere urusai weeaboo yaoi yuri zettai harem senpai otaku waifu weeb fanfiction doujinshi trope \
        Anime News Network Anime Central Touhou kanji kaiju Neon Genesis Evangelion Spice and Wolf Holo Asuka \
        kawaii bishonen bishojo visual novel light novel video game story plot Fruits Basket Toradora Taiga \
        Aisaka tiger Detective Conan Pokemon Osamu Tezuka cat ears neko romantic comedy little sister character \
        plot drama article nekomimi bunny isekai tanuki catgirl moe manga manga manga anime anime anime review \
        plot summary. An exciting cute kawaii new anime series based on a light novel shoujo manga called  \
        \"I Can't Believe My Alien Visitor Is My Little Sister\" or \"MAVISM\", a sequel to \"Raccoon Girl\", \
        is the latest hit. In the first episode of this anime, " \
    >> /media/gwern/Data/thiswaifudoesnotexist/snippet-"$@".txt;
    cat /media/gwern/Data/thiswaifudoesnotexist/snippet-"$@".txt; }
export -f gpt2
seq 0 70000 | parallel --jobs 12 --progress gpt2
# Computers / CPU cores / Max jobs to run
# 12:local / 32 / 12
#
# Computer:jobs running/jobs completed/%of started jobs/Average seconds to complete
# local:1/0/100%/0.0s           harem anime heroine                 "Raccoon Girl II" was shown at Ani \
# me Funimation's Anime Festival (and it's a very popular anime) and was nominated as a top 100 anime  \
# on the Tokyo Otaku Mode list (I'll tell about that soon, I promise) . The  suitable ending for the f \
# irst episode (after a lot of scenes with a lot of bad guys, which is the end of episode three of the \
#  series) was shown as an early preview of the game which                  can also be downloaded fro \
# m the website.
# In my second review, I went by many titles, which you can find all here to download t \
# he episodes as well, but for the sake of brevity, here is an overview of the games I saw (I won't go \
#  into all the other titles because I think there are too many titles you can download that might be  \
# good, I have some I wouldn't name because I don't want to spoil them)           土安黄满收
# The only one t \
# hat I did not watch the whole series for my liking was "AoT, the End". The first episode             \
#   is about   (Katsuko)          x2 and             x2 was my first love story,         x2 is one whe \
# re it all gets a second time. This anime              x2 is not only  a really great  game and it    \
#             x2,          x3.  But for a few moments of enjoyment            x3 was the ending and I  \
# feel that I am not here to argue that "it wasn't even good  but it was awesome  and it  changed my l \
# ife  it was fun  it made my eyes roll  I felt that it  was  great, it  fixed any  that  I"
# 100%|████████████████████████████████████████████████████████████████| 512⁄512 [00:14<00:00, 35.54it/s]
# ...

While run­ning the first pass, some GPT-2-117M instances will fail for rea­sons like run­ning out of GPU VRAM (each instance takes almost 1GB of VRAM and my 1080tis only have ~10GB usable VRAM each). These can be fixed by look­ing for empty text files, extract­ing their ID/seed, and try­ing again:

MISSING_TXT=$(find /media/gwern/Data/thiswaifudoesnotexist/ -type f -name "*.txt" -size 0 \
              | cut -d '-' -f 2 | cut -d '.' -f 1 | sort --numeric-sort)
echo "$MISSING_TXT" | parallel --jobs 5 --progress gpt2

The final gen­er­ated GPT-2-117M text sam­ples have a major flaw: it will fre­quently end its gen­er­a­tion of anime text and switch to another top­ic, as denoted by the token <endoftext>. These other top­ics would be things like sports news or news arti­cles about Don­ald Trump, and would ruin the mood if included on TWDNE. The prob­lem there is that gpt-2-PyTorch does not mon­i­tor the mod­el’s out­put, it just runs the model for an arbi­trary num­ber of steps regard­less of whether <endoftext> has been reached or not. To remove these unwanted topic changes & leave only anime text, I run each text file through sed and it exits/stops pro­cess­ing when the token is reached, thereby delet­ing the token & every­thing after­wards:

find /media/gwern/Data/thiswaifudoesnotexist/ -name "snippet-*.txt" -type f \
    -exec sed -i '/<|endoftext|>/{s/<|endoftext|>.*//;q}' {} \;

This leaves 70k clean ani­me-themed gib­ber­ish text sam­ples which can be loaded by the same ran­dom­iza­tion process as the faces—hours of fun for the whole fam­i­ly.

GPT-2-anime plot synopses for GPT-2-117M

“okay my brain lit­er­ally can­not com­pre­hend the fact that these aren’t just real draw­ings and were made by a com­put­er. what”

Ash

I needed to feed GPT-2-117M such a large prompt because it is a gen­eral lan­guage mod­el, which has learned about anime as merely one of count­less top­ics in its giant cor­pus. The pri­mary goal of train­ing such lan­guage mod­els is to then use them on nar­row task-spe­cific cor­puses to ‘fine­tune’ or ‘trans­fer learn’ or pro­vide ‘infor­ma­tive pri­ors’ for a new mod­el, with the global knowl­edge serv­ing to model all the generic lan­guage (if you have a small cor­pus of med­ical clin­i­cal reports, you don’t want to waste pre­cious data learn­ing some­thing like cap­i­tal­iza­tion) and get­ting the model up to speed for the new cor­pus, and tap­ping into hid­den knowl­edge of the orig­i­nal.

By the same rea­son­ing, train­ing GPT-2-117M on an anime text cor­pus might lead to bet­ter or at least, fun­nier, gen­er­ated text. nshep­perd wrote some GPT-2 train­ing code for fine­tun­ing and retrained GPT-2-117M for ~3 epochs on Cang­gih P Wibowo’s 2016 Anime News Net­work scrape, using the title & plot syn­op­sis fields (TITLE|PLOT SYNOPSIS) I have also used the fine­tun­ing code for Project Guten­berg poetry and using the poetry as a run­ning exam­ple.

GPT-2-117M appeared to start over­fit­ting quickly and train­ing was stopped, yield­ing a final trained check­point we call “GPT-2-anime” (441M­B). To use it, one can sim­ply unpack it into the GPT-2-117M model direc­to­ry, over­writ­ing the files, and then run­ning the OA code as usu­al.

Sam­ple uncon­di­tional out­put:

======================================== SAMPLE 1
========================================
 a boy who works in a magic shop. Her magic power is not despite the power
these animals and the atmosphere grows stronger every night. She can't use
magic at all so she can get her witch powers back.
Shirobako: My Island (TV)|"Abe Eigaku is a rookie high school student who is
about to have a battle dyed by the colors of samurai and a sordid fangirl who
only wished to destroy the hell that befall him. One day, he runs into a
samurai and strikes a deal. The deal: If he can get to, then distract the
soaked Edo sword will be able to change his form. The deal was an attack, and
the young boy is left with a injuries and he is left with bitter injuries in
the hands of the Kumano family. From the experience that inherits the "gods'
ability" in him, he comes into an outdoor fighting ring with the Gices. There
he meets Setsu, a former show comor who has changed his life---like his father,
himself as a ""demon"" of the Gices, and a member of the Gices. At the family's
house, he makes his ultimate weapon, the three-headed demon Gekkou, Kabuto.
Yukino comes to his aid, and their rivalry soon evolves as they head into more
serious and skillful monsters. With his ability, Yukino faces others both
fierce and charming. The series follows their everyday lives, together with
their personal ones and each other, as they go through their daily lives and
try to grow up as close to each other as possible to the one they love.
Berserk: The Golden Age Arc (TV)|Guts is a young man who has been accepted into
a powerful industrial city by a mysterious woman named Null Mother, and is then
summoned to the city of Midland. The first he ever summoned to his world was
the Ghost Ship, a brutal battle that has been held on for thousands of years.
Now he is on the eve of the Bloody War, when the inhabitants of the little
world try to destroy the automated sentry. However, Guts cannot stop them and
blithelyave their fate as the inhabitants of Midland are devourant of the very
world they reside in.
Lupin III: The Legend of the Gold of Babylon (movie)|Deep beneath New York city
are buried tablets that tell the tale of Babylon's gold that was lost during
Babylon's destruction. Lupin is interested in finding this gold, but will have
to deal with two mafia families and Zenigata during his quest in unsolving the
mystery of the tablets. During Lupin's journey he encounters an old woman who
has a connection with this treasure.
Fortune Arterial: Akai Yakusoku (TV)|Kohei Hasekura has lived a live of
transferring schools for boys all his life. He's been the site of a recently
bankrupt well known as Moon River for years and has been looking forward to
getting it back. At his new school, however, he sees the school's most wanted
fight and gets a duel to choose the strongest in the class. The battle is about
to be fought by a monster called Hildegarn either. Hasekura's friend Orca
Buchizouji gets dragged into the fight and loses a majority of key to the duel.
Hasekura fights alone against the bullies and punks from the previous season,
and when the duel is over everyone comes to an end.
Gunslinger Girl: Il Teatrino (TV)|When the Social Welfare Agency investigates
the disappearance of a operative, their inquiry leads them right into the lair
of their rival, the Five Republics. The assassin Triela infiltrates the hostile
organization, but her search is cut short when she finds herself staring down
at the barrel of a gun...
Yuruyuri Nachu Yachumi! (TV)|
Magical Sisters Yoyo & Nene (TV)|Yoyo and Nene are witches living in the
Magical Kingdom who specialize in curse and decusing. They are negotiating with
a woman who wants to find her sister, a witch that disappeared twelve years
ago, when a monstrous tree appears in front of Yoyo and Nene´s house. Embedded
within the tree are unfamiliar buildings which prompt Yoyo to explore these
strange constructions. During her scout, Yoyo unintentionally ends up thrown
into modern day Japan. He is taken away by a doll-cat who falls from a magical
tree on his way back. He is taken back to Japan by Gogyou, a girl who is their
own, who falls from a magical tree. He and his friend Nanami lands on a magical
journey to recover the lost magical powers of the magical stone.
Sakura Taisen: Ecole de Paris (TV)|"Anastasia was supposed to be invaded by
various people. She is so damaged that she can't see anything. But her friendces

There are two imme­di­ately vis­i­ble prob­lems with the GPT-2-anime out­put.

  1. Too Short: prompted/conditional out­put from GPT-2-anime looks… much the same.

    Appar­ently what hap­pened was that dur­ing the fine­tun­ing, GPT-2-117M learned that the for­mat was incor­ri­gi­ble and inter­nal­ized that to such an extent that it would only gen­er­ate title+short­-syn­op­sis-fol­lowed-by-new-pairs. If a prompt is pro­vid­ed, it might influ­ence the first gen­er­ated pair, but then GPT-2-anime knows that the prompt is guar­an­teed to be near-ir­rel­e­vant to the next one (aside from per­haps fran­chise entries alpha­bet­ized togeth­er, like sequels) and can be ignored, and each title+­plot syn­op­sis is only a few sen­tences long, so GPT-2-anime will tran­si­tion quickly to the next one, which will be effec­tively uncon­di­tion­al.

    This is annoy­ing but per­haps not that big of a prob­lem. One can surely fix it by using a dataset in which the title/synopsis is then fol­lowed by a much longer plot descrip­tion (per­haps by merg­ing this ANN scrape with Wikipedia arti­cles). And for gen­er­at­ing ran­dom text snip­pets for TWDNE, we don’t need the con­trol since the uncon­di­tional sam­ples are all on-topic about anime now (which was part of the point of the fine­tun­ing).

  2. Low Qual­ity: some out­puts are thor­oughly unsat­is­fac­tory on their own mer­its.

    The first entry lacks a series title and any intro­duc­tion of the premise. One entry has a title—but no plot. And while most are shorter than I would like, the last one is sim­ply far too short even for a plot syn­op­sis.

    This is a prob­lem. The out­puts are mod­er­ately inter­est­ing (eg “Shi­robako: My Island”) and pro­vide a good plot syn­op­sis which is a nice start­ing point, but as-is, are unac­cept­ably low-qual­ity for TWDNE pur­pos­es, since they are so much shorter & less inter­est­ing on aver­age than the long prompted GPT-2-117M exam­ples were.

How­ev­er, the 2 prob­lems sug­gest their own solu­tion: if the GPT-2-anime plot syn­opses are good premises but GPT-2-anime refuses to con­tinue them with plot/dialogue, while GPT-2-117M gen­er­ates good long plot/dialogue con­tin­u­a­tions but only if it’s given a good prompt to force it into anime mode with use­ful key­words, why not com­bine them? Gen­er­ate a wacky plot syn­op­sis with GPT-2-anime, and then feed it into GPT-2-117M—to get the best of both worlds!

As before, sed will take care of the <endoftext> mark­ers (& if one does­n’t like the syn­opses’ “Source:” end-mark­ers, they can be removed with sed -e 's/ (Source: .*//'), the titles can be ital­i­cized, and we can post-process it fur­ther for qual­i­ty: if either of the syn­op­sis or plot sum­mary are too short, drop that text snip­pet entire­ly. This should yield long con­sis­tent high­-qual­ity sam­ples of ani­me-only text sam­ples of the form title/synopsis/coherent plot-sum­mary or dia­logue or arti­cle about said title+syn­op­sis.

After gen­er­at­ing 12MB of GPT-2-anime plot syn­opses yield­ing ~35k syn­opses (as­sum­ing some per­cent­age will be thrown out for being too short), I fed them into this script for GPT-2-117M sim­i­lar to before, to gen­er­ate tex­t-s­nip­pets #70,000–100,000:

TARGET="/media/gwern/Data/thiswaifudoesnotexist"
I=70001
fgrep '|' /home/gwern/src/gpt-2/samples | sed -e 's/^\(.*\)|/_\1_: /' | \
while IFS= read -r PROMPT; do
    if [ ${#PROMPT} -gt 150 ]; then
        PLOT="$(CUDA_VISIBLE_DEVICES=1 python main.py --seed 5 --top_k 40 \
                --text "Japanese anime manga light novel. $PROMPT. Plot \
                        Summary. In the first episode of this anime " | \
                sed -e '/<|endoftext|>/{s/<|endoftext|>.*//;q}'))"

        if [ ${#PLOT} -gt 250 ]; then
            echo "$PROMPT In the first episode of this anime $PLOT" \
              >> "$TARGET/snippet-$I.txt"

            echo "$TARGET/snippet-$I.txt"
            cat "$TARGET/snippet-$I.txt"

            I=$((I+1))
        fi
    fi
done

Sam­ple com­bined out­put:

/media/gwern/Data/thiswaifudoesnotexist/snippet-71240.txt
_Mayonaka wa Junketsu_: His older brother has been hospitalized for about 5
weeks for about 6 weeks. Because of his unaccidental death, Ryouta's parents
send him to live with his uncle, Yuuichirou. To his surprise, Yuuichirou is
actually the son of his father's lover! Will Ryouta get to know Yuuichirou?
(from B-U) In the first episode of this anime 【The Second Episode】, the
story takes place in Koegi-ji town. After Yuuichirou and his family leave to
live with his uncle, they encounter "Ryouta and the three of them". They learn
about their real relationship, but they're not going to tell this kind of story
for fear of being banned from the show. The characters' father, Kazui, is the
first name his family does not use. The story begins with the trio and Kazui
being together, while their parents are fighting for the survival of their
families, Yuuichirou and Ryouta in the middle of a battle. 【In the Next
Chapter】 I also do a series of interviews with people from the anime.
---------------------------The story is about three girls in the house.
-------------------From that moment on, the plot follows them as their story
goes on. They are all in shock about what has happened, but they believe that
Yuuichirou is the real Ryouta. From that moment on, they feel they have a
chance to change their path for the better after the events at the hospital
which led to Yuuichirou's death to come home and they try to gain friends by
helping Yuuichirou. ---------------There are many other characters besides the
three in the story, so what is the real one? The main character is called
Yuuichirou Kazui ( 招仮猿 名子 ) 【First Character】 【Second
Character】 【Third Character】 ---------------This anime tells a story very
similar to the show, just with characters that make you smile. If you are in a
certain situation and see the person that is your favorite, there is no
problem. However, you need to be mindful of the following things to make sure
you know who you are dealing with. The story is very short, but there are many
things that need to be done in order to make you smile. There are many times
that I would write a scene that just needs to end, the most important part, of
the scene, or I would have to make the scene not end at all. As long as it
doesn't end with a big scene that was in a certain moment, you wouldn't be able
to use your eyes to make out that someone is laughing, or a child being played
or something. Just be mindful of how you are dealing with those moments and
keep yourself a little calm even if they are actually happening

/media/gwern/Data/thiswaifudoesnotexist/snippet-71241.txt
_Shijin no Koibito_: Collection of short stories: (1) Shibatte (2) Love Me Tell
Me (3) I Love You (4) I've Fine (5) Meido (6) Take Of The Class (7) Liar (8) Take Of
The Precious Parade In the first episode of this anime iku wa Hana
kawakukushigai wa (Love me tell me a secret), a student named Ritoye has to
learn to read all of the kanji correctly. While they are doing this, they'll
notice something odd, they can't talk and the class will change. They'll also
notice some other things that happened, and Ritoye will get annoyed with the
school and will leave her class. Once they're through with her class, she'll
explain the problems and the solutions. She'll begin by looking back through
their textbooks, searching for the kanji that says 'I'm on the line' and 'I'm
here to talk a little bit about the story'. At some point in the story Ritoye
finds out that the class used to be 'on the line' with 'I'm just being a
teacher', so if someone is on the line they can start a lesson by asking the
name of that person you're teaching and giving a name of the class 'I-Mitsu'
which you have to pick from a few names on the internet. While the topic gets a
little heated, there'll be a brief period of humor where the class becomes even
funnier. One character shows off some random drawings while Ritoye is sitting
by her desk, so maybe it's not a complete mess of drawings, but I was sure it
would have been better if it was. The manga was originally published in
February 2014. I'll update this article if anything changes, then give up.
------------- RITOYE: A school principal

~ ~ ~

My parents will be around a week or two after school. During that time, the
school's president will take care of the school and it will be my job to pick
up students and take care of the textbooks.

We don't even need to call her. Her name is Ritoye.

She has all the classes they need, but she still says she doesn't like 'those
guys...

I love you everyone. I really do.'

I'm here, to get this in front of class, as I want to make sure she knows I'm
serious.

That's what the head teacher says.

I'll just take everything from you to make sure she knows that I'm serious.

"You, you're the best teacher in class, aren't you?"

It doesn't matter how difficult, we'll give you everything. We'll

/media/gwern/Data/thiswaifudoesnotexist/snippet-71242.txt
_Hitodenashi no Hirusagari_: A mysterious disease. A disease. The disease has
drastically differ from the usual suspects of the type of disease, but it is
basically living in a hospital as a guinea pig for a few weeks. A short story
of a person's persistent determination to save a person's life, and the story
of a person's struggles to find it in this "comedy" story. (Source: MU) In the
first episode of this anime 『Hitodenashi no Hirusagari』 (1962) the
protagonist, Hiroki, begins his life. In the next episode he has to learn that
he cannot leave Kyoto. From this he starts to find out more about himself and
the world, as well as a great deal about the life he has created for himself.
He learns a great deal about himself, as well as about the world. It goes
through the story like the story of a kid having a terrible experience...or the
story of young people taking it upon themselves. What Hiroki learns, however,
is quite difficult to grasp because of the many factors that are involved.


This is the first light novel I read that started off as a light novel, in the
spirit of the first book that came out in English.

When I came through the last arc of this short anime, I was very disappointed.
I was expecting much much more of a series of stories and anime but instead, I
found that it felt more about the plot. As the first episode started, in a
similar manner to the last arc, each arc is different from the previous arc.

From what I can gather, many plot points were missed when read through and how
often it was made a point of focus to tell a story instead of the original.

At the end of a story, you get to make your own sense of it based on how you
read it.

One time there was two main things left out as the main plot that needed me to
know a lot more.

First, I couldn't really get that the stories you saw in the first arcs were
actually the people that you could actually talk to.

This left me feeling more like I had lost some of my sense of immersion in a
story and just didn't really find it there. For example, I started to feel that
the main plot was very different from what you would see in the first arcs. The
two main themes we saw in the first arcs were how to live and how to do a life.

I found the story of how the life of a person is different from one that you
might see in the books.

The story of a person's living and living alone is also different from the
story of the world that you might see in a short play.

And to answer the question that many people have asked me over the last six
months, and these comments, but in order to

/media/gwern/Data/thiswaifudoesnotexist/snippet-71243.txt
_The King Of Debt_: Souta is a rich guy, who is unlucky in love, and punishes
by the rich son of a rich Chinese mafia as he is tossed about debt. But is he
really bad at such a guy? (from B-U) In the first episode of this anime
何月处, Souta is forced to marry a rich kid to buy him a piece of his ass. A
guy who doesn't know how to make friends with strangers, only has a friend for
the rich guy's sake. Souta does this by being the only person in the house that
doesn't have a job but a car, and even a friend of hers can't really call her
sister 'fae'. He also has an idol he can sing on stage, and he is forced to
fight for her (a guy named Katsu) against Katsu himself in the episode entitled
Souta Is Fae. In a final episode, he is forced to run the family business for
$100,000. What's the point in doing it? The kids only get to run the business,
but when they pay out the money they want in their inheritance, Souta starts to
run the shop as well. It's very simple for Souta and the guy he's running his
business for. After the episode has been rated H, the new episode has been
rated H...

favorite favorite favorite favorite ( 5 reviews )

Topics: Drama, TV, Romance, Romance


Community Audio 91 91 Seiken Hamamoto's Hacronyms 2 1 of 1 View 164 of 190

Topics: Drama, TV, Fantasy, Fantasy


Community Audio 88 88 The Man From The Outer Heaven 4 3 of 3 View 164 of 189

Topic: Drama, OVA


Community Audio 87 87 Seiken Hamamoto's Hacronyms 9 8 of 8 View 164 of 190

Topic: Drama, OVA


Community Audio 85 85 The Man From The Outer Heaven 3 2 of 2 View 164 of 185

Topic: Drama, OVA


Community Audio 84 84 Seiken Hamamoto's Hacronyms 2 2 of 2 View 164 of 189

Topic: Drama, OVA


Community Audio 83 83 Seiken Hamamoto's Hacronyms 3 3 of 3 View 164 of 190

Topic: OVA, DVD, Blu-ray, DVD-R

Topics: Drama, TV


Community Audio 82 82 Seiken Hamamoto's Hacronyms 3 3 of 3 View 164 of 190

Topic: Drama, OVA


Community Audio 81 81 Seiken Hamamoto's Hacronyms 2 2 of 2 View 164 of 188

Topic: OVA, DVD,

/media/gwern/Data/thiswaifudoesnotexist/snippet-71244.txt
_Yuuhi Zukan_: A story about a student council president and his student
council president, and how they are going to boot on the plan of how to change
the financial system. Unfortunately, this plan results in both failure and
budget failure. (from B-U) In the first episode of this anime 一日本の橙
and other characters form a conspiracy to gain access to his store, after being
cut in half, they make their way to the store to recruit them using the powers
of a demon. The demon has three characteristics: a strong, sharp eye, a strong
hand; a black eye, with white and black pupils with a big black mark on, and a
silver eye, with a black eye mark. The Demon's ability to create spells is
"demonic mana" that can be activated as if it were "guru aura". After a brief
delay, however, the Demon enters a black space that has a black face, and then
the Demon is attacked by a small black hole in his chest. The Demon attacks the
small black hole and when the black hole dies, the Demon's power to create
spells is restored. The demon returns to his house where is an old man called
Tsubayasu who says that he is going to create a dragon with great power. He has
been searching for a good magic and after his work with Tsubayasu and his
students for about half a decade now, he sees the dragon coming on a distant
night. Once this happens, his plans fall apart. When he discovers it was some
magic wizard he has been using, he leaves Tsubayasu thinking that he can not
use the dragon because of the power-up and his plans have come to an end. He
wakes up his parents and tells them that Tsubayasu had found money and a dragon
he created. (from B-U). 一日京初日 歋未夢の橙面者 (from B-U) A
young girl named Fushikata is assigned to work for the student council
President in the store. Fushikata is an ordinary girl who works as a maid, yet
she is also an extremely dangerous person. She lives with her mother and sister
as the main characters of the series. One day Fushikata's mother goes missing.
When they return to the store, a group of students, with a group of girls,
enter. They are given the key to the store, but after they get home Fushikata
is killed by the two of them. The group discovers that the woman they have lost
is Tsubayasu's old teacher, Hidetoshi. At this time Fushikata starts to think
about how she should treat him at a later date but that she is too busy

The cor­re­spond­ing 30k faces are gen­er­ated & upscaled as before, index.html updated to ran­dom­ized 0–100,000, and so on.

GPT-3

Ope­nAI released GPT-3 in June 2020 as a SaaS, with the model avail­able only through a closed beta API. I was given access, and ran exten­sive using the web inter­face. After sat­is­fy­ing myself with that, I thought to update TWDNE with GPT-3 anime plot sum­maries.

Because GPT-3 is so much more pow­er­ful than any of the GPT-2 mod­els, the two-phase process and exten­sive key­words can be omit­ted entirely in favor of just a sin­gle short prompt—GPT-3 will get the idea imme­di­ate­ly. After some exper­i­men­ta­tion ( remains a black art), I found that I could get a novel anime title & plot sum­mary by fram­ing it as a review of an anime from the future (eg 2022). A sim­ple prompt that worked nicely was

2022 Anime Fall Sea­son Pre­views and Reviews

Pre­views of the lat­est and hottest upcom­ing Japan­ese anime this fall sea­son!

Below, a review of the themes and plot of the first episode of the wide­ly-an­tic­i­pated 2022 fall orig­i­nal new 1-sea­son anime

GPT-3 API

The API itself is straight­for­ward to use, and can be inter­acted with via their Python pack­age or just curl. You have to pro­vide an API key, but oth­er­wise, one just needs the tem­per­a­ture and top-p (for nucleus sam­pling) and a prompt, and one gets back the text com­ple­tion. Since the prompt is so short, we don’t need to worry about issues like our text tok­eniz­ing into (which was a major issue with lit­er­ary uses) or hit­ting the con­text win­dow (dou­bled to 2048 BPEs, but still all too painfully nar­row). The returned string is JSON like this:

{
  "id": "cmpl-zFdIa6r5oO6AV4iogorDoAXh",
  "object": "text_completion",
  "created": 1593397747,
  "model": "davinci:2020-05-03",
  "choices": [
    {
      "text": "Yutori-chan which is produced by Sunrise:\nThe Kind-Hearted T-back Hurdler \ Yutori-chan\nAt \
      long last we welcome the challenging and heartwarming anime Yutori-chan produced by",
      "index": 0,
      "logprobs": null,
      "finish_reason": "length"
    }
  ]
}

GPT-3 Generation

So, to gen­er­ate new anime plot sum­maries, one can just loop through with curl, extract the text field with jq, do a lit­tle refor­mat­ting, and that’s it:

for i in {0..100000};
do
    echo -n "Review of the themes and plot of the first episode of the new anime " > snippet-$i.txt
    curl --silent 'https://api.openai.com/v1/engines/davinci/completions'
        -H 'Content-Type: application/json' -H 'Authorization: XYZ' # NOTE: insert your own API token here
        -d '{"temperature": 0.95,"top_p": 0.98, # }
            "prompt": "2022 Anime Fall Season Previews and Reviews\nPreviews of the latest and hottest \
              upcoming Japanese anime this fall season!\nBelow, a review of the themes and plot of the \
              first episode of the widely-anticipated 2022 fall original new 1-season anime \"",
            "max_tokens": 700 }' | \
  ## Select just the text completion:
  jq '.choices[0].text' | \
  ## unescape quotes:
  sed -e 's/\\\"/"/g' | \
  tee --append snippet-$i.txt
  echo -n "…" >> snippet-$i.txt
  sleep 3s
done

The hyper­pa­ra­me­ters are vanilla GPT-3 set­tings:

  1. best-of (BO = 1): using Meena-style best-of rank­ing like BO = 20 is not worth the expense here as we are not ask­ing tricky ques­tions or assign­ing tasks with a ‘right answer’; for cre­ative writ­ing like anime reviews, reg­u­lar sam­pling is fine
  2. tem­per­a­ture: vary­ing tem­per­a­tures from 0.80 to 0.95 all work fine, so this is not a task that is tem­per­a­ture-sen­si­tive—ap­par­ently there are always rea­son­able com­ple­tions even if a lot of low-prob­a­bil­ity tokens are selected
  3. top-p: I gen­er­ally use nucleus sam­pling to cut off the gib­ber­ish tail of the bot­tom 2–5%; here, it did­n’t make much differ­ence that I could tell, so I left it at my usual 2% (0.98).

The use of future years in the prompt does­n’t always suc­ceed in gen­er­at­ing novel ani­me; here is a fail­ure case, a per­fectly sen­si­ble, coher­ent, plau­si­bly-hu­man-writ­ten anime review which is, how­ev­er, too close to review­ing :

Review of the themes and plot of the first episode of the new anime “DARLING in the FRANKXX”, sched­uled to pre­miere on Octo­ber 7th, 2017.

Visu­als: 4.0/5.0 (Av­er­age)

The visu­als of this new anime are noth­ing par­tic­u­larly spec­tac­u­lar, but still pretty good. Char­ac­ter designs are fairly unique, but there’s not a whole lot of col­ors for this episode. It does set up for a darker story and does every­thing pretty effec­tive­ly.

Visu­als are rarely an indi­ca­tor of the ani­me’s qual­i­ty, but it gives you a feel for what it looks like over­all.

Sound: 3.5/5.0

This open­ing song was pretty cool, and the seiyuus did a great job in mak­ing their char­ac­ters come to life.

Music was­n’t spec­tac­u­lar, but the voice actors did a great job in bring­ing their char­ac­ters to life.

Sto­ry: 3.5/5.0

Sur­pris­ingly this seems like it’s going to be a pretty intense and gloomy romance story about hav­ing kids and falling in love. I’ll be defi­nitely check­ing this out when it comes out.

The plot sum­mary for “DARLING in the FRANKXX” is basi­cally the setup of this episode.

Char­ac­ter: 3.5/5.0

This char­ac­ter I did­n’t really like when she first started show­ing up, but by the end of the episode I was pretty much warm­ing up to her. The MC does­n’t really do a whole lot in this episode, but he seems pretty nice. A lot of devel­op­ment is prob­a­bly gonna go on with the other char­ac­ters, so I’ll be watch­ing out for that.

Clos­ing Thoughts: (note: this was writ­ten before this anime came out)

This anime did end up being quite decent, and my com­plaints about it mostly con­cerned the time crunch, so it’s not the ani­me’s fault at all. I had only two other things that would have improved this anime even more: One would be to include more gags involv­ing the flow­ers, which were actu­ally pretty cute. They were a funny way to tell us about who the char­ac­ters were and their dynam­ics with one anoth­er. The other would be to not rush things so much. Any­ways this was a pretty great anime over­all. I hope they do a sec­ond sea­son. A quick review about this ani­me: (writ­ten after it was aired) When you first hear “DARLING in the FRANKXX”, you might not think that a 1920’s themed anime with an odd premise could be done, but believe me when I tell you it can, and it was done really well. A review of the first episode can be found on my pre­vi­ous post. They intro­duced the key ele­ments to the plot well, and had a good plot out­line. The begin­ning setup took a lit­tle get­ting used to, but it became quite cute by the end. Themes of the episode were mostly romance, and they did the themes well. It uses its music and visu­als quite effec­tive­ly. Over­all the story was quite unique and well done. Other notable things would include a rel­a­tively high amount of diver­sity in char­ac­ters, such as hav­ing a “hene­nak” as well as hav­ing a rel­a­tively equal amount of female to male char­ac­ters. The art­style for this anime was solid, although the art team was­n’t really given enough time to come up with a great bud­get for it. There­fore, the few col­ors it did have were used to the best of their abil­i­ties in draw­ing the char­ac­ters. Over­all, it was a pretty fun anime and I hope

But they are still much bet­ter than the GPT-2 ones!

So I gen­er­ated 100,002 final snip­pets from 2020-07-12–2020-09-05, some­times tweak­ing the tem­per­a­ture or prompt for vari­ety’s sake, and resam­pling any com­ple­tions below ~550 char­ac­ters. (To­tal: 330,800 lines; 55,301,038 words; 328,099,060 bytes.)

GPT-3 Download

Snip­pets can be seen at TWDNE, of course but also avail­able as tar­balls; down­load:

  • local mir­ror (77MB)

  • Mega mir­ror

  • Rsync mir­ror:

    rsync --verbose --recursive rsync://78.46.86.149:873/biggan/portraits/2020-09-05-gwern-twdne-v3.5-gpt3snippets.tar.xz ./

Results

“Anon, please tell me the artist of that pic­ture. The way they draw that hair is amaz­ing.”

Anony­mous

“Some­one know who she is? (for a friend)”

“Prob­a­bly from some weird visual nov­el.”

Red­dit

I set up the first ver­sion of TWDNE with faces Tues­day 2019-02-19; overnight, it went viral after being posted to a Chi­nese FLOSS/technology web­site, receiv­ing hun­dreds of thou­sands of unique vis­i­tors over the next few days (>700,000 unique vis­i­tors), with sur­pris­ingly long aver­age ses­sions (I guess look­ing at all the pos­si­ble faces is hyp­notic, or some peo­ple were using it as a screen­saver):

Google Ana­lyt­ics traffic sta­tis­tics for TWDNE, 2019-02-19–2019-02-23

This traffic surge was accom­pa­nied by a band­width surge (>3.5T­B):

Cloud­Flare cache band­width usage for TWDNE, 2019-02-19–2019-02-23, and 2019-02-22–2019-02-23

By 2019-04-03, there were >1 mil­lion unique vis­i­tors. Traffic remained steady at sev­eral thou­sand vis­i­tors a day for the next few months (~900GB/month), pro­duc­ing Ama­zon S3 bills of ~$90/month, so in July 2019 I moved host­ing to a nginx Het­zner serv­er. By 2019-07-20, TWDNE traffic hit 1,161,978 unique users in 1,384,602 ses­sions; because of the JS refresh, ‘pageviews’ (12,102,431) is not par­tic­u­larly mean­ing­ful, but we can infer from the remark­able 1m:48s length of the aver­age ses­sion, that TWDNE users are look­ing at >7 images per ses­sion.

As jokes go, this was a good one. Once whole-im­ages are solved, per­haps I can make a “This Booru Does Not Exist” web­site to show off sam­ples from that!


  1. One ben­e­fit is that by load­ing the tar­get files dynam­i­cally instead of pro­vid­ing a brows­able direc­tory or some­thing more machine-read­able, this hides the images/snippets from search engi­nes—which is a good thing since it avoids expos­ing mean­ing­less text to search engine users.↩︎

  2. Orig­i­nal­ly, an was used for sim­plic­i­ty, but a user pointed out that this led to a less-pleas­ing expe­ri­ence since when the whole page reload­ed, the new image/text would vis­i­bly down­load & dis­play, while there was no rea­son the next image could­n’t be down­loaded in the back­ground and the vis­i­ble image changed atom­i­cal­ly.↩︎

  3. One dispir­it­ing anec­dote about mobile users: the ini­tial ver­sion of TWDNE was essen­tially bro­ken on smart­phones as I for­got to add any CSS which would scale the faces down to fit on the screen com­fort­ably. About 460k unique users, more than half on mobile, vis­ited before I noticed the prob­lem on my own. Appar­ently mobile users, and espe­cially those forced to use Safari on iOS (no­to­ri­ous for sup­port­ing few fea­tures & stan­dard­s), expect web­sites to be bro­ken & don’t bother com­plain­ing.↩︎

  4. This takes sev­eral hours because the script is not opti­mized at all and does one face at a time instead of run­ning mini­batches of ~20 faces, which is fine for a one-off dump, but if one were doing many such projects or larger dumps, would be worth tak­ing to the time to fix.↩︎