ThisWaifuDoesNotExist.net(TWDNE) is a static website which uses JS to display random anime faces generated by StyleGAN neural networks, along with GPT-3-generated anime plot summaries. Followups: “This Pony Does Not Exist” (TPDNE)/“This Fursona Does Not Exist” (TFDNE)/“This Anime Does Not Exist” (TADNE).
“Practical aspects of StyleGAN2 training”, (2020-04-28):
I have trained StyleGAN2 from scratch with a dataset of female portraits at 1024px resolution. The samples quality was further improved by tuning the parameters and augmenting the dataset with zoomed-in images, allowing the network to learn more details and to achieved FID metrics that are comparable to the results of the original work…I was curious how it would work on the human anatomy, so I decided to try to train SG2 with a dataset of head and shoulders portraits. To alleviate capacity issues mentioned in the SG2 paper I preferred to use portraits without clothes (a substantial contributing factor to dataset variance); furthermore, the dataset was limited to just one gender in order to further reduce the dataset’s complexity.
…I haven’t quite been able to achieve the quality of SG2 trained with the FFHQ dataset. After over than 30000 kimg, the samples are not yet as detailed as it is desirable. For example, teeth look blurry and pupils are not perfectly round. Considering the size of my dataset as opposed to the FFHQ one, the cause is unlikely to be the lack of training data. Continuing the training does not appear to help as is evident from the plateau in FIDs.
Overall, my experience with SG2 is well in line with what others are observing. Limiting the dataset to a single domain leads to major quality improvements. SG2 is able to model textures and transitions quite well. At the same time it is struggling as the complexity of the object increases with, for instance, greater diversity in poses. It should be noted that SG2 is much more efficient for single domain tasks compared to other architectures, resulting in acceptable results much faster.
State-of-the-art object detection networks depend on region proposal algorithms to hypothesize object locations. Advances like SPPnet and Fast R-CNN have reduced the running time of these detection networks, exposing region proposal computation as a bottleneck. In this work, we introduce a Region Proposal Network (RPN) that shares full-image convolutional features with the detection network, thus enabling nearly cost-free region proposals. An RPN is a fully convolutional network that simultaneously predicts object bounds and objectness scores at each position. The RPN is trained end-to-end to generate high-quality region proposals, which are used by Fast R- for detection. We further merge RPN and Fast R- into a single network by sharing their convolutional features—using the recently popular terminology of neural networks with ‘attention’ mechanisms, the RPN component tells the unified network where to look. For the very deep VGG-16 model, our detection system has a frame rate of 5fps (including all steps) on a GPU, while achieving state-of-the-art accuracy on PASCAL VOC 2007, 2012, and MS COCO datasets with only 300 proposals per image. In ILSVRC and 2015 competitions, Faster R- and RPN are the foundations of the 1st-place winning entries in several tracks. Code has been made publicly available.
“YOLOv3: An Incremental Improvement”, (2018-04-08):
We present some updates to YOLO! We made a bunch of little design changes to make it better. We also trained this new network that’s pretty swell. It’s a little bigger than last time but more accurate. It’s still fast though, don’t worry. At 320×320 YOLOv3 runs in 22 ms at 28.2 mAP, as accurate as SSD but three times faster. When we look at the old .5 IOU mAP detection metric YOLOv3 is quite good. It achieves 57.9 mAP@50 in 51 ms on a Titan X, compared to 57.5 mAP@50 in 198 ms by RetinaNet, similar performance but 3.8× faster. As always, all the code is online at https://pjreddie.com/yolo/
“This Anime Does Not Exist.ai (TADNE)”, (2021-01-19):
[Website demonstrating samples from a modified StyleGAN2 trained on Danbooru2019 using TFRC TPUs for ~5m iterations for ~2 months on a TPUv3-32 pod; this modified ‘StyleGAN2-ext’, removes various regularizations which make StyleGAN2 data-efficient on datasets like FFHQ, but hobble its ability to model complicated images, and scales the model up >2×. This is surprisingly effective given StyleGAN’s previous inability to approach BigGAN’s Danbooru2019, and TADNE shows off the entertaining results.
The interface reuses Said Achmiz’s These Waifus Do Not Exist grid UI.
Writeup; see also: Colab notebook to search by CLIP embedding; “This Waifu Does Not Exist” (TWDNE)/“This Fursona Does Not Exist” (TFDNE)/“This Pony Does Not Exist” (TPDNE), TADNE face editing, CLIP=guided ponies]
“This Fursona Does Not Exist (TFDNE)”, (2020-05-07):
A StyleGAN2 showcase: high-quality GAN-generated furry (anthropomorphic animals) faces, trained on n = 55k faces cropped from the e621 image booru. For higher quality, the creator heavily filtered faces and aligned them, and upscaled using waifu2×. For display, it reuses Obormot’s “These Waifus Do Not Exist” scrolling grid code to display an indefinite number of faces rather than one at a time. (TFDNE is also available on Artbreeder for interactive editing/crossbreeding, and a Google Colab notebook for Ganspace-based editing.)
Model download mirrors:
rsync --verbose rsync://18.104.22.168:873/biggan/2020-05-06-arfafax-stylegan2-tfdne-e621-r-512-3194880.pkl.xz ./
“This Pony Does Not Exist”, (2020-07):
“This Pony Does Not Exist” (TPDNE) is the followup to “This Fursona Does Not Exist”, also by Arfafax. He scraped the Derpibooru My Little Pony: Friendship is Magic image booru, hand-annotated images and trained a pony face YOLOv3 cropper to create a pony face crop dataset, and trained the TFDNE StyleGAN 2 model to convergence on TensorFork TPU pods, with an upgrade to 1024px resolution via transfer learning/model surgery. The interface reuses Said Achmiz’s These Waifus Do Not Exist grid UI.
The S2 model snapshot is available for download and I have mirrored it (
rsync rsync://22.214.171.124:873/biggan/2020-07-15-arfafax-stylegan2-thisponydoesnotexist-1024px-iter151552.pkl ./). See also: “This Waifu Does Not Exist” (TWDNE)/“This Anime Does Not Exist” (TADNE)
“E621 Face Dataset”, (2020-02-18):
The total dataset includes ~186k faces. Rather than provide the cropped images, this repo contains CSV files with the bounding boxes of the detected features from my trained network, and a script to download the images from e621 and crop them based on these CSVs.
The CSVs also contain a subset of tags, which could potentially be used as labels to train a conditional .
File get_faces.py Script for downloading base e621 files and cropping them based on the coordinates in the CSVs. faces_s.csv CSV containing URLs, bounding boxes, and a subset of the tags for 90k cropped faces with rating = safe from e621. features_s.csv CSV containing the for 389k facial features with rating = safe from e621. faces_q.csv CSV containing URLs, , and a subset of the tags for 96k cropped faces with rating = questionable from e621. features_q.csv CSV containing the for 400k facial features with rating = questionable from e621.