×
you are viewing a single comment's thread.

view the rest of the comments →

[–]shipblazer420[S] 131 points132 points  (11 children)

Here are some computer-generated Emilia faces, which mainly look good, but the transitions between styles or angles are sometimes messy. The quality could be increased at the cost of diversity of styles and expressions, but I chose not to to give less-biased result and also to make them less boring.

As source training data I used 500 emilia faces from the web, auto-cropped the faces and removed all non-emilia characters (but, as can be seen at 0:39 at bottom-right, sometimes an half of someones face was left in the crop, most often Rem's (this was also the case in the network I used to continue on)).

The images are results of a Generative Adversarial Network machine learning method, its results presented in a video format by a random-walk trough some of the datapoints in the latent space of the Generator. Basically, during training there are two neural networks, Discriminator which ranks images that it sees, and Generator which generates images and tries to make them pass the Discriminator as real images. G starts from random noise, as it has no idea what kind of images give good results, and by getting the D's evaluation of its output it should learn what kind of images provide better results. Both networks improve over time, as the D learns what kind of tricks the G uses in its quest for a good results, and G tries to find out what the D expects, so they both need to adapt, and ideally this is always an improvement (sometimes this is not the case, but there are some methods to combat such behavior, even if the system is still somewhat black box as a whole). The D learns some filters for some of the features, for example we can see that Emilias main features with elven ears and hair- and eye-color are always present, and these are pretty easy to detect. The main question is the correct combination of all of them.

This is done with NVIDIA's StyleGAN v2 by transfer-learning from Gwern's ThisWaifuDoesNotExist v3 for 200 kimgs on a RTX 2080 Ti. I give thanks to u/gwern for their pre-trained weights and helpful guide, which you should read along with this, if you are interested in how this works.

[–]runningtoda 63 points64 points  (5 children)

Me to my nonexistent girlfriend:

"See! Anime does help me in the real world!"

[–]franmarsiglione 28 points29 points  (3 children)

It's all about motivation!

[–]runningtoda 9 points10 points  (2 children)

With motivation, all of us lonely guys can get a girlfriend! Thank you kind stranger! For reinvigorating my motivation!

[–]franmarsiglione 6 points7 points  (1 child)

Me, i was mostly referring to the motivation to learn and develop AIs. But yeah, that too...

[–]runningtoda 5 points6 points  (0 children)

Well... you also boosted my our motivation for developing AIs through your wise words

[–]AI_Tori 10 points11 points  (1 child)

This is fascinating. Thanks for posting!

I only have knowledge related to AI in text form (as I use it for projects and my one big ReZero fic), but I imagine its probably not largely different. For text, you can just dump a pile of text (like the novels and side stories) and request information back. I admit I haven't played with machine learning though (I'm too novice for that).

That said, I wonder how it might handle more complex character designs. I imagine giving it Rem might confuse it because the hairband and clips aren't always present. That or just make sure all images have it or not before moving on.

[–]shipblazer420[S] 4 points5 points  (0 children)

Oh, I have also used NLP in a research group! Indeed the handling of whole data examples (in images, convolution with its filters to consider features, and in text the use of transformers (?) to consider the tokens' relations to each others) can be seen as similar idea, even if the methods vary wildly.

The headband-clip thing is not a problem - I don't completely understand how CNNs see an image, and the features can be really abstract. The decision when to put a band and when not to is just up to chance what the latent space + noise happens to have at that example, but there is no need to distinct the training data to only one kinds. It would indeed make the training a bit easier if that was present and it would be a feature needed to pass at some point, but if the other features like hair and eye color and hair shape are present, StyleGAN 2 has shown to be powerful enough to learn that kind of possible, additional features - same thing for clothes, as you can see in this when Emilia somethimes has a collar and sometimes not. I could even argue that with it lacking, the generator needs to find "harder" features to imitate for its images to pass, since with the pin a pink splash at the general area could be enough (exaggerating a bit, but you get the idea!)

[–]franmarsiglione 6 points7 points  (1 child)

Pretty cool, GAN learning had me interested since the moment i learned about it tbh. I'm a CS student but ML is not part of the standard curricula here; i just used some back-propagation this year. Hope i can keep at it since there's a lot to do in this area!

[–]shipblazer420[S] 5 points6 points  (0 children)

I'm also an IT student (with pretty much similar curriculum as the CS students here) doing my Master's. I'm just doing a GAN course on Coursera, haven't taken it long enough yet to be able to recommend it, but it looks promising'ish and you can try for free for a week. I also got interested in ML in general when I sad TWDNE v1 almost 2 years ago, and started very lazily to study some stuff (probably in wrong order).