×
all 88 comments

[–]f112809 69 points70 points  (10 children)

Wow!

Now try colorizing manga!

[–]Sillychina 5 points6 points  (5 children)

Is this a possible task without true AI? How do you know what color hair is supposed to be unless they say so

[–]KimonoThief 16 points17 points  (0 children)

You could train it on images from the series in question, no?

[–]f112809 6 points7 points  (0 children)

Generally, protagonists will be colored on the front cover of the manga, or there will be color illustrations if a manga become famous. Otherwise style2paints could try any color it likes, then analyse feedback from audience. Anyway, I believe there's always a way to determine what color it should use.

To me, mangas are basically sketches, but with continuous story telling and speech balloons. So I assume identifying different characters/objects/contexts is something essential, and it probably shouldn't be done by style2paints. I was just being naively excited...

[–]confusedX 3 points4 points  (0 children)

GANs are well suited for multimodal problems. It's possible

[–]Colopty 0 points1 point  (0 children)

Since you can pick out colors from the reference and tell the tool to use that color in a certain area, yes. It seems to give worse coloration than if you just let it use the default behavior though.

[–]Phantine 0 points1 point  (0 children)

How do you know what color hair is supposed to be unless they say so

Just have it exclusively color Jojo.

[–]Jonno_FTW 0 points1 point  (3 children)

This sounds much harder to get data for since most manga is black and white. Could try applying this to to manga and see what you get though.

[–][deleted] 0 points1 point  (2 children)

Covers are colored most of the time though.

[–]Jonno_FTW 1 point2 points  (1 child)

Hardly representative of the majority of manga though. You could have a cover art colouriser though which would be of use.

[–]NeverQuiteEnough 0 points1 point  (0 children)

often a series will get some color pages

[–]programmerChilli 27 points28 points  (2 children)

Could I ask what are the significant design differences between this and version 1?

The results for version 1 were already the most impressive ive seen, and these look even better.

[–]q914847518[S] 34 points35 points  (1 child)

Technique difference:

V2 is fully unsupervised and unconditional as I mentioned above. In my personal empirical test v2 is 100% better than v1.

Commercial difference:

Our major competitors “paintschainer” has updated many models that seems better than v1, so we also use some new methods in v2 to present better results lol.

Their site: http://paintschainer.preferred.tech/index_en.html

[–]q914847518[S] 72 points73 points  (16 children)

Edit: more screenshots avaliable at: https://github.com/lllyasviel/style2paints

Hi! We feel so excited here to release the version 2.0 of style2paints, a fantastic anime painting tool. We would like to share with you some new features on our services.

Part I: Anime Sketch Colorization

When I am talking about "colorization", I mean to transfer a sketch to a painting. What is critical is that:

  1. We are able to and prefer to colorize sketches combines of pure lines. It means the artists can but do not need to draw shadow or high light to their sketch. This is challenging. Recently the paintschainer are aimed to improve such shading and we also give our different solution, and we are very confident about our method.

  2. The "colorization" should transfer a sketch to a painting instead of a colorful sketch. The difference between a painting and a colorful sketch lie in the shading and the texture. In a fine anime painting, the girls' eyes should shine like galaxy, the cheeks should be suffused with flush and the delicate skin should be charming. We try our best to achieve these, instead of only putting some color between lines.

Contributions:

  1. The Most Accurate

Yes, we have the most accurate neural hint pen for artist. The so-called “neural hint pen” combines of a color picker and a simple pen tool. Artists are able to select color and put some pointed hints on the sketch. Nearly all state-of-the-art neural painter have such tool. Among all current anime colorization tools (Paintschainer Tanpopo, Satsuki, Canna, Deepcolor, AutoPainter (maybe exist)), our pen performs highest accuracy. In the most challenging case, the artists can even control the color of a 13 times 13 area using our 3 times 3 hint pen on a 1024 times 2048 illustration. For larger blocks, a 3 times 3 pointed hint can also even control half of the color of all painting. This is very challenging and is designed for professional use. (At the same time, the hint pens of other colorization methods prefer messy hint and these methods do not care about the accuracy.)

  1. The Most Natural

When I am talking about “natural”, I mean we do not add any human-defined rules in the training procedure, except the adversarial rule. If you are familiar with pix2pix or CycleGAN, you may know that all these classical methods add some extra rules to ensure a converge. For example, the pix2pix(or HQ) add a l1 loss (or some deep l1 loss) to the learning objective and the discriminator receive the pair of [input, training data] and [input, fake output]. Though we also use these classic methods for a short period of time, the majority of our training is purely and fully unsupervised and even fully unconditional. We do not add rules to force the NN paint according to the sketch but the NN itself find that if it obey the input sketch, it can fool the discriminator better. The final learning objective is totally same as the very classic DCGAN without any other thing and the discriminator do not receive pairs. This is very difficult to make it converge, especially when the NN is so deep.

  1. The Most Harmonious

Painting is very difficult to most of us and this is the reason why we admire artists. One of the most important skill of a fine artist is to select harmonious colors for the painting. Most people have no knowledge that there are more than 10 kinds of blue in the field of painting, and though these colors are all called “blue”, the difference between them cast huge impacts on the final result of the paintings. Just Image that: a non-professional user run a colorization software and the software shows the user a huge color panel with 20*20=400 colors and ask the user “which color do you want?”. I am sure that the non-professional user can not select the best color. But this is not a problem for STYLE2PAINTS because the user can upload a reference image (or called style image), and the user is able to directly select color on the image, and the NN paints according to the reference image and hints with color from it. The results are harmonious in color style and it is user-friendly for non-professional user. Among all anime AI painters, our method is the only one with this feature.

Part II: Anime Style Transfer

Yes, the very Anime Style Transfer! I am not sure whether we are the first one but I am sure that if you are in need of a style transfer for anime painting, you can search everywhere for a very long time and you will finally find that our STYLE2PAINTS is the best choice (in fact the only choice). Many Asia papers claim that they are able to transfer style of anime paintings, but if you check their papers your will find their so-called novel method is only a tuned VGG. OK, to show you the fact, I am here listing the real things:

  1. All transfering methods based on ImageNet VGG are not good enough on anime paintings.

  2. All transfering methods based on Anime Classifier are not good enough because we do not have anime ImageNet and if you run some gram matrix optimizer on Illustration2vec or some else anime classifier, the only thing you will achieve is a perfect Gaussian Blur Generator lol, because all current anime classifiers are bad in feature learning.

  3. Because of 1 and 2, currently all methods based on gram matrix, Markov random filed, matrix norm, deep feature patchMatch are not good enough for anime.

  4. Because of 123, all feed-forward fast transfering methods are also not good enough for anime.

  5. GANs can do style transfer, but we need the one where user can upload specific style, instead of selecting Monet/VanGogh (lol Monet and VanGogh do not know anime)

But fortunately, I managed to write the current one and I am confident about it:) You can try it directly in our APP:)

Just play with our demo! http://paintstransfer.com/

Source and models if you need: https://github.com/lllyasviel/style2paints

Edit: Oh I forget to mention an important thing.. Some of the sketches for preview is not selected by us and we directly use the promotion sketches of paintschainer and we are showing our results on their sketches.

Edit2: If you can not get good enough results, maybe you are in wrong mode or you are not using pen properly. Check this comment for more:

https://www.reddit.com/r/MachineLearning/comments/7mlwf4/pstyle2paintsii_the_most_accurate_most_natural/drv72cj/

[–]visarga 41 points42 points  (0 children)

I like your confidence.

[–]gwern 10 points11 points  (5 children)

All transfering methods based on Anime Classifier are not good enough because we do not have anime ImageNet

So you think if we trained a tag CNN (much larger than illustration2vec) on a dataset like Danbooru, the final layers would be enough to serve as a useful Gram matrix and then anime style transfer would Just Work without any further changes?

[–]q914847518[S] 11 points12 points  (4 children)

Personally I would like to say "yes", but as a reseacher I have no evidence to prove it. The risk is very high because such a dataset can cost lots of money, but no one knows whether it will works.

[–]gwern 16 points17 points  (3 children)

Hm. All the more reason I should finish packing up a torrent of Danbooru images+tags, then...

[–]Risky_Click_Chance 0 points1 point  (1 child)

Would it be better to have a program pull all the data from the website in a snapshot and sort accordingly? How is data with multiple tags formatted?

[–]gwern 0 points1 point  (0 children)

Would it be better to have a program pull all the data from the website in a snapshot and sort accordingly?

Danbooru has a BigQuery mirror of the SQL database which is updated daily, so I'm combining that with a simple wget iteration over the API. The tags are stored as a text array in BQ. BQ can be dumped as JSON, and then converted back to SQL. I'm not the SQL guru so I'm not sure how exactly that array type maps onto a regular SQL db (apparently it's BQ-specific or something).

[–]FliesMoreCeilings 1 point2 points  (4 children)

Getting some pretty decent results on your demo after manually selecting which parts to color.

The only thing I'm noticing is that the coloring applied tends to be rather watercolor-esque, and somehow doesn't really seem to capture the mono-colored block style typical of anime very well, despite that seeming intuitively easier. Selecting "render illustration" gives the best results, but its still pretty far from the original style. Is this just the style you're focusing on right now?

[–]q914847518[S] 1 point2 points  (3 children)

In fact sketch colorization is our main service and I have not devote so much time to tune or improve style transfer. Right now we are focusing on how to transfer sketches to paintings and this is more meaningful for art industry.

[–]FliesMoreCeilings 0 points1 point  (2 children)

Style is transferred, but it's from your training set to the final output, instead of from reference image (or sketch) to output. And unfortunately the style can actually massively impact color too. For example some yellows tend towards greens and blacks are going to grays in a lot of areas because of the watercolor effect. You sometimes also see things like 'blushies' appear while these werent present in either of the two source files. For pure colorization the watercolor style is way too present. Like try using this as a reference and copy the haircolor: http://i.imgur.com/zY7EiHT.jpg

[–]q914847518[S] 2 points3 points  (1 child)

yes and you get the point. In fact this is a problem that all feed-forward methods faced to. If we have a well-trained anime VGG, we can definitely use optimizer or matcher to get better result, getting rid of all these limitations. But unfortunately we do not have such a model. In this case, our label-free method can fill the gap. We are confident to claim as the best because no other methods are good enough and anyway ours works in most cases.

[–]FliesMoreCeilings 0 points1 point  (0 children)

It does seem pretty good compared to the alternatives. Good job!

[–]fkhz 0 points1 point  (1 child)

What do you mean by the discriminator receiving pairs?

[–]q914847518[S] 6 points7 points  (0 children)

What do you mean by the discriminator receiving pairs?

Oh sorry if I did not make it clear:

In classic pix2pix, if the input of G is shaped like (a, b, c, d) and output is like (a, b, c, e), then we concat them and the input of D should be (a, b, c, d+e). This is one of the common practices to make a GAN conditional.

"Do not receive pairs" means the D receive the output of G as shape (a, b, c, e).

[–]eauxeau 0 points1 point  (1 child)

Can it do hair colors other than blue?

[–]q914847518[S] 3 points4 points  (0 children)

sure. maybe there are too much demo of blue color lol. I will change these demo every week.(maybe

[–][deleted] 18 points19 points  (0 children)

Finally some good use for all that technology.

[–]solomondg 18 points19 points  (3 children)

Damn. Should we expect a paper any time soon?

[–]q914847518[S] 29 points30 points  (2 children)

yes, maybe soon. We will be happy if our methods can contribute to the community.

[–]Stepfunction 4 points5 points  (0 children)

Please, it would be greatly appreciated!

[–]Neutran 1 point2 points  (0 children)

Please do write a paper! An Arxiv paper will make it much easier for us to cite you. It'd be very awkward to reference your work in my paper by listing reddit links.

[–]cjsnefncnen 32 points33 points  (0 children)

Imagine redoing a whole anime series with this technique..

Cowboy bebop with no game no life color palette plz

[–]Daell 15 points16 points  (5 children)

I would like to highlight waifu2x

Single-Image Super-Resolution for Anime-Style Art using Deep Convolutional Neural Networks. And it supports photo.

It's insanely good

[–]Mar2ck 6 points7 points  (2 children)

The denoise function is practically flawless, the scale function is great but has problems with artifacting. Overall great software would also recommend.

Waifu2x-caffe is a newer version which is much faster because it can be run through Nvidia's CUDA

[–]Daell 13 points14 points  (1 child)

I'm gonna tell you a bit shameful story...

... all started with the fact that i really needed a particular stock image:

  1. original but it's low quality, unusable... you would think

  2. So i sending it through waifu2x, twice

  3. Then in Illustrator i auto traced it

[–]Mar2ck 5 points6 points  (0 children)

That's actually an ingenious use. It won't work with detailed images but this is pretty clever (and kind of unethical)

[–]madebyollin 0 points1 point  (0 children)

Yup! Also letsenhance.io (same idea, but not cartoon-oriented).

[–]astrange 0 points1 point  (0 children)

Btw, waifu2x is basically the same as nnedi2 which is more than 8 years old now.

http://forum.doom9.org/showthread.php?t=147695

[–]juhotuho10 6 points7 points  (0 children)

Machine learning finally applied to something useful

[–]columbus8myhw 4 points5 points  (6 children)

Why do you focus on anime? What if you try it on other animation styles?

[–]q914847518[S] 24 points25 points  (4 children)

  1. In the field of style transfer, the VGG works well in nearly all kinds of images except anime style images. Many problems related to anime is very challenging and reseachers like challenge.

  2. The application of this kind has a large market and we have many friends/competitors such as paintschainer.

[–]abyssDweller1700 33 points34 points  (1 child)

Be truthful. You just like those anime tiddies.

[–]q914847518[S] 49 points50 points  (0 children)

OK OK maybe ⁄(⁄ ⁄•⁄ω⁄•⁄ ⁄)⁄

[–]columbus8myhw 1 point2 points  (1 child)

What makes anime different from Western animation such that VGG behaves differently on it?

[–]lucidrage 2 points3 points  (0 children)

Anime has more "plot" compared to Western animation, which makes them difficult to reproduce via vgg.

[–]PervertWhenCorrected 0 points1 point  (0 children)

/u/q914847518 "Omae Wa Mou Shindeiru"

/u/columbus8myhw "Nani?!"

[–]bluelightzero 5 points6 points  (3 children)

With all these style transfers and other AI techniques, I wonder if 60fps anime could be feasible now.

[–]RedditNamesAreShort 1 point2 points  (0 children)

IIRC there was a really good dl frame interpolater. It did result in hilarious artifacts with anime though since animes usually have some subanimation running at a lower fps then the video is encoded in. So after interpolation characters moved smoothly for 4 frames and then stood still for 4 frames, or something like that.

[–]madebyollin 1 point2 points  (0 children)

Totally feasible! I did a 4X interpolation test using SepConv on a Howl's Moving Castle clip (posted here) and the basic ideas works fine for animation. As /u/RedditNamesAreShort stated, animation keyframes are usually on twos or sparser (i.e. not full 24FPS), and there are also frequent jump cuts, so you need to check the magnitude of difference in frames to make sure that you only interpolate between two successive keyframes of the same scene, while keeping the keyframe timestamps fixed. Naively interpolating between pairs of frames mean you end up with jerky motions and weird morphing cuts like in the clip I posted.

Will try to get it working eventually and publish the wrapper scripts–unfortunately I don't have a GPU machine on hand right now to develop with...

Edit: looks like there's already a wrapper script with basic video support here, so the diffing is the only remaining work.

Edit 2: Oh wow, forgot that different parts of the frame will be animated at different rates and offset from each other. That makes things harder, but definitely still doable... on the other hand, detecting jump cuts turns out to work fine.

[–]Jerome_Eugene_Morrow 0 points1 point  (0 children)

It's interesting. I've always had a concept in my head that the next step for NN was going to be as a sort of assistant to humans. It could certainly take over a lot of the duties that colorists and in-betweeners do in anime. Really exciting to imagine how much more art we might be able to get by decreasing that overhead for artists.

[–]Burn1nsun 6 points7 points  (2 children)

Tried a random image with a blue preset style, and a gray/white winter forestish type style.

https://i.imgur.com/a5Q3u3Ar.jpg

https://i.imgur.com/J5X26KP.jpg

Works surprisingly well even if the style/reference image isn't necessarily an anime image.

[–]zergling103 3 points4 points  (1 child)

OwO What's this?

[–][deleted] 3 points4 points  (0 children)

The thing that makes thots obsolete.

[–]Xx_JUAN_TACO_xX 4 points5 points  (3 children)

why does everybody get amazing results but when I try it's trash ?

[–]Colopty 2 points3 points  (2 children)

Have you tried:

1: A better dataset?

2: More layers?

3: Picking a better random seed?

If you want crappy results though, here's my attempt at generating santa pictures that I made in between juggling family christmas activities. Hopefully it makes you feel better.

[–]Xx_JUAN_TACO_xX 0 points1 point  (1 child)

I'm not even talking about making my own, just trying their website.
Nice job with the horror santa

[–]Colopty 0 points1 point  (0 children)

Oh yeah. Personally I feel like I got the best result when not using the color hints, whenever I used those the color seemed to bleed in bad ways. I might just have used it incorrectly though. Also didn't work too well when I used a real picture as the sketch, though I guess that's to be expected since it's not made for that. Just keep experimenting.

Also, thank.

[–]TragedyOfAClown 9 points10 points  (3 children)

Picture of Emma Watson. This is really cool. Good Work.

[–]imguralbumbot 4 points5 points  (0 children)

Hi, I'm a bot for linking direct images of albums with only 1 image

https://i.imgur.com/xU6tWYw.jpg

Source | Why? | Creator | ignoreme | deletthis

[–]zzzthelastuser 0 points1 point  (1 child)

She has got that different eye color syndrome thing.

[–]zopiac 0 points1 point  (0 children)

Heterochromia

[–]Inprobamur 2 points3 points  (0 children)

This is so cool, just tried with some random pictures that are not even anime.

https://i.imgur.com/2c6oiwL.jpg

[–]Theonlycatintheworld 2 points3 points  (1 child)

Has machine learning gone too far?

[–]PervertWhenCorrected 5 points6 points  (4 children)

Machine Learning "Omae Wa Mou Shindeiru"

Me "Nani?!"

[–]AnvaMiba 12 points13 points  (3 children)

This meme is already dead.

[–]PervertWhenCorrected 8 points9 points  (2 children)

/u/AnvaMiba "Kono mīmu wa sudeni shinde imasu"

Me "Nani?!"

[–]muntoo 0 points1 point  (1 child)

I... ummm... Nani?

[–]columbus8myhw 0 points1 point  (0 children)

90% sure that first line is "This meme is already dead" in Japanese

[–]bitchgotmyhoney 1 point2 points  (4 children)

This would be useful to apply to video game meshes, to add more variety in game

[–][deleted]  (2 children)

[deleted]

    [–]transpostmeta 1 point2 points  (0 children)

    Ocarina of Time would be amazing to play with this.

    [–]Jerome_Eugene_Morrow 0 points1 point  (0 children)

    Finally we can have a Super GameBoy that works. It only took 25 years!

    [–]PENIS_SHAPED_LADDER 0 points1 point  (0 children)

    This guy just indepently reinvented swapped color pallete sprites. Square Enix hire this man.

    [–]Stradivariuz 1 point2 points  (0 children)

    this is pretty damn cool

    here are my two (newbie) attempts at coloring:

    https://i.imgur.com/JlFslSX.jpg

    https://i.imgur.com/oAppgDG.png

    sketches courtesy of artgerm

    [–][deleted]  (7 children)

    [deleted]

      [–]q914847518[S] 4 points5 points  (6 children)

      OK you can upload the img here, if you think the image is anime related. I will give you a good result. If the result is good enough maybe I can even add the result to those one I am showing.

      [–][deleted]  (5 children)

      [deleted]

        [–]q914847518[S] 5 points6 points  (3 children)

        https://github.com/lllyasviel/style2paints/tree/master/valiox

        I have prepare a page for you.

        My PC crashed several minutes but you can check how much time I have use for each image via the windows clock in the screenshots.

        You uploaded so many images so I randomly selected some. If you are still not satisfied, I will finish all of them.

        Any other requirements, sir?

        [–][deleted]  (1 child)

        [deleted]

          [–]q914847518[S] 3 points4 points  (0 children)

          It is OK. Sometimes we just need some tricks such as try more references. Toggles are also important. Just try more modes, more references and add some pointed hints! You will like it.

          [–]madebyollin 0 points1 point  (0 children)

          Really nice work, lllyasviel!

          [–]q914847518[S] 4 points5 points  (0 children)

          fine. wait me some minutes.

          [–]pnloyd 0 points1 point  (0 children)

          I don't see any style transfer in these.

          [–]Uncouth_Bardbarian 0 points1 point  (0 children)

          Illya? Is that an FSN reference?

          [–]NextDysonSphere 0 points1 point  (0 children)

          This is the most amazing work I've ever seen in such field!!! Nice work!

          BTW, do you guys plan to publish your paper any time soon?