That might be a parse error according to validator, but it's actually a pretty commonly used hack for old Internet Explorers - real browsers drop such invalid declaration, while IE (up to 6 I think, although 7 still accepted some other character IIRC) parsed it anyway and ignored the asterisk. What's more - this quirk is commonly used with, guess what, the "zoom" property, which is one of the common ways to trigger "hasLayout" mode in IE engine (although you'd most likely use it with value "1" instead of "2").
So this NN simply wrote a compatibility hack for IE6 :)
If you find this combination of Natural Computation and arts (or styling in this case), you might want to take a look on papers published on these tracks:
Someone correct me if I am wrong here, but these char-rnn based outputs are great to look at and experiment with - but they dont seem to be of any practical use. All that they are capable of showing is that an RNN can "remember" things. The biggest question is how do you make any use of these things ?
Is it possible to nudge the RNN in a particular direction - so that it produces something that we want ? Perhaps there is an answer to this in Alex Graves et al. Paper on Handwriting generation. A more thorough explanation or exploration in this direction would perhaps help ?.
Anyone really working on generating CSS ( conditioned! ) ?
No experience actually trying, but I'd suspect the first step would be training a system on combinations of CSS and HTML; all the really interesting behavior comes from the interaction between the two.
Then I imagine you could do interesting stuff by e.g. constraining the HTML and seeing what kind of CSS was spit out.
it could be a nice experiment to have a basic html structure and a picture, and then attempt to generate the style until the rendered html match the image.
Yes, a generative model of HTML+CSS is definitely a direction I'm nodding towards. (I discussed it briefly in the previous section about reinforcement learning in general.) I'm still hazy on the full architecture: you need an RNN to generate the CSS, you probably want to feed in a particular HTML page to target the CSS onto, the users' browsers generate the reward signal, and you can create images of the HTML+CSS combo as rendered in a web browser and bring in a convolutional network somehow... There doesn't seem to be anything quite like this in the literature that I've come across.
(Actually, there's a surprising dearth of reinforcement learning in general. Very few blog posts or demos or introductory materials. It makes it hard to understand what is new about DQN or how the whole system works on a concrete coding level.)
Reinforcement learning seems like overkill if you're just trying to match a single target. RL agents want to maximize reward accumulated over time, while you'd be happy to find a single good state. You could frame this as stochastic optimization (MCMC or simulated annealing) over the CSS string, with some combination of RNN and convnet serving as proposal.
The best paper award at CVPR this year (http://www.cv-foundation.org/openaccess/content_cvpr_2015/ht...) has an architecture somewhat like this: instead of a web browser they're using a 3D rendering engine, and searching for pose parameters that cause their rendered image to match an observed image. This is just Bayesian inference using MCMC, where they train a deep net to function as a data-driven proposal distribution.
If your goal were to actually get a working system, you'd probably want to do inference directly on a parameter vector encoding all the relevant quantities - heights, widths, font sizes, colors, etc. of the various boxes, etc. - that you could programmatically ground out into a CSS file. Trying to do inference over the raw text is making things artificially hard since you have to put so much work into even just getting correct syntax. Though maybe that's part of the fun. :-)
I think the optimal way to do this would be to use an RNN that can walk up and down the DOM. So it starts at the HTML tag, and chooses which path to go up and down, visiting each node. Then iterate through every style attribute and predict what value it should have.
This would create a simple way of generating CSS styles for a document, without dealing with the complicated issues of RNNs having limited memory and producing correct syntax.
Then you can use these predictions as a prior probability over what the CSS styles should be. Then you can use some kind of bayesian optimization to find the optimal settings in the least number of experiments.
The "reward signal" aka objective function seems to be the challenging part here. In the parent post's suggestion, all you'd end up with is a neural network that could /maybe/ reproduce a picture (assuming CSS was capable and the network has the approximately learnable properties necessary). It'd be more interesting to have some "quality" measure that actually meant something to evaluate outputs.
If the original image were created in a vector-based program -- in other words, something where the x,y,width,height parameters are known and stored -- you can load the generated HTML+CSS in a known reference browser and enumerate the x,y,width,height of the matching elements in the DOM. If your fitness function is a golf score, then something like:
yeah, but you need to render in a headless browser. This might take 0.1 seconds per webpage which is extremely slow when trying to use in the context of reinforcement learning.
I thought about trying to do MCMC over a beam search through the rnn output, but ran out of time and patience.
My point was that this is boring. A fitness function that evaluated some notion of how "pretty" a page was would be cooler than being able to regenerate a screenshot's CSS (in a likely very complex form).
That would be really interesting to see. I bet there would be convoluted CSS hacks we never even dreamed were possible and they won't be practical at all. :)
Letting my imagination run wild, I'd add a speech recognition module, generating CSS from a spoken description of layout etc.
After some busy work along that line, we would arrive at a generation of PLs where you don't write source code but have it generated from interpreted speech input.
I'm not so sure I (as a programmer) would really want that, but it would seem a rather obvious line of development.
You can do ML. But you won't match the performance of state of the art deep learning networks. Training on CPU isn't impossible. Gwern said it was a 20x slowdown, others said it was only as bad as 10x. But for smaller models it should be feasible.
That might be a parse error according to validator, but it's actually a pretty commonly used hack for old Internet Explorers - real browsers drop such invalid declaration, while IE (up to 6 I think, although 7 still accepted some other character IIRC) parsed it anyway and ignored the asterisk. What's more - this quirk is commonly used with, guess what, the "zoom" property, which is one of the common ways to trigger "hasLayout" mode in IE engine (although you'd most likely use it with value "1" instead of "2").
So this NN simply wrote a compatibility hack for IE6 :)