Growing Neural Cellular Automata: A Differentiable Model of Morphogenesis

DonHopkins · on Feb 11, 2020

Also check out the "Moveable Feast Machine", Robust-first Computing, and this Distributed City Generation example:

https://news.ycombinator.com/item?id=21858577

DonHopkins on Oct 26, 2017 | parent | favorite | on: Cryptography with Cellular Automata (1985) [pdf]

A "Moveable Feast Machine" is a "Robust First" asynchronous distributed fault tolerant cellular-automata-like computer architecture. It's similar to a Cellular Automata, but it different in several important ways, for the sake of "Robust First Computing". These differences give some insight into what CA really are, and what their limitations are.

Cellular Automata are synchronous and deterministic, and can only modify the current cell: all cells are evaluated at once (so the evaluation order doesn't matter), so it's necessary to double buffer the "before" and "after" cells, and the rule can only change the value of the current (center) cell. Moveable Feast Machines are like asynchronous non-deterministic cellular automata with large windows that can modify adjacent cells.

Here's a great example with an amazing demo and explanation, and some stuff I posted about it earlier:

https://news.ycombinator.com/item?id=14236973

Robust-first Computing: Distributed City Generation:

https://www.youtube.com/watch?v=XkSXERxucPc

otoburb · on Feb 12, 2020

David Ackley's[1] real-life hardware implementation of Robust-First Computing is at a new YouTube channel called the T2 Tile Project[2]. He jumpstarted the new channel after retiring from University of New Mexico. He releases new updates every Tuesday like clockwork. This is all he thinks, talks, dreams about and builds. It's amazing.

[1] https://www.cs.unm.edu/~ackley/

[2] https://www.youtube.com/channel/UC1M91QuLZfCzHjBMEKvIc-A

voldacar · on Feb 13, 2020

Ackley is literally thinking hundreds of years ahead of the hardware we have. It's a shame that his ideas will probably never make much of a dent in the current computing ecosystem

tehsauce · on Feb 11, 2020

One day Von Neumann architecture will be superseded by something more similar to cellular automata. Von Neumann architecture makes sense from a logical perspective which is not concerned with physical space or scaling efficiency. Chips designed as distributed meshes similar to neurons in the brain will eventually take us much further.

ColanR · on Feb 11, 2020

From what I understand, what we call Von Neumann architecture was actually his 'first draft' prototype, and what you describe is what he would have built next - if he hadn't died of brain cancer.

DonHopkins · on Feb 11, 2020

I posted about this and typed in some stuff from his book earlier:

https://news.ycombinator.com/item?id=21858465

John von Neuman's 29 state cellular automata machine is (ironically) a classical decidedly "non von Neumann architecture".

https://en.wikipedia.org/wiki/Von_Neumann_cellular_automaton

He wrote the book on "Theory of Self-Reproducing Automata":

https://archive.org/details/theoryofselfrepr00vonn_0

He designed a 29 state cellular automata architecture to implement a universal constructor that could reproduce itself (which he worked out on paper, amazingly):

https://en.wikipedia.org/wiki/Von_Neumann_universal_construc...

He actually philosophized about three different kinds of universal constructors at different levels of reality:

First, the purely deterministic and relatively harmless mathematical kind referenced above, an idealized abstract 29 state cellular automata, which could reproduce itself with a Universal Constructor, but was quite brittle, synchronous, and intolerant of errors. These have been digitally implemented in the real world on modern computing machinery, and they make great virtual pets, kind of like digital tribbles, but not as cute and fuzzy.

https://github.com/SimHacker/CAM6/blob/master/javascript/CAM...

Second, the physical mechanical and potentially dangerous kind, which is robust and error tolerant enough to work in the real world (given enough resources), and is now a popular theme in sci-fi: the self reproducing robot swarms called "Von Neumann Probes" on the astronomical scale, or "Gray Goo" on the nanotech scale.

https://en.wikipedia.org/wiki/Self-replicating_spacecraft#Vo...

https://grey-goo.fandom.com/wiki/Von_Neumann_probe

>The von Neumann probe, nicknamed the Goo, was a self-replicating nanomass capable of traversing through keyholes, which are wormholes in space. The probe was named after Hungarian-American scientist John von Neumann, who popularized the idea of self-replicating machines.

Third, the probabilistic quantum mechanical kind, which could mutate and model evolutionary processes, and rip holes in the space-time continuum, which he unfortunately (or fortunately, the the sake of humanity) didn't have time to fully explore before his tragic death.

p. 99 of "Theory of Self-Reproducing Automata":

>Von Neumann had been interested in the applications of probability theory throughout his career; his work on the foundations of quantum mechanics and his theory of games are examples. When he became interested in automata, it was natural for him to apply probability theory here also. The Third Lecture of Part I of the present work is devoted to this subject. His "Probabilistic Logics and the Synthesis of Reliable Organisms from Unreliable Components" is the first work on probabilistic automata, that is, automata in which the transitions between states are probabilistic rather than deterministic. Whenever he discussed self-reproduction, he mentioned mutations, which are random changes of elements (cf. p. 86 above and Sec. 1.7.4.2 below). In Section 1.1.2.1 above and Section 1.8 below he posed the problems of modeling evolutionary processes in the framework of automata theory, of quantizing natural selection, and of explaining how highly efficient, complex, powerful automata can evolve from inefficient, simple, weak automata. A complete solution to these problems would give us a probabilistic model of self-reproduction and evolution. [9]

[9] For some related work, see J. H. Holland, "Outline for a Logical Theory of Adaptive Systems", and "Concerning Efficient Adaptive Systems".

https://www.deepdyve.com/lp/association-for-computing-machin...

https://deepblue.lib.umich.edu/bitstream/handle/2027.42/5578...

https://www.worldscientific.com/worldscibooks/10.1142/10841

ColanR · on Feb 12, 2020

Thank you very much for this. I originally learned about the Von Neumann / CA connection here on HN, but never really found a lot of sources for it. I've been deeply wanting to learn more. You've posted a lot of awesome sources above - is there anything else you'd care to add, for someone who wants to know as much as possible about what Von Neumann was up to?

caetris1 · on Feb 12, 2020

Von Neumann was a friend and colleague of Claude Shannon, the founder of information theory (mathematical theory of communication). Back then, it was a really small world of scientists working in this field. Von Neumann, Turing, and Shannon spent a lot of time complimenting each other's work. Shannon eventually discovered that information encoding/decoding is optimally done as a collection of bits (the name was later coined by John Archibald Wheeler "it from bit"). At some point Shannon recommended to Turing that his machine could be whittled down to only two states. Von Neumann was inspired by Turing's work, which led him to create the Von Neumann architecture. Von Neumann was brilliant but he relied heavily on Shannon and Turing, and vice versa. And to answer your question... Shannon wrote a paper on that very subject out of deep admiration of his friend Von Neumann. https://projecteuclid.org/download/pdf_1/euclid.bams/1183522...

ColanR · on Feb 12, 2020

Very cool. Thank you.

tomxor · on Feb 11, 2020

> One day Von Neumann architecture will be superseded by something more similar to cellular automata.

At which point we will have to call it the Cellular Ulam-Neumann architecture ;)

UncleOxidant · on Feb 11, 2020

In as much as CAs are a specific type of systolic array, yes, I think we'll move more in that direction towards systolic arrays.

daniel2x · on Feb 11, 2020

You mean like the internet?

eyvindn · on Feb 11, 2020

One of the authors here.

Feel free to ask any questions!

We encourage you to play with the attached Colab with which you can train models from scratch in <30min.

UncleOxidant · on Feb 11, 2020

Very cool. I've only just skimmed the article (and quickly), but I was wondering if instead of generating pictures one could apply this to generating a CA (or parameters for a CA) that does certain types of computations? This would allow us to evolve or find CAs that could perform certain types of multi-stage computations in a highly parallel fashion and could be readily implemented in FPGAs (for example).

Oh, sounds like I just found a fun project to work on.

eyvindn · on Feb 11, 2020

We have a lot of future applications and directions in mind, including the feasibility of applying this to distributed computing and consensus. In particular, the idea of training a model generating CA parameters instead of having a CA fit a specific pattern is of great interest to us and the synthetic biological field. Feel free to get in touch.

tehsauce · on Feb 11, 2020

This is a wonderful idea. Would you be interested in collaborating on this project?

m_que · on Feb 11, 2020

Dang! I started working mid-last year on practically exactly this as a side project (was going to call the paper "Towards the horse", if you get the reference :) ). Congratulation for making it work, I am really psyched. Can you remember what triggered the motivation to work on this? Somehow I remember there being something on reaction-diffusion equations on HN around that time. Had you tried to use a single Laplace operator instead of two Sobel filters? My approach is to model the reaction-diffusion as a sum of 1x1-convolution (reaction) and a depthwise 3x3 convolution with a fixed kernel multiplied by a learnable constant (diffusion). However, for this to work, a single seed pixel obviously will not work. Any thoughts?

eyvindn · on Feb 12, 2020

Diffusion-reaction systems were an inspiration in general. Using a Laplace operator (or the discretised equivalent for our 2D grid) might have trouble learning to generate these patterns - the Laplacian wouldn’t always provide unique information as to where a cell might be based on its neighbours. It’s possible the network would learn to exploit the hidden channels to bypass this directional invariance. Starting from a single pixel in such a setting would indeed need some mechanism to break the symmetry (stochastic updates as used here, for instance).

m_que · on Feb 12, 2020

Now having scrolled down to the bottom of the papaer, I have seen a section on the genesis of the idea.

mehh · on Feb 12, 2020

Hi

I just see a website that doesn't seem to anything particularly interesting with a CA, is there a paper that explains what's going on (I might have missed the link, but I did look).

plafl · on Feb 11, 2020

Congratulations, it's really cool. It's late here now, I will play with it tomorrow. I actually have a lot of questions!

- How well the CA rules "compress" the image against a typical image compression algorithm?

- Have you tried 3D? Animations?

- What happens if you insert small noise to the learned parameters? It gets destroyed or mutates the image to something similar?

- Is it practical with more complex images, for example, a human face?

eyvindn · on Feb 11, 2020

1) This is often the first question many people familiar with NNs ask and rightly so. Compression was not one of our goals with this article, and it would look like a terrible compression algorithm if that were its purpose. In fact, the model displayed here is about 8.3k parameters, although the WebGL model is quantized (more info on this in the last section), and each model learns to encode an image consisting of 44x44x3 = 5808 integers. We made no attempt to minimize this number. The key thing to bear in mind is that all the cells have the exact same rule and the image generation starts from a single one of them - meaning they have to learn to communicate locally with their neighbours to self-organize in the correct pattern. This is a very non-trivial task and the majority of the model’s parameters are likely going towards this communication protocol and growth behavior.

2) We have not tried 3D. As for animations, some of our earliest experiments suggested one could achieve “animations” by applying the loss at key-points to have the model learn to iterate through these points across several time steps.

3) One could argue the WebGL implementation does this to some extent by quantizing the learned weights we take from the Tensorflow training code. The model remains very resilient and worked out of the box in almost all cases. Moreover, if one tried to inject explicit noise to the CA in a given location, some models would have no problems adapting to it, while others would fail miserably. Some early experiments yielded some remarkably resistant models, able to resist while being subject to continuous globally occurring noise. We suspect explicitly training them while introducing noise would allow us to drive the model towards more consistently resistant behaviors.

4) One of the main obstacles to larger patterns at the moment is memory usage during a forward/backward pass. There are optimization and tricks we plan to employ to generate larger and more complex patterns, which may be discussed in a follow up thread.

DonHopkins · on Feb 11, 2020

Dave Ackley, who developed the Moveable Feast Machine, had some interesting thoughts about moving from 2D to 3D grids of cells:

https://news.ycombinator.com/item?id=21131468

DonHopkins 4 months ago | parent | favorite | on: Wolfram Rule 30 Prizes

Very beautiful and artistically rendered! Those would make great fireworks and weapons in Minecraft! From a different engineering perspective, Dave Ackley had some interesting things to say about the difficulties of going from 2D to 3D, which I quoted in an earlier discussion about visual programming:

https://news.ycombinator.com/item?id=18497585

David Ackley, who developed the two-dimensional CA-like "Moveable Feast Machine" architecture for "Robust First Computing", touched on moving from 2D to 3D in his retirement talk:

https://youtu.be/YtzKgTxtVH8?t=3780

"Well 3D is the number one question. And my answer is, depending on what mood I'm in, we need to crawl before we fly."

"Or I say, I need to actually preserve one dimension to build the thing and fix it. Imagine if you had a three-dimensional computer, how you can actually fix something in the middle of it? It's going to be a bit of a challenge."

"So fundamentally, I'm just keeping the third dimension in my back pocket, to do other engineering. I think it would be relatively easy to imaging taking a 2D model like this, and having a finite number of layers of it, sort of a 2.1D model, where there would be a little local communication up and down, and then it was indefinitely scalable in two dimensions."

"And I think that might in fact be quite powerful. Beyond that you think about things like what about wrap-around torus connectivity rooowaaah, non-euclidian dwooraaah, aaah uuh, they say you can do that if you want, but you have to respect indefinite scalability. Our world is 3D, and you can make little tricks to make toruses embedded in a thing, but it has other consequences."

Here's more stuff about the Moveable Feast Machine:

https://news.ycombinator.com/item?id=15560845

https://news.ycombinator.com/item?id=14236973

The most amazing mind blowing demo is Robust-first Computing: Distributed City Generation:

https://www.youtube.com/watch?v=XkSXERxucPc

And a paper about how that works:

https://www.cs.unm.edu/~ackley/papers/paper_tsmall1_11_24.pd...

Plus there's a lot more here:

https://movablefeastmachine.org/

Now he's working on a hardware implementation of indefinitely scalable robust first computing:

https://www.youtube.com/channel/UC1M91QuLZfCzHjBMEKvIc-A

heyitsguay · on Feb 11, 2020

I'm wondering these things too. You say in the paper that your net is quite small - I wonder if you could first extend it to regenerative sprite animations for some fun procedural NPCs in a roguelike? If you can efficiently add or change a pixel or two per frame in a sprite+collision mask, imagine battle sequences where things get hacked off, maybe regrow if you wait too long, etc.

langkilde · on Feb 11, 2020

Nice work @eyvindn! A few questions:

1. Am I just imagining things, or do the results depend on the speed?

2. I'm able to erase the shape very easily at max speed, despite setting it to persist. Have you studied how much shape loss is required for the shape to vanish? (ex video https://youtu.be/zMQkTyzdphc)

eyvindn · on Feb 11, 2020

1. The results do not depend on speed in principle. The difference between 60FPS and “max” can be several orders of magnitude depending on your hardware, so it’s likely you may see instabilities that exist on much longer horizons at higher speeds.

Interactions will play out very differently at different speeds because you are interacting with a sped-up/slowed down version of the CA.

2. We haven’t done any rigorous studies of regenerative capability, although this is certainly on our to-do list. From empirically playing with them, models seem to be more susceptible to damage to the centre of their bodies than to limbs, likely as a result of “growing” outwards.

caetris1 · on Feb 12, 2020

This is because the center point of your models do not expect a foreground/background boundary. By putting a hole at the center, the boundaries seek to stabilize the system by bubbling outward. The best way to put it is that there is no stable state where a doughnut hole exists at the center. An interference pattern ripples outward and destroys the system.

sasaciric · on Feb 13, 2020

Hi, very intetesting work. Just posted a comment with some related work from my group. Please check it out and let me know if you need any info. Best, Stefano Nichele

eyvindn · on Feb 13, 2020

Thank you. We will take a look at this. Currently we are working through a backlog of communication.

hedgehog · on Feb 12, 2020

Have you looked at other applications of this approach for tasks like semantic segmentation or depth estimation?

eyvindn · on Feb 12, 2020

Yes. Several other applications (both with and without an underlying 2D grid) are in the pipeline which we hope to publicise soon.

caetris1 · on Feb 11, 2020

Spectacular. I think most people might be lost on the significance of what has been done here. I suspect that it's because the framing the solution focuses on is based on morphogenesis. That is, in fact, not what is going on here. A more suitable metaphor would be the functional basis of a visual hallucination. The very same mechanics of the demonstrated result are described in Oliver Sack's book on the subject. Learned patterns are procedurally generated through probabilistic inference of adjacent feature vectors. Dreams work in a similar way. To prove that this is what is going on, I suggest training the CA on multiple frames of an animation of the lizard walking.

The research is really promising but the conclusion that the authors come to is that this system can be used to generate self-organizing agents that evolve and replicate. There is no possibility of this result based on the approach. However, they've actually created something much better than the intended result... a real-time auto-complete for visual scenes.

m_que · on Feb 12, 2020

Other works of the authors on deep dreaming reflect better what you describe here.

The astonishing results with regard to morphogeneses cannot be emphesized enough: A complex structure is robustly encoded in a single function and can be reproduced from a single grid cell.

sasaciric · on Feb 13, 2020

Great work, very much related to our previous work on CA morphogenesis and replication:

https://ieeexplore.ieee.org/abstract/document/8004527

https://link.springer.com/chapter/10.1007/978-3-319-45823-6_...

https://www.mitpressjournals.org/doi/abs/10.1162/ARTL_a_0019...

https://www.mitpressjournals.org/doi/abs/10.1162/978-0-262-3...

https://ieeexplore.ieee.org/abstract/document/7008733

HelloNurse · on Feb 12, 2020

I played with "regenerating" models by filling the field with multiple overlapping copies and repeatedly changing the angle, with rather weak results.

As the shapes are disintegrated and rebuilt, patterns can form: the eye forms brown and white thick straight "sausages", the yellow striped fish forms similar but smaller strips, the butterfly and the pretzel exhibit crawling shapes with partial features and empty space, the christmas tree can develop stripes on a green background instead of balls, the lizard reforms very effectively recovering empty space, the ladybug reforms partially (heads in a sea of wing material), the explosion produces an uniform patch of the interior section as growth extinguishes the border.

fishmaster · on Feb 11, 2020

This kind of reminds me of this comment (moveable feast engine): https://news.ycombinator.com/item?id=17877155

DonHopkins · on Feb 12, 2020

David Ackley's talks are amazing and mind expanding.

Public Lecture: Artificial Life for Bigger & Safer Computing

Talk presented October 8, 2014 as part of the John von Neumann Public Lecture series at the Center for Complexity and Collective Computation (C4) in Madison, Wisconsin. Video recorded by Alan Ruby of the Wisconsin Insitute for Discovery.

https://www.youtube.com/watch?v=NqSnoJ-VGH4

He stakes out a bold iconoclastic position that the ridiculously successful deterministic approach to computing is a dead end that will not scale, but he's not just like Clifford Stoll predicting that nobody's going to ever shop over the internet (bless his heart for owning up to that ;), because he's taking a much longer term view, and his beliefs don't contradict von Neumann, but actually reflect what von Neumann predicted.

John von Neumann actually predicted that his own approach was going to fall down.

The future "...logic of automata will differ from the present system of formal logic in two relevant respects:

1. The actual length ... of the chains of operations will have to be considered.

2. The operations of logic ... will all have to be treated by procedures which allow exceptions (malfunctions) with low but non-zero probabilities."

"...natural organisms are constructed to make errors as ... harmless as possible. Artificial automata are designed to make errors as ... disastrous as possible ... We are ... much more 'scared' by the occurrence of an isolated error and by the malfunction ... behind it. Our behavior is clearly that of overcaution, generated by ignorance."

-John von Neumann, 1948

gwenzek · on Feb 11, 2020

I think I broke the Matrix: https://i.ibb.co/8zGRcNd/Screenshot-20200211-230621-Firefox-...

I erased the lezard but the tail and it grew back very weirdly. How robust is this method?

zackmorris · on Feb 12, 2020

The models described in this article run on the powerful GPU of a modern computer or a smartphone. Yet, let’s speculate about what a “more physical” implementation of such a system could look like. We can imagine it as a grid of tiny independent computers, simulating individual cells. Each of those computers would require approximately 10Kb of ROM to store the “cell genome”: neural network weights and the control code, and about 256 bytes of RAM for the cell state and intermediate activations. The cells must be able to communicate their 16-value state vectors to neighbors. Each cell would also require an RGB-diode to display the color of the pixel it represents. A single cell update would require about 10k multiply-add operations and does not have to be synchronised across the grid. We propose that cells might wait for random time intervals between updates. The system described above is uniform and decentralised. Yet, our method provides a way to program it to reach the predefined global state, and recover this state in case of multi-element failures and restarts. We therefore conjecture this kind of modeling may be used for designing reliable, self-organising agents. On the more theoretical machine learning front, we show an instance of a decentralized model able to accomplish remarkably complex tasks. We believe this direction to be opposite to the more traditional global modeling used in the majority of contemporary work in the deep learning field, and we hope this work to be an inspiration to explore more decentralized learning modeling.

I've been hoping that someone would build the above computer for over 20 years. I used to just get laughed at, or people would mock me for not knowing what I'm talking about (even though I have a computer engineering degree). That transitioned to at least speculating on what I was saying, and will soon transition to being old hat. Oh well.

But I look at computers today with a billion transistors, running at the same 3 GHz as they did 20 years ago, and it breaks my heart. I imagine what might have been, if we had arrays of hundreds/thousands of z80 or x86 or 68000 or DEC Alpha or MIPS chips to do whatever we wanted with. Imagine the emergent behavior that might have happened. Instead we're locked into this proprietary/narrow way of working with shaders and specialized neural computers doing SIMD and I just find it all so uninspiring. But I'm hopeful that some of the recent work with getting these older processors running on FPGAs will bear fruit.

https://news.ycombinator.com/item?id=17833829

https://news.ycombinator.com/item?id=19329083

https://news.ycombinator.com/item?id=10089807

https://news.ycombinator.com/item?id=3267428

https://news.ycombinator.com/item?id=19885901

https://www.google.com/search?q=processor+on+FPGA+site:news....

RosanaAnaDana · on Feb 11, 2020

Game changing!

openfuture · on Feb 11, 2020

m_que · on Feb 11, 2020

This is the foundation for 'programming' morphogenesis of multi-cellular synthetic life in the far future. Of course you first have to get the programming of monocelluar life to work.

Also, the reverse might be possible, i.e., decompiling genetic code from phenotype and so on.

imvetri · on Feb 12, 2020

Foundation for programming the morphogenesis. Yes.

morphogenesis - concept is fundamentally flawed. It's not thoroughly put in theories. Computing has touring test. Does concepts like morphogenesis has any tests? Nope. So building something on top of half baked concepts will not scale/sustain/be a foundation.

If we really had understood morphogenesis in the past and the it is 100% put in theories, we would be having a virtual world mimicking biological world by now.

m_que · on Feb 12, 2020

Yes, morphological models can be tested. You can hypothesize to reactions to alterations of an organism structurally and chemically (morphogenes). Diffusion-reaction models are available for certain kind of morphogenetic behavior. See also https://www.brandeis.edu/now/2014/march/turingpnas.html

dropit_sphere · on Feb 11, 2020

>Also, the reverse might be possible, i.e., decompiling genetic code from phenotype and so on.

I don't know anything about decompiling or ML, but doesn't the black box nature of NN rules this out?

m_que · on Feb 12, 2020

NNs are not a black box in the sense that you cannot see what is going on in them, but in the sense that we currently just very badly understand. You have a function approximator that you have to translate into some other representation. As long as you know how basic building blocks map or can be approximated, it might be possible.