×
all 10 comments

[–]avturchin 6 points7 points  (0 children)

Another middle way to AGI may be to create a functional model of the human brain, that is a block scheme of a few tens black-boxes, roughly corresponding to cortical regions or functions. Data from the actual brains may be used for fine-tuning firing patterns of this blocks (that is which of them are activated together). However, the inside of each block may be different from human neural circuity.

[–]wassname 2 points3 points  (0 children)

Hey /u/gwern, if it's not much trouble could you also summarize the empirical results of the papers you linked, if you read that part? Or perhaps point out the ones have the strongest results. I'm skimming them now but I find it hard to read a lot of technical material and have limited time :(

There are a lot of cool AGI and ML ideas, and in restrospect some of the most promising one (GAN's, backprop, batch_norm) were not obvious. So I try to rate them on their empirical results so far. From skimming this seems to have worked quite well so far. Which is surprising because it sounds like data that's not that rich, and that focused on visual attention which supervised ML is quite good at.

[–]gwern[S] 1 point2 points  (1 child)

And on the flip side, one possibility is that the BCI will allow powerful interaction of the sort simply not possible now based on using brain activations as supervision for understanding material in a way which transports those extremely complex abstractions into the computer in a software-understandable way

To give a quick random example: imagine the BCI records your global activations as you read that Reddit post about deep learning augmented by EEG/MRI/etc data; a year later while reading HN about something AI, you want to leave a comment & think to yourself 'what was that thing about using labels from brain imaging' which produces similar activations and the BCI immediately pulls up 10 hits in a sidebar, and you glance over and realize the top one is the post you were thinking of and you can immediately start rereading it. And then of course your brain activations could be decoded into a text summary for the comment which you can slightly edit and then post...

One of the things I find frustrating about BCIs is that everyone is working hard on them without a good idea of what exactly one would do with them (aside from the most obvious things like 'robot hand'): https://twitter.com/gwern/status/1037784639233040385 It's very handwavy: "it'll be a memory prosthetic increasing IQ 20 points!" 'yeah but how' 'uh'. I don't need a detailed prototype laying out every step, even just a generic description would do. What's the VisiCalc or visual text editor of BCI? You can describe them, the way Engelbart or Alan Kay could describe their systems on paper, without needing to actually make them or know all the details. But no one's done so for BCIs that I've seen. As enormous as WaitButWhy's discussion of NeuralLink is, the examples kinda boil down to 'maybe you could have a little TV in your mind'.

Taking a brain-imitation approach seems to help me imagine more concretely what could be done with a BCI.

So even with just this surface recording data you can imagine doing a lot. You could use the embedding as an annotation for all input streams, like lifelogging. There are probably tons of specific applications you can imagine just on the paradigm of associating mental embeddings with screenshots/text/emails/documents/video timestamps: it's automatic semantic tagging of persons, places, times, subjects, emotions...

It could be used as feedback too. Perhaps there's an embedding which corresponds to coding or deep thought, in which case all notifications are automatically disabled, except for notifications about emails where the RNN predictor predicts high importance based on alertness/excitedness embeddings of earlier emails. Or neurofeedback: the simplest version being to make you calm down. (I remember Gmail had a 'beer' feature, I think, where it would offer to delay email if you sent them late at night or make you solve arithmetic puzzles to be sure you wanted to send it.)

[–]FractalNerve 0 points1 point  (1 child)

So in essence, DL using a fully randomized input feed and a target feed that is a good approximation for a dataset of many fNIR feeds that doesn't overfit. Would you agree that this might work? Shall we try? Where do I get fNIR datasets?

In a little more detail, but very generically: Given that a person's instinctive neural activity to an input stimuli that is an observation from the environment is outcome dependant, we might as well just use a single dataset to generate a ground base model without an imaginatory environment. This reduces good approximations to a drastically less hard to "brute force" and deep-learnable space.

We assume that the size of our neural parameters operating in the real brain is not infinitely large and maximally predictive for only very few strongly correcting parameters.

Then we can build an fNIR data prediction model, which correctly predicts most neural activity of a new person's brain mostly right.

After that I assume we need to synthesize a huge amount of fNIR data cleverly, without leaking a bias or too much noise into the data.

Sorry,too lazy to lookup for references or papers right now, or to academically overload this post. But happy to get a model into the real world, if you like to work together.

[–]gwern[S] 2 points3 points  (0 children)

So in essence, DL using a fully randomized input feed and a target feed that is a good approximation for a dataset of many fNIR feeds that doesn't overfit. Would you agree that this might work? Shall we try? Where do I get fNIR datasets?

I wasn't thinking of imitating a specific brain, necessarily. I was thinking more of meta-learning: each brain is drawn from the distribution of brains, you don't care much about each specific brain or the 'average' brain, you want to capture the generic algorithm which each brain is instantiating in a somewhat different way. So for example, you might treat it MAML-style. Take a set of images, expose the set to a set of humans while recording their brain; now sample each human's recording set and train the seed NN to try to match its activations; take a gradient step in the seed NN to minimize the loss; repeat. Does this get you a much more human-like, generalizable, sample-efficient CNN which you can then apply to any other dataset which performs better than your standard resnet?

But happy to get a model into the real world, if you like to work together.

Oh, I'm not working on this. I just thought it was neat and underappreciated research in this subreddit (learning music tagging from EEGs! using eyetracking to train a DRL agent's attention mechanism! How can you not find these papers bonkers and cool?), and I also wanted to make a point to Anders Sandberg that there's a lot of possible space in between pure DL and brute-force WBE which FHI might want to think about if they ever update their WBE Roadmap. Let's make better use of human brains than just single-labeling cat images or drawing some bounding boxes!

[–]theonlyduffman 0 points1 point  (1 child)

This is an interesting area.

Ultimately, I do think we need to train human imitators in order to mitigate problems with Goodhart's Curse / overoptimization. Whether that's done with human interaction data, brain data, or a combination of the two is an open question.

I've skimmed a few of the papers just now. I think the results are quite far from being powerfully useful for AGI. One way of describing the obstacle is that these kinds of papers often reproduce the brain activity on a too-high too-inaccurate level of resolution to be useful. If you have a ML model that reproduces brain activity at a high res with low-reliability, you really don't have a model that will be able to do much hard thinking for new problems. The kinds of quite low-dimensional embeddings can be used for some unreliable zero-shot (zero brain-data) results that seem quite far from useful. A related way of seeing things is that these models won't capture the important logic of how the brain works because their predictions are too heuristic. As a more general remark, these models seem pretty far from the capability frontier for ordinary ML models. It's plausible that this could all change with higher-res imaging but I would bet against it.

Independently of this assessment, on a several-year timescale, I'd expect this could be a fruitful way to design awesome lie-detection.

[–]gwern[S] 0 points1 point  (0 children)

It's a little unfair to judge them for not being SOTA in anything. They have had orders of magnitude less effort put into them than standard approaches, after all. There is not anything like an ImageNet of brain semantic annotations. Consider this more akin to DL in 2008 than DL in 2018.

What is interesting is that these prototype approaches work at all. If you had asked me, 'can you use EEG signals to meaningfully improve music or image classification' I would have been amused at the suggestion and said of course not. What could EEG signals possibly convey that the NN or SVM or other algorithm couldn't learn much more easily on its own?

Brain imaging approaches have been increasing exponentially in precision and resolution for decades now, so the trend there is good, independent of the specific lines of research. Plus VR headsets will come online soon. Once you can capture eyetracking with a $500 headset you bought for gaming^WSerious Research Purposes and a few lines of code in Unity, why wouldn't you?

So, I think this is an untapped paradigm that very few people even know is a thing, much less are thinking about what hybrid approaches are possible, or running serious large-scale research projects on like what we more usually talk about in this subreddit.

[–]wassname 0 points1 point  (0 children)

I wonder if you can do the same thing with openworm data, except on a much finder scale. Perhaps this isn't much differen't than current openworm data but the focus would be not on functional worm behavior, or building models that match neurons, but on finding architecture that mimic neuron behavior. Or am I off base here?

[–]gwern[S] 0 points1 point  (0 children)

While I think the idea is still sound, it looks like some of these papers may not be reporting real results due to not following best practices in brain imaging by using random orders, allowing for carryover or other systematic biases and thus the classifiers can achieve high performance by test set leakage:

"Training on the test set? An analysis of Spampinato et al. [31]", Li et al 2018:

A recent paper [31] claims to classify brain processing evoked in subjects watching ImageNet stimuli as measured with EEG and to use a representation derived from this processing to create a novel object classifier. That paper, together with a series of subsequent papers [8, 15, 17, 20, 21, 30, 35], claims to revolutionize the field by achieving extremely successful results on several computer-vision tasks, including object classification, transfer learning, and generation of images depicting human perception and thought using brain-derived representations measured through EEG. Our novel experiments and analyses demonstrate that their results crucially depend on the block design that they use, where all stimuli of a given class are presented together, and fail with a rapid-event design, where stimuli of different classes are randomly intermixed. The block design leads to classification of arbitrary brain states based on block-level temporal correlations that tend to exist in all EEG data, rather than stimulus-related activity. Because every trial in their test sets comes from the same block as many trials in the corresponding training sets, their block design thus leads to surreptitiously training on the test set. This invalidates all subsequent analyses performed on this data in multiple published papers and calls into question all of the purported results. We further show that a novel object classifier constructed with a random codebook performs as well as or better than a novel object classifier constructed with the representation extracted from EEG data, suggesting that the performance of their classifier constructed with a representation extracted from EEG data does not benefit at all from the brain-derived representation. Our results calibrate the underlying difficulty of the tasks involved and caution against sensational and overly optimistic, but false, claims to the contrary.

Discussion: https://www.reddit.com/r/MachineLearning/comments/a8p0l8/p_training_on_the_test_set_an_analysis_of/

[–]TotesMessenger -1 points0 points  (0 children)

I'm a bot, bleep, bloop. Someone has linked to this thread from another place on reddit:

 If you follow any of the above links, please respect the rules of reddit and don't vote in the other threads. (Info / Contact)