Sunday, November 06, 2016

AI and media


so we have this thing now where liberals can choose to only read liberal media, and conservatives select conservative media. steve was pointing out that as more content is created by AI, it won't just be liberal vs conservative, but your own micro-customized perspective fed back in your face.

i guess it seems almost inevitable, if the internet is designed to get you "the information you want", then that's whatever matches your views. you could change the objective to giving "the information you *should* have", but then somebody else has to decide what that is. maybe this is the trap all intelligent life falls into, and why we don't see them everywhere in the galaxy?

one glimmer of hope would be if we can build AIs that don't just maximize a value function, but are actually alive in a deeper sense, that they're on the edge of life and death, with a feeling of mystery and wonder.

Thursday, October 06, 2016

not thought through email 2

you know how after you learn an association, if you present one of the two items alone, a representation of the other one appears after a short delay? e.g.,
http://www.nature.com/nature/journal/v354/n6349/abs/354152a0.html
https://elifesciences.org/content/4/e04919

i was thinking about this in the context of the remerge model:
http://content.apa.org/journals/rev/119/3/573

it could be that the dynamics of hippocampus are particularly good for stitching together the topology of space using little chunks of sequence. and more recently in evolution this is also used for stitching together arbitrary concepts.

when two or more concepts are linked together for the first time, this is initially stored in recurrent MTL dynamics, such that presenting either item alone triggers a fast sequence including the other item. (dharsh's idea, if i understand right.)

but maybe, after more learning (or sleep/replay), some PFC neurons, which are repeatedly being exposed to rapid sequential activation of one item after the other, become specifically sensitive to the new compound. as in tea-jelly.

now the tea-jelly cells in PFC can become building blocks for new sequences in MTL. maybe one reason the PFC is huge in humans is that we have to know about crazy amounts of stuff, like global warming and tensorflow. these are things you can't get from a stacked convnet on the visual stream; only from binding multiple things?

this could also fit with the idea that episodic learning and statistical learning are part of the same mechanism. when you learn a new concept, it's originally from a single or small number of examples. it would be interesting if the final geometry of the abstract concept, even after it's been well consolidated, still somewhat depends on what first examples you learned it from.


not thought through email 1

re: http://www.annualreviews.org/doi/pdf/10.1146/annurev-psych-122414-033625

would some of the trajectories have to be very long? like they give the example of college acceptance's value being conditional on whether you graduate...

how do you decide what to store in a trace? would a trajectory hop from homework to college application to graduation?

episodic trajectories might be one way of looking at "conflicting beliefs". you could easily have two episodes, which start from different starting points, and constitute logically incompatible beliefs. like, "everyone should earn a fair wage, therefore we should set a high minimum wage". and, "higher minimum age forces employers to reduce full-time employees".

how separable is the state generalization problem from the episodic idea? (although there's the good point about advantages of calculating similarity at decision time.)

they don't talk much (or maybe i missed it) about content-based lookup for starting trajectories. is it basically "for free" that you start trajectories near where you are now (or near where you're simulating yourself to be)? is it like rats where the trajectories are constantly going forward from your current position, maybe in theta rhythm?

i was thinking about a smooth continuum between "episodic" and "statistical/model-based". what if we picture the episodic trajectories as being stored in a RNN. when you experience a new trajectory, you could update the weights of the RNN such that whenever the RNN comes near the starting state of that trajectory, it will probably play out the exact sequence. but you could also update the weights in a different way (roughly: with a smaller update) such that the network is influenced by the statistics of the sequence, but doesn't deterministically play out that particular sequence.

this view is also kind of nice because episodes aren't separate from your statistical learning. some episodes from early in life might even form a big part of your statistical beliefs. especially as the episodes are replayed more and more, over time, from hippocampus out into cortex.

is the sampling of trajectories influenced by current goals?
 (e.g., https://elifesciences.org/content/4/e06063%20)
like, the current goal representations in PFC could be continually exerting pressure on the sampling dynamics to produce outcomes that look like the goal representations?


Tuesday, September 27, 2016

for the previous post, one issue is how the sensory system can maintain both the current inputs and the forward modeled consequences. maybe one mechanism is in different phases of oscillations.

Monday, September 26, 2016

mixed selectivity, brain-wide decisions, planning and imagination


laurence hunt has been pointing out how decisions aren't computed in the brain: by one region or circuit doing a particular computation and passing the finished result on to another region or circuit, that then does a different computation.

instead, millisecond by millisecond, decision-related signals appear almost simultaneously all over cortex:
http://www.markussiegel.net/download/siegel_science_2015.pdf

another example is how downstream circuits can be biased by an upstream calculation before that calculation is finished:
http://www.jneurosci.org/content/32/24/8373.full.pdf

these things make sense intuitively because neurons have a lot of connections to each other. laurence also points outs that one of the canonical computations in the brain is lateral inhibition. he outlines a big picture of parallel competition across the brain - with different areas emphasizing competition in different feature spaces:
http://www.nature.com/neuro/journal/v17/n11/abs/nn.3836.html

and this matches up with "mixed selectivity". for example, if you look at the cells that project to midbrain dopamine neurons, you might think these cells would contain cleanly separated representations of the different signals you need to calculate RPE, like reward, expectation etc. but instead, many of the neurons are a bit correlated with multiple of these signals:
http://www.cell.com/neuron/abstract/S0896-6273(16)30510-4
it's a signal smoothie.

stefano fusi and others observe that keeping this high dimensionality is important in order to respond to task demands:
http://www.nature.com/nature/journal/v497/n7451/abs/nature12160.html
http://www.nature.com/neuro/journal/v17/n12/abs/nn.3865.html
i like how this is analogous to life. keeping lots of potential around is probably what life is. i wonder how this connects to mate lengyl's idea that moment-to-moment variability in neural signals encodes uncertainty in the beliefs they represent.

i also like this because it fits exactly with the idea of continuously metabolizing the world. compared to "decision making", i think this is a fundamentally different way of looking at what the brain is doing. for one thing, you generate rather than select actions. i suppose this is good in the real world where the action dimensionality is very high and candidate actions are rarely proposed to you. for another thing, it means that in a "stable state", you can keep emitting actions from your current abstract goal representation. maybe it's unfolded to different things depending on the current inputs and internal state...

but now we get to the really interesting part. let's start with laurence's model of lots of brain areas processing feature competitions in parallel, with simultaneous influences on one another. what i've been thinking is that this links together planning and imagination in a concrete way.

the key to this, in my current (probably infantile) thinking, is that PFC networks may be very good at holding on to their representations (both because individual neurons have longer time constants, and because of network-level properties). if you think of PFC as being the top of a sensory hierarchy, you could look at it as encoding a very abstract state of the world. or if you think of it as being the top of a motor hierarchy, you could view it as encoding a very abstract action plan. these two things are the same thing.

now, let's say that in the feature competitions, you include some bits of forward model -- i.e., how your potential actions will change the world -- and some modelling of the dynamics of the world itself (let's lump these both under the name "forward model" for now). as the overall state of the brain evolves forwards (under its lateral inhibition dynamics), your current abstract action plan will push the modelled future state of the world toward whatever the predicted consequences are. these predicted consequences are processed through the sensory/state/value hierarchy and compared against current goals. if they match current goals, fine, you continue unfolding your current action plan. if they don't match, then this is where the hyper-stability of PFC comes in. PFC, as the top of the sensory hierarchy, should be pushed toward believing the predicted consequences. but because it's pathologically stable, it doesn't get its own state pushed around. instead, it pushes on the action side of the hierarchy to make the abstract action representations a little bit more like something that produces consequences that match its beliefs.

importantly, when the current goals are not working (maybe signalled by something like low tonic dopamine), the PFC can let go of its hyper-stability and reorganize around new goals/abstract action plans. e.g.:
http://science.sciencemag.org/content/338/6103/135
spiritually speaking, this can feel difficult. the more of the long-term goal-patterns are released, the more it's letting go of the self, which is what we're afraid of.

(here's a very ignoreable side note. it wouldn't have to be PFC alone that "insists" on goals. this insistence could be distributed over the whole brain too. the PFC idea is just a clean way to visualize the story.)

this process is an energy minimization in activation space (to satisfy as many of the constraints imposed by weights as possible). but it's nice that you never have to compute the energy or its gradient. i have no idea how learning would work in this kind of system but i'm sure people must have worked on it. something hebbian and simple?

so what i've just described is a way of doing planning, right? but you never need any part of the brain doing something like tree search over the whole abstract state space. instead, little bits of forward model nested in various areas can unfold in different kinds of feature spaces, to different lengths of time, but these computations are being influenced by constraints from ongoing computations in other areas.

this could explain pavlovian pruning:
http://journals.plos.org/ploscompbiol/article?id=10.1371%2Fjournal.pcbi.1002410
because if you start simulating bad consequences in one part of the feature space, this suppresses the search. although maybe this kind of explanation is overkill.

i suppose this fits with replay because the dynamics of the brain have learned the statistics of the dynamics of the real world. it makes sense that little bits of the brain are perpetually playing out the little snippets of dynamics that they know about. obviously it's much more complicated with learning and stuff. this is just some stray thoughts.

where does dopamine come in? some more speculative thoughts. if you get a reward prediction error, this means the overall action plan/expectations didn't account for everything, so maybe you need to update your action plan. (or, i could see this going the other way, that a positive RPE means you should stabilize your abstract action plan, and engage more action along it.) this loosely matches with seamans/yang or cohen/braver/brown ideas that dopamine modulates PFC representational stability. rui costa's work seems consistent with this being at some level more abstract than just simple actions, e.g.:
http://www.nature.com/neuro/journal/v17/n3/abs/nn.3632.html

what about "episodic RL"?
http://www.annualreviews.org/doi/pdf/10.1146/annurev-psych-122414-033625
sam and nathaniel's model remembers discrete sequences, but maybe the whole system is on a continuum from non-parametric to parametric. when you experience a sequence/trajectory, this gets written into the weights of your RNNs. if it's written in really strongly, it becomes a path that future activity can almost deterministically follow (if it comes near the starting state). if it's written in weakly, it just influences the future dynamics a bit.

what about striatum? forward models ("what will happen if i do this?") and dynamics models are probably relatively easy to learn: you just match what you observe. i suppose inverse models are harder to learn ("what action should i do?"). but the striatum and habits maybe cache little pieces of inverse model. (i haven't thought through how this fully fits in yet.) it is a nice idea that the continuum between "model-based" and "model-free" has to do with which parts of forward-model-energy-minimization you replace with bits of inverse model.

there are some ML systems that need to be fed a "goal state", like this:
https://arxiv.org/abs/1609.05143
these kind of systems should mesh well with the metaphors i've been thinking about here.

this is getting so schizophrenic, let me close by reflecting about how the building blocks of nervous systems are things like central pattern generators. it's oscillations modulating oscillations.



Friday, September 16, 2016

exploration

once again i'm pretty sure i'm not thinking anything original here, it seems to be a mashup of other people's ideas. but it's me starting to internalize it maybe.

so, let's say in the brain you have an abstract representation of your action-plan, which predicts through forward models the consequences, and which also continuously contrasts these predicted consequences against goals. the system attempts to minimize the energy of its activations continuously, like a big multiple constraint satisfaction problem, with all comparisons/prediction errors happening in parallel.

how does exploration work in this system? as i understand karl's model, he has a term for epistemic value. or, in like dan russo's information directed sampling, you try to optimize your knowledge about which actions lead to reward.

how could i fit this into my intuitive framework?

generally, i think exploration is the drive to metabolize the outside world. why is this a property of living systems? because life is fundamentally dynamic. the energy of that metabolism is what keeps moving the system forward.

so what is exploration in my continuous action-generation system? why isn't it just making the actions that make predictions look most like goals? i guess one possible answer is simply that it's minimizing prediction errors everywhere. on the surface, it seems like this has a problem - that you should avoid exploring because it would generate surprise.

but i had an idea about this - maybe exploration is only relative to something that has *already* caused a prediction error. for example, let's say there's a door in your environment, which you've never seen. before you see it, you obviously can't choose to explore behind it. when you see it, *this* is what generates the prediction errors. especially because you know some stuff about doors: and your generative models fill in lots of uncertainty about what's behind the door. your exploration is then to minimize this uncertainty.

and finally, we avoid the darkroom problem because of the dopamine stuff, and hyper-stability in PFC etc.

Tuesday, August 23, 2016


sofie suggested a while ago that rest might be kind of a fundamental thing. the "other" thing besides the whole motivational gradient.

i was thinking about how bad some states can feel, and often in here there's a feeling of "needing to do something". like restlessness.

so aside from having or not having a path to goal, is there also a separate dimension where you might or might not feel like something needs to be done? could this be related to why bad isn't exactly the negative of good.

Thursday, June 09, 2016

feeling like things are a little unfair


we don't like being taken advantage of. for example, if you always clean up after your housemate, and they never clean up after you, this doesn't sit well.

to avoid being taken advantage of, we roughly track how much we've done for someone and how much they've done for us. it's annoying if someone does this tracking excessively, but i think everybody does it to some degree.

however, this tracking is not an exact science. there's lots of uncertainty. and most importantly, i think we probably tend (more often than not) to give ourselves the benefit of this doubt.

what this means is that if things are "objectively" fair, each person will feel like they're giving slightly more than the other person, in the long run.

so this is probably the feeling we should be expecting, and accepting.

Thursday, May 26, 2016

multi-voxel pattern analysis


A commonly-held idea is that Multi-Voxel Pattern Analysis has something to do with patterns over voxels.

This flames of this misconception are probably fanned by images like this:

[http://www.cogsci.mq.edu.au/research/projects/thebrainthatadapts/]

Because we never saw this kind of image in review papers about single-voxel fMRI analysis, we tend to think MVPA is special in being able to detect patterns like the one colorfully shown.

What do we mean by a "pattern", exactly? In particular, people often get the idea that MVPA is special because it's sensitive to cases where nearby voxels might encode the stimulus in opposite directions. This intuitively fits with the image above.

But the truth is, analyses that treat each voxel independently are perfectly happy to tell you about nearby voxels encoding a stimulus in opposite directions. Suppose you have two stimulus conditions, like face vs. house. At each voxel independently, you can perform an ANOVA against these category labels. If two adjacent voxels encode the categories in opposite directions, the F-statistics at these voxels will both be large.

You can spatially smooth these F-statistics and align them between subjects, and get a statistical map of where in the brain encodes information about faces and houses.

So, even considering one voxel at a time, you can still pick up the pattern of positive and negative encoding shown in the colorful image above.

One thing MVPA does do is sacrifice spatial resolution to gain sensitivity. By including multiple voxels, more information about the variable of interest (e.g., faces vs. houses) is pooled together. The tradeoff is that we don't know which voxels in that pool contain information about faces and houses.

Consider multiple regression, a typical MVPA approach. Our prediction of the response y (e.g., faceness vs. houseness) is related to:

beta_1 * voxel_1 + beta_2 * voxel_2 + ... + beta_n * voxel_n

In other words, our prediction is roughly the average of the predictions that the individual voxels make. (Although we can find better beta coefficients with multiple regression than with n single regressions.)

A weighted average can be much better than a prediction from a single voxel - but there's no magical "pattern" information.

Conversely, you can use many of the methods typically applied to MVPA to analyze single voxels, including representational similarity analysis. For example, here's a representational dissimilarity matrix constructed from real data from a single MEG sensor:




MVPA does bring some extra benefits if we allow for nonlinearities. Multi-voxel analysis can detect encodings that are invisible to single voxel analysis, like this one:



Neither voxel by itself carries information about red versus blue, but a 2D Gaussian kernel can separate them.

(2D linear classification, unlike 2D linear regression, could perform above chance on these example data, by drawing a boundary that puts all the blue points on one side, and half of the red points on the other side.)



Friday, April 08, 2016


in relation to the previous post, maybe every thought/belief/perception/narrative inherently has some *value*. it constitutes a motivational gradient.

this would explain why every time a thought comes up, it in some way implies an action.

action vs inhibition


this started from thinking about zen. if you're in sitting meditation, you sometimes catch yourself starting to "do" something. it could be either external (like i need to scratch an itch, i need to adjust my position because my leg hurts, etc), or internal (which is often more subtle, like i'm exciting myself by thinking about some interesting thought).

but the premise of zen is that you don't do that stuff, you just sit. normally, outside of zen, a feeling of "need for action" usually gets discharged immediately as an action (internal or external), which puts us in unconscious loops. not taking the actions breaks the loops, and i think this is part of the value of zen.

but then, i was thinking that sometimes the thing i do habitually is *inaction*. like, i'm afraid to show myself too much, i'm afraid to scream around other people, to look like a baby, or to look stupid by doing something. in that case, would the zen philosophy say that i should try actually *doing* those things?

challenging yourself vs accepting yourself



something i've been interested in recently is the balance between accepting yourself and challenging yourself. maybe we could think of it like a spectrum.

on one extreme, you have accepting yourself in a kind of opium-like way. telling yourself it's ok, focusing on the positive.

closer to the middle, there's a "self-acceptance" that observes that your wants and fears are just carrots and sticks on the treadmill. if you're "motivated" by something, to ask what's the nature of the thing that's driving me.. what's my subjective experience like right now, and what will it be like if i change things to make them be like i want? this kind of approach doesn't necessarily prescribe actions per se. i think this is the concept people often associate with zen (but not necessarily what zen teachers teach).

crossing the mid-point of this spectrum, a bit on the other side there's the challenging-yourself approach that says you can always do better. you should specifically identify what you're afraid of happening, and identify what sort of actions you're perpetually doing to keep it from happening. then try not doing those actions (which is very scary), staying conscious through it...

far on the "challenging yourself" end of the spectrum, there's picking a goal (based on what you want at some level), and working toward it concretely, like fixing a cabinet. going for what you want. there's obviously a lot of value in this.

i guess it's often easier, and tempting, to go for the endpoints of this spectrum rather than the middle. maybe the middle is like "passionate equanimity".

but, it's interesting that using the endpoint strategies can paradoxically lead us toward the middle strategies. for example, if we've picked a goal, but now we see the thing that's blocking us from getting their is our own attachments and anxiety.

UPDATE (2020 May 14): wiki says this about gandhi and WWII. a nice example of challenging vs accepting.

Gandhi's views came under heavy criticism in Britain when it was under attack from Nazi Germany, and later when the Holocaust was revealed. He told the British people in 1940, "I would like you to lay down the arms you have as being useless for saving you or humanity. You will invite Herr Hitler and Signor Mussolini to take what they want of the countries you call your possessions... If these gentlemen choose to occupy your homes, you will vacate them. If they do not give you free passage out, you will allow yourselves, man, woman, and child, to be slaughtered, but you will refuse to owe allegiance to them." George Orwell remarked that Gandhi's methods confronted 'an old-fashioned and rather shaky despotism which treated him in a fairly chivalrous way', not a totalitarian Power, 'where political opponents simply disappear.'
https://en.wikipedia.org/wiki/Mahatma_Gandhi

UPDATE (2021 Jan 17): a more concrete personal example of this duality. say i've done something selfish. on one hand, i might rightly regret it and treat myself like a child who needs some loving discipline, forcing myself to move over an energy barrier in order to see some uncomfortable things about myself and how i'm affecting others. on the other hand, i might accept that i was doing the best i could at the time, like maybe i was feeling really anxious and backed into a corner, or i just didn't have the presence of mind to act better at the time.

how do you know when to do each one? should they come in cycles, like heating and cooling? or are they two different ways of looking at the same thing?