Hi all,
Welcome to the next installment of whatever this is. March turned out to be pretty good for thinking about things (it helped that I had a week off work), so I had plenty of raw material and this has ended up quite long. I’ll rant about some old stuff from Less Wrong, and then I’ll rant about the Spekkens toy model, and then I’ll rant about entropy. Happy Easter!
A Human’s Guide To Words, and the symbol grounding problem
I’ve spent a lot of time this month on my favourite game of ‘trying to understand what the hell Eliezer Yudkowsky is talking about’. This particular iteration of it started off in January when I was catching up on a couple of months of internet, and came across this exchange between Scott Alexander and David Chapman about Yudkowsky’s ‘A Human’s Guide To Words’ sequence on Less Wrong. Scott says:
My own version of this experience was reading Eliezer Yudkowsky’s A Human’s Guide To Words, which caused a bunch of high-level philosophical ideas to slip neatly into place for me. Last week David Chapman wrote about what was clearly the same thing, even centering around the same key example of whether Pluto is a planet. A Gender Studies major I know claims (I can’t confirm) that the same thing is a major part of queer theory too. But Chapman’s version and queer theory don’t make a lot of sense to me; I was able to understand the former only because I already knew what he was talking about, and I have to take any statements about the latter on pure faith.
David Chapman pops up in the comments to say ‘nope, they don’t look the same to me’, and I would agree, but actually isolating the differences clearly has been a bit of a pain, and the upshot of this is that I now have 5000ish words of a ranty unfinished blog post with the troll title ‘The Thing Structure of Clusterspace’ and I’m not sure what to do with it. Looking back after a couple of months, I’m hoping that I can cannibalise it for parts and get something more focussed and charitable out of it. But that requires fixing some bits I don’t understand well.
Back in January when I started writing the post I read all of A Human’s Guide To Words, and unfortunately for me this process feels nothing like ‘a bunch of philosophical ideas slipping neatly into place’. I always find Yudkowsky confusing to read – there are clearly assumptions he has that I don’t share, but it takes me ages to dig out what they are explicitly, rather than just having a nonverbal feeling of ‘something doesn’t make sense’, and I haven’t yet dug them all out.
I focussed on The Cluster Structure of Thingspace where Yudkowsky talks about concepts as empirical clusters in an abstract configuration space, using a robin as an example:
… you can see a robin as a robin—brown tail, red breast, standard robin shape, maximum flying speed when unladen, its species-typical DNA and individual alleles. Or you could see a robin as a single point in a configuration space whose dimensions described everything we knew, or could know, about the robin.
… Different robins will have strong correlations between “volume” and “mass”, so the robin-points will be lined up in a fairly linear string, in those two dimensions—but the correlation won’t be exact, so we do need two separate dimensions.
This is the benefit of viewing robins as points in space: You couldn’t see the linear lineup as easily if you were just imagining the robins as cute little wing-flapping creatures.
I honestly find this passage really baffling. I isolated some areas of disagreement, but never really got them into a coherent argument:
- Empirical clusters are useful, but rely on already having the axes, i.e. you already have to have reified some things into concepts to be able to do this in the first place. This particular post skims over this weirdly quickly, and also hides the issue somewhat by using very clean ‘physicist concepts’ like mass and volume which are fairly fundamental and that have become familiar by constant use, so you’re less likely to think of making this objection.
- I really don’t know what he means by including ‘everything we knew, or could know’. Is he suggesting that this process ends somewhere, with a large finite number of clean formal statements about the robin? That statement is the one where I have the biggest ‘this doesn’t make sense’ nonverbal gap, but until I understand what his position is it’s hard to resolve it.
- How is he dealing with vague, fuzzy categories? ‘Fluffiness’, for example, is not a clean ‘physicist concept’. I could operationalise fluffiness in various numerical ways if I really wanted to (a crude one could be something to do with the average ratio of air to feathers on its surface), but saying that any one of these operationalisations is fluffiness is pretty silly.
- The general problem of categories being fuzzy at the edges, e.g. there being no precise answer for exactly which atoms are in the robin and which aren’t. Tbh I find this one incredibly boring to think or write about, because it’s sort of obvious and I can’t imagine anyone actually being confused about it, but it’s clearly related and I would have to explain why (and possibly I will struggle at that if I have to do it explicitly).
- To be fair, Yudkowsky does have some story about concepts in other posts, this isn’t complete naive empiricism. In ‘How an Algorithm Feels From Inside’ he talks about how humans also give things mental labels. We don’t just think ‘here’s a thing with a red breast and a brown tail that has feathers and can fly’. We give the whole pile of features the label ‘robin’, and talk about that. In his version this is interpreted in terms of a neural network structure, where instead of having a bunch of labels ‘red breast’, ‘brown tail’, ‘feathers’, ‘can fly’ that are all interconnected with each other, there’s a cheap solution where they’re all wired together through a central ‘robin’ node which fires if enough of them are activated. This already does a fair bit of explanatory work: it deals well with non-central examples that manage to fire the central node despite not having all the standard properties, and he’s able to use it to dissolve some dumb arguments that have gone wrong at the labelling level.
The point at which I got bogged down was in understanding how far that network structure goes. Clearly it doesn’t go very far if the nodes are all handed to you – there has to be an account of how we form them in the first place, or we’re straight back to the question of how we picked the axes labels. Trying to understand Yudkowsky’s position on this has been very confusing!
Anyway in March I accidentally happened on an old PDF of his, ‘Levels of Organization in General Intelligence’, which has confused me further. It’s a really interesting document and takes an approach I find somewhat more promising than his later stuff. It dates from before his ‘Bayesian Enlightenment’, when he decided that Bayes was the single principle underlying cognition, and he was thinking in terms of a much more patchwork approach to AI grounded in (afaict) current research on neuroscience etc. It’s also a lot more polished and optimised for looking like a normal person than the rest of his writing from that period (I think he cleaned it up later?) which makes it less fun than the crackpot autodidact stuff, but with the same general no-one-weird-trick outlook.
I asked nostalgebraist, the internet’s foremost Yudkowsky Studies scholar, because I had a feeling he’d written about this sometime, and he pointed me in the direction of this old post of his which I basically agree with.
Now – hopefully I’m starting to get to the point – back then Yudkowsky did seem to be aware of the issue of ‘where the nodes come from’, and refers to this as ‘the symbol grounding problem’, which seems to have been the term of art in AI at the time (maybe still is? I think this particular term comes from this paper by Harnad… anyway afaict it’s something like AI’s version of the old old philosophical problem of how [language, symbols, etc] gain their meaning and probably has many other names. I don’t know the history of that problem at even a superficial level but I’m starting to want to. I’d like to have a broad-strokes understanding of where to place this, but I just don’t at the moment).
It sounds like that at this time he was from the general school of ‘symbols have to ground out in our interaction with the world’, and he cites a great paper by Chalmers, French and Hofstadter on this which I really enjoyed reading. This sounds promising! Unfortunately I can’t figure out the specifics of what he’s proposing (not necessarily his fault) and he decides to chuck in his own new autodidact term of ‘holonic conflict resolution’, which does nothing to make things any clearer.
I have no idea how any of this connects to How An Algorithm Feels From The Inside and where the nodes come from. I don’t know if he threw these questions out wholesale with his Bayesian Enlightenment or whether it’s still in there somehow.
Just to confuse things further, I also looked forwards, to what he’s been writing recently. This is mostly here on his Arbital site. There’s one page that’s particularly interesting to me, on something called ‘The ontology identification problem for unreflective diamond maximizers’.
Here he’s interested in the important problem of ontology shifts, where you go from say a classical atomic hypothesis to an underlying quantum-mechanical one. He seems to be going about the problem of representing ontologies with some kind of explicitly structuralist toy problem where:
…as an unrealistic example, imagine that there are little LISP tokens representing environmental objects, and that the environmental-object-type of carbon-objects is encoded by the integer 6. Imagine also that each atom, inside this representation, is followed by a list of the other atoms to which it’s covalently bound. Then when the AI is imagining a carbon atom participating in a diamond, inside the representation we would see an object of type 6, followed by a list containing exactly four other 6-objects.
This is pure structuralism in the sense that you’ve got some thing-in-the-head that represents diamond by virtue of the symbols having the same structural relations to each other that the diamond atoms have out in the world. Obviously he’s chosen it as a toy example and clearly states that it’s unrealistic, but I don’t know, it just looks like such an unpromising research angle to me. I’m generally baffled about what MIRI are doing though and have no intuition for why they’re trying any of the things they’re trying.
It’s pretty opaque even to me why I feel this need to spend hours of my free time understanding the position of one person on the internet, but it doesn’t feel pointless. Every time I go round this loop I seem to turn up an interesting paper or grasp something slightly better, so it’s obviously filling in gaps in my own understanding, or surfacing bits of implicit understanding, or something. It’s been absorbing a lot of the same sort of energy that I’d otherwise spend on writing blog posts, though, which is one reason that the blog has gone a bit quiet.
(Another is that a lot of my reading at the moment seems to be in this general how-language-and-symbols-ground-out area, and as I said above it’s a huge field where I have no idea of even the broad outlines and am just not at the point where I want to write anything. I’m not even at the ‘let’s talk excitably about this thing I sort of understand’ level where it’s fun to write. At the same time, a lot of what I wrote about last year is already looking obvious enough to be boring!)
The Spekkens toy model
Last time I talked a bit about this paper by van Enk that I’m concentrating on understanding this year. This paper is an extension of a more well-known paper by Robert Spekkens, which is a favourite of mine. It introduces a very simple toy model – literally just four boxes and a rule about how much knowledge we can obtain – which reproduces a number of the counterintuitive features of quantum mechanics (not all of them though). This month I’m just going to introduce the basic setup of the model that’s needed to understand the van Enk model – I won’t really get on to what you can do with it.
None of this section requires an understanding of quantum mechanics. It’s just a thing that could be in one of four boxes.
We can think of these four boxes as being arranged in a fully Ribbonfarm-compliant 2×2 as follows:

The system is in one of these four states (labelled the ontic states as they correspond to what’s actually there). But an observer might not know the exact state, so an observer also has an epistemic state representing their knowledge of the system, which takes the form of a probability distribution over the four states. For example, they might just have no idea what state it’s in:

So in this case their epistemic state just assigns ¼ probability to each box.
The core of the Spekkens model is that the observer is unable to know the true ontic state. This is just assumed as a basic axiom of the model, the Knowledge Balance Principle, which states that
At most you can obtain half the knowledge of the system.
We need to be able to quantify what we mean by knowledge, but fortunately in this case we can pick something pretty simple. There are four boxes, so two yes/no questions would be enough to pick out one ontic state for definite. Spekkens defines three canonical yes/no questions, which are the following:
0 – Is it in the top two boxes?
1 – Is it in the left hand two boxes?
2 – Is it in the two boxes making up the top-left/bottom-right diagonal?
(I’m starting from 0 to match what van Enk does later.)
I’ve drawn the six canonical epistemic states corresponding to all of the possible answers to one question below:

Now if you obtained the answer to any two of these questions, you’d know the ontic state for certain, which is against the Knowledge Balance Principle. So in practice, the principle means that you can only learn the answer to one question.
(The model manages this by shuffling the ontic state around after each question in a way that stops you ever learning everything. That gives rise to some ‘dynamics’ for the system. But that’s not particularly relevant right now.)
Pictures are more fun than algebra, but sometimes algebra is preferable for actually calculating shit. So the van Enk model formalises all this stuff into algebra instead of boxes. He defines two variables X_a and X_b, which can take two values, 0 or 1. I’ve drawn these on the boxes as follows:

So then the canonical questions become
0 – Does X_a = 0?
1 – Does X_b = 0?
2 – Does X_a + X_b = 0?
Now question 2 might look wrong… if X_a = 1 and X_b = 1, as in the bottom right cell, then the answer should be 2. But the other feature of the formalisation is that we work mod 2, so that 1 + 1 = 0.
Next van Enk introduces some probabilities Q_0, Q_1, Q_2, for each of the three questions 0. to 2. If we know for sure that the answer to the first question is 0, he writes this as Q_0(0) = 1. (At the same time, we’d also know that Q_0(1) = 0, as it the answer definitely isn’t 1.)
The six canonical epistemic states are then formalised as sets of answers to the three questions, where you know one for certain and have basically no idea about the other two. E.g. ‘it’s in the top two boxes’ corresponds to the set
Q_0(0) = 1
Q_1(0) = ½
Q_2(0) = ½
This is unnecessarily complicating things for the original Spekkens model, which just uses the six canonical states. For those, you may as well just draw the boxes. But van Enk’s extension of the model is basically something like ‘what if we ask a bit of question 1, a bit of question 2 and a bit of question 3, such that somehow we end up with half the knowledge total?’ In this case there could be all sorts of values for Q_0, Q_1 and Q_2 as long as the knowledge requirement is satisfied.
However at this point, we also need to get more sophisticated about the definition of knowledge. This is the tricky bit of the van Enk paper and I spent some time thinking about it this month…
Entropy
I’m normally bad at getting interested in one thing, getting a short way with it, and then jumping focus to something else, or just following a rabbit hole til I’m miles away from the original topic. I’ve got ‘explore’ down, but have never really cracked the ‘exploit’ bit. It’s not as bad as it could be, because there’s a fairly small set of things in physics that I like to jump between, so eventually I end up back at the same one again. I’d like to try having more focus though.
That’s one of the reasons I’m trying something different this year. The van Enk paper is 7 pages long and pretty self-contained, and I expect to spend all year thinking about it (there’s enough going on in those 7 pages that that should be possible). Of course nothing is ever completely self-contained, and all papers leak somewhere, so there is still the danger of following the leaks down some pretty convoluted rabbit holes.
The paper has one bad leak. As described in the previous section, the model depends on only being able to access ‘half the information’ in the toy system. In the original Spekkens model, this just means ‘knowing that the state is in one of two boxes, out of four total’, so this isn’t really a very a complicated concept. The van Enk model extends this to partial knowledge, so you could for example know a little about all four states and somehow have it all add up to half the information. This requires a lot more poking around working out exactly how to quantify ‘half the information’.
The most obvious way of quantifying information would be to use the Shannon entropy. This paper doesn’t do that. Instead, it mostly works with a more general version of entropy, the Rényi entropy, which is actually more like a family of entropies that includes the Shannon entropy as a special case. It also includes some other special case called the collision entropy, and that’s the one that produces the really interesting, ‘quantum-like’ model with the negative probabilities and the right Bell’s theorem violation and so forth.
I haven’t spent huge amounts of time on this, and I’m not very used to thinking about entropy in the information sense rather than the traditional thermodynamics sense, but the underlying idea of the Rényi entropy is something like this. We’re trying to distribute probabilities between four boxes, and the total will add up to 1. (Forget the ‘half the information’ constraint for now, this is just any way of carving up the probabilities between boxes.) We could write this as a vector, for example [0.2, 0.4, 0.3, 0.1].
If your vector is [0.25, 0.25, 0.25, 0.25] then you know nothing at all about which box it’s in and your probability is completely spread out. If it’s [0, 0, 1, 0], you know for sure that it’s in the third box and your probability is as concentrated as possible. [0.2, 0.4, 0.3, 0.1] is somewhere in between.
Entropy is a measure of how concentrated the probability is. We want highly concentrated systems to have low entropy, and highly spread out systems to have low entropy.
The Rényi entropy uses a measure of concentration called ‘majorisation’, which works in the following way:
1 – Reorder the elements of the vector in descending order:
[1 0 0 0]
[0.4 0.3 0.2 0.1]
[0.25 0.25 0.25 0.25]
2 – Compute partial sums (i.e the sums of the first n elements in the vector):
[1 1 1 1]
[0.4 0.7 0.9 1]
[0.25 0.5 0.75 1]
If all the partial sums of one vector x are bigger than another vector y, we say that x ‘majorises’ y. [1 0 0 0] majorises the others because each partial sum is higher than for the other vectors – it gets to 1 immediately and stays there. Whereas [0.25 0.25 0.25 0.25] is majorised by everything – it’s the slowest way to get to 1.
If vector x majorises vector y, it’s more concentrated and should have lower entropy – writing the entropy as H because people always do, you want H(x) < H(y). Functions of this sort are called Schur concave, apparently.
Anyway, the Rényi entropies all have this same Schur concave behaviour. They depend on a free parameter p, and roughly speaking the larger p is the more weight it gives to more probable states. More precisely, the Renyi entropy is related to the p-norm… the Shannon entropy is the p=1 case and the collision entropy is the p=2 case. Those are the two norms that come up all the time, and the 2-norm is all over the place in quantum physics, so this looks quite nice!
I’m still pretty hazy on exactly why this quantity might be good as an entropy measure for quantum mechanics, though… van Enk quotes a paper that argues for it, but I don’t find that paper very clear, and I then found this paper that’s critical of it, and basically it looks like there’s a fair bit of controversy there. This could be an annoying rabbit hole!
Final thoughts
OK, well if you’ve actually read all of these ramblings you are probably some kind of April fool 🙂
Seriously though, thanks again for following along with this experiment. No idea what the reading experience is like, but I’m definitely enjoying writing these and plan to do at least a few more. I’m impressed at how well it’s motivating me to do things like reviewing and writing up, which I normally think of as very dull and unappealing.
I started an extremely basic diary for the month in a Google Doc to keep track of what I was doing, which is more than I can normally be bothered to do. Most days I wrote a very short comment, but a couple of times a long spontaneous rant appeared which I could stick into here with a few edits. So some of the work was done before the end-of-the-month deadline.
Next month is probably going to be a low effort sort of consolidation month, where I write a load of stuff up and don’t try and push forward much. Also the days are getting longer, so I want to spend more of my weekends out walking and less time churning through old crap on the internet. I have a fair bit of leftover material from March that I didn’t get around to, so I may just end up writing about that.
Again, if this isn’t what you expected please let me know and I’ll unsubscribe you! And if you have any questions or comments, I’d like to hear them…
Cheers,
Lucy