Bell’s theorem and Mermin’s machine

> Anybody who’s not bothered by Bell’s theorem has to have rocks in his head.

— ‘A distinguished Princeton physicist’, as told to David Mermin

This post is a long, idiosyncratic discussion of the Bell inequalities in quantum physics. There are plenty of good introductions already, so this is a bit of a weird thing to spend my time writing. But I wanted something very specific, and couldn’t find an existing version that had all the right pieces. So of course I had to spend far too much time making one.

My favourite introduction is Mermin’s wonderful Quantum Mysteries for Anyone. This is an absolute classic of clear explanation, and lots of modern pop science discussions derive from it. It’s been optimised for giving a really intense gut punch of NOTHING IN THE WORLD MAKES SENSE ANY MORE, which I’d argue is the main thing you want to get out of learning about the Bell inequalities.

However, at some point if you get serious you’ll want to actually calculate things, which means you’ll need to make the jump from Mermin’s version to the kind of exposition you see in a textbook. The most common modern version of the Bell inequalities you’ll see is the CHSH inequality, which looks like this:

| E(a,b) + E(\bar{a}, b) + E(a, \bar{b}) - E(\bar{a}, \bar{b}) | < 2

(It doesn’t matter what all of that means, at the moment… I’ll get to that later.) The standard sort of derivations of this tend to involve a lot of fussing with algebraic rearrangements and integrals full of \lambdas and so forth. The final result is less of a gut punch and more of a diffuse feeling of unease: "well I guess this number has to be between -2 and 2, but it isn’t".

This feels like a problem to me. There’s a 1929 New Yorker cartoon which depicts ordinary people in the street walking around dumbstruck by Einstein’s theory of general relativity. This is a comic idea because the theory was famously abstruse (particularly back then when good secondary explanations were thin on the ground). But the Bell inequalities are accessible to anyone with a very basic knowledge of maths, and weirder than anything in relativity. I genuinely think that everyone should be walking down the street clutching their heads in shock at the Bell inequalities, and a good introduction should help deliver you to this state. (If you don’t have rocks in your head, of course. In that case nothing will help you.)

It’s also a bit of an opaque black box. For example, why is there a minus sign in front of one of the Es but not the others? I was in a discussion group a few years back with a bunch of postdocs and PhD students, all of us with a pretty strong interest in quantum foundations, and CHSH came up at some point. None of us had much of a gut sense for what that minus sign was doing… it was just something that turned up during some algebra.

I wanted to trace a path from Mermin’s explanation to the textbook one, in the hope of propagating some of that intuitive force forward. I wrote an early draft of the first part of this post for a newsletter in 2018 but couldn’t see how to make the rest of it work, so I dropped it. This time I had a lot more success using some ideas I learned in the meantime. I ended up taking a detour through a third type of explanation, the ‘logical Bell inequalities’ approach of Abramsky and Hardy. This is a general method that can be used on a number of other similar ‘no-go theorems’, not just Bell’s original. It gives a lot more insight into what’s actually going on (including that pesky minus sign). It’s also surprisingly straightforward: the main result is a few steps of propositional logic.

That bit of propositional logic is the most mathematically involved part of this post. The early part just requires some arithmetic and the willingness to follow what Mermin calls ‘a simple counting argument on the level of a newspaper braintwister’. No understanding of the mathematics of quantum theory is needed at all! That’s because I’m only talking about why the results of quantum theory are weird, and not how the calculations that produce those results are done.

If you also want to learn to do the calculations, starting from a basic knowledge of linear algebra and complex numbers, I really like Michael Nielsen and Andy Matuschak’s Quantum Country, which covers the basic principles of quantum mechanics and also the Bell inequalities. You’d need to do the ‘Quantum computing for the very curious’ part, which introduces a lot of background ideas, and then the ‘Quantum mechanics distilled’ part, which has the principles and the Bell stuff.

There’s also nothing about how the weirdness should be interpreted, because that is an enormous 90-year-old can of rotten worms and I would like to finish this post some time in my life 🙂

Mermin’s machine

So, on to Mermin’s explanation. I can’t really improve on it, and it would be a good idea to go and read that now instead, and come back to my version afterwards. I’ve repeated it here anyway though, partly for completeness and partly because I’ve changed some notation and other details to mesh better with the Abramsky and Hardy version I’ll come to later.

(Boring paragraph on exactly what I changed, skip if you don’t care: I’ve switched Mermin’s ‘red’ and ‘green’ to ‘true’ and ‘false’, and the dial settings from 1,2,3 on both sides to a, a', a'' on the left side and b, b', b'' on the right side. I’ve also made one slightly more substantive change. Mermin explains at the end of his paper that in his setup, ‘One detector flashes red or green according to whether the measured spin is along or opposite to the field; the other uses the opposite color convention’. I didn’t want to introduce the complication of having the two detectors with opposite wiring, and have made them both respond the same way, flashing T for along the field and F for opposite. But I also wanted to keep Mermin’s results. To do that I had to change the dial positions of the right hand dial, so that a is opposite b, a' is opposite b', and a'' is opposite b''. )

Anyway, Mermin introduces the following setup:



The machine in the middle is the source. It fires out some kind of particle – photons, electrons, frozen peas, whatever. We don’t really care how it works, we’ll just be looking at why the results are weird.

The two machines on the right and left side are detectors. Each detector has a dial with three settings. On the left they’re labelled a, a' and a''. On the right, they’re b, b' and b''.

On the top of each are two lights marked T and F for true and false. (Again, we don’t really care what’s true or false, we’re keeping everything at a kind of abstract, operational level and not going into the practical details. It’s just two possible results of a measurement.)

It’s vital to this experiment that the two detectors cannot communicate at all. If they can, there’s nothing weird about the results. So assume that a lot of work has gone into making absolutely sure that the detectors are definitely not sharing information in any way at all.

Now the experiment just consists of firing out pairs of particles, one to each detector, with the dials set to different values, and recording whether the lights flash red or green. So you get a big list of results of the form

ab'TF, a''bFT, a'b'FF, ...

The second important point, other than the detectors not being able to communicate, is that you have a free choice of setting the dials. You can set them both beforehand, or when the particles are both ‘in flight’, or even set the right hand dial after the left hand detector has already received its particle but before the right hand particle gets there. It doesn’t matter.

Now you do like a million billion runs of this experiment, enough to convince you that the results are not some weird statistical fluctuation, and analyse the results. You end up with the following table:


Dial setting (T,T) (T,F) (F,T) (F,F)
ab 1/2 0 0 1/2
ab' 1/8 3/8 3/8 1/8
ab'' 1/8 3/8 3/8 1/8
a'b 1/8 3/8 3/8 1/8
a'b' 1/2 0 0 1/2
a'b'' 1/8 3/8 3/8 1/8
a''b 1/8 3/8 3/8 1/8
a''b' 1/8 3/8 3/8 1/8
a''b'' 1/2 0 0 1/2

Each dial setting has a row, and the entries in that row give the probabilities for getting the different results. So for instance if you set the dials to a' and b, there’s a 1/8 chance of getting (T,T).

This doesn’t obviously look particularly weird at first sight. It only turns out to be weird when you start analysing the results. Mermin condenses two results from this table which are enough to show the weirdness. The first is:

Result 1: This result relates to the cases where the two dials are set to ab, a'b', or a''b''. In these cases both lights always flash the same colour. So you might get ab TT, ab FF, a'b' TT etc, but never ab TF or a''b'' FT.

This is pretty easy to explain. The detectors can’t communicate, so if they do the same thing it must be something to do with the properties of the particles they are receiving. We can explain it straightforwardly by postulating that each particle has an internal state with three properties, one for each dial position. Each of these takes two possible values which we label T or F. We can write these states as e.g.

TTF

TTF

where the the entries on the top line refer to the left hand particle’s state when the dial is in the a, a' and a'' positions respectively, and the bottom line refers to the right hand particle’s state when the dial is in the b, b', b'' position.

Result 1 implies that the states of the two particles must always be the same. So the state above is an allowed one, but e.g.

TTF

TFF

isn’t.

Mermin says:

> This hypothesis is the obvious way to account for what happens in [Result 1]. I cannot prove that it is the only way, but I challenge the reader, given the lack of connections between the devices, to suggest any other.

Because the second particle will always have the same state to the first one, I’ll save some typing and just write the first one out as a shorthand. So the first example state will just become TTF.

Now on to the second result. This one covers the remaining options for dial settings, a'b', a''b and the like.

Result 2: For the remaining states, the lights flash the same colour 1/4 of the time, and different colours 3/4 of the time.

This looks quite innocuous on first sight. It’s only when you start to consider how it meshes with Result 1 that things get weird.

(This is the part of the explanation that requires some thinking ‘on the level of a newspaper braintwister’. It’s fairly painless and will be over soon.)

Our explanation for result 1 is that particles in each run of the experiment have an underlying state, and both particles have the same state. Let’s go through the implications of this, starting with the example state TTF.

I’ve enumerated the various options for the dials in the table below. For example, if the left dial is a and the right dial is b', we know that the left detector will light up T and the right will light up T, so the two lights are the same.


Dial setting Lights
ab' same
ab'' different
a'b same
a'b'' different
a''b different
a''b' different

Overall there’s a 1/3 chance of being the same and a 2/3 chance of being different. You can convince yourself that this is also true for all the states with two Ts and an F or vice versa: TTF TFF, TFT, FTT, FTF, FFT.

That leaves TTT and FFF as the other two options. In those cases the lights will flash the same colour no matter what the dial is set to.

So whatever the underlying state is, the chance of the two lights being different is greater than ⅓. But this is incompatible with Result 2, which says that the probability is ¼.

(The thinky part is now done.)

So Results 1 and 2 together are completely bizarre. No assignment of states will work. But this is exactly what happens in quantum mechanics!

You probably can’t do it with frozen peas, though. The details don’t matter for this post, but here’s a very brief description if you want it: the particles should be two spin-half particles prepared in a specific ‘singlet’ state, the dials should connect to magnets that can be oriented in three states at 120 degree angles from each other, and the lights on the detectors measure spin along and opposite to the field. The magnets should be set up so that the state for setting a on the left hand side is oriented at 180 degrees from the state for setting b on the right hand side; similarly a' should be opposite b' and a'' opposite b''. I’ve drawn the dials on the machine to match this. Quantum mechanics then says that the probabilities of the different results are

p(T,T) = p(F,F) = \frac{1}{2}\cos^2{\frac{\theta}{2}}

p(T,F) = p(F,T) = \frac{1}{2}\sin^2{\frac{\theta}{2}}

where \theta is the angle between the magnet states on the left and right sides. This reproduces the numbers in the table above.

Once more with less thinking

Mermin’s argument is clear and compelling. The only problem with it is that you have to do some thinking. There are clever details that apply to this particular case, and if you want to do another case you’ll have to do more thinking. Not good. This is where Abramsky and Hardy’s logical Bell approach comes in. It requires more upfront setup (so actually more thinking in the short term – this section title is kind of a lie, sorry) but can then be applied systematically to all kinds of problems.

This first involves reframing the entries in the probability table in terms of propositional logic. For example, we can write the result (T,F) for (a’,b) as a' \land \lnot b. Then the entries of the table correspond to the probabilities we assign to each statement: in this case, \text{prob}(a' \land \lnot b) = \frac{3}{8}.

Now, look at the following highlighted cells in three rows of the grid:


Dial setting (T,T) (T,F) (F,T) (F,F)
ab 1/2 0 0 1/2
ab' 1/8 3/8 3/8 1/8
ab'' 1/8 3/8 3/8 1/8
a'b 1/8 3/8 3/8 1/8
a'b' 1/2 0 0 1/2
a'b'' 1/8 3/8 3/8 1/8
a''b 1/8 3/8 3/8 1/8
a''b' 1/8 3/8 3/8 1/8
a''b'' 1/2 0 0 1/2

These correspond to the three propositions

\phi_1 = (a\land b) \lor (\lnot a \land\lnot b)

\phi_2 = (a'\land b') \lor (\lnot a' \land\lnot b')

\phi_3 = (a''\land b'') \lor (\lnot a'' \land\lnot b'') ,

which can be written more simply as

\phi_1 = a \leftrightarrow b

\phi_2 = a' \leftrightarrow b'

\phi_3 = a'' \leftrightarrow b''.

where the \leftrightarrow stands for logical equivalence. This also means that a can be substituted for b, and so on, which will be useful in a minute.

Next, look at the highlighted cells in these three rows:


Dial setting (T,T) (T,F) (F,T) (F,F)
ab 1/2 0 0 1/2
ab' 1/8 3/8 3/8 1/8
ab'' 1/8 3/8 3/8 1/8
a'b 1/8 3/8 3/8 1/8
a'b' 1/2 0 0 1/2
a'b'' 1/8 3/8 3/8 1/8
a''b 1/8 3/8 3/8 1/8
a''b' 1/8 3/8 3/8 1/8
a''b'' 1/2 0 0 1/2

These correspond to

\phi_4 = (a\land \lnot b') \lor (\lnot a \land b')

\phi_5 = (a\land \lnot b'') \lor \lnot (a \land b'')

\phi_6 = (a'\land \lnot b'') \lor (\lnot a' \land b'') ,

which can be simplified to

\phi_4 = a \oplus b'

\phi_5 = a \oplus b''

\phi_6 = a' \oplus b''.

where the \oplus stands for exclusive or.

Now it can be shown quite quickly that these six propositions are mutually contradictory. First use the first three propositions to get rid of b , b' and b'', leaving

a \oplus a'

a \oplus a''

a' \oplus a''

You can check that these are contradictory by drawing out the truth table, or maybe just by looking at them, or maybe by considering the following stupid dialogue for a while (this post is long and I have to entertain myself somehow):


Grumpy cook 1: You must have either beans or chips but not both.

Me: OK, I’ll have chips.

Grumpy cook 2: Yeah, and also you must have either beans or peas but not both.

Me: Fine, looks like I’m having chips and peas.

Grumpy cook 3: Yeah, and also you must have either chips or peas but not both.

Me:

Me: OK let’s back up a bit. I’d better have beans instead of chips.

Grumpy cook 1: You must have either beans or chips but not both.

Me: I know. No chips. Just beans.

Grumpy cook 2: Yeah, and also you must have either beans or peas but not both.

Me: Well I’ve already got to have beans. But I can’t have them with chips or peas. Got anything else?

Grumpy cook 3: NO! And remember, you must have either chips or peas.

Me: hurls tray


So, yep, the six highlighted propositions are inconsistent. But this wouldn’t necessarily matter, as some of the propositions are only probabilistically true. So you could imagine that, if you carefully set some of them to false in the right ways in each run, you could avoid the contradiction. However, we saw with Mermin’s argument above that this doesn’t save the situation – the propositions have ‘too much probability in total’, in some sense, to allow you to do this. Abramsky and Hardy’s logical Bell inequalities will quantify this vague ‘too much probability in total’ idea.

Logical Bell inequalities

This bit involves a few lines of logical reasoning. We’ve got a set of propositions \phi_i (six of them in this example case, N in general), each with probability p_i. Let P be the probability of all of them happening together. Call this combined statement

\Phi = \bigwedge_i \phi_i.

Then

1 - P = \text{prob}\left( \lnot\Phi\right) = \text{prob}\left(\bigvee_i \lnot\phi_i\right)

where the second equivalence is de Morgan’s law. This is definitely less than the sum of the probabilities of all the \lnot\phi_i s:

1 - P \leq \text{prob} \sum_i (\lnot\phi_i)

= \sum_i (1 - p_i)

= N - \sum_i p_i .

where N is the total number of propositions. Rearranging gives

\sum_i p_i \leq N + P - 1.

Now suppose the \phi_i are jointly contradictory, as in the Mermin example above, so that the combined probability P = 0. This gives the logical Bell inequality

\sum_i p_i \leq N-1 .

This is the precise version of the ‘too much probability’ idea. In the Mermin case, there are six propositions, three with probability 1 and three with probability ¾, which sum to 5.25. This is greater than N-1 = 5, so the inequality is violated.

This inequality can be applied to lots of different setups, not just Mermin’s. Abramsky and Hardy use the CHSH inequality mentioned in the introduction to this post as their first example. This is probably the common example used to introduce Bell’s theorem, though the notation is usually somewhat different. I’ll go though Abramsky and Hardy’s version and then connect it back to the standard textbook notation.

The CHSH inequality

The CHSH experiment only uses two settings on each side, not three. I’ve drawn a ‘CHSH machine’ in the style of Mermin’s machine to illustrate it:



There are two settings a and \bar{a} on the left side, 60 degrees apart. And there are two settings b and \bar{b} on the right side, also 60 degrees apart, with b opposite a. This leads to the following table:


Dial setting (T,T) (T,F) (F,T) (F,F)
ab 1/2 0 0 1/2
a\bar{b} 3/8 1/8 1/8 3/8
\bar{a}b 3/8 1/8 1/8 3/8
\bar{a}\bar{b} 1/8 3/8 3/8 1/8

Now it’s just a case of following the same reasoning as for the Mermin case. The highlighted rows correspond to the propositions

\phi_1 = (a \land b) \lor  \lnot (a \land \lnot b) = a \leftrightarrow b

\phi_2 = (a \land \bar{b}) \lor \lnot (a \land \lnot \bar{b}) = a \leftrightarrow \bar{b}

\phi_3 = (\bar{a} \land b) \lor \lnot (\bar{a} \land \lnot b) = \bar{a} \leftrightarrow b

\phi_4 = (\lnot \bar{a} \land \bar{b}) \lor (\bar{a} \land \lnot \bar{b}) = \bar{a} \oplus \bar{b}

As with Mermin’s example, these four propositions can be seen to be contradictory. Rather than trying to make up more stupid dialogues, I’ll just follow the method in the paper. First use \phi_3 to replace \bar{a} with b in \phi_4:

\phi_4 = b \oplus \bar{b} .

Then use \phi_1 to swap out b again, this time with a:

\phi_4 = a \oplus \bar{b} .

Finally use \phi_2 to swap out a with \bar{b}, leaving

\bar{b} \oplus \bar{b}

which is clearly contradictory.

(Sidenote: I guess these sort of arguments to show a contradiction do involve some thinking, which is what I was trying to avoid earlier. But in each case you could just draw out a truth table, which is a stupid method that a computer could do. So I think it’s reasonable to say that this is less thinking than Mermin’s method.)

Again, this violates the logical Bell inequality. In total, we have

\sum_i p_i = 1 + \frac{3}{4}  + \frac{3}{4}  + \frac{3}{4} = 3.25 > 3.

The textbook version of this inequality is a bit different. For a start, it uses an ‘expectation value’ for each proposition rather than a straightforward probability, where truth is associated with +1 and falsity with -1. So each proposition \phi_i has an expectation value E_i with

E_i = (+1)\cdot p_i + (-1)\cdot (1-p_i) = 2p_i -1.

Then summing over the E_is gives

\sum_i E_i = \sum_i (2p_i-1) = 2\sum_i p_i - N

and then, using the previous form of the logical Bell inequality,

\sum_i E_i \leq 2(N-1) - N = N-2.

A similar argument for -E_i shows that \sum_i E_i \geq -(N-2), so that this is a bound above and below:

|\sum_i E_i| \leq N - 2.

In this case N = 4 and so the inequality becomes |\sum_i E_i| \leq 2. However adding up the E_is associated to the propositions \phi_i gives 2.5, so the inequality is violated.

There’s still a little further to go to get the textbook version, but we’re getting close. The textbook version writes the CHSH inequality as

| E(a,b) + E(\bar{a}, b) + E(a, \bar{b}) - E(\bar{a}, \bar{b}) | < 2.

where the expectation value is written in the form

E(a,b) = \int A(a,\lambda) B(b, \lambda)\rho(\lambda) d\lambda.

The \lambda are ‘hidden variables’ – properties of the particles that dispose them to act in various ways. For example, in the Mermin case, we imagined them to have hidden states, like

TFF

TFF

that controlled their response to each dial, and showed that any choice of these hidden states would lead to a contradiction.

For a given \lambda, A(\lambda, a) and B(\lambda, b) are the values measured by the left and right hand machines respectively. In our case these values are always either +1 (if the machine flashes T) or -1 (if the machine flashes F). The CHSH argument can also be adapted to a more realistic case where some experimental runs have no detection at all, and the outcome can also be 0, but this simple version won’t do that.

For the dial settings a and b, all we care about with these hidden variables is whether they make the machines respond true or false. So in our case \lambda is just a set of four variables, \lambda = { a\land b, a\land \lnot b, \lnot a\land b, \lnot a\land\lnot b }, and the integral can just become a sum:

E(a,b) = (+1 \times +1)\cdot p(a\land b) + (+1 \times -1)\cdot p(a\land \lnot b) + (-1 \times +1)\cdot p(\lnot a\land b) + (-1 \times -1)\cdot p(\lnot a\land \lnot b)

= p(a\land b) + p(\lnot a\land \lnot b) - p(a\land \lnot b) - p(\lnot a\land b).

= p((a\land b) \lor \lnot (a\land \lnot b)) - p((a\land \lnot b) \lor(\lnot a\land b)).

Now that first proposition (a\land b) \lor \lnot (a\land \lnot b) is just \phi_1 from earlier, which had probability p_1. And the second one covers all the remaining possibilities, so it has probability 1-p_1. So

E(a,b) = p_1 - (1-p_1) = 2p_1 - 1 = E_1.

The argument goes through exactly the same way for E(a, \bar{b}) and E(\bar{a}, b). The last case, E(\bar{a}, \bar{b}), is slightly different. We get

E(\bar{a}, \bar{b}) = p((\bar{a}\land \bar{b}) \lor \lnot (\bar{a}\land \lnot \bar{b})) - p((\bar{a}\land \lnot \bar{b}) \lor(\lnot \bar{a}\land \bar{b}))

following the same logic as before. But this time \phi_4 matches the second proposition (\bar{a}\land \lnot \bar{b}) \lor(\lnot \bar{a}\land \bar{b}), not the first, so that

E(\bar{a}, \bar{b}) = (1-p_4) - p_4 = 1 - 2p_4 = -E_4.

This is where the minus sign in the CHSH inequality comes in! We have

|\sum_i E_i| = | E(a, b) + E(a, \bar{b}) + E(\bar{a}, b) - E(\bar{a}, \bar{b}) | \leq 2.

So we end up with the standard inequality, but with a bit more insight into where the pieces come from. Also, importantly, it’s easy to extend to other situations. For example, you could follow the same method with the six Mermin propositions from earlier to make a kind of ‘Mermin-CHSH inequality’:

|\sum_i E_i| = | E(a, b) + E(a', b') + E(a'', b'') - E(a, b') - E(a, b'') - E(a', b'') | \leq 4.

Or you could have three particles, or a different set of measurements, or you could investigate what happens with other tables of correlations that don’t appear in quantum physics… this is a very versatile setup. The original paper has many more examples.

Final thoughts

There are still some loose ends that it would be good to tie up. I’d like to understand exactly how the inequality-shuffling in a ‘textbook-style’ proof of the CHSH inequality connects to Abramsky and Hardy’s version. Presumably some of it is replicating the same argument, but in a more opaque form. But also some of it must need to deal with the fact that it’s a more general setting, and includes things like measurements returning 0 as well as +1 or -1. It would be nice to figure out which bits are which. I think Bell’s original paper didn’t have the zero thing either, so that could be one place to look.

On the other hand… that all sounds a bit like work, and I can’t be bothered for now. I’d rather apply some of this to something interesting. My next post is probably going to make some connections between the logical Bell inequalities and my previous two posts on negative probability.

If you know the answers to my questions above and can save me some work, please let me know in the comments! Also, I’d really like to know if I’ve got something wrong. There are a lot of equations in this post and I’m sure to have cocked up at least one of them. More worryingly, I might have messed up some more conceptual points. If I’ve done that I’m even more keen to know!

Speedrun: The Vygotsky Circle

I did a ‘speedrun’ post a couple of months ago where I set a one hour timer and tried to find out as much as I could about Marx’s theory of alienation. That turned out to be pretty fun, so I’m going to try it again with another topic where I have about an hour’s worth of curiosity.

I saw a wikipedia link to something called ‘the Vygotsky Circle’ a while back. I didn’t click the link (don’t want to spoil the fun!) but from the hoverover it looks like that includes Vygotsky, Luria and… some other Russian psychologists, I guess? I’d heard of those two, but I only have the faintest idea of what they did. Here’s the entirety of my current knowledge:

  • Vygotsky wrote a book called Thought and Language. Something about internalisation?
  • Luria’s the one who went around pestering peasants with questions about whether bears in the Arctic are white. And presumably a load of other stuff… he pops up in pop books with some frequency. E.g. I think he did a study of someone with an extraordinary memory?

That’s about it, so plenty of room to learn more. And also anything sounds about ten times more interesting if it’s a Circle. Suddenly it’s an intellectual movement, not a disparate bunch of nerds. So… let’s give this a go.


OK first go to that wiki article.

The Vygotsky Circle (also known as Vygotsky–Luria Circle[1][2]) was an influential informal network of psychologists, educationalists, medical specialists, physiologists, and neuroscientists, associated with Lev Vygotsky (1896–1934) and Alexander Luria (1902–1977), active in 1920-early 1940s in the Soviet Union (Moscow, Leningrad and Kharkiv).

So who’s in it?

The Circle included altogether around three dozen individuals at different periods, including Leonid Sakharov, Boris Varshava, Nikolai Bernstein, Solomon Gellerstein, Mark Lebedinsky, Leonid Zankov, Aleksei N. Leontiev, Alexander Zaporozhets, Daniil Elkonin, Lydia Bozhovich, Bluma Zeigarnik, Filipp Bassin, and many others. German-American psychologist Kurt Lewin and Russian film director and art theorist Sergei Eisenstein are also mentioned as the “peripheral members” of the Circle.

OK that’s a lot of people! Hm this is a very short article. Maybe the Russian one is longer? Nope. So this is the entirety of the history of the Circle given:

The Vygotsky Circle was formed around 1924 in Moscow after Vygotsky moved there from the provincial town of Gomel in Belarus. There at the Institute of Psychology he met graduate students Zankov, Solov’ev, Sakharov, and Varshava, as well as future collaborator Aleksander Luria.[5]:427–428 The group grew incrementally and operated in Moscow, Kharkiv, and Leningrad; all in the Soviet Union. From the beginning of World War II 1 Sept 1939 to the start of the Great Patriotic War, 22 June 1941, several centers of post-Vygotskian research were formed by Luria, Leontiev, Zankov, and Elkonin. The Circle ended, however, when the Soviet Union was invaded by Germany to start the Great Patriotic War.

However, by the end of 1930s a new center was formed around 1939 under the leadership of Luria and Leontiev. In the after-war period this developed into the so-called the “School of Vygotsky-Leontiev-Luria”. Recent studies show that this “school” never existed as such.

There are two problems that are related to the Vygotsky circle. First was the historical recording of the Soviet psychology with innumerable gaps in time and prejudice. Second was the almost exclusive focus on the person, Lev Vygotsky, himself to the extent that the scientific contributions of other notable characters have been considerably downplayed or forgotten.

This is all a bit more nebulous than I was hoping for. Lots of references and sources at least. May end up just covering Vygotsky and Luria.

OK Vygotsky wiki article. What did he do?

He is known for his concept of the zone of proximal development (ZPD): the distance between what a student (apprentice, new employee, etc.) can do on their own, and what they can accomplish with the support of someone more knowledgeable about the activity. Vygotsky saw the ZPD as a measure of skills that are in the process of maturing, as supplement to measures of development that only look at a learner’s independent ability.

Also influential are his works on the relationship between language and thought, the development of language, and a general theory of development through actions and relationships in a socio-cultural environment.

OK here’s the internalisation thing I vaguely remembered hearing about:

… the majority of his work involved the study of infant and child behavior, as well as the development of language acquisition (such as the importance of pointing and inner speech[5]) …

Influenced by Piaget, but differed on inner speech:

Piaget asserted that egocentric speech in children “dissolved away” as they matured, while Vygotsky maintained that egocentric speech became internalized, what we now call “inner speech”.

Not sure I’ve picked a good topic this time, pulls in way too many directions so this is going to be very shallow and skip around. And ofc there’s lots of confusing turbulent historical background, and all these pages refer to various controversies of interpretation 😦 Skip to Luria, can always come back:

Alexander Romanovich Luria (Russian: Алекса́ндр Рома́нович Лу́рия, IPA: [ˈlurʲɪjə]; 16 July 1902 – 14 August 1977) was a Russian neuropsychologist, often credited as a father of modern neuropsychological assessment. He developed an extensive and original battery of neuropsychological tests during his clinical work with brain-injured victims of World War II, which are still used in various forms. He made an in-depth analysis of the functioning of various brain regions and integrative processes of the brain in general. Luria’s magnum opus, Higher Cortical Functions in Man (1962), is a much-used psychological textbook which has been translated into many languages and which he supplemented with The Working Brain in 1973.

… became famous for his studies of low-educated populations in the south of the Soviet Union showing that they use different categorization than the educated world (determined by functionality of their tools).

OK so this was early on.

Some biographical stuff. Born in Kazan, studied there, then moved to Moscow where he met Vygotsky. And others:

During the 1920s Luria also met a large number of scholars, including Aleksei N. Leontiev, Mark Lebedinsky, Alexander Zaporozhets, Bluma Zeigarnik, many of whom would remain his lifelong colleagues.

Leontiev’s turned up a few times, open in another tab.

OK the phrase ‘cultural-historical psychology’ has come up. Open the wikipedia page:

Cultural-historical psychology is a branch of avant-garde and futuristic psychological theory and practice of the “science of Superman” associated with Lev Vygotsky and Alexander Luria and their Circle, who initiated it in the mid-1920s–1930s.[1] The phrase “cultural-historical psychology” never occurs in the writings of Vygotsky, and was subsequently ascribed to him by his critics and followers alike, yet it is under this title that this intellectual movement is now widely known.

This all sounds like a confusing mess where I’d need to learn way more background than I’m going to pick up in an hour. Back to Luria. Here’s the peasant-bothering stuff:

The 1930s were significant to Luria because his studies of indigenous people opened the field of multiculturalism to his general interests.[12] This interest would be revived in the later twentieth century by a variety of scholars and researchers who began studying and defending indigenous peoples throughout the world. Luria’s work continued in this field with expeditions to Central Asia. Under the supervision of Vygotsky, Luria investigated various psychological changes (including perception, problem solving, and memory) that take place as a result of cultural development of undereducated minorities. In this regard he has been credited with a major contribution to the study of orality.

That last bit has a footnote to Ong’s Orality and Literacy. Another place I’ve seen the name before.

In 1933, Luria married Lana P. Lipchina, a well-known specialist in microbiology with a doctorate in the biological sciences.

Then studied aphasia:

In his early neuropsychological work in the end of the 1930s as well as throughout his postwar academic life he focused on the study of aphasia, focusing on the relation between language, thought, and cortical functions, particularly on the development of compensatory functions for aphasia.

This must be another pop-science topic where I’ve come across him before. Hm where’s the memory bit? Oh I missed it:

Apart from his work with Vygotsky, Luria is widely known for two extraordinary psychological case studies: The Mind of a Mnemonist, about Solomon Shereshevsky, who had highly advanced memory; and The Man with a Shattered World, about a man with traumatic brain injury.

Ah this turns out to be late on in his career:

Among his late writings are also two extended case studies directed toward the popular press and a general readership, in which he presented some of the results of major advances in the field of clinical neuropsychology. These two books are among his most popular writings. According to Oliver Sacks, in these works “science became poetry”.[31]

In The Mind of a Mnemonist (1968), Luria studied Solomon Shereshevskii, a Russian journalist with a seemingly unlimited memory, sometimes referred to in contemporary literature as “flashbulb” memory, in part due to his fivefold synesthesia.

In The Man with the Shattered World (1971) he documented the recovery under his treatment of the soldier Lev Zasetsky, who had suffered a brain wound in World War II.

OK 27 minutes left. I’ll look up some of the other characters. Leontiev first. Apparently he was ‘a Soviet developmental psychologist, philosopher and the founder of activity theory.’ What’s activity theory?

Activity theory (AT; Russian: Теория деятельности)[1] is an umbrella term for a line of eclectic social sciences theories and research with its roots in the Soviet psychological activity theory pioneered by Sergei Rubinstein in 1930s. At a later time it was advocated for and popularized by Alexei Leont’ev. Some of the traces of the theory in its inception can also be found in a few works of Lev Vygotsky,[2]. These scholars sought to understand human activities as systemic and socially situated phenomena and to go beyond paradigms of reflexology (the teaching of Vladimir Bekhterev and his followers) and classical conditioning (the teaching of Ivan Pavlov and his school), psychoanalysis and behaviorism.

So maybe he founded it or maybe he just advocated for it. This is all a bit of a mess. But, ok, it’s an umbrella term for moving past behaviourism.

One of the strengths of AT is that it bridges the gap between the individual subject and the social reality—it studies both through the mediating activity. The unit of analysis in AT is the concept of object-oriented, collective and culturally mediated human activity, or activity system.

This all looks sort of interesting, but a bit vague, and will probably take me down some other rabbithole. Back to Leontiev.

After Vygotsky’s early death, Leont’ev became the leader of the research group nowadays known as the Kharkov School of Psychology and extended Vygotsky’s research framework in significantly new ways.

Oh shit completely missed the whole thing about Vygotsky’s early death. Back to him… died aged 37! Of tuberculosis. Mostly became famous after his death, and through the influence of his students. Ah this bit on his influence might be useful. Soviet influence first:

In the Soviet Union, the work of the group of Vygotsky’s students known as the Vygotsky Circle was responsible for Vygotsky’s scientific legacy.[42] The members of the group subsequently laid a foundation for Vygotskian psychology’s systematic development in such diverse fields as the psychology of memory (P. Zinchenko), perception, sensation, and movement (Zaporozhets, Asnin, A. N. Leont’ev), personality (Lidiya Bozhovich, Asnin, A. N. Leont’ev), will and volition (Zaporozhets, A. N. Leont’ev, P. Zinchenko, L. Bozhovich, Asnin), psychology of play (G. D. Lukov, Daniil El’konin) and psychology of learning (P. Zinchenko, L. Bozhovich, D. El’konin), as well as the theory of step-by-step formation of mental actions (Pyotr Gal’perin), general psychological activity theory (A. N. Leont’ev) and psychology of action (Zaporozhets).

That at least says something about what all of those names did. Open Zinchenko tab as first.

Then North American influence:

In 1962 a translation of his posthumous 1934 book, Thinking and Speech, published with the title,Thought and Language, did not seem to change the situation considerably.[citation needed] It was only after an eclectic compilation of partly rephrased and partly translated works of Vygotsky and his collaborators, published in 1978 under Vygotsky’s name as Mind in Society, that the Vygotsky boom started in the West: originally, in North America, and later, following the North American example, spread to other regions of the world.[citation needed] This version of Vygotskian science is typically associated with the names of its chief proponents Michael Cole, James Wertsch, their associates and followers, and is relatively well known under the names of “cultural-historical activity theory” (aka CHAT) or “activity theory”.[45][46][47] Scaffolding, a concept introduced by Wood, Bruner, and Ross in 1976, is somewhat related to the idea of ZPD, although Vygotsky never used the term.[

Ah so Thought and Language was posthumous.

Then a big pile of controversy about how his work was interpreted. Now we’re getting headings like ‘Revisionist movement in Vygotsky Studies’, think I’ll bail out now. 16 minutes left.

OK let’s try Zinchenko page.

The main theme of Zinchenko’s research is involuntary memory, studied from the perspective of the activity approach in psychology. In a series of studies, Zinchenko demonstrated that recall of the material to be remembered strongly depends on the kind of activity directed on the material, the motivation to perform the activity, the level of interest in the material and the degree of involvement in the activity. Thus, he showed that following the task of sorting material in experimental settings, human subjects demonstrate a better involuntary recall rate than in the task of voluntary material memorization.

This influenced Leontiev and activity theory. That’s about all the detail there is. What to do next? Look up some of the other people I guess. Try a few, they’re all very short articles, give up with that.

Fine I’ll just google ‘vygotsky thought and language’ and see what i get. MIT Press description:

Vygotsky’s closely reasoned, highly readable analysis of the nature of verbal thought as based on word meaning marks a significant step forward in the growing effort to understand cognitive processes. Speech is, he argues, social in origins. It is learned from others and, at first, used entirely for affective and social functions. Only with time does it come to have self-directive properties that eventually result in internalized verbal thought. To Vygotsky, “a word is a microcosm of human consciousness.”

OK, yeah that does sound interesting.

Not finding great sources. 8 minutes left. Zone of proximal development section of Vygotsky’s page:

“Zone of Proximal Development” (ZPD) is a term Vygotsky used to characterize an individual’s mental development. He originally defined the ZPD as “the distance between the actual developmental level as determined by independent problem solving and the level of potential development as determined through problem solving under adult guidance or in collaboration with more capable peers.” He used the example of two children in school who originally could solve problems at an eight-year-old developmental level (that is, typical for children who were age 8). After each child received assistance from an adult, one was able to perform at a nine-year-old level and one was able to perform at a twelve-year-old level. He said “This difference between twelve and eight, or between nine and eight, is what we call the zone of proximal development.” He further said that the ZPD “defines those functions that have not yet matured but are in the process of maturation, functions that will mature tomorrow but are currently in an embryonic state.” The zone is bracketed by the learner’s current ability and the ability they can achieve with the aid of an instructor of some capacity.

ZPD page itself:

Zygotsky spent a lot of time studying the impact of school instruction on children and noted that children grasp language concepts quite naturally, but that math and writing did not come as naturally. Essentially, he concluded that because these concepts were taught in school settings with unnecessary assessments, they were of more difficulty to learners. Piaget believed that there was a clear distinction between development and teaching. He said that development is a spontaneous process that is initiated and completed by the children, stemming from their own efforts. Piaget was a proponent of independent thinking and critical of the standard teacher-led instruction that was common practice in schools.

But also:

… He believed that children would not advance very far if they were left to discover everything on their own. It’s crucial for a child’s development that they are able to interact with more knowledgeable others. They would not be able to expand on what they know if this wasn’t possible.

OK 3 minutes left. Let’s wildly skip between tabs learning absolutely nothing. Hm maybe this would have been interesting? ‘Vygotsky circle as a personal network of scholars: restoring connections between people and ideas’.

Ding! Didn’t get much past reading the title.


Well that didn’t work as well as the alienation one. Sprawling topic, and I wasn’t very clear on what I wanted to get out of it. History of the Circle itself or just some random facts about what individual people in it did? I mostly ended up with the second one, and not much insight into what held it together conceptually, beyond some vague idea about ‘going beyond behaviourism’/’looking at general background of human activity, not just immediate task’.

Still, I guess I know a bit more about these people than I did going in, and would be able to orient more quickly if I wanted to find out anything specific.

The Mane Six as Mitford Sisters

IMG_20200628_154808962_BURST000_COVER_TOP

[Written as part of Notebook Blog Month.]

I’ve saved the most important topic for last. As far as I can tell, nobody on the internet has tackled the vital question of how My Little Pony characters map to Mitford sisters. So I’m going to fix that.

As a bit of background, the Mitfords were a wildly eccentric English aristocratic family. The novelist Nancy Mitford is probably the most famous of them, but her five sisters were an impressively bizarre mix of communists, fascists, socialites, farmers and Hitler obsessives. (There’s also a brother who nobody cares about.) I’m not particularly well up on Mitford lore, but I am a big fan of Nancy Mitford’s The Pursuit of Love, and I’m fascinated by eccentrics of all kinds, so I know the basics.

My Little Pony also has six main characters. (Plus Spike.) They turn out to match up surprisingly closely with Mitford sisters, right up to the point where they don’t, and then I just have to make it up.

So, first of all, Nancy Mitford is Twilight Sparkle. This one is completely obvious. In The Pursuit of Love, Nancy is the narrator of a fictionalised version of the Mitfords’ lives, with her as the quieter, more studious observer. Done.

Deborah Mitford was a famous socialite known for… being social and stuff. Lots of witty correspondence with other witty socialite types. She also ran a big stately home that was open to the public, which is kind of like organising parties if you squint hard enough. So Pinkie Pie.

Jessica Mitford was the most adventurous and rebellious of the sisters, running away to Spain and then later becoming an activist in the US, where she worked on civil rights campaigns and joined the Communist Party. Also investigated unscrupulous business practices in the funeral home industry for some reason. Clearly has to be Rainbow Dash.

Unity Valkyrie Mitford is the oddest of the lot. She became completely obsessed with Hitler, stalked him around Munich, eventually made her way into his inner circle, shot herself in the head when Britain declared war on Germany, survived the attempt and lasted out almost another ten years before dying of meningitis caused by swelling around the bullet, which was never removed. I feel like this is a job for Rarity, partly because of the rhyming name, partly because she’s the only one capable of pulling off this much drama.

OK, this is the point where the mapping gets a bit trickier. The two remaining sisters are Pamela and Diana. Pamela was the most retiring of the Mitfords, staying out of the public eye, at least in comparison to the others. She was practical-minded, loved animals and the countryside and managed a farm for a while. A pretty good fit for either Applejack or Fluttershy. She did still manage to do some weird Mitford stuff, marrying a bisexual millionaire physicist and then becoming the ‘companion’ of an Italian horsewoman after they divorced.

Diana is the dud Mitford. She was another fascist, and not even a spectacularly bizarre one like Unity. Mainly known for marrying Oswald Mosley, leader of the British Union of Fascists, editing a fascist magazine, and spending some time in prison during the war for being a fascist.

I don’t really want to lumber either of them with Diana, but I need to make a choice, so I’ve introduced an outside tie break. Fluttershy is best pony. So Fluttershy gets Pamela Mitford and poor old Applejack is stuck with Diana Mitford.

I’m sure everyone’s relieved that this major open question has been definitively answered at last.

Bullshitting about bullshit jobs

IMG_20200628_151152903

[Written as part of Notebook Blog Month.]

Today’s topic is bullshit jobs. I’ve done no preparation for this beyond a reread of David Graeber’s essay, On the Phenomenon of Bullshit Jobs: A Work Rant. This piece was enormously popular when it came out in 2013, so presumably there’s a large secondary literature of commentaries and follow-ups. I haven’t read any of it, this is me just bullshitting from first principles. So I’m probably repeating a lot of obvious talking points.

Rereading the article, a couple of things jumped out straight away:

  • A lot of the specifics of his argument are not particularly convincing (I’ll get to this in a minute).
  • It rings deeply true anyway, because we all know deep in our hearts and viscera that so many jobs are full of useless bullshit. It’s not surprising that it was so popular.

So, I’ll briefly go over his thesis. Advances in productivity and automation should have freed up lots of time by now, and we should have got the 15 hour working week that John Maynard Keynes expected. But clearly we haven’t. Graeber rules out the consumer treadmill as an explanation:

The standard line today is that he didn’t figure in the massive increase in consumerism. Given the choice between less hours and more toys and pleasures, we’ve collectively chosen the latter. This presents a nice morality tale, but even a moment’s reflection shows it can’t really be true. Yes, we have witnessed the creation of an endless variety of new jobs and industries since the ’20s, but very few have anything to do with the production and distribution of sushi, iPhones, or fancy sneakers.

Instead, he points the finger at what he describes as whole new classes of jobs:

… rather than allowing a massive reduction of working hours to free the world’s population to pursue their own projects, pleasures, visions, and ideas, we have seen the ballooning of not even so much of the ‘service’ sector as of the administrative sector, up to and including the creation of whole new industries like financial services or telemarketing, or the unprecedented expansion of sectors like corporate law, academic and health administration, human resources, and public relations. And these numbers do not even reflect on all those people whose job is to provide administrative, technical, or security support for these industries, or for that matter the whole host of ancillary industries (dog-washers, all-night pizza delivery) that only exist because everyone else is spending so much of their time working in all the other ones.

These are what I propose to call ‘bullshit jobs’.

It’s as if someone were out there making up pointless jobs just for the sake of keeping us all working.

So, in this view there are a bunch of jobs that are bullshit. Here’s another sample list further down:

A world without teachers or dock-workers would soon be in trouble, and even one without science fiction writers or ska musicians would clearly be a lesser place. It’s not entirely clear how humanity would suffer were all private equity CEOs, lobbyists, PR researchers, actuaries, telemarketers, bailiffs or legal consultants to similarly vanish.

It’s not very satisfying to me to leave this at the level of a big binary list of jobs that are bullshit (telemarketers, corporate law) and jobs that are not (teachers, tube drivers). To my mind the bullshit is much more fractally distributed throughout the whole economy. It’s definitely true that some jobs are much more prone to gathering bullshit than others. But most of the ‘bullshit’ jobs Graeber lists serve some useful functions. I’m not sure what he’s got against actuaries – insurance seems like a reasonable thing to me, and somebody needs to work out how much it should cost. And some level of financial and legal work needs to go on. (Some of these I do find hard to defend at all. I think telemarketing might actually be pure bullshit given a narrow enough definition? Does anyone need to bother other people by phone any more? I mean I really hate phones so I could be biased here, but that does sound like bullshit to me.)

At the other end of the scale, a lot of his ‘non-bullshit’ jobs get mixed up with bullshit too. Teachers are always having to grapple with the latest bullshit government initiative, for example. The bullshit is mixed right through everything.

I want to probe a bit deeper into what factors are upstream of jobs becoming bullshit. There definitely seem to be warning signs for bullshit. For a start, jobs are particularly likely to contain a lot of bullshit if they contain a lot of abstraction layers. For example:

  • selling abstract things (financial derivatives) rather than concrete things (potatoes)
  • managing people who do things, rather than doing things directly
  • contracting out work to a second company, rather than doing it yourself.
  • producing hard-to-measure output (potato marketing board) rather than obvious results (potato farmer)

These aren’t bad things intrinsically, they need to happen to some extent or we’d all be stuck individually bartering potatoes all day. But they provide places for the bullshit to get in.

For the rest of this post I’m going to play with one potential taxonomy of bullshit jobs. This isn’t supposed to be a Grand Unified Theory of Bullshit Jobs, it’s just me playing around. I don’t have time to try multiple taxonomies in a notebook post like this, so if this one turns out to not be very insightful then I just have to deal with that I suppose. Anyway, it’s inspired by Frankfurt’s characterisation of bullshit in his classic On Bullshit:

It is impossible for someone to lie unless he thinks he knows the truth. Producing bullshit requires no such conviction. A person who lies is thereby responding to the truth, and he is to that extent respectful of it. When an honest man speaks, he says only what he believes to be true; and for the liar, it is correspondingly indispensable that he considers his statements to be false. For the bullshitter, however, all these bets are off: he is neither on the side of the true nor on the side of the false. His eye is not on the facts at all, as the eyes of the honest man and of the liar are, except insofar as they may be pertinent to his interest in getting away with what he says. He does not care whether the things he says describe reality correctly. He just picks them out, or makes them up, to suit his purpose.

The main feature of bullshit, as Frankfurt explains it, is that it is indifferent to the truth, rather than outright false. It’s produced as a side effect of some other self-serving process.

This already ties in quite well with the idea of bullshit seeping through abstraction layers. Abstraction layers tend to be places where caring breaks down – it’s just easier to care about potatoes than financial derivatives. In my taxonomy I’m going to explore three kinds of not caring:

  1. Nobody cares about the problem
  2. Nobody cares whether the solution could possibly fix the problem
  3. The incentives push against caring anyway

I’ll go through the eight options. First there’s the Platonic ideal of a non-bullshit job:

  • Care about problem, care about solution, good incentives. Honest artisan hand crafts a beautiful table to sell directly etc etc.

Then we get into the region of somewhat bullshit jobs:

  • Care about problem, care about solution, bad incentives. This is where teaching often ends up, for example (good teachers, I mean). Loves their subject, wants to teach it well, but also has to tick the boxes for ‘learning outcomes’ or w/e.

  • Care about problem, don’t care about solution, good incentives. Caring about the problem but not whether the solution could possibly fix it is a funny one, but something like it happens quite often once a bit of self-deception gets involved, and some bad incentives. Doing it with good incentives is harder. Maybe a widget factory boss gets infatuated with some kind of trendy-but-useless management methodology. They genuinely want to sell more widgets, the market for widgets functions well, but there’s a layer of cargo-cult stupidity in the middle. That’s the best I can do, maybe somebody else can come up with a better example.

  • Don’t care about problem, care about solution, good incentives. This is somebody putting in a solid day of work at a a job they don’t particular care for intrinsically, but with decent working conditions and high standards for what counts as a good job done.

None of these seem like the canonical bullshit job to me, but they are definitely likely to contain bullshit elements. Then we get towards the real bullshit:

  • Care about the problem, don’t care about the solution, bad incentives. This is the self-delusion thing again. Maybe this is a charity employee who genuinely cares about the cause but has some motivated reasoning going on about whether the thing they’re doing could possibly help. If the charity is able to fundraise whether the work is useful or not, you get this situation.

  • Don’t care about the problem, care about the solution, bad incentives. This is the typical academic with a very specific pet hammer, churning out papers that use it in dubious ways.

  • Don’t care about the problem, don’t care about the solution, good incentives. This is just working a job you don’t care about again, but this time without being held back by any kind of standards of professionalism or craftsmanship. Working conditions are still good.

Then finally we get to:

  • Don’t care about the problem, don’t care about the solution, incentives are terrible anyway. You can get here quite easily by switching the remaining yes to a no in the examples above: the academic doesn’t even care about whether the technique is carried out right, the charity employee is completely indifferent to the cause rather than well-meaning but self-deluded, the apathetic worker also has a horrible boss and isn’t paid well. Then you just have pure bullshit.

I’ll try it on one real example, that job I did where I walked around hospitals measuring things:

There were two hospitals being merged together on a new site, and the project management office needed to collect data on how much storage space the new hospital would need for medical supplies. I’m not sure what the best way of doing this would be, but maybe it would involve, I don’t know, some Fermi estimates based on their current storage requirements, plus some efficiencies for the single site. What they actually did was make a giant spreadsheet of every sort of item ordered by the hospital (bandages, prosthetics, tiny orthopaedic screws) and then employ EIGHT OF US to go round the hospitals with tape measures FOR WEEKS tracking down and measuring every individual item on the list, including the tiny orthopaedic screws.

If that doesn’t score highly on the bullshitometer then something is really up.

So… the original problem, ‘how much storage space do we need’, is a good one, and presumably somebody somewhere really cared about the answer. Once this had been filtered through a couple of management layers and subcontractors most of the caring had been lost. I’ll give it a generous half a point.

Definitely nobody cared whether the solution could possibly fix the problem. Zero points.

Incentives were also awful. The work had been contracted out through a couple of layers, and the people who had to find the answer had no particular reason to care beyond coming up with a number that kept the layer above happy. Zero points again.

So I make that 0.5/3. Definitely a bullshit job. So it passes this basic sanity check. (Not a great one, as I’d have had exactly this example at the back of my mind when I came up with the criteria!)

So… was that particulary taxonomy any good? Not particularly, but it did get my thinking through the space of possibilities. I do think that the general strategy of looking for places where not-caring gets in is a good one for spotting bullshit.

“Neoliberalism”

IMG_20200625_110020941
 
[Written as part of Notebook Blog Month.]

Everybody hates neoliberalism, it’s the law. But what is it?

This is probably the topic I’m most ignorant about and ill-prepared-for on the whole list, and I wasn’t going to do it. But it’s good prep for the bullshit jobs post, which was a popular choice, so I’m going to try. I’m going to be trying to articulate my current thoughts, rather than attempting to say anything original. And also I’m not really talking about neoliberalism as a coherent ideology or movement. (I think I’d have to do another speedrun just to have a chance of saying something sensible.) More like “neoliberalism”, scarequoted, as a sort of diffuse cloud of associations that the term brings to mind. Here’s my cloud (very UK-centric):

  • Big amorphous companies with bland generic names like Serco or Interserve, providing an incoherent mix of services to the public sector, with no obvious specialism beyond winning government contracts
  • Public private partnerships
  • Metrics! Lots of metrics!
  • Incuriosity about specifics. E.g. management by pushing to make a number go up, rather than any deep engagement with the particulars of the specific problem
  • Food got really good over this period. I think this actually might be relevant and not just something that happened at the same time
  • Low cost short-haul airlines becoming a big thing (in Europe anyway – don’t really understand how widespread this is)
  • Thinking you’re on a public right of way but actually it’s a private street owned by some shopping centre or w/e. With private security and lots of CCTV
  • Post-industrial harbourside developments with old warehouses converted into a Giraffe and a Slug and Lettuce
  • A caricatured version of Tony Blair’s disembodied head is floating over the top of this whole scene like a barrage balloon. I don’t think this is important but I thought you’d like to know

I’ve had this topic vaguely in mind since I read a blog post by Timothy Burke, a professor of modern history, a while back. The post itself has a standard offhand ‘boo neoliberalism’ side remark, but then when challenged in the comments he backs it up with an excellent, insightful sketch of what he means. (Maybe this post should just have been a copy of this comment, instead of my ramblings.)

I’m sensitive to the complaint that “neoliberalism” is a buzz word that can mean almost everything (usually something the speaker disapproves of).

A full fleshing out is more than I can provide, though. But here’s some sketches of what I have in mind:

1) The Reagan-Thatcher assault on “government” and aligned conceptions of “the public”–these were not merely attempts to produce new efficiencies in government, but a broad, sustained philosophical rejection of the idea that government can be a major way to align values and outcomes, to tackle social problems, to restrain or dampen the power of the market to damage existing communities. “The public” is not the same, but it was an additional target: the notion that citizens have shared or collective responsibilities, that there are resources and domains which should not be owned privately but instead open to and shared by all, etc. That’s led to a conception of citizenship or social identity that is entirely individualized, privatized, self-centered, self-affirming, and which accepts no responsibility to shared truths, facts, or mechanisms of dispute and deliberation.

2) The idea of comprehensively measuring, assessing, quantifying performance in numerous domains; insisting that values which cannot be measured or quantified are of no worth or usefulness; and constantly demanding incremental improvements from all individuals and organizations within these created metrics. This really began to take off in the 1990s and is now widespread through numerous private and public institutions.

3) The simultaneous stripping bare of ordinary people to numerous systems of surveillance, measurement, disclosure, monitoring, maintenance (by both the state and private entities) while building more and more barriers to transparency protecting the powerful and their most important private and public activities. I think especially notable since the late 1990s and the rise of digital culture. A loss of workplace and civil protections for most people (especially through de-unionization) at the same time that the powerful have become increasingly untouchable and unaccountable for a variety of reasons.

4) Nearly unrestrained global mobility for capital coupled with strong restrictions on labor (both in terms of mobility and in terms of protection). Dramatically increased income inequality. Massive “shadow economies” involving illegal or unsanctioned but nevertheless highly structured movements of money, people, and commodities. Really became visible by the early 1990s.

A lot of the features in my association cloud match pretty well: metrics, surveillance, privatisation. Didn’t really pick up much from point 4. I think 2 is the one which interests me most. My read on the metric stuff is that there’s a genuinely useful tool here that really does work within its domain of application but is disastrous when applied widely to everything. The tool goes something like:

  • let go of a need for top-down control
  • fragment the system into lots of little bits, connected over an interface of numbers (money, performance metrics, whatever)
  • try to improve the system by hammering on the little bits in ways such that the numbers go in the direction you want. This could be through market forces, or through metrics-driven performance improvements.

If your problem is amenable to this kind of breakdown, I think it actually works pretty well. This is why I think ‘food got good’ is actually relevant and not a coincidence. It fits this playbook quite nicely:

  • It’s a known problem. People have been selling food for a long time and have some well-tested ideas about how to cook, prep, order supplies, etc. Theres’s innovation on top of that, but it’s not some esoteric new research field.
  • Each individual purchase (of a meal, cake, w/e) is small and low-value. So the domain is naturally fragmented into lots of tiny bits.
  • This also means that lots of people can afford to be customers, increasing the number of tiny bits
  • Fast feedback. People know whether they like a croissant after minutes, not years.
  • Relevant feedback. People just tell you whether they like your croissants, which is the thing you care about. You don’t need to go search for some convoluted proxy measure of whether they like your croissants.
  • Lowish barriers to entry. Not especially capital-intensive to start a cafe or market stall compared with most businesses.
  • Lowish regulations. There’s rules for food safety, but it’s not like building planes or someting.
  • No lock-in for customers. You can go to the donburi stall today and the pie and mash stall tomorrow.
  • All of this means that the interface layer of numbers can be an actual market, rather than some faked-up internal market of metrics to optimise. And it’s a pretty open market that most people can access in some form. People don’t go out and buy trains, but they do go out and buy sandwiches.

There’s another very important, less wonky factor that breaks you out of the dry break-it-into-numbers method I listed above. You ‘get to cheat’ by bringing in emotional energy that ‘comes along for free’. People actually like food! They start cafes because they want to, even when it’s a terrible business idea. They already intrinsically give a shit about the problem, and markets are a thin interface layer over the top rather than most of the thing. This isn’t going to carry over to, say, airport security or detergent manufacturing.

As you get further away from an idealised row of spherical burger vans things get more complicated and ambiguous. Low cost airlines are a good example. These actually did a good job of fragmenting the domain into lots of bits that were lumped together by the older incumbents. And it’s worked pretty well, by bringing down prices to the point where far more people can afford to travel. (Of course there’s also the climate change considerations. If you ignore those it seems like a very obvious Good Thing, once you include them it’s somewhat murkier I suppose.)

The price you pay is that the experience gets subtly degraded at many points by the optimisation, and in aggregate these tend to produce a very unsubtle crappiness. For a start there’s the simple overhead of buying the fragmented bits separately. You have to click through many screens of a clunky web application and decide individually about whether you want food, whether you want to choose your own seat, whether you want priority queuing, etc. All the things you’d just have got as default on the old, expensive package deal. You also have to say no to the annoying ads trying to upsell you on various deals on hotels, car rentals and travel insurance.

Then there are the all the ways the flight itself becomes crappier. It’s at a crap airport a long way from the city you want to get to, with crappy transport links. The flight is a cheap slot at some crappy time of the early morning. The plane is old and crappily fitted out. You’re having a crappy time lugging around the absolute maximum amount of hand luggage possible to avoid the extra hold luggage fee. (You’ve got pretty good at optimising numbers yourself.)

This is often still worth it, but can easily tip into just being plain Too Crappy. I’ve definitely over-optimised flight booking for cheapness and regretted it (normally when my alarm goes off at three in the morning).

Low cost airlines seem basically like a good idea, on balance. But then there are the true disasters, the domains that have none of the natural features that the neoliberal playbook works on. A good example is early-stage, exploratory academic research. I’ve spent too long on this post already. You can fill in the depressing details yourself.

Some rambling thoughts about visual imagery

IMG_20200621_115251676

[Written as part of Notebook Blog Month.]

I’ve got some half-written drafts for topics on the original list which I want to finish soon, but for now I seem to be doing better by going off-list and rambling about whatever’s in my head. Today it’s visual imagery.

I’ve ended up reading a bunch of things vaguely connected with mnemonics in the last couple of weeks. I’m currently very bad at concentrating on books properly, but I’m still reading at a similar rate, so everything is in this weird quarter-read state. Anyway here’s the list of things I’ve started:

  • Moonwalking with Einstein by Joshua Foer. Pop book about learning to compete in memory championships. This is good and an easy read, so there is some chance I’ll actually finish it.
  • Orality and Literacy by Walter Ong. One of the references I followed up. About oral cultures in general but there is stuff on memorisation (e.g. repetitive passages in Homer being designed for easy memorisation when writing it down is not an option)
  • Brienne Yudkowsky’s posts on mnemonics
  • These two interesting posts by AllAmericanBreakfast on Less Wrong this week about experimenting with memory palaces to learn information for a chemistry exam.
     

Those last two posts are interesting to me because they’re written by someone in the very early stages of fiddling around with this stuff who doesn’t consider themself to naturally have a good visual imagination. I’d put myself in the same category, but probably worse. Actually I’m really confused about what ‘visual imagery’ even is. I have some sort of – stuff? – that has a sort of visual component, maybe mixed in with some spatial/proprioceptive/tactile stuff. Is that what people mean by ‘visual imagery’? I guess so? It’s very transitory and hard to pin down in my case, though, and I don’t feel like I make a lot of use out of it. The idea of using these crappy materials to make something elaborate like a memory palace sounds like a lot of work. But maybe it would work better if I spent more time on it.

The thing that jumped out of the first post for me was this bit:

I close my eyes and allow myself to picture nothing, or whatever random nonsense comes to mind. No attempt to control.

Then I invite the concept of a room into mind. I don’t picture it clearly. There’s a vague sense, though, of imagining a space of some kind. I can vaguely see fleeting shadowy walls. I don’t need to get everything crystal clear, though.

This sounded a lot more fun and approachable to me than crafting a specific memory palace to memorise specific things. I didn’t even get to the point of ‘inviting the concept of a room in’, just allowed any old stuff to come up, and that worked ok for me. I’m not sure how much of this ‘imagery’ was particularly visual, but I did find lots of detailed things floating into my head. It seems to work better if I keep a light touch and only allow some very gentle curiosity-based steering of the scene.

Here’s the one I found really surprising and cool. I was imagining an intricately carved little jade tortoise for some reason, and put some mild curiosity into what its eyes were made of. And I discovered that they were tiny yellow plastic fake gemstones that were weirdly familiar. So I asked where I recognised them from (this was quite heavy-handed questioning that dragged me out of the imagery). And it turns out that they were from a broken fish brooch I had as a kid. I prised all the fake stones off with a knife at some point to use for some project I don’t remember.

I haven’t thought about that brooch in, what, 20 years? But I remember an impressive amount of detail about it! I’ve tried to draw it above. Some details like the fins are a best guess, but the blue, green and yellow stones in diagonal stripes are definitely right. It’s interesting that this memory is still sitting there and can be brought up by the right prompt.

I think I’ll play with this exercise a bit more and see what other rubbish I can dredge up.

Doing things on purpose

IMG_20200619_175149647

[Written as part of Notebook Blog Month.]

Veering off the list again today. This is a quick post and I’m not trying to make a very coherent point, just chuck out some thoughts. I read two things this morning that were about, more or less, ‘doing things’, and now I want to talk about that.

The first was Venkatesh Rao’s latest post on his Art of Gig newsletter, talking about bad questions newbies waste time on to put off actually having to act:

Here are some examples of bad questions.

LLC or S-corp? (or equivalent question in other countries). The right answer is “probably LLC,” but if you don’t trust me, sure, go with S-corp. It’s not too costly to fix this if you get this wrong.

Blogging to attract inbound leads, or proactive email pitches? The obvious answer is the right one: try both, see what works, double down. Cheap effort.

Targeted, researched pitches versus spray-and-pray? Targeted, obviously. But sure, waste your time on spray-and-pray for a while. Maybe you’re one of the exceptions.

These all have similar ‘answers’: make a cursory stab at researching the problem to rule out obvious dumb ideas, then try one of the remaining options and see what happens. Then you’ll have more information:

They’re bad questions because you’re trying to fix an information deficit (which calls for trial and error) by over-analyzing information you do have.

The second was T. Greer at The Scholar’s Stage, talking about cultures that build:

The Americans of 1918 had carved towns, cities, and states out of the wilderness, and had practical experience building the school boards, sheriff departments, and the county, city, and state governments needed to manage them. Also within the realm of lived experience was the expansion of small towns into (unprecedentedly large) metropolises and the invention of the America’s first multi-national conglomerates. The progressive movement had spent the last three decades experimenting with new forms of government and administration at first the state and then the federal level, while American civic society saw a similar explosion in new social organizations. These include some famous names: the NRA, the American Red Cross, the Salvation Army, the American Bar Association, the Sierra Club, 4-H, the VFW, Big Brothers, the NAACP, the Boys Scouts, the PTA, the United Way, the American Legion, and the ACLU. [3] To a large extent we wander in the ruins of the world this generation built.

Greer argues that this kind of local civic action provides an education in doing things on purpose that then bubbles up to the larger scale:

I understand that the self-organizing neighborhood committee that removes a tree that blocks their street does not go on to build the Empire State Building. My argument is slightly different. To consistently create brilliant poets, you need a society awash in mediocre, even tawdry poetry. Brilliant minds will find their way towards poem writing when poem writing and poem reading is the thing that people do.

Now, I’m some nerd who likes to read a lot of text on the internet. I’m not actually very good at doing things. But I’ve slowly been pushing up my ability to just try things quickly, and that’s going well for me. I guess this notebook blog month thing is one example. I think I want some kind of extra writing discipline on the days-to-weeks level, and this is an easy experiment, so I might as well just try it and see what happens. I don’t think it quite fits what I want, but I’ve learned a whole lot more from doing it than I would have done by abstractly thinking about what I might want. Next month I can do something different.

I got a big one-time boost in my ability to do things on purpose from a single week-long physics workshop I went to a few years ago. It took me from ‘vaguely trying to keep learning physics in my spare time’ to ‘actually focussing on a topic and pushing towards doing serious work outside of academia’. Part of that was findng a community of people with similar interests. Part of it was that the workshop was built from the ground up to encourage agency, in a way that matches Greer’s start-from-everyday-decisions pattern. We were in a beautiful mountain cabin in the middle of nowhere in the Austrian Alps (if you were wondering what the picture is about this time, that’s my attempt at drawing the side of the cabin). So all cooking and cleaning had to be done by us. There was a rota for meals, but everything else was on a whoever-notices-it-fixes-it basis, with no distinction between the organisers and the rest of us. If the bin is overflowing or you need more firewood from the barn, sort it out yourself.

This low-level agency flowed up into higher-level decisions about the workshop. Everyone had to give a talk, and anyone could suggest discussion sessions for the afternoons based on questions that came up. Everybody took turns at moderating discussions and taking notes. If somebody had an idea for a good topic while putting the plates away, they could just gather some interested people and try it.

It also flowed out beyond the workshop, into the small society that runs it. I ended up joining this and it works the same way: anybody can suggest an event, get together some organisers and start running it. Or start some other new initiative, like a newsletter. People are already used to making decisions themselves from the workshop, so new members actually do this. We take turns to go on the board that ‘runs the society’ each year, but this is really just dealing with any admin that comes up and organising a general meeting once a year. Everything else seems to… just happen.

There are obvious downsides to this setup as well. It’s hard to scale; if you’re not fussy enough about entry you get taken over by cranks; if you’re too fussy about entry you turn into an old boys’ club that won’t listen to anybody with slightly different ideas. But it’s definitely good at solving the problem of getting people to do interesting stuff on purpose.

The Bristol Bridge Problem

IMG_20200616_064914978

[Written as part of Notebook Blog Month.]

Last year I walked the Bristol Bridge Walk with my brother. This is a 28 mile circuit in the style of Königsberg’s famous bridge problem, where every bridge in Bristol (or more accurately, every footbridge across the Avon) has to be crossed exactly once. It was designed by Thilo Gross, who figured out that Bristol’s bridge problem can be solved, unlike Königsberg’s. At least until they build more bridges and mess it up.

I wrote a Twitter thread on this a while back but thought I’d try an expanded version here. The interesting bit for me is how much of Gross’s work in creating this route was in defining the problem, mapping all the mess of modern Bristol onto a clean mathematical model.

The original Königsberg bridge problem has the following structure, with four land masses and seven bridges:

IMG_20200616_065357229

Euler worked out that it wasn’t possible to cross each bridge exactly once. The solution is simple and short enough that I may as well reproduce it here. He found a clever reframing of the problem, where the four land masses become nodes of a graph, and the seven bridges become edges between them:

IMG_20200616_065341923

Now, ignore the start and end of the walk for a minute. For the rest of the time, every time you enter a land mass by a bridge, you also leave by a bridge. So if every bridge has been crossed exactly once, then for any node of the graph apart from the start and end ones, there must be an even number of edges coming out of it.

Now this can’t possibly work for Königsberg, because all four nodes have three edges coming out of them, and you can only get away with having the two start and end nodes with odd numbers of edges. So the problem is insoluble.

Bristol has a similar overall structure to Königsberg, so the problem maps over quite nicely. There’s one main river, the Avon, with two islands in the middle of it: Spike Island and Redcliffe. Here’s a map:

bristol_map

So there are four nodes. But modern Bristol has a whole lot more bridges. 45 in fact. (Or maybe 47 now. The two new arena bridges were closed when I did the walk. They don’t change anything though, you just cross one and immediately cross back on the other.) Here’s the graph:

IMG_20200616_072002529

A is the north bank, B is the south, C is Spike Island and D is Redcliffe. There are so many bridges between Redcliffe and the two banks that I just wrote the numbers instead of drawing each edge. There are also several internal bridges linking the north bank to itself, for reasons I’ll get to.

(I came up with that graph myself, and it might not exactly match Gross’s. I tried to construct it to match his descriptions, but there might be some edge cases I counted slightly differently to him.)

Ignoring the internal edges linking A to A (which can be removed), all nodes have an even number of edges. A has 20 without the internal edges, B has 18, C has 12 and D has 26. So it’s a soluble problem.

Now, coming up with that graph was a good deal more complicated than just counting bridges. This is the bit I find interesting as an example of problem formation, mapping complicated stuff in the real world onto a clean mathematical model with an unambiguous technical solution. Gross explains all the work he had to do to figure out the scope of the problem, and what should actually count as a bridge:

It required about three solid days of work spread over a fortnight. Two thirds of the time was spent just finding out where all the bridges were, and what was to count as a bridge and what not. Only bridges that are walkable are included and a judgement had to be made about some smaller bridges (not comparable in size to Königsberg’s bridges) on tributaries of the Avon that would have led the resulting walk long distances through some dull areas. Some bridges comprise two separate structures of segregated traffic and are counted as two.

It turns out that Bristol is like Königsberg on the large scale, but when it comes to the details there are a whole bunch of judgement calls. These generally have a reasonable, common-sense resolution, so the resulting solution doesn’t feel arbitrary, but it does mean that solving the problem is not just a quick matter of drawing some nodes and edges. I’ll go through a load of these below.

First up, Gross needed to decide what counts as a bridge. This is purpose-dependent: rail- and car-only bridges don’t count in this solution because the whole point is to be able to walk it.

Next, it’s not exactly true that there are four nodes. If you look closely at the map, there are two little islets off of Spike Island:

islets

The left islet is kind of complicated, and worth looking at on a satellite view:

small_bridges

The two bottom footbridges linking the islet to Spike Island are genuine public bridges over the Avon and should definitely be included. The two top ones are little lock gate thingies. I don’t think they’re reliably walkable, and they probably shouldn’t count. So it’s a reasonable choice to ignore them and count the islet as part of the north bank node for the purposes of this graph.

The right islet is simpler: one bridge on, one bridge off. It kind of doesn’t matter much what you do with it – it could be counted separately (in which case it’s a new node with two edges), or shoved in with either the north bank or Spike Island. Gross went for the second option, preserving the similarity with Königsberg. Then the south bridge gets counted as the bridge between the north bank and Spike Island, and the second one gets counted as an internal bridge linking the north bank to itself. (At least that’s what I did. Not sure exactly how Gross formalised it.)

Also, there are a couple of minor rivers emptying into the Avon. There’s the Trym:

trym

In a sense this cuts the north bank into two: west of the Trym and east of the Trym. Should these be two nodes? And should you have to traverse all the bridges along the Trym as well, right out into the suburbs of Bristol?

Gross goes with no. It’s a small river and is just not a big deal: I don’t think that which side of the Trym you’re on is part of many people’s mental geography in Bristol, in the way that ‘north of the Avon’ (or ‘east of the motorway’) would be. (While walking this we crossed it without noticing, and had to go back to get a photo of the bridge.) Also walking to the outer suburbs and back would be a big hassle and make an already long walk really tedious. So both sides of the Trym are the same node, and the bridges over the Trym that do get crossed as part of this walk are internal bridges linking the north bank to itself.

There’s a similar argument for the Frome:

frome

This one’s easier, actually, because it’s mostly culverted in the centre and doesn’t really divide up Bristol at all. There are a couple of smaller rivers in the south, the Malago (also culverted) and Brislington Brook (tiny) which are even easier to rule out.

So we’re settled on four nodes. Now for the edges. This is mostly more straightforward, but even here there are special cases:

double_bridge

Is a bridge like this two bridges or one bridge? The aerial photo makes them look very much like one bridge, as you can see the whole intersection, but as a pedestrian they feel more like two. It doesn’t matter too much, but it would be annoying if the rule was inconsistent in different places. In this case they all get counted as two.

One last decision. There’s a single ‘outlier bridge’ in Bristol, the motorway bridge at Avonmouth. Here’s a map of the whole route area:

avonmouth_st_annes

This bridge does have a footbridge, so it ought to count. But it’s way out in the west, miles down from the next bridge, the Clifton Suspension Bridge. It adds maybe a third of the route just on its own. It would be a reasonable decision to leave it out, but also annoying, because it’s such a big important bridge and the walk wouldn’t feel complete. So it’s kept in. This actually works out quite well, because one of the nicest parts of the walk is the path through Leigh Woods on the way back from Avonmouth.

So… is the Bristol Bridge Problem soluble? It is, in the sense that you can make reasonable arguments for dividing Bristol up into nodes and edges in a way that gives you a soluble problem. It isn’t, in the sense that you could probably make different reasonable arguments for dividing it up in a different way that doesn’t. (It’s fairly robust to small changes, but if you dropped the Avonmouth outlier and counted the double bridges as single ones you could maybe get something that isn’t soluble). In the end the arguments for it being soluble were all good enough that I was happy to go along with them for the sake of an interesting walk. It’s soluble enough for my purposes.

I really like this as a demonstration of the work of problem formation. It’s easy to understand without a specific technical background: the mathematical model is a simple graph, and the decisions about how to map Bristol on to it are all easy-to-understand common sense sorts of decisions. But it’s still not a trivial toy problem you’d find in a textbook. Gross had to put days of work into finding a good formalisation, and the decisions are genuinely hard to call without being completely arbitrary. It’s a great example of mapping between a technical problem and the mess of the world.

Having opinions in public

IMG_20200614_131146919

[Written as part of Notebook Blog Month.]

I’m veering off the post ideas list today to talk about something that’s come up during the month. David MacIver’s been on a bit of a campaign recently to get people to try a daily writing practice, and has a good Twitter thread and blog post on how to get started. (His notebook blog is the main inspiration for my current notebook experiment.)

It turns out I have some opinions on this, focussed on a slightly different but complementary area: getting over emotional blocks around writing in public at all. If I had time I might try and rewrite this into a more generally applicable ‘advice’ type format. But I don’t, so I’m just going to ramble about what I did, and hope it’s possible to extract something useful out of that. Probably the main points are the classic start with something low-stakes and gradually ramp up, but maybe the details will turn out to be helpful too.

(I do still have an emotional block around anything that looks too much like ‘advice’ or ‘teaching’, actually. Like it’s claiming more authority than I have? I’m feeling some of that even writing this rambling sort of post – I mean, it’s not like I write some amazingly well-written mega-popular blog. But I do 1. regularly write posts that 2. some people read and get something useful out of, and that’s probably a more helpful reference point for people starting out anyway.)

The mechanics of writing blog posts came pretty easily to me. I know how to write in a fairly easy and unforced conversational voice, and have the result be coherent and entertaining enough that people will bother to read it. I have a decent intuitive grasp of how to explain why I care about a topic, and how to give enough background that people can follow along. I didn’t have to work on any of this deliberately and don’t have much insight into how I do it. I’m sure I could still improve a lot at these things, but it’s not been a focus for me.

Getting into a habit of writing frequently and coming up with topics to write about was hard at first, but over time both of these problems just solved themselves. I didn’t specifically put much thought into either, and David’s advice for getting round these at the start (minimal success criterion of ‘just write one sentence’, random book page to generate easy prompts) looks good to me.

What didn’t come easily for me, at all, was hitting the damn Publish button. Even under a pseudonym. I actually started from a level that seems extremely pathetic in retrospect, where I struggled to write comments on other people’s blogs under a consistent pseudonym. I’d write one comment under some name, and then I’d go back and nerve myself to look at it the next day, and it would look so cringily pointless and stupid that I’d never be able to make myself use that name again.

Come to think of it, I originally had this problem even with writing I didn’t publish. I’d write it, and the next day it would look atrociously bad, so I’d just delete it or bin it. So I have very little surviving writing from before age 25 or so. I’m not too sure how I solved that one, except that I kept going and eventually it looked less bad (combination of my writing improving and me dropping my standards way down from the very stupid ‘everything should be crafted carefully at the word level with no stale metaphors, Politics and the English Language style’ ones I’d picked up). I think this sort of ‘taste gap’ is very common when your ability to recognise good writing gets too far ahead of your ability to produce it, but I just had a really overly strong disgust reaction for some reason.

Solving writing in public was also quite accidental at first. I had a lot of time to waste and spent some of it going down the rabbit hole of something called rationalist-adjacent tumblr. I think it’s still going, but the peak was probably around 2015 or so. It’s hard to describe exactly what it was like, but a short version is ‘a bunch of clever but bored and unproductive people with fantastically distinctive writing styles and aesthetics having endless repetitive stupid arguments about dust specks and cupcakes and trying to befriend anybody who came along to join the argument.’ (Those two links give a better idea.) I never quite became part of the community there, more of a lurker and occasional commenter, but I did start writing. Mainly because it was incredibly low-stakes, and there was an easy ladder of increasing difficulty. First just reading and liking posts, then replying with the odd comment, then writing my own posts on low-controversy topics nobody was going to start a big fuss about. I started exploring my thoughts on mathematical intuition and discovering that I had something to say.

Around this time I started figuring out what was happening, and stretching my abilities deliberately. One big influence was Sarah Constantin’s post A Return To Discussion. This was aimed at people who were fleeing the spotlight of public discussion:

I have noticed personally is that people have gotten intimidated by more formal and public kinds of online conversation. I know quite a few people who used to keep a “real blog” and have become afraid to touch it, preferring instead to chat on social media. It’s a weird kind of locus for perfectionism — nobody ever imagined that blogs were meant to be masterpieces. But I do see people fleeing towards more ephemeral, more stream-of-consciousness types of communication, or communication that involves no words at all (reblogging, image-sharing, etc.) There seems to be a fear of becoming too visible as a distinctive writing voice.

But it also turned out to be useful for someone trying to learn how to do it in the first place. It called out some of my strategies and pushed me towards doing better:

You can preempt embarrassment by declaring that you’re doing something shitty on purpose. That puts you in a position of safety. You move to a space for trashy, casual, unedited talk, and you signal clearly that you don’t want to be taken seriously, in order to avoid looking pretentious and being deflated by criticism. I think that a lot of online mannerisms, like using all-lowercase punctuation, or using really self-deprecating language, or deeply nested meta-levels of meme irony, are ways of saying “I’m cool because I’m not putting myself out there where I can be judged. Only pompous idiots are so naive as to think their opinions are actually valuable.”

Very on point for someone with a tumblr called ‘drossbucket’! (That I’m-being-shitty-on-purpose thing did get me writing, though, even if it’s limiting in the long run. If you need it to start writing at all, I’d say go with it.)

That post also gets into the moral dimension of being able to have opinions in public, in terms of accountability and open debate. I’m not going to get into that, partly because it would take too long and partly because I don’t feel like I’m there yet. Still too conflict-averse to stick my neck out on most things that matter.

Still, I’ve kept working on it, increasing the difficulty level in small steps. I got this wordpress blog and moved over there, starting off by writing the same sort of short posts I wrote on tumblr. Then I started upping the quality standards and writing longer pieces that I’d obviously put time and thought into. No hiding behind the shitty-on-purpose label if it turned out bad. I got a Twitter account and started talking to people there. Then I attached the blog to my real name, but didn’t advertise that much. Then I added my real name and face to Twitter as well.

I was helped a lot by ending up in a fantastic corner of Twitter where people are really thoughtful and encourage each other with their writing projects, rather than trying to knock each other down and look clever all the time. Same for blog comments. I’d never have managed it otherwise. I’m not at the point where I could venture into some fraught culture war quagmire, or open up about anything deeply personal and upsetting. (I’m not even sure that I want to be able to do those things!) But by now there’s a pretty wide scope of topics where I’m able to blab out some thoughts with my name attached. And I want to keep pushing the scope of what I can do. Either directly on this blog, or by spinning up new alts to explore something different and merging them in.

So, yeah, I guess my advice did boil down to ‘start small and ramp up’ in the end. I hope the details are interesting to read though. Good luck to anyone trying this!

Thin technical terminology

IMG_20200612_094448768

[Written as part of Notebook Blog Month.]

Short one today on a weird aesthetic preference I have. It’s not particularly important but I’m curious whether anybody else has this, or whether it even makes sense to other people.

This relates back to the two types of symbol I was talking about a few posts back in Roses and traffic lights. Symbols as centres of clusters of rich associative meaning (‘rose symbolism in poetry’), vs symbols that are deliberately kept free of secondary associations so that they can stand for one ambiguous technical meaning (‘red traffic light means stop’).

Anyway the thing I really viscerally dislike is when I’m trying to learn a new technical topic, and a word that has a rich set of associations in everyday life is suddenly being repurposed for the ‘traffic light’ type of thin technical meaning. It’s just… bleurgh… and slows me down when learning certain things.

I’ll give an example. One place I’ve noticed this strongly is reading about cryptography and networking protocols. The technical vocabulary of ‘keys’ and ‘certificates’ and ‘sockets’ takes very ordinary words that have vivid sensory associations, and abstracts out one very specific core property. A cryptographic key isn’t metallic or shiny or toothed like a ‘proper’ key, but it does have the core property of having to be exactly correct, or it won’t open the thing you want to get at. I agree that this is the abstract heart of being a key, but somehow the word is still unpleasant without its cloud of visual and tactile connotations.

(There’s a connection to the Bell Jar stuff I quoted in the cognitive decoupling elite post – it’s similar to Esther’s dislike of the physics class.)

I know that if I stuck with the topic it would become better again. I’d start developing associations between the technical words, and then they’d start to have this rich meaning again. If I do get to this other side then technical topics suddenly become REALLY INTERESTING, and I can get obsessive. But I have to put in a lot of activation energy to get over this initial phase, and a lot of the time I just can’t be bothered.

I have the feeling that some people do have the ability to rapidly get interested in stuff just based on thin meanings, like mathematical definitions or rules for puzzles… but I don’t really know. Maybe they are just quicker at converting to thick meaning?