Speedrun: Abacus schools

(This is a speedrun post, where I set a one hour timer to see what I can find out about a subject. See the category tag for more examples.)

I’m currently reading Catarina Dutilh Novaes’s Formal Languages in Logic, and one part of the section on the historical development of mathematical notation jumped out at me as potentially interesting. Abbaco (‘abacus’) schools were a kind of practical school in medieval southern Europe that trained the sons of merchants and artisans in useful mathematics for bookkeeping and business. Apparently the mathematical culture associated with these schools actually went beyond the university education of the time in some respects, and helped push forward the development of algebra:

Indeed, modern algebra (and its notation) will ultimately emerge from the sub-scientific tradition of the abbaco schools, rather than the somewhat solidified academic tradition taught at the medieval universities.

I find these sort of semi-informal institutions on the edges of academia intriguing… I’m not sure how much I care about the details, but it seems worth an hour of investigation at least. There’s also a mention of Leonardo da Vinci and Danti Alighieri attending these schools, which could be interesting to follow up.

This speedrun session is also a bit different because we’re trying out a group speedrun event, and David MacIver and Eve Bigaj have also joined. Let’s see how it goes… As usual I typed this as I went and have done only minor tidying up afterwards, so there may be a bunch of typos and dodgy formatting.


There’s a wikipedia article, but it isn’t very long. Looks like there are a few other useful links though

Abacus school is a term applied to any Italian school or tutorial after the 13th century, whose commerce-directed curriculum placed special emphasis on mathematics, such as algebra, among other subjects. These schools sprang after the publication of Fibonacci’s Book of the Abacus and his introduction of the Hindu-Arabic numeral system. In Fibonacci’s viewpoint, this system, originating in India around 400 BCE, and later adopted by the Arabs, was simpler and more practical than using the existing Roman numeric tradition. Italian merchants and traders quickly adopted the structure as a means of producing accountants, clerks, and so on, and subsequently abacus schools for students were established.

So, yep, practical education for merchants and traders.

Significant for a couple of reasons. First they got rid of Roman numerals.

The number of Roman characters a merchant needed to memorize to carry out financial transactions as opposed to Hindu-numerals made the switch practical. Commercialists were first introduced to this new system through Leonardo Fibonacci, who came from a business family and had studied Arabic math. Being convinced of its uses, abacus schools were therefore created and dominated by wealthy merchants, with some exceptions

Also they were instrumental in rising literacy levels.

Nothing about algebra here! Another thing on the search page mentioned Cardano though so hopefully there will be a link.

Then there’s a bunch of stuff about the school system.

Italian abacus school systems differed more in their establishment than in their curriculum during the Middle Ages. For example, institutions and appointed educators were set up in a number of ways, either through commune patronage or independent masters’ personal funds. Some abbaco teachers tutored privately in homes. All instructors, however, were contractually bound to their agreement which usually meant that they could supplement their salary with tuition fees or other rates.

Could be an overlap here with medieval guild funding of universities (e.g. in Bologna), another subject I’m considering speedrunning on.

Independent teachers could also be hired by the commune, but for lower wages.[19] Most times, free-lance masters were contracted by a group of parents in a similar fashion to that of communal agreements, thus establishing their own school if the number of students being tutored was significant in size.[20] Abbaco apprentices training to become masters could also tutor household children and pay for their studies simultaneously.

Last (short) section is on the curriculum.

Arithmetic, geometry, bookkeeping, reading and writing in the vernacular were the basic elementary and secondary subjects in the abbaco syllabus for most institutions, which began in the fall, Mondays through Saturdays.

… Mathematical problems dealt with the everyday exchange of different types of goods or monies of differing values, whether it was in demand or in good quality, and how much of it was being traded. Other problems dealt with distribution of profits, where each member invested a certain sum and may have later withdrawn a portion of that amount

Well that wasn’t a very informative article. There isn’t one in Italian either, just Arabic (same info as English) and Persiian (a stub where I’m not going to even bother to hit translate). So I need to leave wikipedia very early.

OK, this looks good and more what I was after. ‘Solving the Cubic with Cardano – Aspects of Abbaco Mathematics’ by William Branson.

To understand the abbaco mathematics used by Cardano, we have to step back and look at the medieval tradition of abbaco schools and their masters. Though the subject is a fascinating and deep one, there is one particular aspect of this tradition that is crucial in the following account: abbaco masters thought in terms of canonical problems, and one particular canonical problem, the “Problem of Ten,” arises in the solution of the cubic that we will examine.

Quick summary of what they were, similar to wikipedia.

Abbaco mathematics was rhetorical—in Cardano’s time, most of the algebraic symbols with which we are so familiar were either recently invented, concurrent with the Ars Magna, or were well in the future. For example, ‘(+)’ and ‘(–)’ were first recorded in the 1480s, and were not in common use in 1545, when the Ars Magna was published. Robert Recorde would not invent the equals sign until 1557, and the use of letters and exponential notation would have to await Francois Viete in the 1590s and the Geometrie of Rene Descartes of 1637 [Note 2]. What Descartes would write as (x^3=ax+b,) Cardano wrote as “cubus aequalis rebus & numero” [Cardano 1662, Chapter 12, p. 251].

OK this is similar to what Dutilh Novaes was saying, people were solving problems that were algebraic in nature with unknowns to solve for, but the notation was still very wordy.

Rhetorical formulas can be difficult to remember, so algebraic rules were presented with canonical examples, which encoded the rules as algorithms within the examples. Thus, the mind of the abbaco master was a storehouse of such canonical examples, to which he compared the new problems that he came across in his work. When he recognized a parallel structure between the new problem and a canonical problem, he could solve the new problem by making appropriate substitutions into the canonical example.

So these ‘wordy’ forms still had some kind of canonical structures, it wasn’t just free text but was a kind of notation.

Such canonical examples occurred even in the foundational texts of abbaco mathematics, including the Algebra of al-Khwarizmi. An important example for us, one that occurs implicitly in Cardano’s solution to the cubic, is the “problem of ten” [Note 3]. Most abbaco texts had such problems, and one from Robert of Chester’s 1215 translation of al-Khwarizmi’s Algebra into Latin [al-Khwarizmi, p. 111] ran as follows:

Denarium numerum sic in duo diuido, vt vna parte cum altera multiplicata, productum multiplicationis in 21 terminetur. Iam ergo vnam partem, rem proponimus quam cum 10 sine re, quae alteram partem habent, multiplicamus…

In his translation of this passage into English, Louis Karpinski used (x) for ‘rem’ (thing), and so I offer my own translation, without symbols [Note 4]:

Ten numbers in two parts I divide in such a way, in order that one part with the other multiplied has the product of the multiplication conclude with 21. Now therefore one part we declare the thing, and then, with 10 without the thing, which the other part is, we multiply…

My god I can’t even be bothered to read all of that that… very glad we don’t do maths like that now…

The structure of the “problem of ten” was that of a number (a) broken into two parts (x) and (y,) with a condition on the parts; symbolically: [x+y=a\,\,{\rm and}\,\,f(x,y)=b] for some function (f(x,y)) and number (b.) The usual method of solution was to express the two parts as “thing” and “number minus thing” and then to substitute into the condition, as al-Khwarizmi did above. The “problem of ten” was canonical for quadratic problems, and served as a way to remember the rules for solving such problems.

This was used in Cardano’s solution to the cubic, apparently, but there’s no more detail on this page, it just ends there. Looks like a book extract or something.

There’s another MAA page on abbaco schools, though, so I’ll read that next. This is ‘Background: The Abbaco Tradition’ by Randy K. Schwarz.

Bit more detail on where these schools were:

They arose first in northern Italy, whose economy was the most vibrant in Europe during this period (Spiesser 2003, pp. 34-35). A banker and official in Florence, Italy, reported that in 1345 at least 1,000 boys in that city alone were receiving instruction in abbaco and algorismo (Biggs 2009, p. 73). Such schools also began to appear in neighboring southern France, and a few in Catalonia (the area around Barcelona, Spain) and coastal North Africa. These four regions of the western Mediterranean had extensive trade and cultural ties with one another at the time, so it isn’t surprising that they shared methods of practical mathematics and its instruction (Høyrup 2006).

Mentions the Fibonacci book again as a common ancestor. Ah so this is why Fibonacci knew this stuff:

He was only a boy, he reports, when his father, a customs official representing Pisan merchants at their trading enclave of Bugia, in what is now Algeria, brought him to the customs house there to be taught Hindu-Arabic numerals and arithmetic (Sigler 2002, pp. 3, 15)

This article is part of a series on something called the Pamiers manuscript, which translated some of this into French maybe? or some language in modern France anyway. look up later if time.

Nice picture of teaching in an abbaco school here.

In general, the abbaco texts offered practical, simplified treatments in which mathematical techniques were distilled into easy-to-remember rules and algorithms. The focus was on how to carry these out rather than on justifying the theory behind them. At the same time, the books were often innovative in their solutions to particular problems and especially in their pedagogical approach: their presentation was popular, and they introduced the use of illustrations and vernacular languages to the history of mathematics textbooks.

Reference here to something called Swetz 1987, ‘Capitalism and Arithmetic: The New Math of the 15th Century’.

OK this article finishes here too… and I still have 34 minutes, this might be a difficult speedrun for finding information. I may as well skim the intro page and find out what the Pamiers manuscript is while I’m here.

Pamiers is in the far south of France, south of Toulouse near the Pyrenees. Written in the Languedocian language.

One of the striking features of the Pamiers manuscript is the fact that it includes the world’s earliest known instance in which a negative number was accepted as the answer to a problem for purely mathematical reasons. The fact that this occurred in the context of a commercial arithmetic, rather than a more scholastic or theoretical work, is a surprise.

Ah, nice, this is the sort of thing I was hoping for, new ideas coming up in the context of practical problems.

Back to wikipedia for now, what else can I find?

I found a pdf by Albrecht Heeffer which is very short but does mention one interesting book.

The abbaco or abbacus tradition (spelled with double b to distinguish it from the material calculating device called ‘abacus’) has the typical characteristics of a so-called ‘sub-scientific’ tradition of mathematical practice (coined by Jens Høyrup). It is supported by lay culture, e.g. merchants, artisans and surveyors. Knowledge is disseminated through master-apprentice relationships, often within family relations. Texts, as far as they are extant, are written in the vernacular. The tradition is open to foreign influences, including cross-cultural practices. Typically, the tradition is underrepresented in the history of mathematics.

Dutilh Novaes also mentioned the Høyrup book so maybe that is what I should really be reading. It’s this ‘sub-scientific’ angle that I’m interested in.

Abbaco masters made subtle but important contributions to the development of early symbolism. Their two centuries of algebraic practice paved the road for the development of symbolic algebra during the sixteenth century. They introduced mathematical techniques such as complete induction which is believed to have emerged a century later

Yeah, ok, so this is an interesting subject but I probably need to be reading books to find the good bits, rather than skimming the internet. Similar to Vygotsky speedrun maybe.

Let’s find out what this Høyrup book is called. Ah it must be this book mentioned on his wikipedia page: ‘Jacopo da Firenze’s Tractatus algorismi and early italian abacus culture.’ Yes I’m definitely going to buy these chapters off Springer for 25.95 euros each, sounds like a great idea.

Ah here’s a copy of a pdf by Høyrup! It’s 34 pages so I don’t have time to go into the details, but I can skim it. Hm also it looks like it’s mainly arguing about the centrality of Fibonacci in the tradition, I’m not interested in that, I’m interested in the sub-scientific thing.

First though I’d like to chase up that thing about Dante and da Vinci.

20 minutes left.

Search ‘da Vinci abbacco school’, oh god the results are full of random schools named after him and references to The Da Vinci Code. Must include: abbaco.

I have found another vaguely useful paper though, ‘The Market for Luca Pacioli’s Summa Arithmatica’ by Alan Sangster and others. Something here about the two-track nature of education in Renaissance Italy, with these schools at the practical end.

The curriculum of the vernacular schools emerged from the merchant culture and was designed to prepare sons of merchants and craftsmen for their future working lives [Grendler, 1990]. There was another parallel set of schools, the Latin (either scholastic or humanist) schools, where the sons of the privileged were taught in Latin.

The two sets of schools taught very different subjects. The Latin schools sought to teach the future leaders of society and those that aided them, e.g., secretaries and lawyers [Grendler,1989, p. 311]. They specialized in the trivium of grammar, rhetoric, and logic… On the rare occasions when mathematics was taught in these schools, it took the form of “classical or medieval Latin mathematics” [Grendler, 1989, p. 309]. In contrast to the vernacular schools, boys leaving the humanist schools often went to university.

Hang on, why don’t I just look on da Vinci’s wikipedia page? It just says the following:

Despite his family history, Leonardo only received a basic and informal education in (vernacular) writing, reading and math, possibly because his artistic talents were recognized early.

which would at least be consistent with going to one of these schools. And Dante Alighieri:

Not much is known about Dante’s education; he presumably studied at home or in a chapter school attached to a church or monastery in Florence.

Hm, so what did Dutilh Novaes say? Ah, it’s a quote from Heeffer 2007, ‘Humanist Repudiation of Eastern Influences in Early Modern Mathematics’. Pdf is here. Should have looked this up to start with!

Actually I’m confused because, although this is very relevant looking, it doesn’t have the quote in it at all. Ah well, I may as well read it for the rest of the time anyway (only 5 minutes left!). The thing about Dante and da Vinci isn’t really important.

Here’s some more on the sub-scientific idea:

Jens Høyrup coined the term sub-scientific mathematics for a long tradition of practice which has been neglected by historians. As a scholar working on a wide period of mathematical practice, from Babylonian algebra to the seventeenth century, Høyrup has always paid much attention to the more informal transmission of mathematical knowledge which he calls sub-scientific structures.

This is pretty complicated to skim quickly.

The sub-scientific tradition was a cross-cultural amalgam of several traditions. Merchant type arithmetic and recreational problems show a strong similarity with Indian sources. Algebra descended from the Arabs. By the time Regiomontanus learned algebra in Italy it was practiced by abbaco masters for more than 250 years. The tradition of surveying and mensuration within practical geometry goes back to Babylonian times.

Some stuff on ‘proto-algebraic rules’.

Our main hypothesis is that many recipes or precepts for arithmetical problem solving, in abbaco texts and arithmetic books before the second half of the sixteenth century, are based on proto-algebraic rules. We call these rules proto-algebraic because they are, or could be based originally on algebraic derivations. Yet their explanation, communication and application do not involve algebra at all. Proto-algebraic rules are disseminated together with the problems to which they can be applied. The problem functions as a vehicle for the transmission of this sub-scientific structure. Little attention has yet been given to sub-scientific mathematics or proto-algebraic rules.

Ding! Time’s up.


Hm, that was kind of annoying to do a speedrun on, because the Wikipedia article was so short and I had to jump quickly to a bunch of other sources which all either had very limited detail or way too much detail. I never did get to the bottom of the Dante and da Vinci thing.

I’m also still not that clear on the details of exactly what new techniques they introduced, but looks like they were relevant to Cardano’s solution of the cubic, and also to the use of negative numbers in problems. They also introduced a bunch of schematic templates for solving problems, which later developed into modern algebraic notation.

The idea of ‘sub-scientific’ traditions sounds interesting more generally too, maybe I should look up the Høyrup book. Overall this looks like a topic where I’m better off reading books and papers than skimming random web pages.

Crackpot time 3: speculations will turn out well?

In 2017 I wrote two posts about my about my experiences with attempting to do physics outside of academia, which I called Crackpot Time 1 and Crackpot Time 2. At the time I was trying to reconnect to a more expansive, free-ranging energy that I had lost during the hyperfocus on technical details required for Ph.D. work. I was enjoying the ‘crackpot’ label as a kind of tongue-in-cheek pointer to the style of thinking I was trying to cultivate. I wanted to directly attack any topic that looked interesting, without fussing about whether the topic was ‘too ambitious’, or ‘too difficult’, or ‘not my field’. Small details like a total lack of relevant expertise didn’t matter.

I had a lot of this kind of energy in 2017, which was a very good year intellectually for me. I went to two deeply unusual and inspiring physics workshops that immediately raised my ambitions for what it would be possible for me to do in my spare time alongside a full time job. At the same time I was starting to take my side interest in mathematical intuition more seriously, and get oriented reading some phenomenology for the first time, so it was an intense time where I felt like the horizon was opening up fast in all directions. I started this blog and cranked out a bunch of short, unpolished but enthusiastic blog posts to try and make some sense of my thoughts.

I’ve been meaning to write another Crackpot Time update ever since, but just… never have. Partly that’s because I started a monthly newsletter practice in 2018 that took over some of the same role. But also it’s the standard inspiring workshop problem: the inspired feeling eventually wears off and then you then have to do the hard bit, which is doing the actual work. This is less immediately exciting and doesn’t autogenerate breathless updates about how amazing everything is, so they stopped appearing. I’ve finally decided to crank one out anyway, even if it’s effortful and uninspired.

At the beginning of 2020 I got this fortune cracker for Chinese New Year. Perfect fortune for a crackpot, right?

I’m now trying to evaluate whether speculations did in fact turn out well. It’s weirdly hard to decide. I’m normally at least somewhat confused by my progress – trying to do independent work in a complicated domain is slow and ambiguous at the best of times – but I think this is the most confused I’ve been in a long time. Long 2020 has obviously been enormously strange for everyone, and then on top of that I’m in a hard-to-interpret stuck phase. This is my best attempt to explain what I’ve been up to, and where I’m at now.

Focus and accountability

I’m not going to try and go over everything I’ve done since 2017, nobody cares including me, but I’ll do a few quick catch-up paragraphs to get me to the beginning of 2020. I had two good strategic ideas at the start of 2018. The first was to pick a very specific topic to focus on. My natural tendency is to dissipate my energies going partway down some interesting rabbithole before getting distracted by something else, and only end up with a very vague high-level understanding of anything. Useful for getting a sense of the territory, useless for making any sort of meaningful contribution to it.

To counteract that, I picked a single 8-page paper, A toy model for quantum mechanics by S. J. van Enk, as my focus for the whole year. I had some sense that this particular paper would be a good anchor for me, and that turned out to be correct. The core toy model is very concrete and easy to play around with, but touches on a number of ideas in quantum foundations that interest me – negative probabilities, the phase space formulation of quantum physics, the Spekkens toy model. There are also potential intriguing connections to my favourite recurring fascination, retrocausal interpretations of quantum physics. Having to stick close to the anchor paper meant I could explore aspects of these big topics without disappearing off into uselessly ungrounded speculation.

The second good idea was to use a monthly email newsletter as an accountability mechanism, inspired by this post. This wasn’t a Substack or anything, just a bog standard email that I sent out to a handful of people. I’d ramble a bit about what I’d done in the month, and that gave me a bit more incentive to stay on track. I stuck fairly closely to the area of this paper for the whole of 2018 and didn’t stray much further in 2019 either. This gave me far more focussed knowledge than I’d managed to pick up before working on my own.

At the beginning of the year I wrote the following:

My plan for 2018 is to go beyond just learning some physics in my spare time and to do ‘something novel’, interpreted broadly. ‘Novel’ in this case doesn’t have to mean original research (though that would definitely count) – I’m thinking of a wider conception of what counts as a novel contribution, in the style of Chris Olah and Shan Carter’s Research debt essay (I wrote some comments on it here).

I’ve never been too fussed about whether anything I do is original in the sense required for an academic physics paper, as a completely new technical contribution to the field. But my ambitions are higher than just passively making notes from a textbook. I want to follow my own curiosity trail through a subject, write down what I notice on the way, and highlight ideas and connections that currently aren’t available in digested blog post form. The sort of work that Olah and Carter call ‘research distillation’ in the essay linked above.

This took longer to spin up than I was initially hoping for, and I spent most of 2018 just learning background and writing notes. I finally got going in 2019 and had a few thoughts on negative probabilities from a somewhat novel angle, which produced a couple of posts and a mildly popular twitter thread. So that takes me up to 2020, and the fortune cookie.

Long 2020

In early 2020 I had a tedious hour+ two-bus commute to work and sometimes skimmed some interesting-looking papers on my phone. Otherwise I wasn’t getting much done, because my energy was sapped by the stupid commute. I decided to have a twitter break in February to claw back whatever time I could, which worked fairly well. Some time near the end I spent a Saturday holed up in a corner of Bath University library where I had an idea for a very basic toy model that was quite limited by itself but maybe extensible in some interesting way. I was excited to figure out what it could do and started fiddling around with that for the next week or two.

I got back on twitter on March 1 to discover everything had been replaced by coronavirus panic, which was a big shock to me because I had almost completely ignored it until then. So I started catching up on panic, and the toy model went out of my head for the next couple of months along with everything else that wasn’t covid. I no longer had the bus commute, but I also couldn’t think properly, so that didn’t help much.

After a couple of months my brain came back online at least partially, but the toy model was completely dropped. (I still haven’t managed to pay it any consistent attention, it’s a loose thread at the moment.) Instead I remembered the papers I’d been reading on the bus. I’d been learning about Abramsky and Hardy’s logical Bell inequality work, and I realised that I could use the tools from this to finish off a half-baked idea for a post on Bell’s theorem that I had, connecting a classic popular-science explanation to the version you’d find in a text book. The logical Bell inequality techniques made a natural bridge between the two, and over the summer I was able to use this idea to extend my scrappy notes into a full post that I was pretty happy with. I was finally managing the kind of distillation work I’d been thinking about at the start of 2018.

After that I was on a roll, and found a second use for the logical Bell techniques. In my 2019 posts on negative probabilities I used a very simple toy model created by Dan Piponi as an illustrative example. I picked it because it was simple, but I was also intrigued by its relation to quantum physics – it’s structurally similar to qubit phase space, but the specific numbers are different. In a sense it’s even further from classical physics, with the negative probability being more negative than anything allowed in quantum physics.

I’d noticed before that this was interestingly parallel to a much more well-known case of something being ‘worse than quantum physics’, the Popescu-Rohrlich box, but thought it was only a vague similarity. Once I had the logical Bell tools I realised that there was an exact numerical analogy. I couldn’t find this described anywhere else, so I started writing it up.

Unfortunately this took long enough that it took me into the long depressing UK lockdown winter. The news was a constant stream of miserable statistics from the new covid variant mixed in with increasingly batshit US election nonsense, the weather was dark and grey, and working from home was getting more and more tedious. I eventually managed to finish the ‘worse than quantum mechanics’ stuff and get it out as two blog posts, but that overstressed my limited ability to care about things and once I published the posts I lost interest. I made some very half-hearted attempts to find out more about whether this was actually novel, and when this wasn’t completely straightforward I just dropped it. That was some time around February and I still haven’t picked it up again.

So… now what?

I’m writing this up now because I suddenly have a lot of free time. I’ve just quit my job – last day was last Friday – and haven’t lined up another one. I’m planning at least a couple of months off before I start thinking seriously about getting a new job. So this would be the perfect time to pick this up again. I’m not too bothered if I can’t get my attention back round to physics, because I have other weird projects that I am still keen to work on, but it does seem like a shame to just drop all this stuff. I’m not going to push it though.

The thing I’m feeling most is the lack of social support. I’m not naturally plugged in to a community of people in quantum foundations who are thinking about similar topics, so it can be difficult to keep motivation. David MacIver has a great newsletter post on Maintaining Niche Interests, where he talks about struggling with the same problem:

“Nobody actually wants to know” is a bit unfair. It’s more like… there are people who are interested, but they are both less interested than I am in the subject, and also I don’t talk to them much. The people who I talk to on a regular basis are not interested, because this is mostly not their field.

I feel it even more keenly in comparison with some of my other interests that I talk about on this blog and newsletter and on Twitter, where I do have some sort of community. I can talk about some pretty niche topics – Derrida, Vygotsky, the Prussian education system – and get meaningful informed responses from other people. Book recommendations, suggestions for related areas to explore, that sort of thing. It’s not the same as being in a densely-networked in-person research group, but it goes a surprisingly long way.

The pandemic has definitely made it worse. I do normally get some sense of shared community from the physics society I’m in, which organises workshops and meetups (including the two really inspiring ones I went to in 2017). But it’s very much a community built around meeting in person, rather than around producing large quantities of English-language text on the public internet. We’ve tried a few online calls and talks, but it’s not the same.

Even without the pandemic, though, I struggle with this. I’m just not very good at collaborating when it comes to physics. A lot of this is rooted in defensiveness – I’m just weird for a physicist, kind of slow and mediocre technically and with an odd thinking style, highly focussed on examples and weak on abstraction. I go into any interaction worrying that I’m going to look stupid and expecting to not be able to get my point across, which makes it even harder to get my point across, which… you get the idea. It’s difficult. I think I could make good incremental progress on this in the same way I made progress on blogging, but getting the right supportive environment to start the feedback loop going is tricky. Physics culture is not known for providing what I want.

In the mean time I’m going to keep plugging on with other projects and not force anything. After all, it’s been a strange enough year that I should probably feel happy that I did anything at all. Hopefully my interest in physics will return soon and I can get a better sense of whether speculations have turned out well.

Speedrun: The Prussian education system

This is another of my research speedrun experiments – I’ve made a category for them now, so look at the earlier ones if you want to know more.

Today’s topic was inspired by this tweet:

I’d noticed this one too. If you hang around parts of the internet where people talk about how School Is Bad a lot, someone will eventually bring up ‘the Prussian education system’ and how it was designed to indoctrinate factory workers or something. There is never any detail beyond this, we all nod sagely and move on.

Presumably there is more to learn about this topic. Let’s set that one hour timer and find out…


Ok, so… um… where’s Prussia? Somewhere round where Germany is now presumably, but which bit?

Prussia was a historically prominent German state that originated in 1525 with a duchy centered on the region of Prussia on the southeast coast of the Baltic Sea… Prussia, with its capital first in Königsberg and then, when it became the Kingdom of Prussia in 1701, in Berlin, decisively shaped the history of Germany.

Ah, so it included Königsberg, of bridge fame. And a big stripe of Baltic coast at its peak (1870 map).

My historical knowledge is not great and this will be a problem for contextualising all this stuff. Ah well, just get a quick sense of time and space. Done space, in terms of time we have:

The name Prussia derives from the Old Prussians; in the 13th century, the Teutonic Knights—an organized Catholic medieval military order of German crusaders—conquered the lands inhabited by them. In 1308, the Teutonic Knights conquered the region of Pomerelia with (Danzig) Gdańsk.

Then bla bla bla usual complicated mid european wars…

The union of Brandenburg and the Duchy of Prussia in 1618 led to the proclamation of the Kingdom of Prussia in 1701.

Prussia entered the ranks of the great powers shortly after becoming a kingdom,[5][6][7][8] and exercised most influence in the 18th and 19th centuries.

Then lots of complicated 20th century history.

… The Kingdom ended in 1918 along with other German monarchies that collapsed as a result of the German Revolution.

etc etc up to

Prussia existed de jure until its formal abolition by the Allied Control Council Enactment No. 46 of 25 February 1947.

Right I am now an expert on Prussia, ten minutes down.

Next is the wikipedia article on the Prussian education system.

The Prussian education system refers to the system of education established in Prussia…

yep I got that bit…

… as a result of educational reforms in the late 18th and early 19th century, which has had widespread influence since. The Prussian education system was introduced as a basic concept in the late 18th century and was significantly enhanced after Prussia’s defeat in the early stages of the Napoleonic Wars. The Prussian educational reforms inspired other countries and remains important as a biopower in the Foucaultian sense for nation-building.

Oh so is Foucault the source of this meme?? ‘Biopower’ is a bit of jargon I hadn’t heard before, open in new tab.

The term itself is not used in German literature, which refers to the primary aspects of the Humboldtian education ideal respectively as the Prussian reforms; however, the basic concept remains fruitful and has led to various debates and controversies.

Open the Humboldtian thing in another tab.

I’ll go through the wikipedia page sections in turn.

Origin

The basic foundations of a generic Prussian primary education system were laid out by Frederick the Great with the Generallandschulreglement, a decree of 1763 which was written by Johann Julius Hecker. Hecker had already before (in 1748) founded the first teacher’s seminary in Prussia.

Haha wtf:

His concept of providing teachers with the means to cultivate mulberries for homespun silk, which was one of Frederick’s favorite projects, found the King’s favour.

So this is in some way related to the king’s pet mulberry growing project??

It expanded the existing schooling system significantly and required that all young citizens, both girls and boys, be educated by mainly municipality-funded schools from the age of 5 to 13 or 14.

OK so this was one of the first systems of tax funded compulsory education. (compare the UK where this happened in the 1880s, it was still fresh history at the time of Lark Rise)

Topics are reading, writing and god stuff:

The Prussian system consisted of an eight-year course of primary education, called Volksschule. It provided not only basic technical skills needed in a modernizing world (such as reading and writing), but also music (singing) and religious (Christian) education in close cooperation with the churches and tried to impose a strict ethos of duty, sobriety and discipline. Mathematics and calculus were not compulsory at the start, and taking such courses required additional payment by parents.

There were also later educational stages preparing for university.

Oh wow so it already had national testing and a national curriculum (that was a big controversy in the UK in the 1990s).

The Prussian system, after its modest beginnings, succeeded in reaching compulsory attendance, specific training for teachers, national testing for all students (both female and male students), a prescribed national curriculum for each grade and mandatory kindergarten.

So it really did have a lot of the features of modern schooling, I see why it comes up so often. Teacher training as well, and credential gating for the civil service:

In 1810, Prussia introduced state certification requirements for teachers, which significantly raised the standard of teaching.[9] The final examination, Abitur, was introduced in 1788, implemented in all Prussian secondary schools by 1812 and extended to all of Germany in 1871. Passing the Abitur was a prerequisite to entering the learned professions and higher echelons of the civil service.

Outreach

The overall system was soon widely admired for its efficiency and reduction of illiteracy, and inspired education leaders in other German states and a number of other countries, including Japan and the United States.

The Japan link could be interesting… won’t follow that tangent…

The underlying Humboldtian educational ideal of brothers Alexander and Wilhelm von Humboldt was about much more than primary education; it strived for academic freedom and the education of both cosmopolitan-minded and loyal citizens from the earliest levels. The Prussian system had strong backing in the traditional German admiration and respect for Bildung as an individual’s drive to cultivate oneself from within.

These reforms ‘… had a background in the middle and upper middle strata of society and were pioneered by the Bildungsbürgertum.’ Look up that word: ‘a social class that emerged in mid-18th century Germany as an educated class of the bourgeoisie with an educational ideal based on idealistic values and classical antiquity. The Bildungsbürgertum could be described as the intellectual and economic upper bourgeoisie’

The concept as such faced strong resistance both from the top, as major players in the ruling nobility feared increasing literacy among peasants and workers would raise unrest, and from the very poor, who preferred to use their children as early as possible for rural or industrial labor.

Reformers got their chance after the defeat of Prussia in the Napoleonic Wars.

In 1809 Wilhelm von Humboldt, having been appointed minister of education, promoted his idea of a generic education based on a neohumanist ideal of broad general knowledge, in full academic freedom without any determination or restriction by status, profession or wealth.

Now some stuff on interaction with the nationalist movement, featuring my friend Fichte from The Roots of Romanticism. OK so he was keen on education reform as a part of his German nationalism project:

Fichte and other philosophers, such as the Brothers Grimm, tried to circumvent the nobility’s resistance to a common German nation state via proposing the concept of a Kulturnation, nationhood without needing a state but based on a common language, musical compositions and songs, shared fairy tales and legends and a common ethos and educational canon.

Then something about a guy called Jahn who liked gymnastics a lot and shoehorned a bunch of it into the curriculum. The forefather of horrible PE lessons.

Also privileging of High German as an official language.

Now a lot of stuff about Pietism.

Pietist theology stressed the need for "inner spirituality" (Innerlichkeit [de]), to be found through the reading of Scripture. Consequently, Pietists helped form the principles of the modern public school system, including the stress on literacy, while more Calvinism-based educational reformers (English and Swiss) asked for externally oriented, utilitarian approaches and were critical of internally soul searching idealism.

Oh I see, this is important, Pietism actually wanted people to read! Yeah so there’s a whole cluster of interest groups coming together.

Shit I’m 30 minutes in and need to speed up a bit. This is all too interesting! Though normally the wiki article tails off later anyway, so maybe I’m ok.

Some stuff about attitudes to teachers:

Generations of Prussian and also German teachers, who in the 18th century often had no formal education and in the very beginning often were untrained former petty officers, tried to gain more academic recognition, training and better pay and played an important role in various protest and reform movements throughout the 19th and into the 20th century… There is a long tradition of parody and ridicule, where teachers were being depicted in a janus-faced manner as either authoritarian drill masters or, on the other hand, poor wretches which were suffering the constant spite of pranking pupils, negligent parents and spiteful local authorities.

Open ‘Biedermeier’ tab though I don’t have time to look at it… ‘an era in Central Europe between 1815 and 1848 during which the middle class grew in number and the arts appealed to common sensibilities’.

Spread to other countries

Austria first under Maria Theresa, then widely after the French Revolution. Estonia and Latvia, Norway and Sweden, Finnish nationalist movement.

France and the UK took longer, ‘France due to conflicts between a radical secular state and the Catholic Church’ and UK just because of generally not liking change I think. Some stuff in the US too, Horace Mann and the common school movement in Massachusetts.

Now a section about tensions between Prussian system and Anglo culture:

The basic concept of a state-oriented and administered mass educational system is still not granted in the English-speaking world, where either the role of the state as such or the role of state control specifically in education faces still (respectively again) considerable skepticism… One of the important differences is that in the German tradition, there is stronger reference to the state as an important principle, as introduced for example by Hegel’s philosophy of the state, which is in opposition to the Anglo-American contract-based idea of the state.

Ah here’s a bit on the interaction with the Prussian system and military and industrial aims:

Early Prussian reformers took major steps to abandon both serfdom and the line formation as early as 1807 and introduced mission-type tactics in the Prussian military in the same year. The latter enlarged freedom in execution of overall military strategies and had a major influence in the German and Prussian industrial culture, which profited from the Prussian reformers’ introduction of greater economic freedom. The mission-type concept, which was kept by later German armed forces, required a high level of understanding, literacy (and intense training and education) at all levels and actively invited involvement and independent decision making by the lower ranks.

Ah so I’m nearly at the end of the article with 18 minutes to go, the rest is postwar legacy and I’d rather stay more in the historical period. I’ll look up Humboldt first and then maybe Foucault’s biopower thing if time?

Humboldtian model

Haha that’s confusing, there are two different Humboldts with two different ideals:

This article is about Wilhelm von Humboldt’s university concept. For the romantic ideal of science related to Alexander von Humboldt, see Humboldtian science.

So this goes beyond vocational training:

Sometimes called simply the Humboldtian model, it integrates the arts and sciences with research to achieve both comprehensive general learning and cultural knowledge, and it is still followed today.

From his letter to the Prussian king:

There are undeniably certain kinds of knowledge that must be of a general nature and, more importantly, a certain cultivation of the mind and character that nobody can afford to be without. People obviously cannot be good craftworkers, merchants, soldiers or businessmen unless, regardless of their occupation, they are good, upstanding and – according to their condition – well-informed human beings and citizens.

Greek classics are important:

Humboldt believed that study of the Hellenic past would help the German national consciousness, reconciling it with modernity but distinguishing it from French culture, which he saw as rooted in the Roman tradition.

Academic freedom independent from political/economic/religious influences.

Study should be guided by humanistic ideals and free thought, and knowledge should be formed on the basis of logic, reason, and empiricism rather than authority, tradition, or dogma.

University reform:

The University of Berlin, founded in 1810 under the influence of Wilhelm von Humboldt and renamed the Humboldt University of Berlin after World War II, is traditionally seen as the model institution of the 19th century.

Fichte was appointed by Humbolt there.

The university’s features included a unity in teaching and research, the pursuit of higher learning in the philosophy faculty, freedom of study for students (Lernfreiheit, contrasted with the prescriptive curricula of the French system) and corporate autonomy for universities despite state funding.

Don’t have time to check now, but I wonder how this interacted with the history of the Ph.D. system. I know that started in Germany…

Haha, France banned beards:

It was in competition with the post-Revolutionary French concept of the grandes écoles. The French system lacked the freedom of German universities and instead imposed severe discipline and control over curriculum, awarding of degrees, conformity of views, and personal habits, instituting, for example, a ban on beards in 1852.

OK 9 minutes left for Foucault’s biopower:

It relates to the practice of modern nation states and their regulation of their subjects through "an explosion of numerous and diverse techniques for achieving the subjugations of bodies and the control of populations".[1] Foucault first used the term in his lecture courses at the Collège de France,[2][3] and the term first appeared in print in The Will to Knowledge, Foucault’s first volume of The History of Sexuality. In Foucault’s work, it has been used to refer to practices of public health, regulation of heredity, and risk regulation, among many other regulatory mechanisms often linked less directly with literal physical health.

So, control of the state over peoples’ bodies.

Modern power, according to Foucault’s analysis, becomes encoded into social practices as well as human behavior, as the human subject gradually acquiesces to subtle regulations and expectations of the social order. It is an integral feature and essential to the workings of—and makes possible the emergence of—the modern nation state, capitalism, etc.

Hm this is going to take me a long way off topic. The article has no mention of Prussian anything. Let’s go back to something else for 5 minutes… what’s this Bildungsbürgertum article…

As a class of wealthy non-noble people, emerging first in the free imperial cities, they gained material wealth, social position and a better education, which was based on Humboldt’s educational ideal. The idea of Bildung (i.e. culture, education) was shaped by a belief in human perfectibility, specifically that an individual’s potential could be realized through a classical education.

In the late absolutist management state there existed a need for a large number of educated officials to implement reforms. To avoid a violent revolution, as in France, a national class was formed that had access to cultural education and thus to political positions. As a result, many educational institutions were established, significantly more in Germany. The universities established in Germany, including the Humboldt University, became a model for modern universities in other countries. This new class was not primarily defined politically or economically, but mainly culturally.

And the Biedermeier article?

Although the term itself derives from a literary reference from the period, it is used mostly to denote the artistic styles that flourished in the fields of literature, music, the visual arts and interior design.

The Biedermeier period does not refer to the era as a whole, but to a particular mood and set of trends that grew out of the unique underpinnings of the time in Central Europe

Ah so the word comes from a parody:

The term "Biedermeier" appeared first in literary circles in the form of a pseudonym, Gottlieb Biedermaier, used by the country doctor Adolf Kussmaul and lawyer Ludwig Eichrodt in poems that the duo had published in the Munich satirical weekly Fliegende Blätter in 1850.[4]

The verses parodied the people of the era, namely Samuel Friedrich Sauter, a primary teacher and sort of amateurish poet, as depoliticized and petit-bourgeois.


Time’s up! That went pretty well. In terms of sources I didn’t even leave Wikipedia because there was plenty there, so maybe not the most exciting from that perspective. Got a bit distracted down rabbit holes at the end, but that normally happens.

I definitely know a bit more than just ‘boo Prussian education system’ now, and the historical background was interesting. It meshed pretty well with The Roots of Romanticism in terms of time and place, so I had a bit more context than I was expecting.

I’d still like to know why it’s such a meme online… does it trace through Foucault or something else? If you have any leads, let me know!

Funny Turns

After a discussion about obscure Google Scholar hits on twitter last night, I just remembered this long list I made a few years ago. If you dig around postmodern/continental stuff long enough you discover there are a lot of Turns. Linguistic turns, rhetorical turns, hermeneutic turns… I never really did figure out what it was all about (potential speedrun question?)

But what are the weirder ones? I previously did this with the ‘X and its Discontents’ snowclone, which was funnier because people use it for very specific things like Newport or the Lawn Chemical Economy. This time it was mostly long boring abstract adjectives, which is maybe why I never published it. Still, here they are…

Linguistic
Postmodern
Hermeneutic
Interpretive
Mobility
Affective
Boy
Pragmatic
Practice
Cultural
Cognitive
Communicative
Corporeal
Complexity
Constructive
Constructivist
Spatial
Social
Sociological
Sociopolitical
Argumentative
Multilingual
Relational
Semantic
Semiotic
Structural
Systemic
Governance
Ontological
Reflexive
Rhetorical
Computational
Digital
Empirical
Ideational
Educational
Postsecular
Spiritual
Ideological
Action
Local
Narrative
Translational
Demotic
Archival
Performative
Deliberative
Iconic
Postcolonial
Decolonial
Territorial
Infrastructure
Intersectional
Neuroscientific
Transnational
Descriptive
Practical
Material
Participatory
Deconstructive
Leaderist
Cosmopolitan
Biographical
Spectral
Qualitative
Moral
Normative
Visual
Theoretical
Curatorial
Evolutionary
Ecological
Algorithmic
Neoliberal
Intercultural
Ethnographic
Consumerist
Geological
Animal

Speedrun: “Sensemaking”

This is a genre of post I’ve been experimenting with where I pick a topic, set a one hour timer and see what I can find out in that time. Previously: Marx on alienation and the Vygotsky Circle.

I’ve been seeing the term ‘sensemaking’ crop up more and more often. I even went to a workshop with the word in the title last year! I quite like it, and god knows we could all do with making more sense right now, but I’m pretty vague on the details. Are there any nuances of meaning that I’m missing by interpreting it in its everyday sense? I have a feeling that it has a kind of ecological tinge, group sensemaking more than individual sensemaking, but I could be off the mark.

Also, what’s the origin of the term? I get the impression that it’s associated with some part of the internet that’s not too distant from my own corner, but I’m not exactly sure which one. Time to find out…


OK start with wikipedia:

https://en.wikipedia.org/wiki/Sensemaking

> Sensemaking or sense-making is the process by which people give meaning to their collective experiences. It has been defined as "the ongoing retrospective development of plausible images that rationalize what people are doing" (Weick, Sutcliffe, & Obstfeld, 2005, p. 409). The concept was introduced to organizational studies by Karl E. Weick in the 1970s and has affected both theory and practice.

Who’s Weick?

> Karl Edward Weick (born October 31, 1936) is an American organizational theorist who introduced the concepts of "loose coupling", "mindfulness", and "sensemaking" into organizational studies.

And, um, what’s organizational studies?

Organizational studies is "the examination of how individuals construct organizational structures, processes, and practices and how these, in turn, shape social relations and create institutions that ultimately influence people".[1]

OK, something sociology-related. It’s a stub so probably not a huge subfield?

Weick ‘key contributions’ subheadings: ‘enactment’, ‘loose coupling’, ‘sensemaking’, ‘mindfulness’, ‘organizational information theory’

> Although he tried several degree programs within the psychology department, the department finally built a degree program specifically for Weick and fellow student Genie Plog called "organizational psychology".[3]

Only quoting this bc Genie Plog is a great name.

So, enactment: ‘certain phenomena are created by being talked about’. Fine.

Loose coupling:

> Loose coupling in Weick’s sense is a term intended to capture the necessary degree of flex between an organization’s internal abstraction of reality, its theory of the world, on the one hand, and the concrete material actuality within which it finally acts, on the other.

Hm that could be interesting but might take me too far off topic.

Sensemaking:

> People try to make sense of organizations, and organizations themselves try to make sense of their environment. In this sense-making, Weick pays attention to questions of ambiguity and uncertainty, known as equivocality in organizational research that adopts information processing theory.

bit vague but the next bit is more concrete:

> His contributions to the theory of sensemaking include research papers such as his detailed analysis of the breakdown of sensemaking in the case of the Mann Gulch disaster,[8] in which he defines the notion of a ‘cosmology episode’ – a challenge to assumptions that causes participants to question their own capacity to act.

Mann Gulch was a big firefighting disaster:

> As the team approached the fire to begin fighting it, unexpected high winds caused the fire to suddenly expand, cutting off the men’s route and forcing them back uphill. During the next few minutes, a "blow-up" of the fire covered 3,000 acres (1,200 ha) in ten minutes, claiming the lives of 13 firefighters, including 12 of the smokejumpers. Only three of the smokejumpers survived. The fire would continue for five more days before being controlled.

> The United States Forest Service drew lessons from the tragedy of the Mann Gulch fire by designing new training techniques and safety measures that developed how the agency approached wildfire suppression. The agency also increased emphasis on fire research and the science of fire behavior.

This is interesting but I’m in danger of tab explosion here. Keep a tab open with the paper and move on. Can’t resist opening the cosmology episode page though:

> A cosmology episode is a sudden loss of meaning, followed eventually by a transformative pivot, which creates the conditions for revised meaning.

ooh nice. Weick again:

> "Representations of events normally hang together sensibly within the set of assumptions that give them life and constitute a ‘cosmos’ rather than its opposite, a ‘chaos.’ Sudden losses of meaning that can occur when an event is represented electronically in an incomplete, cryptic form are what I call a ‘cosmology episode.’ Representations in the electronic world can become chaotic for at least two reasons: The data in these representations are flawed, and the people who manage those flawed data have limited processing capacity. These two problems interact in a potentially deadly vicious circle."

This is the kind of page that looks like it was written by one enthusiast. But it is pretty interesting. Right, back to Weick.

‘Mindfulness’: this is at a collective, organisational level

> The effective adoption of collective mindfulness characteristics by an organization appears to cultivate safer cultures that exhibit improved system outcomes.

I’m not going to look up ‘organizational information theory’, I have a bit of a ‘systems thinking’ allergy and I don’t wanna.

Right, back to sensemaking article. Roots in social psychology. ‘Shifting the focus from organizations as entities to organizing as an activity.’

‘Seven properties of sensemaking’. Ugh I hate these sort of numbered lists but fine.

  1. Identity. ‘who people think they are in their context shapes what they enact and how they interpret events’

  2. Retrospection. ‘the point of retrospection in time affects what people notice (Dunford & Jones, 2000), thus attention and interruptions to that attention are highly relevant to the process’.

  3. Enaction. ‘As people speak, and build narrative accounts, it helps them understand what they think, organize their experiences and control and predict events’

  4. Social activity. ‘plausible stories are preserved, retained or shared’.

  5. Ongoing. ‘Individuals simultaneously shape and react to the environments they face… As Weick argued, "The basic idea of sensemaking is that reality is an ongoing accomplishment that emerges from efforts to create order and make retrospective sense of what occurs"’

  6. Extract cues from the context.

  7. Plausibility over accuracy.

The sort of gestalt I’m getting is that it focusses on social rather than individual thinking, and action-oriented contextual in-the-thick-of-it doing rather than abstract planning ahead. Some similar terminology to ethnomethodology I think? e.g. accountability.

Ah yeah: ‘Sensemaking scholars are less interested in the intricacies of planning than in the details of action’

> The sensemaking approach is often used to provide insight into factors that surface as organizations address either uncertain or ambiguous situations (Weick 1988, 1993; Weick et al., 2005). Beginning in the 1980s with an influential re-analysis of the Bhopal disaster, Weick’s name has come to be associated with the study of the situated sensemaking that influences the outcomes of disasters (Weick 1993).

‘Categories and related concepts’:

> The categories of sensemaking included: constituent-minded, cultural, ecological, environmental, future-oriented, intercultural, interpersonal, market, political, prosocial, prospective, and resourceful. The sensemaking-related concepts included: sensebreaking, sensedemanding, sense-exchanging, sensegiving, sensehiding, and sense specification.

Haha OK it’s this sort of ‘fluidity soup’ that I have an allergy to. Too many of these buzzwords together. ‘Systems thinking’ is just a warning sign.

‘Other applications’: military stuff. Makes sense, lots of uncertainty and ambiguity there. Patient safety (looks like another random paragraph added by an enthusiast).

There’s a big eclectic ‘see also’ list. None of those are jumping out as the obvious next follow. Back to google. What I really want to know is why people are using this word now in some internet subcultures. Might be quite youtube centred? In which case there is no hope of tracking it down in one speedrun.

Oh yeah let’s look at google images:

Looks like businessy death by powerpoint contexts, not so helpful.

31 minutes left. Shit this goes quick!!

Google is giving me lots of video links. One is Daniel Schmachtenberger, ‘The War on Sensemaking’. Maybe this is the subcultural version I’ve been seeing? His name is familiar. Ok google ‘daniel schmachtenberger sensemaking’. Rebel Wisdom. Yep I’ve vaguely heard of that.

OK here is a Medium post about that series, by Andrew Sweeny:

> There is a war going on in our current information ecosystem. It is a war of propaganda, emotional manipulation, blatant or unconscious lies. It is nothing new, but is reaching a new intensity as our technology evolves. The result is that it has become harder and harder to make sense of the world, with potentially fatal consequences. If we can’t make sense of the world, neither can we make good decisions or meet the many challenges we face as a species.

Yes this is the sort of context I was imagining:

> In War on Sensemaking, futurist and visionary Daniel Schmachtenberger outlines in forensic detail the dynamics at play in this new information ecology — one in which we are all subsumed. He explores how companies, government, and media take advantage of our distracted and vulnerable state, and how we as individuals can develop the discernment and sensemaking skills necessary to navigate this new reality. Schmachtenberger has an admirable ability to diagnose this issue, while offering epistemological and practical ways to help repair the dark labyrinth of a broken information ecology.

It’d be nice to trace the link from Weick to this.

Some stuff about zero sum games and bullshit. Mentions Vervaeke.

> Schmachtenberger also makes the point that in order to become a good sensemaker we need ‘stressors’ — demands that push our mind, body, and heart beyond comfort, and beyond the received wisdom we have inherited. It is not enough to passively consume information: we first need to engage actively with with information ecology we live in and start being aware of how we respond to it, where it is coming from, and why it is being used.

Getting the sense that ‘information ecology’ is a key phrase round here.

Oh yeah ‘Game B’! I’ve heard that phrase around. Some more names: ‘Jordan Hall, Jim Rutt, Bonnita Roy’.

‘Sovereignty’: ‘become responsibility for our own shit’… ‘A real social, ‘kitchen sink level’ of reality must be cultivated to avoid the dangers of too much abstraction, individualism, and idealism.’ Seems like a good idea.

‘Rule Omega’. This one is new to me:

> Rule Omega is simple, but often hard to put into practice. The idea is that every message contains some signal and some noise, and we can train ourselves to distinguish truth and nonsense — to separate the wheat from the chaff. If we disapprove of 95% of a distasteful political rant, for instance, we could train ourselves to hear the 5% that is true.

> Rule Omega means learning to recognise the signal within the noise. This requires a certain attunement and generosity towards the other, especially those who think differently than we do. And Rule Omega can only be applied to those who are willing to engage in a different game, and work with each other in good faith.

Also seems like a Good Thing. Then some stuff about listening to people outside your bubble. Probably a link here to ‘mememic tribes’ type people.

This is a well written article, glad I picked something good.

‘Information war’ and shadow stuff:

> Certainly there are bad actors and conspiracies to harm us, but there is also the ‘shadow within’. The shadow is the unacknowledged part we play in the destruction of the commons and in the never-ending vicious cycle of narrative war. We need to pay attention to the subtle lies we tell ourselves, as much as the ‘big’ lies that society tells us all the time. The trouble is: we can’t help being involved in destructive game theory logic, to a greater or lesser degree.

‘Anti-rivalrous systems’. Do stuff that increases value for others as well as yourself. Connection to ‘anti-rivalrous products’ in economics.

‘Information immune system’. Yeah this is nice! It sort of somehow reminds me of the old skeptics movement in its attempts to help people escape nonsense, but rooted in a warmer and more helpful set of background ideas, and with less tribal outgroup bashing. Everything here sounds good and if it helps people out of ideology prisons I’m all for it. Still kind of curious about intellectual underpinnings… like is there a straight line from Weick to this or did they just borrow a resonant phrase?

‘The dangers of concepts’. Some self-awareness that these ideas can be used to create more bullshit and misinformation themselves.

> As such it can be dangerous to outsource our sensemaking to concepts — instead we need to embody them in our words and actions. Wrestling with the snake of self-deception and illusion and trying to build a better world in this way is a tough game. But it is the only game worth playing.

Games seem to be a recurring motif. Maybe Finite and Infinite Games is another influence.

OK 13 minutes left, what to do? Maybe trace out the link? google ‘schmachtenberger weick’. Not finding much. I’m now on some site called Conversational Leadership which seems to be connected to this scene somehow. Ugh not sure what to do. Back to plain old google ‘sensemaking’ search.

Let’s try this article by Laura McNamara, an organizational anthropologist. Nice job title! Yeah her background looks really interesting:

> Principal Member of Technical Staff at Sandia National Laboratories. She has spent her career partnering with computer scientists, software engineers, physicists, human factors experts, I/O psychologists, and analysts of all sorts.

OK maybe she is trying to bridge the gap between old and new usages:

> Sensemaking is a term that gets thrown around a lot without much consideration about where the concept came from or what it really means. If sensemaking theory is democratizing, that’s good thing.

6 minutes left so I won’t get through all of this. Pick some interesting bits.

> One of my favorite books about sensemaking is Karl Weick’s, Sensemaking in Organizations. I owe a debt of thanks to the nuclear engineer who suggested I read it. This was back in 2001, when I was at Los Alamos National Laboratory (LANL). I’d just finished my dissertation and was starting a postdoctoral position in the statistics group, and word got around that the laboratories had an anthropologist on staff. My nuclear engineer friend was working on a project examining how management changes were impacting team dynamics in one of LANL’s radiochemistry bench laboratories. He called me asking if I had time to work on the project with him, and he asked if I knew much about “sensemaking.” Apparently, his officemate had recently married a qualitative evaluation researcher, who suggested that both of these LANL engineers take the time to read Karl Weick’s book Sensemaking in Organizations.

> My nuclear engineer colleague thought it was the most brilliant thing he’d ever read and was shocked, SHOCKED, that I’d never heard of sensemaking or Karl Weick. I muttered something about anthropologists not always being literate in organizational theory, got off the phone, and immediately logged onto Amazon and ordered it.

Weick’s influences:

> … a breathtakingly broad array of ideas – Emily Dickinson, Anthony Giddens, Pablo Neruda, Edmund Leach…

‘Recipe for sensemaking:’

> Chapter Two of Sensemaking in Organizations contains what is perhaps Weick’s most cited sentence, the recipe for sensemaking: “How can I know what I think until I see what I say?”

And this from the intro paragraph, could be an interesting reference:

> in his gorgeous essay Social Things (which you should read if you haven’t already), Charles Lemert reminds us that social science articulates our native social intelligence through instruments of theory, concepts, methods, language, discourse, texts. Really good sociology and anthropology sharpen that intelligence. They’re powerful because they enhance our understanding of what it means to be human, and they really should belong to everyone.

Something about wiki platforms for knowledge sharing:

> For example, back in 2008, my colleague Nancy Dixon and I did a brief study—just a few weeks—examining how intelligence analysts were responding to the introduction of Intellipedia, a wiki platform intended to promote knowledge exchange and cross-domain collaboration across the United States Intelligence community.

DING! Time’s up.


That actually went really well! Favourite speedrun so far, felt like I found out a lot. Most of the references I ended up on were really well-written and clear this time, no wading through rubbish.

I’m still curious to trace the link between Weick and the recent subculture. Also I might read more of the disaster stuff, and read that last McNamara article more carefully. Lots to look into! If anyone has any other suggestions, please leave a comment 🙂

Worse than quantum physics, part 2

This is Part 2 of a two part explanation — Part 1 is here. It won’t make much sense on its own!

In this post I’m going to get into the details of the analogy I set up last time. So far I’ve described how the PR box is ‘worse than quantum physics’ in a specific sense: it violates the CHSH inequality more strongly than any quantum system, pushing past the Tsirelson bound of 2\sqrt{2} to reach the maximum possible value of 4. I also introduced Piponi’s box example, another even simpler ‘worse than quantum physics’ toy system.

This time I’ll explain the connection between Piponi’s box and qubit phase space, and then show that a similar CHSH-inequality-like ‘logical Bell inequality’ holds there too. In this case the quantum system has a Tsirelson-like bound of \sqrt{3}, interestingly intermediate between the classical limit of 1 and the maximum possible value of 3 obtained by Piponi’s box. Finally I’ll dump a load of remaining questions into a Discussion section in the hope that someone can help me out here.

A logical Bell inequality for the Piponi box

Here’s the table from the last post again:


Measurement T F
a 1 0
b 1 0
a \oplus b 1 0

As with the PR box, we can use the yellow highlighted cells in the table to get a version of Abramsky and Hardy’s logical Bell inequality \sum p_i \leq N-1, this time with N = 3 cells. These cells correspond to the three incompatible propositions a, b, a\oplus b, with combined probability \sum p_i = 3, violating the inequality by the maximum amount.

Converting to expected values E_i = 2p_i -1 gives

\sum E_i = 3 > N-2 = 1.

So that’s the Piponi box ↔ PR box part of the analogy sorted. Next I want to talk about the qubit phase space ↔ Bell state part. But first it will be useful to rewrite the table of Piponi box results in a way that makes the connection to qubit phase space more obvious:



The four boxes represent the four ‘probabilities’ P(a,b) introduced in the previous post, which can be negative. To recover the values in the table, add up rows, columns or diagonals of the diagram. For example, to find p(\lnot a), sum up the left hand column:

p(\lnot a) = P(\lnot a, b) + P(\lnot a, \lnot b) = \frac{1}{2} - \frac{1}{2} = 0.

Or to find p(a \oplus b), sum up the top-left-to-bottom-right diagonal:

p(a \oplus b) = P(a, \lnot b) + P(\lnot a, b) = \frac{1}{2} + \frac{1}{2} = 1.

I made the diagram below to show how this works in general, and now I’m not sure whether that was a good idea. It’s kind of busy and looking at the example above is probably a lot more helpful. On the other hand, I’ve gone through the effort of making it now and someone might find it useful, so here it is:


Qubit phase space

That’s the first part of the analogy done, between the PR box and Piponi’s box model. Now for the second part, between the CHSH system and qubit phase space. I want to show that the same set of measurements that I used for Piponi’s box also crops up in quantum mechanics as measurements on the phase space of a single qubit. This quantum case also violates the classical bound of \sum E_i = 1, but, as with the Tsirelson bound for an entangled qubit system, it doesn’t reach the maximum possible value. Instead, it tops out at \sum E_i = \sqrt{3}.

The measurements a, b, a\oplus b can be instantiated for a qubit in the following way. For a qubit |\psi\rangle, take

p(a)  = \langle \psi | Q_z | \psi \rangle ,

p(b) = \langle \psi | Q_x | \psi \rangle ,

with Q_i  = \frac{1}{2}(I-\sigma_i) for the Pauli matrices \sigma_i. The a\oplus b diagonal measurements then turn out to correspond to

p(a\oplus b) = \langle \psi | Q_y | \psi \rangle ,

completing the set of measurements.

This is the qubit phase space I described in my second post on negative probability – for more details on how this works and how the corresponding P(a,b)s are calculated, see for example the papers by Wootters on finite-state Wigner functions and Picturing Qubits in Phase Space.

As a simple example, in the case of the qubit state |0\rangle these measurements give

p(a) = 0

p(b) = \frac{1}{2}

p(a\oplus b) = \frac{1}{2},

leading to the following phase space:



A Tsirelson-like bound for qubit phase space

Now, we want to find the qubit state |\psi\rangle which gives the largest value of \sum p_i. To do this, I wrote out |\psi\rangle in the general Bloch sphere form |\psi\rangle = \cos(\theta / 2) |0\rangle + e^{i\phi} \sin(\theta / 2) |1\rangle and then maximised the value of the highlighted cells in the table:

\sum p_i = p(a) + p(b) + p(a\oplus b) = \frac{3}{2} - \frac{1}{2}(\cos\theta + \sin\theta\cos\phi + \sin\theta\sin\phi )

This is a straightforward calculation but the details are kind of fiddly, so I’ve relegated them to a separate page (like the boring technical appendix at the back of a paper, but blog post style). Anyway the upshot is that this quantity is maximised when \phi = \frac{5\pi}{4} , \sin\theta = \frac{\sqrt{2}}{\sqrt{3}} and \cos\theta = -\frac{1}{\sqrt{3}}, leading to the following table:


Measurement T F
a \frac{1}{2}\left(1 + \frac{1}{\sqrt{3}} \right) 0
b \frac{1}{2}\left(1 + \frac{1}{\sqrt{3}} \right) 0
a \oplus b \frac{1}{2}\left(1 + \frac{1}{\sqrt{3}} \right) 0

The corresponding qubit phase space, if you’re interested, is the following:


Notice the negative ‘probability’ in the bottom left, with a value of around -0.183. This is in fact the most negative value possible for qubit phase space.

This time, adding up the numbers in the yellow-highlighted cells of the table gives

\sum p_i = \frac{3}{2}\left(1 + \frac{1}{\sqrt{3}} \right),

or, in terms of expectation values,

\sum E_i = \sum (2p_i - 1) =   \sqrt{3}.

So \sqrt{3} is our Tsirelson-like bound for this system, in between the classical limit of 1 and the Piponi box value of 3.


Further questions

As with all of my physics blog posts, I end up with more questions than I started with. Here are a few of them:

Is this analogy already described in some paper somewhere? If so, please point me at it!

Numerology. Why \sqrt{3} and not some other number? As a first step, I can do a bit of numerology and notice that \sqrt{3} = \sqrt{N/2}, where N=6 is the number of cells in the table, and that this rule also fits the CHSH bound of 2\sqrt{2}, where there are N=16 cells.

I can also try this formula on the Mermin example from my Bell post. In that case N=36, so the upper bound implied by the rule would be 3\sqrt{2} … which turns out to be correct. (I didn’t find the upper bound in the post, but you can get it by putting \tfrac{1}{8}(2+\sqrt 2) in all the highlighted cells of the table, similarly to CHSH.)

The Mermin example is close enough to CHSH that it’s not really an independent data point for my rule, but it’s reassuring that it still fits, at least.

What does this mean? Does it generalise? I don’t know. There’s a big literature on different families of Bell results and their upper bounds, and I don’t know my way around it.

Information causality. OK, playing around with numbers is fine, but what does it mean conceptually? Again, I don’t really know my way around the literature. I know there’s a bunch of papers, starting from this one by Pawlowski et al, that introduces a physical principle called ‘information causality’. According to that paper, this states that, for a sender Alice and a receiver Bob,

> the information gain that Bob can reach about the previously unknown to him data set of Alice, by using all his local resources and m classical bits communicated by Alice, is at most m bits.

This principle somehow leads to the Tsirelson bound… as you can see I have not looked into the details yet. This is probably what I should do next. It’s very much phrased in terms of having two separated systems, so I don’t know whether it can be applied usefully in my case of a single qubit.

If you have any insight into any of these questions, or you notice any errors in the post, please let me know in the comments below, or by email.

Worse than quantum physics

I’m still down the rabbithole of thinking way too much about quantum foundations and negative probabilities, and this time I came across an interesting analogy, which I will attempt to explain in this post and the next one. This should follow on nicely from my last post, where I talked about one of the most famous weird features of quantum physics, the violation of the Bell inequalities.

It’s not necessary to read all of that post to understand this one, but you will need to be somewhat familiar with the Bell inequalities (and the CHSH inequality in particular) from somewhere else. For the more technical parts, you’ll also need to know a little bit about Abramsky and Hardy’s logical Bell formulation, which I also covered in the last post. But the core idea probably makes some kind of sense without that background.

So, in that last post I talked about the CHSH inequality and how quantum physics violates the classical upper limit of 2. The example I went through in the post is designed to make the numbers easy, and reaches a value of 2.5, but it’s possible to pick a set of measurements that pushes it further again, to a maximum of 2\sqrt{2} (which is about 2.828). This value is known as the Tsirelson bound.

This maximum value is higher than anything allowed by classical physics, but doesn’t reach the absolute maximum that’s mathematically attainable. The CHSH inequality is normally written something like this:

| E(a,b) + E(\bar{a}, b) + E(a, \bar{b}) - E(\bar{a}, \bar{b}) | \leq 2.

Each of the Es has to be between -1 and +1, so if it was possible to always measure +1 for the first three and -1 for the last one you’d get 4.

This kind of hypothetical ‘superquantum correlation’ is interesting because of the potential to illuminate what’s special about the Tsirelson bound – why does quantum mechanics break the classical limit, but not go all the way? So systems that are ‘worse than quantum physics’ and push all the way to 4 are studied as toy models that can hopefully illuminate something about the constraints on quantum mechanics. The standard example is known as the Popescu-Rohrlich (PR) box, introduced in this paper.

This sounds familiar…

I was reading up on the PR box a while back, and it reminded me of something else I looked into. In my blog posts on negative probability, I used a simple example due to Dan Piponi. This example has the same general structure as measurements on a qubit, but it’s also ‘worse than quantum mechanics’, in the sense that one of the probabilities is more negative than anything allowed in quantum mechanics. Qubits are somewhere in the middle, in between classical systems and the Piponi box.

I immediately noticed the similarity, but at first I thought it was probably something superficial and didn’t investigate further. But after learning about Abramsky and Hardy’s logical formulation of the Bell inequalities, which I covered in the last post, I realised that there was an exact analogy.

This is really interesting to me, because I had no idea that there was any sort of Tsirelson bound equivalent for a single particle system. I’ve already spent quite a bit of time in the last couple of years thinking about the phase space of a single qubit, because it seems to me that a lot of essential quantum weirdness is hidden in there already, before you even consider entanglement with a second qubit – you’ve already got the negative probabilities, after all. But I wasn’t expecting this other analogy to turn up.

I haven’t come across this result in the published literature. But I also haven’t done anything like a thorough search, and it’s quite difficult to because Piponi’s example is in a blog post, rather than a paper. So maybe it’s new, or maybe it’s too simple to write down and stuck in the ghost library, or maybe it’s all over the place and I just haven’t found it yet. I really don’t know, and it seemed like the easiest thing was to just write it up and then try and find out once I had something concrete to point at. I am convinced it hasn’t been written up at anything like a blog-post-style introductory level, so hopefully this can be useful however it turns out.

Post structure

I decided to split this argument into two shorter parts and post them separately, to make it more readable. This first part is just background on the Tsirelson bound and the PR box – there’s nothing new here, but it was useful for me to collect the background I need in one place. I also give a quick description of Piponi’s box model.

In the second post, I’ll move on to explaining the single qubit analogy. This is the interesting bit!

The Tsirelson bound: Mermin’s machine again

To illustrate how Tsirelson’s bound is attained, I’ll go back to Mermin’s machine from the last post. I’ll use the same basic setup as before, but move the settings on the detectors:


This time the two settings on each detector are at right angles to each other, and the right hand detector settings are rotated 45 degrees from the left hand detector. As before, quantum mechanics says that the probabilities of different combinations of lights flashing will obey

p(T,T) = p(F,F) = \frac{1}{2}\cos^2\left(\frac{\theta}{2}\right),

p(T,F) = p(F,T) = \frac{1}{2}\sin^2\left(\frac{\theta}{2}\right),

where \theta is the angle between the detector settings. The numbers are more hassly than Mermin’s example, which was picked for simplicity – here’s the table of probabilities:


Dial setting (T,T) (T,F) (F,T) (F,F)
ab \tfrac{1}{8}(2+\sqrt 2) \tfrac{1}{8}(2-\sqrt 2) \tfrac{1}{8}(2-\sqrt 2) \tfrac{1}{8}(2+\sqrt 2)
ab' \tfrac{1}{8}(2-\sqrt 2) \tfrac{1}{8}(2+\sqrt 2) \tfrac{1}{8}(2+\sqrt 2) \tfrac{1}{8}(2-\sqrt 2)
a'b \tfrac{1}{8}(2+\sqrt 2) \tfrac{1}{8}(2-\sqrt 2) \tfrac{1}{8}(2-\sqrt 2) \tfrac{1}{8}(2+\sqrt 2)
a'b' \tfrac{1}{8}(2+\sqrt 2) \tfrac{1}{8}(2-\sqrt 2) \tfrac{1}{8}(2-\sqrt 2) \tfrac{1}{8}(2+\sqrt 2)

Then we follow the logical Bell procedure of the last post, take a set of mutually contradictory propositions (the highlighted cells) and find their combined probability. This gives \sum p_i = 2+\sqrt 2, or, converting to expectation values E_i = 2p_i - 1,

\sum E_i = 2\sqrt 2 .

This is the Tsirelson bound.

The PR box

The idea of the PR box is to get the highest violation of the inequality possible, by shoving all of the probability into the highlighted cells, like this:

Dial setting (T,T) (T,F) (F,T) (F,F)
ab 1/2 0 0 1/2
a\bar{b} 0 1/2 1/2 0
\bar{a}b 1/2 0 0 1/2
\bar{a}\bar{b} 1/2 0 0 1/2

This time, adding up all the highlighted boxes gives the maximum \sum E_i = 4 .

Signalling

This is kind of an aside in the context of this post, but the original motivation for the PR box was to demonstrate that you could push past the quantum limit while still not allowing signalling between the two devices: if you only have access the left hand box, for example, you can’t learn anything about the right hand box’s dial setting. Say you set the left hand box to dial setting a. If the right hand box was set to b you’d end up measuring T with a probability of

p(T,T| a,b) + p(T,F| a,b) = \frac{1}{2} + 0 = \frac{1}{2}.

If the right hand box was set to \bar{b} instead you’d still get \frac{1}{2}:

p(T,T| a,\bar{b}) + p(T,F| a,\bar{b}) = 0 + \frac{1}{2} = \frac{1}{2}.

The same conspiracy holds if you set the left hand box to \bar{a}, so whatever you do you can’t find out anything about the right hand box.

Negative probabilities

Another interesting feature of the PR box, which will be directly relevant here, is the connection to negative probabilities. Say you want to explain the results of the PR box in terms of underlying probabilities P(a,a',b,b') for all of the settings at once. This can’t be done in terms of normal probabilities, which is not surprising: this property of having consistent results independent of the measurement settings you choose is exactly what’s broken down for non-classical systems like the CHSH system and the PR box.

However you can reproduce the results if you allow some negative probabilities. In the case of the PR box, you end up with the following:


P(T,T,T,T) = \frac{1}{2}

P(T,T,T,F) = 0

P(T,T,F,T) = -\frac{1}{2}

P(T,T,F,F) = 0

P(T,F,T,T) = 0

P(T,F,T,F) = 0

P(T,F,F,T) = \frac{1}{2}

P(T,F,F,F) = 0

P(F,T,T,T) = -\frac{1}{2}

P(F,T,T,F) = \frac{1}{2}

P(F,T,F,T) = \frac{1}{2}

P(F,T,F,F) = 0

P(F,F,T,T) = 0

P(F,F,T,F) = 0

P(F,F,F,T) = 0

P(F,F,F,F) = 0

(I got these from Abramsky and Brandenburger’s An Operational Interpretation of Negative Probabilities and No-Signalling Models.) To get back the probabilities in the table above, sum up all relevant Ps for each dial setting. As an example, take the top left cell of the table above. To get the probability of (T,T) for dial setting (a,b), sum up all cases where a and b are both T:

P(T,T,T,T) + P(T,T,T,F) + P(T,F,T,T) + P(T,F,T,F) = \frac{1}{2}

In this way we recover the values of all the measurements in the table – it’s only the Ps that are negative, not anything we can actually measure. This feature, along with the way that the number -\tfrac{1}{2} crops up specifically, is what reminded me of Piponi’s blog post.

Piponi’s box model

The device in Piponi’s example is a single box containing two bits a and b, and you can make one of three measurements: the value of a, the value of b, or the value of a \oplus b. The result is either T or F, with probabilities that obey the following table:


Measurement T F
a 1 0
b 1 0
a \oplus b 1 0

These measurements are inconsistent and can’t be described with any normal probabilities P(a,b), but, as with the PR box, they can with negative probabilities:

P(T,T) = \frac{1}{2}

P(T,F) = \frac{1}{2}

P(F,T) = \frac{1}{2}

P(F,F) = -\frac{1}{2}

For example, the probability of measuring a\oplus b and getting F is

P(T,T) + P(F,F) = \frac{1}{2} - \frac{1}{2} = 0,

as in the table above.

Notice that -\frac{1}{2} crops up again! The similarities to the PR box go deeper, though. The PR box is a kind of extreme version of the CHSH state of two entangled qubits – same basic mathematics but pushing the correlations up higher. Analogously, Piponi’s box is an extreme version of the phase space for a single qubit. In both cases, quantum mechanics is perched intriguingly in the middle between classical mechanics and these extreme systems. I’ll go through the details of the analogy in the next post.

Bell’s theorem and Mermin’s machine

> Anybody who’s not bothered by Bell’s theorem has to have rocks in his head.

— ‘A distinguished Princeton physicist’, as told to David Mermin

This post is a long, idiosyncratic discussion of the Bell inequalities in quantum physics. There are plenty of good introductions already, so this is a bit of a weird thing to spend my time writing. But I wanted something very specific, and couldn’t find an existing version that had all the right pieces. So of course I had to spend far too much time making one.

My favourite introduction is Mermin’s wonderful Quantum Mysteries for Anyone. This is an absolute classic of clear explanation, and lots of modern pop science discussions derive from it. It’s been optimised for giving a really intense gut punch of NOTHING IN THE WORLD MAKES SENSE ANY MORE, which I’d argue is the main thing you want to get out of learning about the Bell inequalities.

However, at some point if you get serious you’ll want to actually calculate things, which means you’ll need to make the jump from Mermin’s version to the kind of exposition you see in a textbook. The most common modern version of the Bell inequalities you’ll see is the CHSH inequality, which looks like this:

| E(a,b) + E(\bar{a}, b) + E(a, \bar{b}) - E(\bar{a}, \bar{b}) | < 2

(It doesn’t matter what all of that means, at the moment… I’ll get to that later.) The standard sort of derivations of this tend to involve a lot of fussing with algebraic rearrangements and integrals full of \lambdas and so forth. The final result is less of a gut punch and more of a diffuse feeling of unease: "well I guess this number has to be between -2 and 2, but it isn’t".

This feels like a problem to me. There’s a 1929 New Yorker cartoon which depicts ordinary people in the street walking around dumbstruck by Einstein’s theory of general relativity. This is a comic idea because the theory was famously abstruse (particularly back then when good secondary explanations were thin on the ground). But the Bell inequalities are accessible to anyone with a very basic knowledge of maths, and weirder than anything in relativity. I genuinely think that everyone should be walking down the street clutching their heads in shock at the Bell inequalities, and a good introduction should help deliver you to this state. (If you don’t have rocks in your head, of course. In that case nothing will help you.)

It’s also a bit of an opaque black box. For example, why is there a minus sign in front of one of the Es but not the others? I was in a discussion group a few years back with a bunch of postdocs and PhD students, all of us with a pretty strong interest in quantum foundations, and CHSH came up at some point. None of us had much of a gut sense for what that minus sign was doing… it was just something that turned up during some algebra.

I wanted to trace a path from Mermin’s explanation to the textbook one, in the hope of propagating some of that intuitive force forward. I wrote an early draft of the first part of this post for a newsletter in 2018 but couldn’t see how to make the rest of it work, so I dropped it. This time I had a lot more success using some ideas I learned in the meantime. I ended up taking a detour through a third type of explanation, the ‘logical Bell inequalities’ approach of Abramsky and Hardy. This is a general method that can be used on a number of other similar ‘no-go theorems’, not just Bell’s original. It gives a lot more insight into what’s actually going on (including that pesky minus sign). It’s also surprisingly straightforward: the main result is a few steps of propositional logic.

That bit of propositional logic is the most mathematically involved part of this post. The early part just requires some arithmetic and the willingness to follow what Mermin calls ‘a simple counting argument on the level of a newspaper braintwister’. No understanding of the mathematics of quantum theory is needed at all! That’s because I’m only talking about why the results of quantum theory are weird, and not how the calculations that produce those results are done.

If you also want to learn to do the calculations, starting from a basic knowledge of linear algebra and complex numbers, I really like Michael Nielsen and Andy Matuschak’s Quantum Country, which covers the basic principles of quantum mechanics and also the Bell inequalities. You’d need to do the ‘Quantum computing for the very curious’ part, which introduces a lot of background ideas, and then the ‘Quantum mechanics distilled’ part, which has the principles and the Bell stuff.

There’s also nothing about how the weirdness should be interpreted, because that is an enormous 90-year-old can of rotten worms and I would like to finish this post some time in my life 🙂

Mermin’s machine

So, on to Mermin’s explanation. I can’t really improve on it, and it would be a good idea to go and read that now instead, and come back to my version afterwards. I’ve repeated it here anyway though, partly for completeness and partly because I’ve changed some notation and other details to mesh better with the Abramsky and Hardy version I’ll come to later.

(Boring paragraph on exactly what I changed, skip if you don’t care: I’ve switched Mermin’s ‘red’ and ‘green’ to ‘true’ and ‘false’, and the dial settings from 1,2,3 on both sides to a, a', a'' on the left side and b, b', b'' on the right side. I’ve also made one slightly more substantive change. Mermin explains at the end of his paper that in his setup, ‘One detector flashes red or green according to whether the measured spin is along or opposite to the field; the other uses the opposite color convention’. I didn’t want to introduce the complication of having the two detectors with opposite wiring, and have made them both respond the same way, flashing T for along the field and F for opposite. But I also wanted to keep Mermin’s results. To do that I had to change the dial positions of the right hand dial, so that a is opposite b, a' is opposite b', and a'' is opposite b''. )

Anyway, Mermin introduces the following setup:



The machine in the middle is the source. It fires out some kind of particle – photons, electrons, frozen peas, whatever. We don’t really care how it works, we’ll just be looking at why the results are weird.

The two machines on the right and left side are detectors. Each detector has a dial with three settings. On the left they’re labelled a, a' and a''. On the right, they’re b, b' and b''.

On the top of each are two lights marked T and F for true and false. (Again, we don’t really care what’s true or false, we’re keeping everything at a kind of abstract, operational level and not going into the practical details. It’s just two possible results of a measurement.)

It’s vital to this experiment that the two detectors cannot communicate at all. If they can, there’s nothing weird about the results. So assume that a lot of work has gone into making absolutely sure that the detectors are definitely not sharing information in any way at all.

Now the experiment just consists of firing out pairs of particles, one to each detector, with the dials set to different values, and recording whether the lights flash red or green. So you get a big list of results of the form

ab'TF, a''bFT, a'b'FF, ...

The second important point, other than the detectors not being able to communicate, is that you have a free choice of setting the dials. You can set them both beforehand, or when the particles are both ‘in flight’, or even set the right hand dial after the left hand detector has already received its particle but before the right hand particle gets there. It doesn’t matter.

Now you do like a million billion runs of this experiment, enough to convince you that the results are not some weird statistical fluctuation, and analyse the results. You end up with the following table:


Dial setting (T,T) (T,F) (F,T) (F,F)
ab 1/2 0 0 1/2
ab' 1/8 3/8 3/8 1/8
ab'' 1/8 3/8 3/8 1/8
a'b 1/8 3/8 3/8 1/8
a'b' 1/2 0 0 1/2
a'b'' 1/8 3/8 3/8 1/8
a''b 1/8 3/8 3/8 1/8
a''b' 1/8 3/8 3/8 1/8
a''b'' 1/2 0 0 1/2

Each dial setting has a row, and the entries in that row give the probabilities for getting the different results. So for instance if you set the dials to a' and b, there’s a 1/8 chance of getting (T,T).

This doesn’t obviously look particularly weird at first sight. It only turns out to be weird when you start analysing the results. Mermin condenses two results from this table which are enough to show the weirdness. The first is:

Result 1: This result relates to the cases where the two dials are set to ab, a'b', or a''b''. In these cases both lights always flash the same colour. So you might get ab TT, ab FF, a'b' TT etc, but never ab TF or a''b'' FT.

This is pretty easy to explain. The detectors can’t communicate, so if they do the same thing it must be something to do with the properties of the particles they are receiving. We can explain it straightforwardly by postulating that each particle has an internal state with three properties, one for each dial position. Each of these takes two possible values which we label T or F. We can write these states as e.g.

TTF

TTF

where the the entries on the top line refer to the left hand particle’s state when the dial is in the a, a' and a'' positions respectively, and the bottom line refers to the right hand particle’s state when the dial is in the b, b', b'' position.

Result 1 implies that the states of the two particles must always be the same. So the state above is an allowed one, but e.g.

TTF

TFF

isn’t.

Mermin says:

> This hypothesis is the obvious way to account for what happens in [Result 1]. I cannot prove that it is the only way, but I challenge the reader, given the lack of connections between the devices, to suggest any other.

Because the second particle will always have the same state to the first one, I’ll save some typing and just write the first one out as a shorthand. So the first example state will just become TTF.

Now on to the second result. This one covers the remaining options for dial settings, a'b', a''b and the like.

Result 2: For the remaining states, the lights flash the same colour 1/4 of the time, and different colours 3/4 of the time.

This looks quite innocuous on first sight. It’s only when you start to consider how it meshes with Result 1 that things get weird.

(This is the part of the explanation that requires some thinking ‘on the level of a newspaper braintwister’. It’s fairly painless and will be over soon.)

Our explanation for result 1 is that particles in each run of the experiment have an underlying state, and both particles have the same state. Let’s go through the implications of this, starting with the example state TTF.

I’ve enumerated the various options for the dials in the table below. For example, if the left dial is a and the right dial is b', we know that the left detector will light up T and the right will light up T, so the two lights are the same.


Dial setting Lights
ab' same
ab'' different
a'b same
a'b'' different
a''b different
a''b' different

Overall there’s a 1/3 chance of being the same and a 2/3 chance of being different. You can convince yourself that this is also true for all the states with two Ts and an F or vice versa: TTF TFF, TFT, FTT, FTF, FFT.

That leaves TTT and FFF as the other two options. In those cases the lights will flash the same colour no matter what the dial is set to.

So whatever the underlying state is, the chance of the two lights being different is greater than ⅓. But this is incompatible with Result 2, which says that the probability is ¼.

(The thinky part is now done.)

So Results 1 and 2 together are completely bizarre. No assignment of states will work. But this is exactly what happens in quantum mechanics!

You probably can’t do it with frozen peas, though. The details don’t matter for this post, but here’s a very brief description if you want it: the particles should be two spin-half particles prepared in a specific ‘singlet’ state, the dials should connect to magnets that can be oriented in three states at 120 degree angles from each other, and the lights on the detectors measure spin along and opposite to the field. The magnets should be set up so that the state for setting a on the left hand side is oriented at 180 degrees from the state for setting b on the right hand side; similarly a' should be opposite b' and a'' opposite b''. I’ve drawn the dials on the machine to match this. Quantum mechanics then says that the probabilities of the different results are

p(T,T) = p(F,F) = \frac{1}{2}\cos^2{\frac{\theta}{2}}

p(T,F) = p(F,T) = \frac{1}{2}\sin^2{\frac{\theta}{2}}

where \theta is the angle between the magnet states on the left and right sides. This reproduces the numbers in the table above.

Once more with less thinking

Mermin’s argument is clear and compelling. The only problem with it is that you have to do some thinking. There are clever details that apply to this particular case, and if you want to do another case you’ll have to do more thinking. Not good. This is where Abramsky and Hardy’s logical Bell approach comes in. It requires more upfront setup (so actually more thinking in the short term – this section title is kind of a lie, sorry) but can then be applied systematically to all kinds of problems.

This first involves reframing the entries in the probability table in terms of propositional logic. For example, we can write the result (T,F) for (a’,b) as a' \land \lnot b. Then the entries of the table correspond to the probabilities we assign to each statement: in this case, \text{prob}(a' \land \lnot b) = \frac{3}{8}.

Now, look at the following highlighted cells in three rows of the grid:


Dial setting (T,T) (T,F) (F,T) (F,F)
ab 1/2 0 0 1/2
ab' 1/8 3/8 3/8 1/8
ab'' 1/8 3/8 3/8 1/8
a'b 1/8 3/8 3/8 1/8
a'b' 1/2 0 0 1/2
a'b'' 1/8 3/8 3/8 1/8
a''b 1/8 3/8 3/8 1/8
a''b' 1/8 3/8 3/8 1/8
a''b'' 1/2 0 0 1/2

These correspond to the three propositions

\phi_1 = (a\land b) \lor (\lnot a \land\lnot b)

\phi_2 = (a'\land b') \lor (\lnot a' \land\lnot b')

\phi_3 = (a''\land b'') \lor (\lnot a'' \land\lnot b'') ,

which can be written more simply as

\phi_1 = a \leftrightarrow b

\phi_2 = a' \leftrightarrow b'

\phi_3 = a'' \leftrightarrow b''.

where the \leftrightarrow stands for logical equivalence. This also means that a can be substituted for b, and so on, which will be useful in a minute.

Next, look at the highlighted cells in these three rows:


Dial setting (T,T) (T,F) (F,T) (F,F)
ab 1/2 0 0 1/2
ab' 1/8 3/8 3/8 1/8
ab'' 1/8 3/8 3/8 1/8
a'b 1/8 3/8 3/8 1/8
a'b' 1/2 0 0 1/2
a'b'' 1/8 3/8 3/8 1/8
a''b 1/8 3/8 3/8 1/8
a''b' 1/8 3/8 3/8 1/8
a''b'' 1/2 0 0 1/2

These correspond to

\phi_4 = (a\land \lnot b') \lor (\lnot a \land b')

\phi_5 = (a\land \lnot b'') \lor \lnot (a \land b'')

\phi_6 = (a'\land \lnot b'') \lor (\lnot a' \land b'') ,

which can be simplified to

\phi_4 = a \oplus b'

\phi_5 = a \oplus b''

\phi_6 = a' \oplus b''.

where the \oplus stands for exclusive or.

Now it can be shown quite quickly that these six propositions are mutually contradictory. First use the first three propositions to get rid of b , b' and b'', leaving

a \oplus a'

a \oplus a''

a' \oplus a''

You can check that these are contradictory by drawing out the truth table, or maybe just by looking at them, or maybe by considering the following stupid dialogue for a while (this post is long and I have to entertain myself somehow):


Grumpy cook 1: You must have either beans or chips but not both.

Me: OK, I’ll have chips.

Grumpy cook 2: Yeah, and also you must have either beans or peas but not both.

Me: Fine, looks like I’m having chips and peas.

Grumpy cook 3: Yeah, and also you must have either chips or peas but not both.

Me:

Me: OK let’s back up a bit. I’d better have beans instead of chips.

Grumpy cook 1: You must have either beans or chips but not both.

Me: I know. No chips. Just beans.

Grumpy cook 2: Yeah, and also you must have either beans or peas but not both.

Me: Well I’ve already got to have beans. But I can’t have them with chips or peas. Got anything else?

Grumpy cook 3: NO! And remember, you must have either chips or peas.

Me: hurls tray


So, yep, the six highlighted propositions are inconsistent. But this wouldn’t necessarily matter, as some of the propositions are only probabilistically true. So you could imagine that, if you carefully set some of them to false in the right ways in each run, you could avoid the contradiction. However, we saw with Mermin’s argument above that this doesn’t save the situation – the propositions have ‘too much probability in total’, in some sense, to allow you to do this. Abramsky and Hardy’s logical Bell inequalities will quantify this vague ‘too much probability in total’ idea.

Logical Bell inequalities

This bit involves a few lines of logical reasoning. We’ve got a set of propositions \phi_i (six of them in this example case, N in general), each with probability p_i. Let P be the probability of all of them happening together. Call this combined statement

\Phi = \bigwedge_i \phi_i.

Then

1 - P = \text{prob}\left( \lnot\Phi\right) = \text{prob}\left(\bigvee_i \lnot\phi_i\right)

where the second equivalence is de Morgan’s law. This is definitely less than the sum of the probabilities of all the \lnot\phi_i s:

1 - P \leq \text{prob} \sum_i (\lnot\phi_i)

= \sum_i (1 - p_i)

= N - \sum_i p_i .

where N is the total number of propositions. Rearranging gives

\sum_i p_i \leq N + P - 1.

Now suppose the \phi_i are jointly contradictory, as in the Mermin example above, so that the combined probability P = 0. This gives the logical Bell inequality

\sum_i p_i \leq N-1 .

This is the precise version of the ‘too much probability’ idea. In the Mermin case, there are six propositions, three with probability 1 and three with probability ¾, which sum to 5.25. This is greater than N-1 = 5, so the inequality is violated.

This inequality can be applied to lots of different setups, not just Mermin’s. Abramsky and Hardy use the CHSH inequality mentioned in the introduction to this post as their first example. This is probably the common example used to introduce Bell’s theorem, though the notation is usually somewhat different. I’ll go though Abramsky and Hardy’s version and then connect it back to the standard textbook notation.

The CHSH inequality

The CHSH experiment only uses two settings on each side, not three. I’ve drawn a ‘CHSH machine’ in the style of Mermin’s machine to illustrate it:



There are two settings a and \bar{a} on the left side, 60 degrees apart. And there are two settings b and \bar{b} on the right side, also 60 degrees apart, with b opposite a. This leads to the following table:


Dial setting (T,T) (T,F) (F,T) (F,F)
ab 1/2 0 0 1/2
a\bar{b} 3/8 1/8 1/8 3/8
\bar{a}b 3/8 1/8 1/8 3/8
\bar{a}\bar{b} 1/8 3/8 3/8 1/8

Now it’s just a case of following the same reasoning as for the Mermin case. The highlighted rows correspond to the propositions

\phi_1 = (a \land b) \lor  \lnot (a \land \lnot b) = a \leftrightarrow b

\phi_2 = (a \land \bar{b}) \lor \lnot (a \land \lnot \bar{b}) = a \leftrightarrow \bar{b}

\phi_3 = (\bar{a} \land b) \lor \lnot (\bar{a} \land \lnot b) = \bar{a} \leftrightarrow b

\phi_4 = (\lnot \bar{a} \land \bar{b}) \lor (\bar{a} \land \lnot \bar{b}) = \bar{a} \oplus \bar{b}

As with Mermin’s example, these four propositions can be seen to be contradictory. Rather than trying to make up more stupid dialogues, I’ll just follow the method in the paper. First use \phi_3 to replace \bar{a} with b in \phi_4:

\phi_4 = b \oplus \bar{b} .

Then use \phi_1 to swap out b again, this time with a:

\phi_4 = a \oplus \bar{b} .

Finally use \phi_2 to swap out a with \bar{b}, leaving

\bar{b} \oplus \bar{b}

which is clearly contradictory.

(Sidenote: I guess these sort of arguments to show a contradiction do involve some thinking, which is what I was trying to avoid earlier. But in each case you could just draw out a truth table, which is a stupid method that a computer could do. So I think it’s reasonable to say that this is less thinking than Mermin’s method.)

Again, this violates the logical Bell inequality. In total, we have

\sum_i p_i = 1 + \frac{3}{4}  + \frac{3}{4}  + \frac{3}{4} = 3.25 > 3.

The textbook version of this inequality is a bit different. For a start, it uses an ‘expectation value’ for each proposition rather than a straightforward probability, where truth is associated with +1 and falsity with -1. So each proposition \phi_i has an expectation value E_i with

E_i = (+1)\cdot p_i + (-1)\cdot (1-p_i) = 2p_i -1.

Then summing over the E_is gives

\sum_i E_i = \sum_i (2p_i-1) = 2\sum_i p_i - N

and then, using the previous form of the logical Bell inequality,

\sum_i E_i \leq 2(N-1) - N = N-2.

A similar argument for -E_i shows that \sum_i E_i \geq -(N-2), so that this is a bound above and below:

|\sum_i E_i| \leq N - 2.

In this case N = 4 and so the inequality becomes |\sum_i E_i| \leq 2. However adding up the E_is associated to the propositions \phi_i gives 2.5, so the inequality is violated.

There’s still a little further to go to get the textbook version, but we’re getting close. The textbook version writes the CHSH inequality as

| E(a,b) + E(\bar{a}, b) + E(a, \bar{b}) - E(\bar{a}, \bar{b}) | < 2.

where the expectation value is written in the form

E(a,b) = \int A(a,\lambda) B(b, \lambda)\rho(\lambda) d\lambda.

The \lambda are ‘hidden variables’ – properties of the particles that dispose them to act in various ways. For example, in the Mermin case, we imagined them to have hidden states, like

TFF

TFF

that controlled their response to each dial, and showed that any choice of these hidden states would lead to a contradiction.

For a given \lambda, A(\lambda, a) and B(\lambda, b) are the values measured by the left and right hand machines respectively. In our case these values are always either +1 (if the machine flashes T) or -1 (if the machine flashes F). The CHSH argument can also be adapted to a more realistic case where some experimental runs have no detection at all, and the outcome can also be 0, but this simple version won’t do that.

For the dial settings a and b, all we care about with these hidden variables is whether they make the machines respond true or false. So in our case \lambda is just a set of four variables, \lambda = { a\land b, a\land \lnot b, \lnot a\land b, \lnot a\land\lnot b }, and the integral can just become a sum:

E(a,b) = (+1 \times +1)\cdot p(a\land b) + (+1 \times -1)\cdot p(a\land \lnot b) + (-1 \times +1)\cdot p(\lnot a\land b) + (-1 \times -1)\cdot p(\lnot a\land \lnot b)

= p(a\land b) + p(\lnot a\land \lnot b) - p(a\land \lnot b) - p(\lnot a\land b).

= p((a\land b) \lor \lnot (a\land \lnot b)) - p((a\land \lnot b) \lor(\lnot a\land b)).

Now that first proposition (a\land b) \lor \lnot (a\land \lnot b) is just \phi_1 from earlier, which had probability p_1. And the second one covers all the remaining possibilities, so it has probability 1-p_1. So

E(a,b) = p_1 - (1-p_1) = 2p_1 - 1 = E_1.

The argument goes through exactly the same way for E(a, \bar{b}) and E(\bar{a}, b). The last case, E(\bar{a}, \bar{b}), is slightly different. We get

E(\bar{a}, \bar{b}) = p((\bar{a}\land \bar{b}) \lor \lnot (\bar{a}\land \lnot \bar{b})) - p((\bar{a}\land \lnot \bar{b}) \lor(\lnot \bar{a}\land \bar{b}))

following the same logic as before. But this time \phi_4 matches the second proposition (\bar{a}\land \lnot \bar{b}) \lor(\lnot \bar{a}\land \bar{b}), not the first, so that

E(\bar{a}, \bar{b}) = (1-p_4) - p_4 = 1 - 2p_4 = -E_4.

This is where the minus sign in the CHSH inequality comes in! We have

|\sum_i E_i| = | E(a, b) + E(a, \bar{b}) + E(\bar{a}, b) - E(\bar{a}, \bar{b}) | \leq 2.

So we end up with the standard inequality, but with a bit more insight into where the pieces come from. Also, importantly, it’s easy to extend to other situations. For example, you could follow the same method with the six Mermin propositions from earlier to make a kind of ‘Mermin-CHSH inequality’:

|\sum_i E_i| = | E(a, b) + E(a', b') + E(a'', b'') - E(a, b') - E(a, b'') - E(a', b'') | \leq 4.

Or you could have three particles, or a different set of measurements, or you could investigate what happens with other tables of correlations that don’t appear in quantum physics… this is a very versatile setup. The original paper has many more examples.

Final thoughts

There are still some loose ends that it would be good to tie up. I’d like to understand exactly how the inequality-shuffling in a ‘textbook-style’ proof of the CHSH inequality connects to Abramsky and Hardy’s version. Presumably some of it is replicating the same argument, but in a more opaque form. But also some of it must need to deal with the fact that it’s a more general setting, and includes things like measurements returning 0 as well as +1 or -1. It would be nice to figure out which bits are which. I think Bell’s original paper didn’t have the zero thing either, so that could be one place to look.

On the other hand… that all sounds a bit like work, and I can’t be bothered for now. I’d rather apply some of this to something interesting. My next post is probably going to make some connections between the logical Bell inequalities and my previous two posts on negative probability.

If you know the answers to my questions above and can save me some work, please let me know in the comments! Also, I’d really like to know if I’ve got something wrong. There are a lot of equations in this post and I’m sure to have cocked up at least one of them. More worryingly, I might have messed up some more conceptual points. If I’ve done that I’m even more keen to know!

Speedrun: The Vygotsky Circle

I did a ‘speedrun’ post a couple of months ago where I set a one hour timer and tried to find out as much as I could about Marx’s theory of alienation. That turned out to be pretty fun, so I’m going to try it again with another topic where I have about an hour’s worth of curiosity.

I saw a wikipedia link to something called ‘the Vygotsky Circle’ a while back. I didn’t click the link (don’t want to spoil the fun!) but from the hoverover it looks like that includes Vygotsky, Luria and… some other Russian psychologists, I guess? I’d heard of those two, but I only have the faintest idea of what they did. Here’s the entirety of my current knowledge:

  • Vygotsky wrote a book called Thought and Language. Something about internalisation?
  • Luria’s the one who went around pestering peasants with questions about whether bears in the Arctic are white. And presumably a load of other stuff… he pops up in pop books with some frequency. E.g. I think he did a study of someone with an extraordinary memory?

That’s about it, so plenty of room to learn more. And also anything sounds about ten times more interesting if it’s a Circle. Suddenly it’s an intellectual movement, not a disparate bunch of nerds. So… let’s give this a go.


OK first go to that wiki article.

The Vygotsky Circle (also known as Vygotsky–Luria Circle[1][2]) was an influential informal network of psychologists, educationalists, medical specialists, physiologists, and neuroscientists, associated with Lev Vygotsky (1896–1934) and Alexander Luria (1902–1977), active in 1920-early 1940s in the Soviet Union (Moscow, Leningrad and Kharkiv).

So who’s in it?

The Circle included altogether around three dozen individuals at different periods, including Leonid Sakharov, Boris Varshava, Nikolai Bernstein, Solomon Gellerstein, Mark Lebedinsky, Leonid Zankov, Aleksei N. Leontiev, Alexander Zaporozhets, Daniil Elkonin, Lydia Bozhovich, Bluma Zeigarnik, Filipp Bassin, and many others. German-American psychologist Kurt Lewin and Russian film director and art theorist Sergei Eisenstein are also mentioned as the “peripheral members” of the Circle.

OK that’s a lot of people! Hm this is a very short article. Maybe the Russian one is longer? Nope. So this is the entirety of the history of the Circle given:

The Vygotsky Circle was formed around 1924 in Moscow after Vygotsky moved there from the provincial town of Gomel in Belarus. There at the Institute of Psychology he met graduate students Zankov, Solov’ev, Sakharov, and Varshava, as well as future collaborator Aleksander Luria.[5]:427–428 The group grew incrementally and operated in Moscow, Kharkiv, and Leningrad; all in the Soviet Union. From the beginning of World War II 1 Sept 1939 to the start of the Great Patriotic War, 22 June 1941, several centers of post-Vygotskian research were formed by Luria, Leontiev, Zankov, and Elkonin. The Circle ended, however, when the Soviet Union was invaded by Germany to start the Great Patriotic War.

However, by the end of 1930s a new center was formed around 1939 under the leadership of Luria and Leontiev. In the after-war period this developed into the so-called the “School of Vygotsky-Leontiev-Luria”. Recent studies show that this “school” never existed as such.

There are two problems that are related to the Vygotsky circle. First was the historical recording of the Soviet psychology with innumerable gaps in time and prejudice. Second was the almost exclusive focus on the person, Lev Vygotsky, himself to the extent that the scientific contributions of other notable characters have been considerably downplayed or forgotten.

This is all a bit more nebulous than I was hoping for. Lots of references and sources at least. May end up just covering Vygotsky and Luria.

OK Vygotsky wiki article. What did he do?

He is known for his concept of the zone of proximal development (ZPD): the distance between what a student (apprentice, new employee, etc.) can do on their own, and what they can accomplish with the support of someone more knowledgeable about the activity. Vygotsky saw the ZPD as a measure of skills that are in the process of maturing, as supplement to measures of development that only look at a learner’s independent ability.

Also influential are his works on the relationship between language and thought, the development of language, and a general theory of development through actions and relationships in a socio-cultural environment.

OK here’s the internalisation thing I vaguely remembered hearing about:

… the majority of his work involved the study of infant and child behavior, as well as the development of language acquisition (such as the importance of pointing and inner speech[5]) …

Influenced by Piaget, but differed on inner speech:

Piaget asserted that egocentric speech in children “dissolved away” as they matured, while Vygotsky maintained that egocentric speech became internalized, what we now call “inner speech”.

Not sure I’ve picked a good topic this time, pulls in way too many directions so this is going to be very shallow and skip around. And ofc there’s lots of confusing turbulent historical background, and all these pages refer to various controversies of interpretation 😦 Skip to Luria, can always come back:

Alexander Romanovich Luria (Russian: Алекса́ндр Рома́нович Лу́рия, IPA: [ˈlurʲɪjə]; 16 July 1902 – 14 August 1977) was a Russian neuropsychologist, often credited as a father of modern neuropsychological assessment. He developed an extensive and original battery of neuropsychological tests during his clinical work with brain-injured victims of World War II, which are still used in various forms. He made an in-depth analysis of the functioning of various brain regions and integrative processes of the brain in general. Luria’s magnum opus, Higher Cortical Functions in Man (1962), is a much-used psychological textbook which has been translated into many languages and which he supplemented with The Working Brain in 1973.

… became famous for his studies of low-educated populations in the south of the Soviet Union showing that they use different categorization than the educated world (determined by functionality of their tools).

OK so this was early on.

Some biographical stuff. Born in Kazan, studied there, then moved to Moscow where he met Vygotsky. And others:

During the 1920s Luria also met a large number of scholars, including Aleksei N. Leontiev, Mark Lebedinsky, Alexander Zaporozhets, Bluma Zeigarnik, many of whom would remain his lifelong colleagues.

Leontiev’s turned up a few times, open in another tab.

OK the phrase ‘cultural-historical psychology’ has come up. Open the wikipedia page:

Cultural-historical psychology is a branch of avant-garde and futuristic psychological theory and practice of the “science of Superman” associated with Lev Vygotsky and Alexander Luria and their Circle, who initiated it in the mid-1920s–1930s.[1] The phrase “cultural-historical psychology” never occurs in the writings of Vygotsky, and was subsequently ascribed to him by his critics and followers alike, yet it is under this title that this intellectual movement is now widely known.

This all sounds like a confusing mess where I’d need to learn way more background than I’m going to pick up in an hour. Back to Luria. Here’s the peasant-bothering stuff:

The 1930s were significant to Luria because his studies of indigenous people opened the field of multiculturalism to his general interests.[12] This interest would be revived in the later twentieth century by a variety of scholars and researchers who began studying and defending indigenous peoples throughout the world. Luria’s work continued in this field with expeditions to Central Asia. Under the supervision of Vygotsky, Luria investigated various psychological changes (including perception, problem solving, and memory) that take place as a result of cultural development of undereducated minorities. In this regard he has been credited with a major contribution to the study of orality.

That last bit has a footnote to Ong’s Orality and Literacy. Another place I’ve seen the name before.

In 1933, Luria married Lana P. Lipchina, a well-known specialist in microbiology with a doctorate in the biological sciences.

Then studied aphasia:

In his early neuropsychological work in the end of the 1930s as well as throughout his postwar academic life he focused on the study of aphasia, focusing on the relation between language, thought, and cortical functions, particularly on the development of compensatory functions for aphasia.

This must be another pop-science topic where I’ve come across him before. Hm where’s the memory bit? Oh I missed it:

Apart from his work with Vygotsky, Luria is widely known for two extraordinary psychological case studies: The Mind of a Mnemonist, about Solomon Shereshevsky, who had highly advanced memory; and The Man with a Shattered World, about a man with traumatic brain injury.

Ah this turns out to be late on in his career:

Among his late writings are also two extended case studies directed toward the popular press and a general readership, in which he presented some of the results of major advances in the field of clinical neuropsychology. These two books are among his most popular writings. According to Oliver Sacks, in these works “science became poetry”.[31]

In The Mind of a Mnemonist (1968), Luria studied Solomon Shereshevskii, a Russian journalist with a seemingly unlimited memory, sometimes referred to in contemporary literature as “flashbulb” memory, in part due to his fivefold synesthesia.

In The Man with the Shattered World (1971) he documented the recovery under his treatment of the soldier Lev Zasetsky, who had suffered a brain wound in World War II.

OK 27 minutes left. I’ll look up some of the other characters. Leontiev first. Apparently he was ‘a Soviet developmental psychologist, philosopher and the founder of activity theory.’ What’s activity theory?

Activity theory (AT; Russian: Теория деятельности)[1] is an umbrella term for a line of eclectic social sciences theories and research with its roots in the Soviet psychological activity theory pioneered by Sergei Rubinstein in 1930s. At a later time it was advocated for and popularized by Alexei Leont’ev. Some of the traces of the theory in its inception can also be found in a few works of Lev Vygotsky,[2]. These scholars sought to understand human activities as systemic and socially situated phenomena and to go beyond paradigms of reflexology (the teaching of Vladimir Bekhterev and his followers) and classical conditioning (the teaching of Ivan Pavlov and his school), psychoanalysis and behaviorism.

So maybe he founded it or maybe he just advocated for it. This is all a bit of a mess. But, ok, it’s an umbrella term for moving past behaviourism.

One of the strengths of AT is that it bridges the gap between the individual subject and the social reality—it studies both through the mediating activity. The unit of analysis in AT is the concept of object-oriented, collective and culturally mediated human activity, or activity system.

This all looks sort of interesting, but a bit vague, and will probably take me down some other rabbithole. Back to Leontiev.

After Vygotsky’s early death, Leont’ev became the leader of the research group nowadays known as the Kharkov School of Psychology and extended Vygotsky’s research framework in significantly new ways.

Oh shit completely missed the whole thing about Vygotsky’s early death. Back to him… died aged 37! Of tuberculosis. Mostly became famous after his death, and through the influence of his students. Ah this bit on his influence might be useful. Soviet influence first:

In the Soviet Union, the work of the group of Vygotsky’s students known as the Vygotsky Circle was responsible for Vygotsky’s scientific legacy.[42] The members of the group subsequently laid a foundation for Vygotskian psychology’s systematic development in such diverse fields as the psychology of memory (P. Zinchenko), perception, sensation, and movement (Zaporozhets, Asnin, A. N. Leont’ev), personality (Lidiya Bozhovich, Asnin, A. N. Leont’ev), will and volition (Zaporozhets, A. N. Leont’ev, P. Zinchenko, L. Bozhovich, Asnin), psychology of play (G. D. Lukov, Daniil El’konin) and psychology of learning (P. Zinchenko, L. Bozhovich, D. El’konin), as well as the theory of step-by-step formation of mental actions (Pyotr Gal’perin), general psychological activity theory (A. N. Leont’ev) and psychology of action (Zaporozhets).

That at least says something about what all of those names did. Open Zinchenko tab as first.

Then North American influence:

In 1962 a translation of his posthumous 1934 book, Thinking and Speech, published with the title,Thought and Language, did not seem to change the situation considerably.[citation needed] It was only after an eclectic compilation of partly rephrased and partly translated works of Vygotsky and his collaborators, published in 1978 under Vygotsky’s name as Mind in Society, that the Vygotsky boom started in the West: originally, in North America, and later, following the North American example, spread to other regions of the world.[citation needed] This version of Vygotskian science is typically associated with the names of its chief proponents Michael Cole, James Wertsch, their associates and followers, and is relatively well known under the names of “cultural-historical activity theory” (aka CHAT) or “activity theory”.[45][46][47] Scaffolding, a concept introduced by Wood, Bruner, and Ross in 1976, is somewhat related to the idea of ZPD, although Vygotsky never used the term.[

Ah so Thought and Language was posthumous.

Then a big pile of controversy about how his work was interpreted. Now we’re getting headings like ‘Revisionist movement in Vygotsky Studies’, think I’ll bail out now. 16 minutes left.

OK let’s try Zinchenko page.

The main theme of Zinchenko’s research is involuntary memory, studied from the perspective of the activity approach in psychology. In a series of studies, Zinchenko demonstrated that recall of the material to be remembered strongly depends on the kind of activity directed on the material, the motivation to perform the activity, the level of interest in the material and the degree of involvement in the activity. Thus, he showed that following the task of sorting material in experimental settings, human subjects demonstrate a better involuntary recall rate than in the task of voluntary material memorization.

This influenced Leontiev and activity theory. That’s about all the detail there is. What to do next? Look up some of the other people I guess. Try a few, they’re all very short articles, give up with that.

Fine I’ll just google ‘vygotsky thought and language’ and see what i get. MIT Press description:

Vygotsky’s closely reasoned, highly readable analysis of the nature of verbal thought as based on word meaning marks a significant step forward in the growing effort to understand cognitive processes. Speech is, he argues, social in origins. It is learned from others and, at first, used entirely for affective and social functions. Only with time does it come to have self-directive properties that eventually result in internalized verbal thought. To Vygotsky, “a word is a microcosm of human consciousness.”

OK, yeah that does sound interesting.

Not finding great sources. 8 minutes left. Zone of proximal development section of Vygotsky’s page:

“Zone of Proximal Development” (ZPD) is a term Vygotsky used to characterize an individual’s mental development. He originally defined the ZPD as “the distance between the actual developmental level as determined by independent problem solving and the level of potential development as determined through problem solving under adult guidance or in collaboration with more capable peers.” He used the example of two children in school who originally could solve problems at an eight-year-old developmental level (that is, typical for children who were age 8). After each child received assistance from an adult, one was able to perform at a nine-year-old level and one was able to perform at a twelve-year-old level. He said “This difference between twelve and eight, or between nine and eight, is what we call the zone of proximal development.” He further said that the ZPD “defines those functions that have not yet matured but are in the process of maturation, functions that will mature tomorrow but are currently in an embryonic state.” The zone is bracketed by the learner’s current ability and the ability they can achieve with the aid of an instructor of some capacity.

ZPD page itself:

Zygotsky spent a lot of time studying the impact of school instruction on children and noted that children grasp language concepts quite naturally, but that math and writing did not come as naturally. Essentially, he concluded that because these concepts were taught in school settings with unnecessary assessments, they were of more difficulty to learners. Piaget believed that there was a clear distinction between development and teaching. He said that development is a spontaneous process that is initiated and completed by the children, stemming from their own efforts. Piaget was a proponent of independent thinking and critical of the standard teacher-led instruction that was common practice in schools.

But also:

… He believed that children would not advance very far if they were left to discover everything on their own. It’s crucial for a child’s development that they are able to interact with more knowledgeable others. They would not be able to expand on what they know if this wasn’t possible.

OK 3 minutes left. Let’s wildly skip between tabs learning absolutely nothing. Hm maybe this would have been interesting? ‘Vygotsky circle as a personal network of scholars: restoring connections between people and ideas’.

Ding! Didn’t get much past reading the title.


Well that didn’t work as well as the alienation one. Sprawling topic, and I wasn’t very clear on what I wanted to get out of it. History of the Circle itself or just some random facts about what individual people in it did? I mostly ended up with the second one, and not much insight into what held it together conceptually, beyond some vague idea about ‘going beyond behaviourism’/’looking at general background of human activity, not just immediate task’.

Still, I guess I know a bit more about these people than I did going in, and would be able to orient more quickly if I wanted to find out anything specific.

The Mane Six as Mitford Sisters

IMG_20200628_154808962_BURST000_COVER_TOP

[Written as part of Notebook Blog Month.]

I’ve saved the most important topic for last. As far as I can tell, nobody on the internet has tackled the vital question of how My Little Pony characters map to Mitford sisters. So I’m going to fix that.

As a bit of background, the Mitfords were a wildly eccentric English aristocratic family. The novelist Nancy Mitford is probably the most famous of them, but her five sisters were an impressively bizarre mix of communists, fascists, socialites, farmers and Hitler obsessives. (There’s also a brother who nobody cares about.) I’m not particularly well up on Mitford lore, but I am a big fan of Nancy Mitford’s The Pursuit of Love, and I’m fascinated by eccentrics of all kinds, so I know the basics.

My Little Pony also has six main characters. (Plus Spike.) They turn out to match up surprisingly closely with Mitford sisters, right up to the point where they don’t, and then I just have to make it up.

So, first of all, Nancy Mitford is Twilight Sparkle. This one is completely obvious. In The Pursuit of Love, Nancy is the narrator of a fictionalised version of the Mitfords’ lives, with her as the quieter, more studious observer. Done.

Deborah Mitford was a famous socialite known for… being social and stuff. Lots of witty correspondence with other witty socialite types. She also ran a big stately home that was open to the public, which is kind of like organising parties if you squint hard enough. So Pinkie Pie.

Jessica Mitford was the most adventurous and rebellious of the sisters, running away to Spain and then later becoming an activist in the US, where she worked on civil rights campaigns and joined the Communist Party. Also investigated unscrupulous business practices in the funeral home industry for some reason. Clearly has to be Rainbow Dash.

Unity Valkyrie Mitford is the oddest of the lot. She became completely obsessed with Hitler, stalked him around Munich, eventually made her way into his inner circle, shot herself in the head when Britain declared war on Germany, survived the attempt and lasted out almost another ten years before dying of meningitis caused by swelling around the bullet, which was never removed. I feel like this is a job for Rarity, partly because of the rhyming name, partly because she’s the only one capable of pulling off this much drama.

OK, this is the point where the mapping gets a bit trickier. The two remaining sisters are Pamela and Diana. Pamela was the most retiring of the Mitfords, staying out of the public eye, at least in comparison to the others. She was practical-minded, loved animals and the countryside and managed a farm for a while. A pretty good fit for either Applejack or Fluttershy. She did still manage to do some weird Mitford stuff, marrying a bisexual millionaire physicist and then becoming the ‘companion’ of an Italian horsewoman after they divorced.

Diana is the dud Mitford. She was another fascist, and not even a spectacularly bizarre one like Unity. Mainly known for marrying Oswald Mosley, leader of the British Union of Fascists, editing a fascist magazine, and spending some time in prison during the war for being a fascist.

I don’t really want to lumber either of them with Diana, but I need to make a choice, so I’ve introduced an outside tie break. Fluttershy is best pony. So Fluttershy gets Pamela Mitford and poor old Applejack is stuck with Diana Mitford.

I’m sure everyone’s relieved that this major open question has been definitively answered at last.