August 2019: the shitpost-to-scholarship pipeline

Hi,

I got perturbed this month! I managed to find a new job, which I’ll be starting mid-September. It’s another programming job, but in a really interesting area, working on software for satellite ground control systems. It sounds like they’re keen for me to get up to speed on the domain as well as I can, so I guess I’ll be filling my head with information about satellites for the next few months, and also trying to improve my programming skills. It’s going to be a disorientating shift after several years of a boring job that gives me lots of leftover energy for weird projects outside of work, but I’m excited to be doing something vaguely physics-y again. Thanks for all the perturbation suggestions – I may get to some of the others later!

For the next few months I’m going to need to concentrate on getting my head round this job. Also I’m temporarily going to have an annoying bus commute, so I’ll have less free time anyway. I’m not sure how this will impact the newsletter yet, so apologies in advance if it drops off a bit for a while. Maybe I’ll write much less, or maybe I’ll find I still have plenty of energy to write, or maybe I’ll braindump about satellites for a bit. No idea. 

Anyway this month I still had time to think about lots of bullshit, some of which I managed to write up. I’m going to talk about entropy further down, but first I’m going to talk about…

The shitpost-to-scholarship pipeline

This is something I’m considering writing a blog post about, but I’m still figuring out what I’m trying to do with it. Also, it’s a kind of an obvious idea, so maybe someone else has already written something good that I should know about. I’d be very interested in any thoughts you have 🙂

Anyway, the rough idea is… there’s an extraordinary explosion of creative idea generation going on online. And there’s this fascinating kind of pipeline where people will start feeling out the vague beginnings of an idea through twitter threads and dumb throwaway posts and blog comments and email conversations, and then if something looks promising they’ll discuss more and pull in bits of other people’s ideas, and gradually build up to more thought out, polished work.

I’m excited about this culture for a lot of reasons. It’s a kind of online version of the casual, unobservable ‘dark matter’ part of academia, the part you can’t access by looking at published work – all the throwing around wild claims over coffee in the common room and in the pub on Friday evenings, bits of ‘yeah, that paper is unreadable, but this is what they’re really talking about’ insight from people in the know, standing round the whiteboard trying to figure something out, group meeting gossip, and the like. And it’s an incredibly vivid and alive version, at a time when large parts of normal academia have become rigid and bureaucratised and plain boring. This seems important to me: it’s the shitposting engine that produces the raw generative power that can drive more focussed work further down the line.

There’s a kind of wild energy; people aren’t afraid to go after big topics. We’ve got ourselves free of the constraints of the academic pimple factory:

An example of Little History is an essay by Matt Might (clearly a Marvel superhero in a counterfactual universe) titled The Illustrated Guide to a Phd. Go read it. It’ll only take a minute. It frames the sum of all human knowledge as a big circular bubble, and your PhD as a little pimple on the surface of it. I’ll call this the Mighty Diagram. It gets passed around in graduate student circles with depressing frequency.

Instead of a dent in the universe, you get a pimple on a uncritically proceduralist conceptualization of the frontier of knowledge as the sum of all the peer-reviewed academic literature in the world.

What makes this essay utterly horrifying is that it is actually an accurate description of what a PhD is; it calibrates academic career expectations correctly and offers an accurate sense of perspective on the peer-reviewed life. I suspect Matt Might sincerely intended the essay as a helpful guide to academic survival, but its effect is to put aspiring scholars in their place, rather than help them find a sense of place in the universe. It’s a You Are Here map for your intellectual journey at the end of a PhD, you disgusting little pimple, you. Kneel before this awe-inspiring edifice of knowledge that you’re lucky to be allowed to add a pimple to.

 This rings very true with my own experience of academia, and the mindset it got me into. I personally found that after a couple of years out of there my thinking kind of cleared and became more expansive, and I was able to have good ideas again.

Unfortunately, this pipeline only goes so far. It currently tends to dump ideas out somewhere around the ‘insight porn’ point – ideas that you read, think ‘oh that’s clever’, hit the like button, maybe comment on or talk about for a bit, then completely forget a week later. In the best case, a fragment of the idea or a bit of new jargon escapes into the local thought soup and can be combined with other ideas that are currently percolating. Sometimes this can be quite a powerful effect on its own. But there are a lot of places that academia still goes to that just can’t be reached in this way.

One of my favourite examples of this dynamic is Sarah Perry’s theory of mess. This is a genuinely great idea, and it’s not just a vague ‘insight’ – it’s an initial sketch of a satisfying explanatory theory of what mess is, complete with some very convincing examples and thought experiments (put a kaleidoscope filter on your mess, and it’s no longer mess!). But as far as I can tell, it got the same treatment as everything else that goes down the pipe – we all liked it and moved on. No real discussion (that I know of) of how to test it, or what has already been done in this line, or probing to see where it might fall down. Does it work? Who knows! On to the next idea!

Now, there’s an obvious explanation for why this happens. Most of us are not doing this as a full time job. We’re fitting this into the spare time we get, alongside paid work or other responsibilities. So we’re only really interested in doing the enjoyable parts of the idea generation process. Chucking around ideas is easy and fun, whereas checking whether they actually work is hard and boring. It’s not a big surprise that people prefer easy and fun work to hard and boring work.

There’s a lot of truth to this, but I think it’s slightly too cynical, in that it both makes the first part of the pipeline sound too easy and the second half of the pipeline too hard. Chucking around ideas is easy, but to be able to do that we need to have some good ones to chuck around, and that’s not exactly trivial. We have some advantage in being able to go after very broad, vague, ambiguous, undeveloped topics, and slowly clear fog. There’s no pressure to quickly get to a point where we can publish something. And at the same time, polishing up ideas is hardly some unrelenting tedious grind. Calculating can be fun, testing can be fun, writing up can be fun. If your eventual aim is to publish in traditional academia then there are some definite unfun parts, like altering your conversational blog post style to fit a more academic register, but this is only a small part of the process.

For me at least, it just feels unsatisfying to leave ideas at the insight porn stage. There’s a natural pull in the direction of getting further down the pipeline, rather than a tedious sense of duty.  I’ve been playing around with some haphazard experiments of my own, and I think I’ve got past the insight porn stage too with some of them, but nowhere near as far as I’d like. I’ll go through a couple of examples.

A few years ago, I wrote a tumblr post called stupid bat and ball, title all lower case, 700 words of low-effort writing not far above the shitpost level. I wasn’t really expecting it to go anywhere further. But it did contain a small core of insight – the bat and ball question of the Cognitive Reflection Test is different to the other two questions in some respect, and so the questions don’t really form a natural set. When I got the wordpress blog I reposted it, and eventually it attracted some really good comments that probed the mechanics of the bat and ball question much more deeply than I had. So I realised that this idea probably was worth investigating and that I should up my game a bit, and I started reading some of the literature. I discovered that the bat and ball question came first and the others were picked ‘to be like it’, with no elaboration of the process for picking them, which confirmed my suspicion that not much work went into question validation. And I found a fascinating ‘extremely rough draft’ follow-up paper showing how ridiculously sticky the wrong answer is.

The comments to this post pushed things further again, coming up with more detailed explorations of how the difficulty relates to the way the problem maps numbers to an abstract quantity (the difference in price), but fools you into mapping it to a concrete one (the price of the bat). @_awbery pointed out that this abstract/concrete confusion is completely missing from the other two questions, where all the quantities map to concrete objects. And anders devised a set of ‘similar questions’ that turn up the level of abstractness one step at a time. These comments point towards something like rat-running experiments for the Cognitive Reflection Test, getting an understanding of how the tools we’re trying to use actually work before using them to make inferences about abstractions like ‘cognitive reflection’. I do think a potentially valuable contribution could be made here. 

But… I’m not really the person to do it. (Even if I cared more about this specific question than I do. I’d pretty much used up my remaining store of shits-to-give on writing the blog post, and didn’t even have enough left to engage with the comments as fully as I’d have liked to.) Doing psych research without fooling yourself sounds like an absolute minefield even if you know what you’re doing, and I have no expertise at all. 

So I guess in this case I quit the pipeline at the level of having a sort of slapdash lit review with some pointers to interesting ways to take it further. Not the most impressive result. But the interesting bit for me was the distance I travelled from the original tumblr post, which I’d put no effort into at all, and the way the project took on a life of its own, with other people helping to propel this considerably further than I’d ever thought to take it myself.

My other example is all my thinking about negative probability in the last year and a half. Although it sounds superficially like a kind of a crackpot topic, there are deep links to quantum mechanics on phase space, and I’ve been using my fascination with this as a serious starting point to learn all kinds of interesting things in quantum foundations/quantum information. I’ve been experimenting with the discipline of using a single paper as my focus, and this has been incredibly helpful for keeping me on track, and damping down my normal habit of wandering from subject to subject too quickly to pick up anything useful. 

I’m more serious about this project than the bat and ball one – it actually connects to an enduring deep interest rather than something I blundered into by accident. Again, I’m not yet as far down the pipeline as I want to be, but I’ve got past the vague insight level. My last couple of posts explored an intriguing decomposition of the Wigner function for a qubit that I found myself, and that I can see some potential use for in interpreting negative probabilities. Since then had quite a few more ideas that I want to investigate, and I’ve started to link things into a more coherent picture. There’s also a lot more I could be doing in terms of making contact with people in academia and asking questions (something I’m rather bad at). I can definitely see how to push further. 

It’s still really funny to me that I’m cheerfully crashing about between cognitive psych and quantum foundations, with a few clueless forays into reading Derrida for good measure. Whereas in academia I’d have felt daring if I tried to pivot from burst to continuous sources of gravitational waves from neutron stars, or something. Obviously this is too scattered for me to get anything done, and I need to get better at idea triage. But there’s something really psychologically healthy about this mindset of just taking a direct run at whatever I feel like, instead of thinking ‘oh, that’s outside my field, I can’t think about that.’ I want to keep this even as I hopefully learn to focus my efforts more usefully.

I’m running out of time to write this now, so I guess I’ll end with some questions:

  • What are examples of people navigating the whole shitpost-to-scholarship pipeline successfully on the public internet? I’m particularly interested in people who are trying for academia-style focussed research on specific object-level questions, rather than big-picture synthesis or popularisation. It occurs to me that Gwern might be a good one, but I don’t know his stuff all that well. Where does he shitpost? Reddit?
  • Is there any kind of institutional support out there, or is it all just individual weird nerds pursuing individual weird research programs?
  • Has anyone written about this well already?

Entropy

Oh god I’ve gone down the entropy rabbithole. I could be stuck here for a while. I finally wrote up my negative probability posts (1, 2) and one of the big leftover confusions was the weird entropy measure that cropped up. This was something called the collision entropy, a member of the family of Rényi entropies that includes the Shannon entropy along with all kinds of more obscure ones. 

I still don’t understand why the collision entropy is the one we want in quantum physics. But I am starting to get my head around why the Shannon entropy won’t work, so I’ll try and explain that as best as I can (which won’t be brilliantly as I’m still digesting).

I’m not going to give any background on what the Shannon entropy is or why you’d care about it, I just want to get going with this stupid braindump straight away, so this isn’t going to be the most accessible write up. There should be some nice explanations around if you’re interested – probably lots as it crops up so much in machine learning. Here are the ones I know about as a starting point:

  • I like this LessWrong article where Yudkowsky gets to the point quickly for once by just giving a few examples.
  • There’s also this really excellent visual introduction to information theory by Chris Olah, though it covers a lot more and you’d have to extract the relevant parts. 
  • I’ll say more about this one later, but I looked up Shannon’s original Bell Labs paper and it’s very lucid and conversational, plus you get to see Markov chain text generation 1948 style: markovAgain, it covers a lot more than just the entropy definition, so might not be the best choice, but it’s interesting to look at.

Now the paper I’ve been using is Brukner and Zeilinger’s Conceptual Inadequacy of the Shannon Information in Quantum Measurements. I’ve been complaining that I don’t understand their other paper on this, but for some reason I’d forgotten about this one, and I’ve been having a bit more success with it.

This paper quotes straight from Shannon’s original one, where he gives three conditions that the entropy ought to follow. The first is that the entropy should be a continuous function of all the probabilities, which seems reasonable: you wouldn’t want a sudden jump in the entropy if one of them shifted a tiny bit. The second is that for the special case where all the probabilities are the same, p_i = 1/n for all i, the entropy should be a monotonically increasing function of n. This is also reasonable: as he puts it, ‘with equally likely events there is more choice, or uncertainty, when there are more possible events’.

The third is more interesting. Here it is reprinted in B+Z:

shannon

So if you split a choice into two choices, you want the entropy to also split in a sensible way. (I really like the fact Shannon explains the rule by giving an example. I think most papers in more formal venues would write out the general case, which would look a lot more opaque, and maybe provide an example later if you’re lucky.)

Shannon shows that these three conditions together are enough to pick out a unique form for the entropy, and this is now the famous Shannon entropy. The first two conditions seem pretty unassailable for anything you would want to call an entropy, but the third one maybe relies on more assumptions. It makes sense in the case where there really are underlying properties of a system that are independent of how to decompose your choice into measurements, but this doesn’t hold in general in the quantum case.

(Nitpicky aside: B+Z say that ‘in classical measurements it is always possible to assign to a system simultaneously attributes corresponding to all possible measurements, here ai, bj, and ai ∧ bj‘. This is often true, but I don’t think it’s true of all things you’d count as classical measurements – for example, I’ve talked about time-frequency measurements before, which have the same sort of Fourier-pair trade-off as position and momentum in quantum mechanics. But I need to think about this more.)

B+Z then go on to give examples of why the Shannon entropy doesn’t work in QM. They do it in a slightly indirect way, where they state consequences that can be derived from Shannon’s form of the entropy and show these don’t hold any more. These consequences are:

  • H(B) ≥ H(B|A). The uncertainty of B is never increased by knowledge of A.
  • H(A) + H(B|A) = H(B) + H(A|B). Information is indifferent to the order of acquisition 

The first one can be shown to be untrue by e.g. a set of polarisers:

polarisers

The second one can be shown to be untrue by e.g. two different spin measurements (see Figure 5 of the paper for details).

I still want to understand this in a more direct way, by seeing exactly how Shannon’s third rule breaks in QM, but I’m getting there. Next up after that will be understanding B+Z’s argument for why they pick the collision entropy in particular. Something to do with ‘it’s invariant under unitary transformations, also it’s related to another quantity that comes up’. Maybe this is as good as I’m going to find in the literature, but I’ll need to keep digging for a while to be sure.

Next month

Absolutely no idea. I’ll see how the job works out. 

Cheers,

Lucy