Social Anxiety

It’s becoming increasingly clear that I need to get the hell off social media. That’s sort of a perverse thing to declare in a blog post, but it’s a long time since WT could be reasonably described as ‘social’, and indeed the very notion of social media has changed beyond recognition in the interim. When we talk about it now, what we really mean is Facebook and — overwhelmingly — Twitter.

Ah, Twitter. That simple little ‘microblogging platform’. Convenience conveyor of outrage morsels to the ever-ravenous masses. Who would have guessed, when it launched back in 2006, or even in 2009 when I finally capitulated and posted my first ambivalent tweet, just what it would become?

Perhaps everyone, at some level. Perhaps we all got exactly what we always wanted.

In any case, here we are. The perpetual emotion machine perpetually emotes. The unceasing, unstoppable global cacophony of Pavlovian bells chimes and chimes and we cringing Quasimodos grunt and drool. We tailor our perfect personal Morse code-sparking electrode and wire it into the emotional centres of our brain, delivering a stream of tiny shocks just incrementally powerful enough to maintain that sense of entrenched helpless rage, perfectly balanced on the critical edge of disgust and impotence.

Evidently, not everyone is as susceptible to this as I am. Although the general drift of politics and culture lately strongly suggests that enough people are for it to be a significant problem. Will Twitter actually be the thing that finally immanentises the eschaton? Who knows. It at least seems to be in with a chance.

I made my escape years ago, weary of the bottomless aggravation and just the sheer amount of time and emotional energy it takes to be constantly furious about things you have absolutely no influence over anyway. The release of resources was invigorating — I could read books instead of soundbites, write apps instead of peevish, unfunny one-liners.

But, of course, the respite was only temporary. As respites always are. The awful unbearable momentum of 2016, the enervating saturation of utter fucking woe, herded me back into the quagmire. That my return coincided with the depth-plumbing, barrel-scraping shitbaggery of the EU referendum was, well, no coincidence. Just one more reason to hate and despise and wish eternal unendurable agonies onto the fucking Brexiteers, the way they made it seem imperative to pay attention to their relentless malfeasance. The laughable, nauseating, tear-jerking dark heart of it. The horror. The horror.

So, back I went. And have spent the last six months in a state of diffuse despair and ambient anxiety. Unable to stop watching the dismal succession of slow-motion disasters, train wreck upon plane crash upon supertanker collision, one after another. Angry, frustrated, unable actually to concentrate or get things done because what is the fucking point anyway?

Just read another tweet, another live blog, another poisonous troll comment crowing about the dismantling of modernity. Another headline, another traducement, another barefaced opportunistic lie that frogmarches the famous tolerant open democracy we’re always so proud of towards fascism.

England Prevails.

https://twitter.com/damiengwalter/status/783598265509957632

It gets worse. It gets worse. It just keeps getting worse, and there seems absolutely no prospect of that trend abating. (Yeah, Hillary will probably win in the US, but it’s already too late to stop that country’s political discourse getting dragged into the sewer too. Lap it up, cucks!)

So, time to back the fuck off. Step away from the firehose. Ignorance is bliss. This unremitting angst is purposeless, ineffective. Forget it and do something else instead. Maybe there is nothing I can achieve with the rescued time that has any merit either — what does, objectively? — but I hope it’ll at least be more fun.

Averages Are Evil

The basis of science, and one of its main claims to epistemic validity, is observation and measurement. We do experiments and follow evidence. We enquire into the workings of the world by means of data.

Unfortunately, data is not always well-behaved. It is tricksy and wayward and noisy, subject to contamination and confounding and being not what it seems. The more complex the processes being studied, the more data is needed and the more sources of error there are. And most life science processes are very complex indeed.

Over the years statisticians have come up with a wide variety of techniques for wrestling useful information out of noisy data, ranging from the straightforward to the eye-wateringly complicated. But the best-known and most widely used, even by non-scientists, is much older and simpler still: the average, usually in the form of the arithmetic mean.

Averaging is easy: add up all your data points and divide by how many there were. Formal notation is frankly superfluous, but I’ve got a MathJax and I’m gonna use it, so for data points $x_1$, $x_2$, …, $x_n$:

$$\bar{x} = \frac{1}{n} \sum^n_{i=1} x_i$$

The intuition behind averaging is as appealingly straightforward as the calculation: we’re amortising the errors over the measurements. Sometimes we’ll have measured too high, sometimes too low, and we hope that it will roughly balance out.

Because it’s easy to understand and easy to use, averaging is used a lot. I mean A LOT. All the time, for everything. Which is often fine, because it’s actually a pretty useful statistic when applied in an appropriate context. But often it’s a horrible mistake, the sort of thing that buttresses false hypotheses and leads to stupid wrong conclusions. Often it is doing literally the opposite of what the user actually wants, which is to reveal what is going on in the data. Averages can all too easily hide that instead.

Obviously this is not really the fault of the poor old mean. It’s the fault of the scientist who isn’t thinking correctly about what their analysis is actually doing. But averaging is so universal, so ubiquitous, that people just take it for granted without much pause for thought.

The fundamental problem with averaging is, vexingly, also the thing that makes it appealing: it reduces a potentially complex set of data into something much simpler. Complex data sets are difficult to understand, so reduction is often desirable. But in the process a lot of information gets thrown away. Whether or not that’s an issue depends very much on what the data is.

In the simple error model described above, the extra data really is just noise — errors in the measurement process that obscure the single true value that we wish to know. This is the ideal use case for the mean, its whole raison d’être. Provided our error distribution is symmetric — which is to say, we’re about equally likely to get errors either way — we will probably end up with a reasonable estimate of the truth by taking the mean of a bunch of measurements. We don’t really care about the stuff we’re throwing away.

Histogram of a unimodal data set.
For unimodal data, the mean (indicated here by the dashed line) can give a reasonable estimate of the typical (or even “true”) value in the population.

However, this is a very specific kind of problem, and many — perhaps most — sets of data that we might be interested in aren’t like that. It’s actually pretty rare to be looking for a single true value in a data set, because most realistic populations are diverse. If the distribution of the data is not unimodal — meaning clustered around one central value — then the average is going to mislead.

Histogram of non-unimodal data
When the data is not unimodal, the mean value (dashed line) may be completely unrepresentative.

What is the average human height or weight? That seems like a plausible use of the mean, but it’s barely even a meaningful question. A malnourished premature newborn and the world’s tallest adult simply aren’t commensurate. It’s like taking the average of a lightbulb and a school bus. The result tells you nothing useful about either.

This problem is significantly compounded when you start wanting to compare data sets. Which is something we always want to do.

You can, of course, compare two means. One of the most basic and widely used statistical tests — the Student t-test — will do exactly that. But to do so is explicitly to assert that those means do indeed capture what you want to compare. In a diverse population — and again, most realistic populations are diverse — that is a strong assumption, one that needs to be justified with evidence.

Let’s say you’ve got two sets of observations. The sets are related in some way — they might be from patients before and after a treatment, or children before and after a year of schooling, or shoes worn on left and right feet. The raw data look like this:

Raw Data

You’re looking for a difference — a change, let’s call it an improvement — between these two sets, so you take the means:

Mean values of each group

Well, that’s kind of promising: observation 2 definitely looks better. Maybe you draw a line from one to the other to emphasise the change. Obviously there are some differences across the population, so you throw in an error bar to show the spread:

Difference of means with SD error bar

Here I’ve shown the standard deviation, a common and (given some distributional assumptions) useful measure of the variability in a data set. Very often people will instead use a different measure, the standard error of the mean (often shortened to standard error). This is a terrible practice that should be ruthlessly stamped out, but everyone keeps on doing it because it makes their error bars smaller:

Mean change with SEM

While you’re at it, you might perform the aforementioned t-test on the data and boldly assert there’s a less than 5% probability* the observed improvement could have happened by chance. Huzzah! Write your Nature paper, file your patent, prepare to get rich.

But what we’ve done here is gather a bunch of data — maybe through years of tedious and costly experiments — and then throw most of it away. Some such compression is inevitable when data sets are large, but it needs to be done judiciously. In this case the sets are not actually large — and I’ve concocted them to make a point — so let’s claw back that discarded information and take another look.

In using the means to assess the improvement we implicitly assumed the population changes were homogenous. If instead we look at all the changes individually, a different picture emerges:

Pairing data between observations

It’s pretty clear that not everyone is responding the same way. Our ostensible improvement is far from universal. In fact there are two radically different subsets in this data:

Paired data with group colouring

Fully half of the subjects are significantly worse off after treatment — and in fact they were the ones who were doing best to begin with. That’s something we’d really better investigate before marketing our product.

If this were real data, we would want to know what distinguishes the two groups. Is it the mere fact of having a high initial level of whatever it is we’re measuring? Is there some other obvious distinction like sex or smoking? Some specific disease state? Or is there a complex interplay of physiological and social factors that leads to the different outcome? There might be an easy answer, or it might be completely intractable.

Of course, it’s not real data, so the question is meaningless. But the general shape of the problem is not just an idle fiction. It’s endemic. This is rudimentary data analysis stuff, tip of the iceberg, Stats 101 — and people get it wrong all the sodding time. They’re looking at populations they know are drastically heterogenous, but they can scrape a significant p-value by comparing means and that’s all that matters.

Stop it. Don’t be that person. Don’t toss your data away. Recognise its structure. Plot it all. Don’t hide its skew. Don’t make unwarranted assumptions. And don’t take an average unless you actually fucking mean it.


* I’ll save the rant about significance tests for another time.

Fear of Music

On a possibly happier note, there’s this:

Some other SoundCloud “albums” passed unreported here, but this one entertains me for some reason and is the first in ages to make it to BandCamp — rest assured as always that no-one is expected to buy the fucking thing, BC just represents a different grouping mechanism.

And yes, this is my third post in three days. Some kind of inertial shift? No promises whatsoever.

Plate Tectonics

I happened to be reading Guy Gavriel Kay’s latest in June, which seemed grimly apt. A common pattern in several of his novels — though perhaps less in Children of Earth and Sky, as it turns out — is of events accumulating into a huge societal transformation, whose enormity is not apparent until afterwards. Errors of judgement, missed opportunities, subtly shifting political alliances and conflicts of interest conspire to bring about the end of a golden age, destroy a fragile civilisation, harden and coarsen and entrench attitudes in a wearied population. Life goes on — what else should it do? — just a bit less well. Only in retrospect do we perceive the knife edge on which it was so finely balanced.

These are fictions, of course, heightened and romanticised, and who knows how well they capture anything of the real historical moments on which Kay draws — the fall of the Tang and Northern Song, Byzantium and al-Andalus. An appeal to a Golden Age is always dangerous. Nostalgia corrupts; just look around us now.

Still, it certainly feels like one of those moments. A collective surrender to the imp of the perverse; a yearning for things to be made worse. And it’s not over yet, of course.

I’ve intermittently felt the urge to write about it, but it’s been difficult to summon much enthusiasm for blogging amidst the ruins, the trembling ground, the overwhelming sense of unmooring. A part of my identity is leaching away.

In the poisoned discourse of 2016, we are told that it is arrogant and elitist and anti-democratic to complain about the abrogation of a whole population’s rights at the whim of a narrow majority of actively-deceived voters, to rail against the generational betrayal of the young by delusional elderly racists, or to point out that contradictory goals do not suddenly become reconcilable just because an uneasy coalition of people with opposing aims all declared a desire for their own particular fantasy of not the status quo.

We voted for magic. Now bring me my unicorn!

Well, fuck that shit. Fuck the vanity of uninformed opinion, the false equivalence of visceral prejudice with expertise, the active disdain for reality. Fuck the shameless lies and pandering of nauseating hucksters like Gove and Johnson, peddling random policy baubles and then backing away with an insouciant shrug. Fuck the sociopathic (and ongoing) rabble-rousing of haterags like the Express and Mail and the cowed pseudo-balance of the BBC. Fuck the insistence of the ignorant that their vapid views be listened to and taken seriously.

That seething mass of mutually-incompatible twattery who make up the 52% are wrong and their misexpression of misdesire deserves no fucking respect at all. Literally every single reason for voting Leave boils down to one or both of evil and stupid.

I am prepared to accept that most of those people are not evil.

 

Aurora

“[…] People do seem to get addicted to their resentments. It must be like an endorphin, or a brain action in a temporal region, near the religious and epileptic nodes. I read a paper saying as much.”

“Fine for you, but let’s stick to the problem at hand. People feeling resentment are not going to give up on it when they are told they are drug addicts enjoying a religious seizure.”