Should we trust London police (and therefore the Mayor’s/Home Office’s) claims on crime stats?

I just read a report in the Islington newspaper about crime statistics.

Ten days ago I was sitting outside a cafe in Islington.

Thanks to a few years of working in a nightclub and a couple of years in Russia followed by episodes like the referendum, I have developed a greater than average degree of paranoia. This is almost always irritating but occasionally useful. 

Typing away on my computer, I sensed a scooter’s noise was too close.

I looked up and my eyes locked on those of someone driving their scooter into my table in about one second’s time.

Between a mask across his nose and a hat, all I could see was young black male eyes (~15-25). I think it extremely unlikely I could have identified him in a lineup.

A second later his bike hit my table and he grabbed my laptop.

I grabbed it back, we wrestled, he nearly fell off his bike and after a half second pause when I thought ‘fuckhesabouttogetoffhisbikehithimfirst’ he whizzed off down the pavement nearly smashing into someone on the pavement.

This was seen by half a dozen people. Two were calling the cops within seconds. 

I stood on the street and cursed my stupidity in fighting over a laptop (no Carole, no evidence of global conspiracies there).

The cops told both witnesses they would come.

I hung around for an hour or so. They never came despite having been told the whole scene was captured on CCTV.

A few days later, as I was typing on my laptop inside the same cafe, the same scene played out.

I saw it through the window as a guy grabbed a laptop from a girl and whizzed off. She was sitting between two parents each with a small child. Both parents were rightly worried about the potential for such attacks to lead to a collision between escaping bike and toddler. My wife and I often sit there with our toddler.

She called the cops and told them it would be on CCTV (I’d told her).

The cops said they’d come.

I hung around for an hour or so.

They never came.

These two incidents have happened after a spate of knife attacks in the half a square mile around this cafe and the colonisation of Rosemary Gardens by various gangs at various times of the day.

According to the Islington Gazette today (HERE), the local cops are claiming that moped crime is 60% down.

I know from my nightclub days that when local cops need to show a fall in crime for political reasons there are all sorts of ways in which they can easily cheat numbers.

As far as I understand it, neither of the two moped attacks above would be recorded in the stats. There was no attempt to watch CCTV footage or gather evidence once they knew the people concerned were not claiming injuries.

Should I trust official statistics such as those announced by Islington police today or am I right to be sceptical? Are there are any serious statistical papers estimating what sort of errors are likely in such statistics (NB. polling companies often misstate the definition of ‘margin of error’ in their own polls so it is common for fields to have ropey ideas about error rates)? 

Apart from whether there is a rigorous process for gathering such statistics locally, is there a Red Team that acts across London to review local processes?

Do Corbyn and Thornberry (local MPs) believe the official statistics? Does the Mayor? Do the Home Secretary and Prime Minister? (I imagine that the Mayor’s office has no real capacity to interrogate official figures and is more or less completely reliant on what he is told?)

Ps. Later that day I called 111 to see what would happen if I tried to report it. A recorded message said long delays, use the website. I went to the website, started to fill it in, the page crashed, and I abandoned ship. Doubtless I should have pursued harder to ensure it was recorded but I guess my behaviour is roughly typical so many similar incidents with other people are probably not being recorded, which is my main point.

Pps. At the least, saying ‘we will come’ then not coming leads local people to conclude a) you can’t rely on the police, b) they’re giving up. So if they are not going to come, it would be wiser to say so and explain why. It’s always interesting when such basic processes are wrong. E.g the way health systems kill thousands every year needlessly because they don’t use simple checklists to avoid central line infections. People in politics tend to spend far too much time on higher profile issues affecting few people and too little time on such basic processes that affect thousands or millions and which we know how to do much better… Cf. blog on expertise which is also relevant to the new money for the NHS.

On the referendum #27: Banks, Russia, conspiracies and Vote Leave

Dear Tory MPs, ministers, donors and peers who supported the January 2016 coup against Vote Leave…

Remember how I and Victoria Woodcock told you repeatedly Banks was not someone who should play a significant role, that his conduct would destroy the credibility of an official campaign, and a ‘unified campaign’ with him would be a ‘total disaster’?

Remember how I and Victoria Woodcock told you repeatedly that he could not be trusted?

Remember how in horrific meeting after horrific meeting you said that we didn’t understand politics and we needed to ‘unite’ and ‘use his social media operation cos he’s got hundreds of thousands of Likes’?

Remember how we clutched our heads and said ‘Facebook doesn’t work like that, he’s spinning you all bullshit, the media will sink the whole campaign if Banks is involved and we refuse to contemplate it’?

Remember how you then tried to engineer the coup, partly also because Banks had told so many of you (cunningly) that the most important factor in winning was ‘you must represent us in the debates on stage with Nigel in front of millions’ and ‘we need your experience, not all these kids Cummings has hired’?

Of course it’s true that the Remain Establishment are doing whatever they can to discredit the referendum, the Observer has invented stuff for two years (including loony conspiracy theories about Banks, me, Mercer, AIQ, Russia etc), and Banks’s actual role in the 10 week campaign was trivial other than causing us embarrassment. Yes it’s true that Banks was a net drag on the result and we’d have won by more if he’d been dropped down one of his defunct mines in summer 2015 and the effort wasted dealing with him had been spent making Vote Leave much stronger earlier. From grassroots to digital, everything would have happened earlier, bigger and better but for that debilitating distraction which meant VL staff had to fight Banks and the entire Establishment simultaneously.

But all that does not change how close you all came to destroying our chance of winning by putting him in charge of the whole thing.

I know you’ll all be wanting to write to those Vote Leave staff who called your bluff and who, unlike you, displayed moral courage under pressure and made many personal sacrifices while you were on the beach or shooting, in order to apologise and thank them personally so here are some of the names of those who told you on 25 January 2016 they would all be out the door in 5 minutes if you persisted and handed power to Banks, and thereby nudged reality down a different branching history:

  • Richard Howell
  • Oliver Lewis
  • Rob Oxley
  • Stephen Parkinson
  • James Starkie
  • Paul Stephenson
  • Jonny Suart
  • Nick Varley
  • Cleo Watson
  • Victoria Woodcock

But for their actions that day, Vote Leave would have been destroyed, Farage and Banks would have run the official campaign with Bill Cash as legal adviser fighting with DD to be on the Today programme, Boris and Gove would have gone on a long holiday rather than flush their reputations down the toilet, Remain would have won 60-40 and Osborne would today be scanning the horizon for the right moment to take over before the 2020 election.

You’re welcome…

Dominic

Ps. Another branching history… If Cameron and Osborne had simply delayed the vote to 2017, Vote Leave would have ceased to exist in spring 2016, Banks and Farage would have been in charge with Cash/DD et al, and Remain would very likely have cruised to victory last year. Our extreme action on and after 25 January only worked because of the time pressure imposed by the Government. Without it, the consensus was, as people said at the time, ‘we’d have a year to rebuild without you and your crazy ideas’. In history books, luck is always underplayed and the talent of individuals is usually overplayed. As I’ve said many times, Vote Leave could only win because the Establishment’s OODA loops are broken — as the Brexit negotiations painfully demonstrate daily — and they are systematically bad at decisions, and this created just enough space for us to win.

Pps. Although Banks and Leave.EU HQ were hopeless, many of its volunteers did great work and ignored ‘the horror, the horror’ in London among the egomaniacs. Although Farage told them not to help VL post-designation, most of them ignored him and did help us (see comments below).

‘Politics is a job that can really only be compared with navigation in uncharted waters. One has no idea how the weather or the currents will be or what storms one is in for. In politics, there is the added fact that one is largely dependent on the decisions of others, decisions on which one was counting and which then do not materialise; one’s actions are never completely one’s own. And if the friends on whose support one is relying change their minds, which is something that one cannot vouch for, the whole plan miscarries… One’s enemies one can count on – but one’s friends!’ Otto von Bismarck.

‘Everything in war is very simple, but the simplest thing is difficult. The difficulties accumulate and end by producing a kind of friction that is inconceivable unless one has experienced war… Countless minor incidents – the kind you can never really foresee – combine to lower the general level of performance, so that one always falls short of the intended goal.  Iron will-power can overcome this friction … but of course it wears down the machine as well… Friction is the only concept that … corresponds to the factors that distinguish real war from war on paper.  The … army and everything else related to it is basically very simple and therefore seems easy to manage. But … each part is composed of individuals, every one of whom retains his potential of friction… This tremendous friction … is everywhere in contact with chance, and brings about effects that cannot be measured… Friction … is the force that makes the apparently easy so difficult… Finally … all action takes place … in a kind of twilight, which like fog or moonlight, often tends to make things seem grotesque and larger than they really are.  Whatever is hidden from full view in this feeble light has to be guessed at by talent, or simply left to chance.’ Clausewitz.

 

On the referendum #26: How to change science funding post-Brexit [updated with comment by Alan Kay]

There was an excellent piece in the Telegraph yesterday by two young neuroscientists on how SW1 should be thinking about science post-Brexit. The byline says that James Phillips works at Janelia, a US lab that has explicitly tried to learn about how to fund science research from the famous successes of Bell Labs, the ARPA-PARC project that invented the internet and PC, and similar efforts. He must see every day how science funding can work so much better than is normal in Britain.

Today, the UK a) ties research up in appalling bureaucracy, such as requiring multi-stage procurement processes literally to change a lightbulb, and b) does not fund it enough. The bureaucracy around basic science is so crazy that a glitch in paper work means thousands of animals are secretly destroyed in ways the public would be appalled to learn if made public.

Few in SW1 take basic science research seriously. And in all the debates over Brexit, practically the entire focus is 1980s arguments over the mechanism for regulating product markets created by Delors to centralise power in Brussels — the Internal Market (aka Single Market). Thirty years after they committed to this mechanism and two years after the referendum that blew it up, most MPs still don’t understand what it is and how it works. Dismally, the last two years has been a sort of remedial education programme and there has been practically zero discussion about how Britain could help create the future

During the referendum, Vote Leave argued that the dreadful Cameron/Osborne immigration policy (including the net migration target) was damaging and said we should make Britain MORE welcoming to scientists. Obviously Remain-SW1 likes to pretend that the May/Hammond Remain team’s shambles is the only possible version of Brexit. Nothing could be further from the truth. If the government had funded the NHS, ditched the ‘tens of thousands’ absurdity, and, for example, given maths, physics and computer science PhDs ‘free movement’ then things would be very different now — and Corbyn would probably be a historical footnote.

Regardless of how you voted in the referendum, reasonable people outside the rancid environment of SW1 should pressure their MPs to take their responsibilities to science x100 more seriously than they do.

I strongly urge you to read it all, send it to your MP, and politely ask for action…

(Their phrase ‘creating the future’ invokes Alan Kay’s famous line — the best way to predict the future is to invent it.)


Science holds the key, by James & Matthew Phillips

The 2008 crisis should have led us to reshape how our economy works. But a decade on, what has really changed? The public knows that the same attitude that got us into the previous economic crisis will not bring us long-term prosperity, yet there is little vision from our leaders of what the future should look like. Our politicians are sleeping, yet have no dreams. To solve this, we must change emphasis from creating “growth” to creating the future: the former is an inevitable product of the latter.

Britain used to create the future, and we must return to this role by turning to scientists and engineers. Science defined the last century by creating new industries. It will define this century too: robotics, clean energy, artificial intelligence, cures for disease and other unexpected advances lie in wait. The country that gives birth to these industries will lead the world, and yet we seem incapable of action.

So how can we create new industries quickly? A clue lies in a small number of institutes that produced a strikingly large number of key advances. Bell Labs produced much of the technology underlying computing. The Palo Alto Research Centre did the same for the internet. There are simple rules of thumb about how great science arises, embodied in such institutes. They provided ambitious long-term funding to scientists, avoided unnecessary bureaucracy and chased high-risk, high-reward projects.

Today, scientists spend much of their time completing paperwork. A culture of endless accountability has arisen out of a fear of misspending a single pound. We’ve seen examples of routine purchases of LEDs that cost under £10 having to go through a nine-step bureaucratic review process.

Scientists on the cusp of great breakthroughs can be slowed by years mired in review boards and waiting on a decision from on high. Their discoveries are thus made, and capitalised on, elsewhere. We waste money, miss patents, lose cures and drive talented scientists away to high-paid jobs. You don’t cure cancer with paperwork. Rather than invigilate every single decision, we should do spot checks retrospectively, as is done with tax returns.

A similar risk aversion is present in the science funding process. Many scientists are forced to specify years in advance what they intend to do, and spend their time continually applying for very short, small grants. However, it is the unexpected, the failures and the accidental, which are the inevitable cost and source of fruit in the scientific pursuit. It takes time, it takes long-term thinking, it takes flexibility. Peter Higgs, Nobel laureate who predicted the Higgs Boson, says he wouldn’t stand a chance of being funded today for lack of a track record. This leads scientists collectively to pursue incremental, low-risk, low-payoff work.

The current funding system is also top-down, prescriptive and homogenous, administered centrally from London. It is slow to respond to change and cut off from the real world.

We should return to funding university departments more directly, allowing more rapid, situation-aware decision-making of the kind present in start-ups, and create a diversity of funding systems. This is how the best research facilities in history operated, yet we do not learn their key lesson: that science cannot be managed by central edict, but flourishes through independent inquiry.

While Britain built much of modern science, today it neglects it, lagging behind other comparable nations in funding, and instead prioritising a financial industry prone to blowing up. Consider that we spent more money bailing out the banks in a single year than we have on science in the entirety of history.

We scarcely pause to consider the difference in return on investment. Rather than prop up old industries, we should invest in world-leading research institutes with a specific emphasis on high-risk, high-payoff research.

Those who say this is not government’s role fail the test of history. Much great science has come from government investment in times of crisis. Without Nasa, there would be no SpaceX. These government investments were used to provide a long-term, transformative vision on a scale that cannot be achieved through private investment alone – especially where there is a high risk of failure but high reward in success. The payoff of previous investments was enormous, so why not replicate the defence funding agencies that led to them with peacetime civilian equivalents?

In order to be the nation where new discoveries are made, we must take decisive steps to make the UK a magnet for talented young scientists.

However, a recent report on ensuring a successful UK research endeavour scarcely mentioned young scientists at all. An increased focus on this goal, alongside simple steps like long-term funding and guaranteed work visas for their spouses, would go a long way. In short, we should be to scientific innovation what we are to finance: a highly connected nerve centre for the global economy.

The political candidate that can leverage a pro-science platform to combine economic stimulus with the reality of economic pragmatism will transform the UK. We should lead the future by creating it.

James Phillips is a PhD student in neuroscience at the HHMI Janelia Research Campus in the US and the University of Cambridge. 
Matthew Phillips is a PhD student in neuroscience at the Sainsbury Wellcome Centre, University College London


UPDATE

Alan Kay, the brilliant researcher I mentioned above, happened to read this blog and posted this comment which I will also paste below here…

[From Alan Kay]

Good advice! However, I’m afraid that currently in the US there is nothing like the fabled Bell Labs or ARPA-PARC funding, at least in computing where I’m most aware of what is and is not happening (I’m the “Alan Kay” of the famous quote).

It is possible that things were still better a few years ago in the US than in the UK (I live in London half the year and in Los Angeles the other half). But I have some reasons to doubt. Since the new “president”, the US does not even have a science advisor, nor is there any sign of desire for one.

A visit to the classic Bell Labs of its heyday would reveal many things. One of the simplest was a sign posted randomly around: “Either do something very useful, or very beautiful”. Funders today won’t fund the second at all, and are afraid to fund at the risk level needed for the first.

It is difficult to sum up ARPA-PARC, but one interesting perspective on this kind of funding was that it was both long range and stratospherically visionary, and part of the vision was that good results included “better problems” (i.e. “problem finding” was highly valued and funded well) and good results included “good people” (i.e. long range funding should also create the next generations of researchers). in fact, virtually all of the researchers at Xerox PARC had their degrees funded by ARPA, they were “research results” who were able to get better research results.

Since the “D” was put on ARPA in the early 70s, it was then not able to do what it did in the 60s. NSF in the US never did this kind of funding. I spent quite a lot of time on some of the NSF Advisory Boards and it was pretty much impossible to bridge the gap between what was actually needed and the difficulties the Foundation has with congressional oversight (and some of the stipulations of their mission).

Bob Noyce (one of the founders of Intel) used to say “Wealth is created by Scientists, Engineers and Artists, everyone else just moves it around”.

Einstein said “We cannot solve important problems of the world using the same level of thinking we used to create them”.

A nice phrase by Vi Hart is “We must insure human wisdom exceeds human power”.

To make it to the 22nd century at all, and especially in better shape than we are now, we need to heed all three of these sayings, and support them as the civilization we are sometimes trying to become. It’s the only context in which “The best way to predict the future is to invent it” makes any useful sense.

Effective action #4b: ‘Expertise’, prediction and noise, from the NHS killing people to Brexit

In part A I looked at extreme sports as some background to the question of true expertise and the crucial nature of fast high quality feedback.

This blog looks at studies comparing expertise in many fields over decades, including work by Tetlock and Kahneman, and problems like — why people don’t learn to use even simple tools to stop children dying unnecessarily. There is a summary of some basic lessons at the end.

The reason for writing about this is that we will only improve the performance of government (at individual, team and institutional levels) if we reflect on:

  • what expertise really is and why do some very successful fields cultivate it effectively while others, like government, do not;
  • how to select much higher quality people (it’s insane people as ignorant and limited as me can have the influence we do in the way we do — us limited duffers can help in limited ways but why do we deliberately exclude ~100% of the most intelligent, talented, relentless, high performing people from fields with genuine expertise, why do we not have people like Fields Medallist Tim Gowers or Michael Nielsen as Chief Scientist  sitting ex officio in Cabinet?);
  • how to train people effectively to develop true expertise in skills relevant to government: it needs different intellectual content (PPE/economics are NOT good introductory degrees) and practice in practical skills (project management, making predictions and in general ‘thinking rationally’) with lots of fast, accurate feedback;
  • how to give them effective tools: e.g the Cabinet Room is worse in this respect than it was in July 1914 — at least then the clock and fireplace worked, and Lord Salisbury in the 1890s would walk round the Cabinet table gathering papers to burn in the grate — while today No10 is decades behind the state-of-the-art in old technologies like TV, doesn’t understand simple tools like checklists, and is nowhere with advanced technologies;
  • and how to ‘program’ institutions differently so that 1) people are more incentivised to optimise things we want them to optimise, like error-correction and predictive accuracy, and less incentivised to optimise bureaucratic process, prestige, and signalling as our institutions now do to a dangerous extent, and, connected, so that 2) institutions are much better at building high performance teams rather than continue normal rules that make this practically illegal, and so that 3) we have ‘immune systems’ to minimise the inevitable failures of even the best people and teams .

In SW1 now, those at the apex of power practically never think in a serious way about the reasons for the endemic dysfunctional decision-making that constitutes most of their daily experience or how to change it. What looks like omnishambles to the public and high performers in technology or business is seen by Insiders, always implicitly and often explicitly, as ‘normal performance’. ‘Crises’ such as the collapse of Carillion or our farcical multi-decade multi-billion ‘aircraft carrier’ project occasionally provoke a few days of headlines but it’s very rare anything important changes in the underlying structures and there is no real reflection on system failure.

This fact is why, for example, a startup created in a few months could win a referendum that should have been unwinnable. It was the systemic and consistent dysfunction of Establishment decision-making systems over a long period, with very poor mechanisms for good accurate feedback from reality, that created the space for a guerrilla operation to exploit.

This makes it particularly ironic that even after Westminster and Whitehall have allowed their internal consensus about UK national strategy to be shattered by the referendum, there is essentially no serious reflection on this system failure. It is much more psychologically appealing for Insiders to blame ‘lies’ (Blair and Osborne really say this without blushing), devilish use of technology to twist minds and so on. Perhaps the most profound aspect of broken systems is they cannot reflect on the reasons why they’re broken  — never mind take effective action. Instead of serious thought, we have high status Insiders like Campbell reduced to bathos with whining on social media about Brexit ‘impacting mental health’. This lack of reflection is why Remain-dominated Insiders lurched from failure over the referendum to failure over negotiations. OODA loops across SW1 are broken and this is very hard to fix — if you can’t orient to reality how do you even see your problem well? (NB. It should go without saying that there is a faction of pro-Brexit MPs, ‘campaigners’ and ‘pro-Brexit economists’ who are at least as disconnected from reality, often more, as the May/Hammond bunker.)

Screenshot 2018-06-05 10.05.19

In the commercial world, big companies mostly die within a few decades because they cannot maintain an internal system to keep them aligned to reality plus startups pop up. These two factors create learning at a system level — there is lots of micro failure but macro productivity/learning in which useful information is compressed and abstracted. In the political world, big established failing systems control the rules, suck in more and more resources rather than go bust, make it almost impossible for startups to contribute and so on. Even failures on the scale of the 2008 Crash or the 2016 referendum do not necessarily make broken systems face reality, at least quickly. Watching Parliament’s obsession with trivia in the face of the Cabinet’s and Whitehall’s contemptible failure to protect the interests of millions in the farcical Brexit negotiations is like watching the secretary to the Singapore Golf Club objecting to guns being placed on the links as the Japanese troops advanced.

Neither of the main parties has internalised the reality of these two crises. The Tories won’t face reality on things like corporate looting and the NHS, Labour won’t face reality on things like immigration and the limits of bureaucratic centralism. Neither can cope with the complexity of Brexit and both just look like I would look like in the ring with a professional fighter — baffled, terrified and desperate for a way to escape. There are so many simple ways to improve performance — and their own popularity! — but the system is stuck in such a closed loop it wilfully avoids seeing even the most obvious things and suppresses Insiders who want to do things differently…

But… there is a network of almost entirely younger people inside or close to the system thinking ‘we could do so much better than this’. Few senior Insiders are interested in these questions but that’s OK — few of them listened before the referendum either. It’s not the people now in power and running the parties and Whitehall who will determine whether we make Brexit a platform to contribute usefully to humanity’s biggest challenges but those that take over.

Doing better requires reflecting on what we know about real expertise…

*

How to distinguish between fields dominated by real expertise and those dominated by confident ‘experts’ who make bad predictions?

We know a lot about the distinction between fields in which there is real expertise and fields dominated by bogus expertise. Daniel Kahneman, who has published some of the most important research about expertise and prediction, summarises the two fundamental tests to ask about a field: 1) is there enough informational structure in the environment to allow good predictions, and 2) is there timely and effective feedback that enables error-correction and learning.

‘To know whether you can trust a particular intuitive judgment, there are two questions you should ask: Is the environment in which the judgment is made sufficiently regular to enable predictions from the available evidence? The answer is yes for diagnosticians, no for stock pickers. Do the professionals have an adequate opportunity to learn the cues and the regularities? The answer here depends on the professionals’ experience and on the quality and speed with which they discover their mistakes. Anesthesiologists have a better chance to develop intuitions than radiologists do. Many of the professionals we encounter easily pass both tests, and their off-the-cuff judgments deserve to be taken seriously. In general, however, you should not take assertive and confident people at their own evaluation unless you have independent reason to believe that they know what they are talking about.’ (Emphasis added.)

In fields where these two elements are present there is genuine expertise and people build new knowledge on the reliable foundations of previous knowledge. Some fields make a transition from stories (e.g Icarus) and authority (e.g ‘witch doctor’) to quantitative models (e.g modern aircraft) and evidence/experiment (e.g some parts of modern medicine/surgery). As scientists have said since Newton, they stand on the shoulders of giants.

How do we assess predictions / judgement about the future?

‘Good judgment is often gauged against two gold standards – coherence and correspondence. Judgments are coherent if they demonstrate consistency with the axioms of probability theory or propositional logic. Judgments are correspondent if they agree with ground truth. When gold standards are unavailable, silver standards such as consistency and discrimination can be used to evaluate judgment quality. Individuals are consistent if they assign similar judgments to comparable stimuli, and they discriminate if they assign different judgments to dissimilar stimuli.

‘Coherence violations range from base rate neglect and confirmation bias to overconfidence and framing effects (Gilovich, Griffith & Kahneman, 2002; Kahneman, Slovic & Tversky, 1982). Experts are not immune. Statisticians (Christensen-Szalanski & Bushyhead, 1981), doctors (Eddy, 1982), and nurses (Bennett, 1980) neglect base rates. Physicians and intelligence professionals are susceptible to framing effects and financial investors are prone to overconfidence.

‘Research on correspondence tells a similar story. Numerous studies show that human predictions are frequently inaccurate and worse than simple linear models in many domains (e.g. Meehl, 1954; Dawes, Faust & Meehl, 1989). Once again, expertise doesn’t necessarily help. Inaccurate predictions have been found in parole officers, court judges, investment managers in the US and Taiwan, and politicians. However, expert predictions are better when the forecasting environment provides regular, clear feedback and there are repeated opportunities to learn (Kahneman & Klein, 2009; Shanteau, 1992). Examples include meteorologists, professional bridge players, and bookmakers at the racetrack, all of whom are well-calibrated in their own domains.‘ (Tetlock, How generalizable is good judgment?, 2017.)

In another 2017 piece Tetlock explored the studies furtherIn the 1920s researchers built simple models based on expert assessments of 500 ears of corn and the price they would fetch in the market. They found that ‘to everyone’s surprise, the models that mimicked the judges’ strategies nearly always performed better than the judges themselves’ (Tetlock, cf. ‘What Is in the Corn Judge’s Mind?’, Journal of American Society for Agronomy, 1923). Banks found the same when they introduced models for credit decisions.

‘In other fields, from predicting the performance of newly hired salespeople to the bankruptcy risks of companies to the life expectancies of terminally ill cancer patients, the experience has been essentially the same. Even though experts usually possess deep knowledge, they often do not make good predictions

When humans make predictions, wisdom gets mixed with “random noise.”… Bootstrapping, which incorporates expert judgment into a decision-making model, eliminates such inconsistencies while preserving the expert’s insights. But this does not occur when human judgment is employed on its own…

In fields ranging from medicine to finance, scores of studies have shown that replacing experts with models of experts produces superior judgments. In most cases, the bootstrapping model performed better than experts on their own. Nonetheless, bootstrapping models tend to be rather rudimentary in that human experts are usually needed to identify the factors that matter most in making predictions. Humans are also instrumental in assigning scores to the predictor variables (such as judging the strength of recommendation letters for college applications or the overall health of patients in medical cases). What’s more, humans are good at spotting when the model is getting out of date and needs updating…

Human experts typically provide signal, noise, and bias in unknown proportions, which makes it difficult to disentangle these three components in field settings. Whether humans or computers have the upper hand depends on many factors, including whether the tasks being undertaken are familiar or unique. When tasks are familiar and much data is available, computers will likely beat humans by being data-driven and highly consistent from one case to the next. But when tasks are unique (where creativity may matter more) and when data overload is not a problem for humans, humans will likely have an advantage…

One might think that humans have an advantage over models in understanding dynamically complex domains, with feedback loops, delays, and instability. But psychologists have examined how people learn about complex relationships in simulated dynamic environments (for example, a computer game modeling an airline’s strategic decisions or those of an electronics company managing a new product). Even after receiving extensive feedback after each round of play, the human subjects improved only slowly over time and failed to beat simple computer models. This raises questions about how much human expertise is desirable when building models for complex dynamic environments. The best way to find out is to compare how well humans and models do in specific domains and perhaps develop hybrid models that integrate different approaches.‘ (Tetlock)

Kahneman also recently published new work relevant to this.

Research has confirmed that in many tasks, experts’ decisions are highly variable: valuing stocks, appraising real estate, sentencing criminals, evaluating job performance, auditing financial statements, and more. The unavoidable conclusion is that professionals often make decisions that deviate significantly from those of their peers, from their own prior decisions, and from rules that they themselves claim to follow.’

In general organisations spend almost no effort figuring out how noisy the predictions made by senior staff are and how much this costs. Kahneman has done some ‘noise audits’ and shown companies that management make MUCH more variable predictions than people realise.

‘What prevents companies from recognizing that the judgments of their employees are noisy? The answer lies in two familiar phenomena: Experienced professionals tend to have high confidence in the accuracy of their own judgments, and they also have high regard for their colleagues’ intelligence. This combination inevitably leads to an overestimation of agreement. When asked about what their colleagues would say, professionals expect others’ judgments to be much closer to their own than they actually are. Most of the time, of course, experienced professionals are completely unconcerned with what others might think and simply assume that theirs is the best answer. One reason the problem of noise is invisible is that people do not go through life imagining plausible alternatives to every judgment they make.

‘High skill develops in chess and driving through years of practice in a predictable environment, in which actions are followed by feedback that is both immediate and clear. Unfortunately, few professionals operate in such a world. In most jobs people learn to make judgments by hearing managers and colleagues explain and criticize—a much less reliable source of knowledge than learning from one’s mistakes. Long experience on a job always increases people’s confidence in their judgments, but in the absence of rapid feedback, confidence is no guarantee of either accuracy or consensus.’

Reviewing the point that Tetlock makes about simple models beating experts in many fields, Kahneman summarises the evidence:

‘People have competed against algorithms in several hundred contests of accuracy over the past 60 years, in tasks ranging from predicting the life expectancy of cancer patients to predicting the success of graduate students. Algorithms were more accurate than human professionals in about half the studies, and approximately tied with the humans in the others. The ties should also count as victories for the algorithms, which are more cost-effective…

‘The common assumption is that algorithms require statistical analysis of large amounts of data. For example, most people we talk to believe that data on thousands of loan applications and their outcomes is needed to develop an equation that predicts commercial loan defaults. Very few know that adequate algorithms can be developed without any outcome data at all — and with input information on only a small number of cases. We call predictive formulas that are built without outcome data “reasoned rules,” because they draw on commonsense reasoning.

‘The construction of a reasoned rule starts with the selection of a few (perhaps six to eight) variables that are incontrovertibly related to the outcome being predicted. If the outcome is loan default, for example, assets and liabilities will surely be included in the list. The next step is to assign these variables equal weight in the prediction formula, setting their sign in the obvious direction (positive for assets, negative for liabilities). The rule can then be constructed by a few simple calculations.

The surprising result of much research is that in many contexts reasoned rules are about as accurate as statistical models built with outcome data. Standard statistical models combine a set of predictive variables, which are assigned weights based on their relationship to the predicted outcomes and to one another. In many situations, however, these weights are both statistically unstable and practically unimportant. A simple rule that assigns equal weights to the selected variables is likely to be just as valid. Algorithms that weight variables equally and don’t rely on outcome data have proved successful in personnel selection, election forecasting, predictions about football games, and other applications.

‘The bottom line here is that if you plan to use an algorithm to reduce noise, you need not wait for outcome data. You can reap most of the benefits by using common sense to select variables and the simplest possible rule to combine them…

‘Uncomfortable as people may be with the idea, studies have shown that while humans can provide useful input to formulas, algorithms do better in the role of final decision maker. If the avoidance of errors is the only criterion, managers should be strongly advised to overrule the algorithm only in exceptional circumstances.

Jim Simons is a mathematician and founder of the world’s most successful ‘quant fund’, Renaissance Technologies. While market prices appear close to random and are therefore extremely hard to predict, they are not quite random and the right models/technology can exploit these small and fleeting opportunities. One of the lessons he learned early was: Don’t turn off the model and go with your gut. At Renaissance, they trust models over instincts. The Bridgewater hedge fund led by Ray Dalio is similar. After near destruction early in his career, Dalio explicitly turned towards explicit model building as the basis for decisions combined with radical attempts to create an internal system that incentivises the optimisation of error-correction. It works.

*

People fail to learn from even the great examples of success and the simplest lessons

One of the most interesting meta-lessons of studying high performance, though, is that simply demonstrating extreme success does NOT lead to much learning. For example:

  • ARPA and PARC created the internet and PC. The PARC research team was an extraordinary collection of about two dozen people who were managed in a very unusual way that created super-productive processes extremely different to normal bureaucracies. XEROX, which owned PARC, had the entire future of the computer industry in its own hands, paid for by its own budgets, and it simultaneously let Bill Gates and Steve Jobs steal everything and XEROX then shut down the research team that did it. And then, as Silicon Valley grew on the back of these efforts, almost nobody, including most of the billionaires who got rich from the dynamics created by ARPA-PARC, studied the nature of the organisation and processes and copied it. Even today, those trying to do edge-of-the-art research in a similar way to PARC right at the heart of the Valley ecosystem are struggling for long-term patient funding. As Alan Kay, one of the PARC team, said, ‘The most interesting thing has been the contrast between appreciation/exploitation of the inventions/contributions [of PARC] versus the almost complete lack of curiosity and interest in the processes that produced them. ARPA survived being abolished in the 1970s but it was significantly changed and is no longer the freewheeling place that it was in the 1960s when it funded the internet. In many ways DARPA’s approach now is explicitly different to the old ARPA (the addition of the ‘D’ was a sign of internal bureaucratic changes).

Screenshot 2018-06-05 14.55.00

  • ‘Systems management’ was invented in the 1950s and 1960s (partly based on wartime experience of large complex projects) to deal with the classified ICBM project and Apollo. It put man on the moon then NASA largely abandoned the approach and reverted to being (relative to 1963-9) a normal bureaucracy. Most of Washington has ignored the lessons ever since — look for example at the collapse of ObamaCare’s rollout, after which Insiders said ‘oh, looks like it was a system failure, wonder how we deal with this’, mostly unaware that America had developed a successful approach to such projects half a century earlier. This is particularly interesting given that China also studied Mueller’s approach to systems management in Apollo and as we speak is copying it in projects across China. The EU’s bureaucracy is, like Whitehall, an anti-checklist to high level systems management — i.e they violate almost every principle of effective action.
  • Buffett and Munger are the most successful investment partnership in world history. Every year for half a century they have explained some basic principles, particularly concerning incentives, behind organisational success. Practically no public companies take their advice and all around us in Britain we see vast corporate looting and politicians of all parties failing to act — they don’t even read the Buffett/Munger lessons and think about them. Even when given these lessons to read, they won’t read them (I know this because I’ve tried).

Perhaps you’re thinking — well, learning from these brilliant examples might be intrinsically really hard, much harder than Cummings thinks. I don’t think this is quite right. Why? Partly because millions of well-educated and normally-ethical people don’t learn even from much simpler things.

I will explore this separately soon but I’ll give just one example. The world of healthcare unnecessarily kills and injures people on a vast scale. Two aspects of this are 1) a deep resistance to learning from the success of very simple tools like checklists and 2) a deep resistance to face the fact that most medical experts do not understand statistics properly and their routine misjudgements cause vast suffering, plus warped incentives encourage widespread lies about statistics and irrational management. E.g People are constantly told things like ‘you’ve tested positive for X therefore you have X’ and they then kill themselves. We KNOW how to practically eliminate certain sorts of medical injury/death. We KNOW how to teach and communicate statistics better. (Cf. Professor Gigerenzer for details. He was the motivation for including things like conditional probabilities in the new National Curriculum.) These are MUCH simpler than building ICBMs, putting man on the moon, creating the internet and PC, or being great investors. Yet our societies don’t do them.

Why?

Because we do not incentivise error-correction and predictive accuracy. People are not incentivised to consider the cost of their noisy judgements. Where incentives and culture are changed, performance magically changes. It is the nature of the systems, not (mostly) the nature of the people, that is the crucial ingredient in learning from proven simple success. In healthcare like in government generally, people are incentivised to engage in wasteful/dangerous signalling to a terrifying degree — not rigorous thinking and not solving problems.

I have experienced the problem with checklists first hand in the Department for Education when trying to get the social worker bureaucracy to think about checklists in the context of avoiding child killings like Baby P. Professionals tend to see them as undermining their status and bureaucracies fight against learning, even when some great officials try really hard (as some in the DfE did such as Pamela Dow and Victoria Woodcock). ‘Social work is not the same as an airline Dominic’. No shit. Airlines can handle millions of people without killing one of them because they align incentives with predictive accuracy and error-correction.

Some appalling killings are inevitable but the social work bureaucracy will keep allowing unnecessary killings because they will not align incentives with error-correction. Undoing flawed incentives threatens the system so they’ll keep killing children instead — and they’re not particularly bad people, they’re normal people in a normal bureaucracy. The pilot dies with the passengers. The ‘CEO’ on over £150,000 a year presiding over another unnecessary death despite constantly increasing taxpayers money pouring in? Issue a statement that ‘this must never happen again’, tell the lawyers to redact embarrassing cockups on the grounds of ‘protecting someone’s anonymity’ (the ECHR is a great tool to cover up death by incompetence), fuck off to the golf course, and wait for the media circus to move on.

Why do so many things go wrong? Because usually nobody is incentivised to work relentlessly to suppress entropy, never mind come up with something new.

*

We can see some reasonably clear conclusions from decades of study on expertise and prediction in many fields.

  • Some fields are like extreme sport or physics: genuine expertise emerges because of fast effective feedback on errors.
  • Abstracting human wisdom into models often works better than relying on human experts as models are often more consistent and less noisy.
  • Models are also often cheaper and simpler to use.
  • Models do not have to be complex to be highly effective — quite the opposite, often simpler models outperform more sophisticated and expensive ones.
  • In many fields (which I’ve explored before but won’t go into again here) low tech very simple checklists have been extremely effective: e.g flying aircraft or surgery.
  • Successful individuals like Warren Buffett and Ray Dalio also create cognitive checklists to trap and correct normal cognitive biases that degrade individual and team performance.
  • Fields make progress towards genuine expertise when they make a transition from stories (e.g Icarus) and authority (e.g ‘witch doctor’) to quantitative models (e.g modern aircraft) and evidence/experiment (e.g some parts of modern medicine/surgery).
  • In the intellectual realm, maths and physics are fields dominated by genuine expertise and provide a useful benchmark to compare others against. They are also hierarchical. Social sciences have little in common with this.
  • Even when we have great examples of learning and progress, and we can see the principles behind them are relatively simple and do not require high intelligence to understand, they are so psychologically hard and run so counter to the dynamics of normal big organisations, that almost nobody learns from them. Extreme success is ‘easy to learn from’ in one sense and ‘the hardest thing in the world to learn from’ in another sense.

It is fascinating how remarkably little interest there is in the world of politics/government, and social sciences analysing politics/government, about all this evidence. This is partly because politics/government is an anti-learning and anti-expertise field, partly because the social sciences are swamped by what Feynman called ‘cargo cult science’ with very noisy predictions, little good feedback and learning, and a lot of chippiness at criticism whether it’s from statistics experts or the ‘ignorant masses’. Fields like ‘education research’ and ‘political science’ are particularly dreadful and packed with charlatans but much of economics is not much better (much pro- and anti-Brexit mainstream economics is classic ‘cargo cult’).

I have found there is overwhelmingly more interest in high technology circles than in government circles, but in high technology circles there is also a lot of incredulity and naivety about how government works — many assume politicians are trying and failing to achieve high performance and don’t realise that in fact nobody is actually trying. This illusion extends to many well-connected businessmen who just can’t internalise the reality of the apex of power. I find that uneducated people on 20k living hundreds of miles from SW1 generally have a more accurate picture of daily No10 work than extremely well-connected billionaires.

This is all sobering and is another reason to be pessimistic about the chances of changing government from ‘normal’ to ‘high performance’ — but, pessimism of the intellect, optimism of the will…

If you are in Whitehall now watching the Brexit farce or abroad looking at similar, you will see from page 26 HERE a checklist for how to manage complex government projects at world class levels (if you find this interesting then read the whole paper). I will elaborate on this. I am also thinking about a project to look at the intersection of (roughly) five fields in order to make large improvements in the quality of people, ideas, tools, and institutions that determine political/government decisions and performance:

  • the science of prediction across different fields (e.g early warning systems, the Tetlock/IARPA project showing dramatic performance improvements),
  • what we know about high performance (individual/team/organisation) in different fields (e.g China’s application of ‘systems management’ to government),
  • technology and tools (e.g Bret Victor’s work, Michael Nielsen’s work on cognitive technologies, work on human-AI ‘minotaur’ teams),
  • political/government decision making affecting millions of people and trillions of dollars (e.g WMD, health), and
  • communication (e.g crisis management, applied psychology).

Progress requires attacking the ‘system of systems’ problem at the right ‘level’. Attacking the problems directly — let’s improve policy X and Y, let’s swap ‘incompetent’ A for ‘competent’ B — cannot touch the core problems, particularly the hardest meta-problem that government systems bitterly fight improvement. Solving the explicit surface problems of politics and government is best approached by a more general focus on applying abstract principles of effective action. We need to surround relatively specific problems with a more general approach. Attack at the right level will see specific solutions automatically ‘pop out’ of the system. One of the most powerful simplicities in all conflict (almost always unrecognised) is: ‘winning without fighting is the highest form of war’. If we approach the problem of government performance at the right level of generality then we have a chance to solve specific problems ‘without fighting’ — or, rather, without fighting nearly so much and the fighting will be more fruitful.

This is not a theoretical argument. If you look carefully at ancient texts and modern case studies, you see that applying a small number of very simple, powerful, but largely unrecognised principles (that are very hard for organisations to operationalise) can produce extremely surprising results.

How to jump from the Idea to Reality? More soon…


Ps. Just as I was about to hit publish on this, the DCMS Select Committee released their report on me. The sentence about the Singapore golf club at the top comes to mind.