On the referendum #24H: Facebook, data science, technology, elections, and transparency

This blog has two short parts: A) a simple point about Wednesday’s committee hearing, B) some interesting evidence from a rare expert on the subject of data and campaigns, and a simple idea to improve regulation of elections. (And a PS. on hack Jane Merrick spreading more fake news.) There is a very short UPDATE re Facebook posted the next day, highlighted in BOLD below.

A. Re Wednesday’s Select Committee and Facebook letter

Correspondence from Facebook was published and used by the Committee to suggest that Vote Leave/AIQ have lied about when they started working together.

Henry de Zoete was introduced to AIQ on 31 March 2016. (This is all clear in emails that I think have been given to the Electoral Commission — if not they easily could be.)

AIQ did zero work for VL before then and, obviously, did not have access to VL’s Facebook page before we had even spoken to them.

If Facebook is saying that AIQ was running ads for VL in February 2016, then Facebook is wrong. [UPDATE: actually, if you read Facebook’s letter carefully, they correct their own error in a table where they use the timeframe for AIQ activity of “15 April – 23 June”. “15 April” of course fits with the date of VL’s introduction to AIQ I gave in this blog, and is the first day of the official campaign. The MPs either didn’t read the letter properly or chose to use the date which gave them a news story.]

VL was running stuff on FB in February as Facebook says. But this was done by us, NOT by AIQ.

Probably Facebook has looked at the VL FB page, seen activity in February, seen AIQ doing stuff shortly after and wrongly concluded that the earlier activity was also done by AIQ. It wasn’t and any further investigations will show this.

This isn’t actually important viz the legal claims and the EC investigation but I make the point in the interests of trying to clarify FACTS — so far the fake news inquiry has spread fake news around the world and clarified little. Also note how the Committee drops correspondence on the day of the hearing to maximise their chances of creating embarrassing moments for witnesses. This is the behaviour of people happy to see false memes spread, not the behaviour of truth-seeking MPs.

The Committee is now threatening me with ‘contempt of Parliament’. Their behaviour in seeking headlines rather than cooperating with witnesses over dates for evidence is the sort of behaviour that has increased the contempt of the public for MPs over the last 20 years, which of course contributed to the referendum result. The Committee doesn’t understand Vote Leave. We had to deal with threats from MPs every day for a year, including from the PM/Chancellor and their henchmen who could actually back up serious threats. We ignored that. Why would you think we’re going to worry about EMPTY threats? If you think I care about ‘reputational damage’, you are badly advised.

B. Rare expertise on the subject of data and elections from Eitan Hersh to US Senate

Eitan Hersh wrote a book in 2015 called Hacking the Electorate. It’s pretty much the best book I’ve seen on the use of data science in US elections and what good evidence shows works and does not work.

As I wrote after the referendum, we tried hard in Vote Leave to base decisions on the best EVIDENCE for what works in campaigns and we spent time tracking down a wide variety of studies. Usually in politics everything is done on hunches. Inevitably, the world of ‘communications’ / PR / advertising / marketing is full of charlatans flogging snake oil. It is therefore very easy to do things and spend money just because it’s conventional. Because we were such a huge underdog we had to take some big gambles and we wanted to optimise the effectiveness of our core message as much as possible — if you know the science, you can focus more effectively. The constraints of time, money, and the appalling in-fighting meant we never pushed this nearly as far as I wanted but we tried hard.

For example, one of the few things about advertising which seems logical and has good evidence to support it is — try to get your message in front of people as close to the decision point as possible. That’s why we spent almost the whole campaign testing things (via polls, focus groups, online etc) then dropped most of our marketing budget in the last few days of the campaign. Similarly, Robert Cialdini wrote one of the few very good books on persuasion — Influence — and ideas from that informed how we wrote campaign materials. We were happy to take risks and look stupid. We came across a study where researchers had used as a control a leaflet with zero branding only to find, much to their surprise if I remember right, that it worked much better than all the other examples. We therefore experimented with leaflets stripped of all branding (‘The Facts’) which unleashed another wave of attacks from SW1 (‘worst thing I’ve seen in politics, amateur hour’ etc), but sure enough in focus groups people loved it (the IN campaign clearly found the same because they started copying this).

Of course, all sorts of decisions could not be helped by reliable evidence. But it is a much healthier process to KNOW when you’re taking a punt. Most political operations — and government — don’t try to be rigorous about decision-making or force themselves to think about what they know with what confidence. They are dominated by seniority, not evidence. Our focus on evidence was connected to creating a culture in which people could say to senior people ‘you’re wrong’. This is invaluable. I made many awful mistakes but was mostly saved from the consequences because we had a culture in which people could say ‘you’re wrong’ and fix them fast.

This is relevant to Hersh’s evidence and the conspiracy theories…

Hersh’s evidence should be read by everybody interested in the general issues of data and elections and the recent conspiracy theories in particular. I won’t go into these conspiracies again.

Here are some quotes…

‘Based on the information I have seen from public reports about Cambridge Analytica, it is my opinion that its targeting practices in 2016 ought not to be a major cause for concern in terms of unduly influencing the election outcome…

‘In every election, the news media exaggerate the technological feats of political campaigns…

‘The latest technology used by the winning campaign is often a good storyline, even if it’s false. Finally, campaign consultants have a business interest in appearing to offer a special product to future clients, and so they are often eager to embellish their role in quotes to the media…

‘I found that commercial data did not turn out to be very useful to campaigns. Even while campaigns touted the hundreds or thousands of data points they had on individuals, campaigns’ predictive models did not rely very much on these fields. Relative to information like age, gender, race, and party affiliation, commercial measures of product preferences did not add very much explanatory power about Americans’ voting behavior…

‘Many commercial fields simply are not highly correlated with political dispositions. And even those that are might not provide added information to a campaign’s predictive models…

‘Nearly everything Mr. Nix articulates here [in a video describing CA’s methods] is not new. Based on what we know from past work, it is also likely to have been ineffective. Cambridge Analytica’s definition of a persuadable voter is someone who is likely to vote but the campaign isn’t sure who they will vote for. This is a common campaign convention for defining persuadability. It also bears virtually no relationship to which voters are actually persuadable, undecided, or cross-pressured on issues, as I discuss in Hacking the Electorate… Cambridge Analytica’s strategy of contacting likely voters who are not surely supportive of one candidate over the other but who support gun rights and who are predicted to bear a particular personality trait is likely to give them very little traction in moving voters’ opinions. And indeed, I have seen no evidence presented by the firm or by anyone suggesting the firm’s strategies were effective at doing this…

‘As many journalists have observed, building a psychological profile by connecting Facebook “likes” to survey respondents who took a personality test would lead to inaccurate predictions. Facebook “likes” might be correlated with traits like openness and neuroticism, but the correlation is likely to be weak. The weak correlation means that the prediction will have lots of false positives…

‘In campaign targeting models I have studied, predictions of which voters are black or Hispanic are wrong about 25-30% of the time. Models of traits such as issue positions or personality traits are likely to be much less accurate. They are less accurate because they are less stable and because available information like demographic correlates and Facebook “likes” are probably only weakly related to them…

‘In a series of experiments, a colleague and I found that voters penalize candidates for mis-targeting such that any gains made through a successful target are often canceled out by losses attributable to mistargets…

‘I am skeptical that Cambridge Analytica manipulated voters in a way that affected the election…

[Hersh then says ‘The skepticism I offer comes with a high degree of uncertainty’ and describes some of the gaps in what we know about such things. He also calls on Facebook to make its data available to researchers.]

‘News, both real and fake, is disseminated among users because it feels good to share. The kinds of news and content that often piques our interest appeals to our basest instincts; we are drawn to extremism, provocation, and outrage.’

Transparency — two simple ideas to improve things

In the last section Hersh discusses some broad points about transparency and social media. These things are important as I said after the referendum. Sadly, the focus on conspiracy theories has diverted the media and MPs away from serious issues.

I have zero legal responsibility for Vote Leave now — I ceased to be a director as part of our desperate rearguard action during the coup that kicked off on 25 January 2016. But I wouldn’t mind if Facebook wanted to take ALL of Vote Leave’s Facebook data that may be still sitting in ad manager etc — data normally considered very sensitive and never published by campaigns — and put the whole lot on its website available for download by anybody (excluding personal data so no individuals could be identified, which presumably would be illegal).

Why?

In principle I agree with Hersh and think serious academic scrutiny would be good.
In the interests of the VL team, it would prove what I have been saying and prove aspects of the conspiracy theories wrong. We never saw/used/wanted the data improperly acquired by CA. We did practically no ‘microtargeting’ in the normal sense of the term and zero using so-called ‘psychographics’ for exactly the reason described above — we tried to base decisions on good evidence and the good evidence from experts like Hersh was that it was not a good use of time and money. We focused on other things.

Here is another idea.

Why not have a central platform (managed by a much-reformed and updated Electoral Commission with serious powers) and oblige all permitted participants in elections to upload samples of all digital ads to this platform (say daily?) for public inspection by anybody who wants to look. After the election, further data on buy size, audience etc could be made automatically available alongside each sample. This would add only a tiny admin burden to a campaign but it would ensure that there is a full and accurate public record of digital campaigning.

Of course, this idea highlights an obvious point — there has never been any requirement on the parties to do this with paper documents. Part of the reason for the rage against Vote Leave in SW1 is that the referendum victory was something done to SW1 and the parties, not something done by them, hence partly their scrutiny of our methods. (This is also partly why the MPs are struggling so much to get to grips with the consequences.) There are no silver bullets but this simple measure would do some good and I cannot see a reasonable objection. Professional campaigners and marketers would hate this as they profit from a lack of transparency and flogging snake oil but their concerns should be ignored. Will the parties support such transparency for themselves in future elections?

One of the many opportunities of Brexit, as I’ve said before, comes in how we regulate such things. American law massively reflects the interests of powerful companies. EU law, including GDPR, is a legal and bureaucratic nightmare. The UK has, thanks to Brexit, a chance to regulate data better than either. This principle applies to many other fields, from CRISPR and genetic engineering to artificial intelligence and autonomous vehicles, which in the EU will be controlled by the ECJ interpreting the Charter of Fundamental Rights (and be bad for Europe’s economies and democracies). MPs could usefully consider these great opportunities instead of nodding along as officials do their best to get ministers to promise to maintain every awful set of EU rules until judgement day.

The issues of data-technology-elections is going to become more and more important fast. While the field is dominated by charlatans, it is clear that there is vast scope for non-charlatans to exploit technology and potentially do things far more effective, and potentially dangerous for democracy, than CA has claimed (wrongly) to do. Having spent some time in Silicon Valley since the referendum, it is obvious that it is/will be possible to have a decisive impact on a UK election using advanced technology. The limiting factors will be cash and a very small number of highly able people: i.e an operation to change an election could scale very effectively and stay hidden to a remarkable degree. The laws are a joke. MPs haven’t mastered the 70 year old technology of TV. How do you think they’d cope with people using tools like Generative Adversarial Networks (GANs) — never mind what will be available within five years? The gaps in technical skill between commercial fields are extreme and getting wider as the west coast of America and coastal China suck in people with extreme skills. Old media companies already cannot compete with the likes of Google and the skill gaps — and their consequences — grow every day.

But but but — technology alone will very rarely be the decisive factor: ‘people, ideas, and machines — in that order’, shouted Colonel Boyd at audiences, and this will remain true until/unless the machines get smarter than the people. The most important thing for campaigns (and governments) to get right is how they make decisions. If you do this right, you will exploit technology successfully. If you don’t — like the Tories in 2017 who created a campaign organisation violating every principle of effective action — no advantage in technology or cash will save you. And to get this right, you should study examples from the ancient world to modern projects like ARPA-PARC and Apollo (see here).

Anyway, I urge you to read Hersh’s evidence and ponder his warnings at the end, it will only take ~15 minutes. If interested, I also urge you to read some of the work by Rand Waltzman who ran a DARPA project on technology and social media. He has mostly been ignored in Washington as far as I can see but he should not be. He would be one of the most useful people in the world for MPs and hacks interested in these issues to speak to.

https://www.eitanhersh.com/uploads/7/9/7/5/7975685/hersh_written_testimony_senate_judiciary.pdf

PS. I’ve just been sent a blog on The Times website by Jane Merrick. It includes this regarding the latest odd news about a C4 drama:

‘Yet as with Mandelson, Cummings seems to complain about everything that is ever written about him, and so his reaction from his Twitter account — @odysseanproject (don’t ask) — was this: “What’s the betting this will be a Remain love-in and dire.” Oh how humbly he does brag!

I’ve had hacks email me asking me to ‘defend’ things on that Twitter account.

1. That is not my twitter account — it is a fake account. It’s interesting how many hacks complain about fake news while spreading it themselves. If you’re going to make claims about anonymous Twitter accounts (as she does elsewhere in her blog), try not to get confused by obvious parodies.

2. She also doesn’t mention that her husband, Toby Helm, was the SW1 equivalent of the guy in Scream chasing me and Henry de Zoete around Westminster for two years with a carving knife and a scream mask. The Observer promised the lobby I’d be marched out of the DfE in handcuffs. Nothing happened. Why? Because hate clouded their judgement, they botched the facts, and their claims were bullshit. Sound familiar?

[Update: The Times has cut that passage from the blog.]

5 thoughts on “On the referendum #24H: Facebook, data science, technology, elections, and transparency”

Tom W Huxley | May 18, 2018 at 22:08

“Why not have a central platform (managed by a much-reformed and updated Electoral Commission with serious powers) and oblige all permitted participants in elections to upload samples of all digital ads to this platform (say daily?) for public inspection by anybody who wants to look. After the election, further data on buy size, audience etc could be made automatically available alongside each sample. This would add only a tiny admin burden to a campaign but it would ensure that there is a full and accurate public record of digital campaigning.”

At what level of election campaigning would this apply? Having just gone through a local election cycle I can see this requirement being a nightmare on small and one-man operations; we had enough trouble creating the adverts to begin with (alongside all the other stuff we had to do) to target these tiny and inconveniently-shaped wards, by ourselves (the party is no help at all with this kind of thing, despite promises to the contrary, and likely never will be). Having to submit samples to the Electoral Commission *every day* on top would just mean we likely wouldn’t have the capacity to do any Facebook advertising without breaking electoral law.

LikeLike

- dominiccummings | May 22, 2018 at 10:04
  
  Hi Tom
  I was thinking national, not local.
  I don’t think you’re right re admin burden.
  Having created ad X, in my system you’d literally just drag a copy of it over to the EC platform and drop it in your registered ‘folder’ so to speak, the platform would time stamp it.
  I.e I think it would and should take someone no more than say 30 secs to fulfil this obligation, it would be about as simple as dragging a picture and dropping it in a folder on your laptop. With FB’s cooperation you could (I think) automate this step and just have it post automatically to the EC platform. Perhaps if the UK did this then FB would then extend the feature globally, no downside for them really and might take some heat off them.
  Arguably registering further information would be more of a burden and I don’t know if that’s necessary. If a copy of every pic is registered then if there are disputes the EC could always ask for further details. That’s probably a better approach than what I suggested in blog.
  Cheers
  Dom
  
  LikeLiked by 1 person
  
  - Tom W Huxley | May 22, 2018 at 19:31
    
    In principle that could be really easy but in practice I have my doubts that Facebook and the EC (mainly the EC) would cooperate to make that work so smoothly!
    
    LikeLike
    
    - dominiccummings | May 22, 2018 at 19:54
      
      For sure that’s an issue, like any simple idea it could be turned into a nightmare.
      d
      
      LikeLike
      
Pingback: On the referendum #33: High performance government, ‘cognitive technologies’, Michael Nielsen, Bret Victor, & ‘Seeing Rooms’ – Dominic Cummings's Blog

Share this:

Related

5 thoughts on “On the referendum #24H: Facebook, data science, technology, elections, and transparency”

Leave a comment Cancel reply