On the referendum #24D: Walter Mitty, CA, and the Guardian/Observer’s own ‘personal data harvesting’

[Update: Re the Guardian/Observer doing what Cambridge Analytica did — i.e harvest personal data not just from person X who used their app but also from all their friends, just like the central allegation vs CA! — shouldn’t the Guardian/Observer have to inform all those they did this to a) that they did it, b) whether they still ‘hold’ all this data in the ICO’s definition, c) what use they are making now of this harvested data? And if they now believe such behaviour is evil, will they destroy all such data they gathered via their Facebook app? The Guardian/Observer probably holds considerably more personal data harvested via Facebook than Cambridge Analytica ever did… Wylie pointed out yesterday, rightly, the lack of powers for the ICO. If the Guardian/Observer just deleted all this dodgy data today, then as Wylie said we would be none the wiser. Presumably Carole will push internally for ‘full transparency’?

Further, and wouldn’t this be the irony of all ironies, the Guardian itself uses ‘behavioural targeting’ and shares this data with advertisers. Did they supply CA with any data?! It seems the Guardian/Observer never checked out what they are doing themselves before they unleashed this virus of a story…]

Screenshot 2018-03-28 13.02.57

Hugo is half right but not in the way he thinks…

The most obvious point about the whistleblower story is that the one thing Carole has undoubtedly done is provide good evidence for something Vote Leave knew in 2015: it would be lunacy for Vote Leave to ally with Arron Banks and Cambridge Analytica. Unfortunately for Carole, her global conspiracy depends on claiming that I was secretly coordinating everything with Banks all along which is one of the reasons the story has fizzled out. The lobby knows this is untenable.

Wylie made many allegations during his testimony to the DCMS committee. Most had nothing to do with Vote Leave so I won’t comment. Regarding Vote Leave/BeLeave, on issue after issue it’s the same story with the whistleblowers: on the BeLeave bank account, expenses, who set up what when, the ‘shared drive’ (which I think they have lied to the Observer and its lawyers about) and so on — their stories change and they will be shown to be either mistaken or lying. The Electoral Commission gave us written permission to donate to BeLeave — this is a fact supported by documents presented in High Court though the media keeps writing I ‘claim’ this. VL staff and Darren Grimes behaved reasonably in trying to strike the right balance between cooperating in certain ways, which we were legally allowed and obliged to do, and ‘coordinating’ in the legal sense, which is very opaque but which we continue to believe we did not do (see long blog for details).

I will explore one of Wylie’s central claims about data that goes to the heart of the VL angle of this story. The issues around Facebook, data, targeting etc are partly quite technical. If they are to investigate such issues properly then the MPs need expert support or they are wide open to charlatans. I am very far from an expert but I’ll try to explain why one of Wylie’s central claims should not be believed.

As reported by the BBC:

‘[Wylie] said he was sure Aggregate IQ had drawn on Cambridge Analytica databases during the referendum, saying it “baffled” him how a firm in the UK for only a couple of months had “created a massive targeting operation” without access to data.’

His claim was that Vote Leave must have used the Cambridge Analytica data, passed through AIQ, to win the referendum. Wylie, describing the alleged links between Cambridge Analytica, AIQ and the specific dataset said:

“You can’t do online targeting if you don’t have access to the database. You just can’t” [11:49:50]

There are two very clear problems with this story.

The Facebook data is on US voters so would have been useless in the referendum.
Far from being impossible it is actually incredibly easy to set up totally legitimate/lawful targeting on Facebook without any electoral data as the Guardian/Observer knows because it does it itself and runs ‘masterclasses’ teaching people how to do it.

And, amusingly, it turns out that the Guardian/Observer itself was ‘harvesting personal data’ via its own Facebook app! Did Carole know and when did she know it?!

Problem 1: The FB data was on US voters

During his testimony, and the extensive reporting on the subject, it has been very clear that the Facebook data was specific to US voters.

As Paul-Olivier Dehaye said in the Committee meeting with Wylie, describing the dataset:

“A few hundred plus millions of Americans’ whose data is being processed by this company” [13:26:50]

Or as the Guardian itself reports:

“harvested millions of Facebook profiles of US voters”

However, data on US voters was irrelevant to the referendum and obviously wasn’t used. And Wylie accidentally undermined his own claim. When under pressure on his own possible illegal use of the FB data, Wyle suddenly blurted out the truth:

“No, because I didn’t have any UK data, I couldn’t physically offer” [to use the Cambridge Analytica dataset for Vote Leave] [13:44:50]

Which is true. There was no UK dataset and the US dataset was never used by Vote Leave or any other party in the referendum. Any responsible journalist should stop claiming anything to the contrary.

Problem 2: It’s extremely easy to set up completely legitimate targeting on Facebook

Wylie claims that:

“it “baffled” him how a firm in the UK for only a couple of months had “created a massive targeting operation” without access to data.” [BBC]

Or in an extended quote from Wylie:

“Strongly encourage looking at this question of where did they get the data? When I met with Dom Cummings, in Nov 2015, one of the things that was apparent is that Vote Leave at the time actually didn’t have any data. That’s in November 2015. Dom Cummings, in part, wanted to meet with me because he was really interested in Cambridge Analytica. He wanted to create the quote-unquote Palantir for politics. But it become apparent that if you don’t even have the electoral register, let alone a social database, you can’t really do this, or you can’t do it legally.” [11:21:50 to 11:24:00]

However, almost any company that does any online marketing could easily explain to the committee how easy it is to do completely legitimate targeting. In fact, the Guardian itself can train them how to do this in its own masterclass on social media and digital targeting!

Pic: Guardian masterclass in online marketing

Screenshot 2018-03-28 14.07.03

It is total rubbish to suggest that this is difficult for a political campaign, let alone impossible. It can be done, start to finish, in less than an hour by someone who knows what they are doing. When I go and give evidence to the DCMS, which sadly will have to await multiple legal actions, I could demo why they were lied to by getting someone to start from scratch on a laptop at the start of the session and shouting ‘finished’ when the DCMS is running its own targeted Facebook ad campaign with zero use of electoral data.

What actually happened

As has been discussed publicly, what actually happened is relatively simple. Through a combination of focus groups and polling, we were aware that the people we wanted to reach were in particular demographic categories, basically ‘between 35-55, outside London and Scotland, excluding UKIP supporters and associated characteristics, and some other criteria’. We created ads, mainly focussed on the NHS, that AIQ put onto Facebook. These were targeted at this very broad segment of society, completely legitimately and with no use of American voter data (obviously!) to reach about 20% the voting population of the UK. Our use of so-called micro-targeting was minimal. Further, we made ZERO use of so-called ‘psychographic’ marketing because our campaign was informed by looking at what serious science suggests works and Big5/OCEAN profiling for politics is very marginal (and expensive) at best.

Our best tools were not super-sophisticated digital targeting — the supposed Jedi mind-bending superpowers that Carole thinks we have — but 1) learning from books thousands of years old about how to manage complex operations and 2) listening hard to the public rather than the pundits.

Conclusion

We have two competing claims.

From Chris Wylie, that it is almost impossible for Vote Leave to have done targeting without access to the Cambridge Analytica dataset.

From Vote Leave, a claim that this is not only possible, it is trivially simple.

If you want to decide between these two, we suggest that you sign up to the Guardian/Observer masterclass on targeting and see for yourself!

There is an amusing kicker. The Guardian/Observer themselves run targeted ads on Facebook.

Pic: The Guardian/Observer targeted Facebook ads

Screenshot 2018-03-28 14.21.08

As previously discussed, Chris Wylie comments that “You can’t do online targeting if you don’t have access to the database. You just can’t”. He also states on the record that he had access to the complete Cambridge Analytica data, and gives no concrete evidence he deleted it. While this data was useless for Vote Leave, who didn’t operate in the US, it would be useful for the Guardian, who do operate there. If you believe Wylie’s testimony, one might conclude that the only way the Guardian could be doing online targeting is through access to Wylie’s version of the Cambridge Analytica dataset!

Of course, I don’t believe this. I believe that the Guardian, just like almost every other company doing digital advertising, and just like Vote Leave, is operating completely within the law. But to spare itself further embarrassment over its Walter Mitty whistleblower, the Guardian should admit that it’s possible to do targeting without access to the Cambridge Analytica dataset, just as they do. And of course this directly conflicts with one of the many bullshit claims by Wylie about the Cambridge Analytica story.

Could I have an answer on the record, please Carole?

Further, in another amusing irony, check out the Guardian/Observer’s own iterating privacy policy — or should we say, ‘anti-privacy policy’?! — on its Facebook app. Yup, they were themselves doing just what they are claiming (falsely) Vote Leave exploited and what they say is destroying democracy — harvesting personal data via their app! Nice work Carole, but please remember — Vote Leave never stooped so low!

Screenshot 2018-03-28 14.24.51

Finally, in another extreme irony alert… Fair Vote, the vehicle for Blair and Osborne to attack Vote Leave and campaign for a second referendum (funded by??) is … wait for it, chasing people around the internet recruiting supporters USING TARGETED ADS!

Screenshot 2018-03-28 14.39.35.png

Dear Observer, Channel 4, Fair Vote, you are an absolute bunch of charlatans — start getting ready for Brexit, your weak Zoolander story is going nowhere.

For the avoidance of any doubt, as I have said many times for over a year…

VL did not work with Cambridge Analytica directly or indirectly.
We never had, or sought access to, the FB data in question.
Microtargeting is an important issue but played practically no part in the VL campaign.

Ps. Also it’s not my job to protect Facebook but hacks keep circulating a video of Zuckerberg saying to the BBC that he would not ‘sell people’s data’, with comments to the effect that ‘what a liar Zuck is’ etc. Practically nobody seems to realise that Facebook did not sell that data! Facebook consistently put user experience ahead of revenue (hence partly why it has flourished while competitors blew up) and its business model does not involve selling personal data as per this clip.

As I suggested a year ago newspapers need to hire specialists who actually understand these issues to advise its political reporters/pundits. There are real issues about data/elections/platforms but pundits make sensible debate harder when they accuse campaigns and Facebook of things they never did. Stick to the facts guys, that’s tricky enough to deal with…

If you want to read someone who actually understands this story ignore people like Hugo Rifkind and read this.

Another excellent piece, one of the best I’ve read on CA snake oil.

Some links

http://www.bbc.co.uk/news/uk-politics-43558876

https://www.theguardian.com/guardian-masterclasses/2017/feb/15/how-to-develop-a-social-media-strategy-for-your-retail-business-digital-course

https://www.theguardian.com/guardian-masterclasses/2015/jul/03/advanced-social-media-for-businesses

https://www.theguardian.com/news/2018/mar/17/cambridge-analytica-facebook-influence-us-election

https://dominiccummings.com/wp-content/uploads/2017/01/20170130-referendum-22-numbers.pdf

https://parliamentlive.tv/Event/Index/28e9cccd-face-47c4-92b3-7f2626cd818e

LA Times piece with lots of comments from various people about why CA is snake oil.