Thursday, May 14, 2015

Open letter to Google on RTBF processes

I'm one of 80 signatories to an open letter to Google requesting more transparency from the company over how it processes so-called ‘right to be forgotten’ requests. The letter was drafted and coordinated by Julia Powles at Cambridge University and Ellen P. Goodman at Rutgers University School of Law.

Copy below.
"What We Seek
Aggregate data about how Google is responding to the >250,000 requests to delist links thought to contravene data protection from name search results. We should know if the anecdotal evidence of Google’s process is representative: What sort of information typically gets delisted (e.g., personal health) and what sort typically does not (e.g., about a public figure), in what proportions and in what countries?
Why It’s Important
Google and other search engines have been enlisted to make decisions about the proper balance between personal privacy and access to information. The vast majority of these decisions face no public scrutiny, though they shape public discourse. What’s more, the values at work in this process will/should inform information policy around the world. A fact-free debate about the RTBF is in no one’s interest.
Why Google
Google is not the only search engine, but no other private entity or Data Protection Authority has processed anywhere near the same number of requests (most have dealt with several hundred at most). Google has by far the best data on the kinds of requests being made, the most developed guidelines for handling them, and the most say in balancing informational privacy with access in search. We address this letter to Google, but the request goes out to all search engines subject to the ruling.

One year ago, the European Court of Justice, in Google Spain v AEPD and Mario Costeja González, determined that Google and other search engines must respond to users’ requests under EU data protection law concerning search results on queries of their names. This has become known as the Right to Be Forgotten (RTBF) ruling. The undersigned have a range of views about the merits of the ruling. Some think it rightfully vindicates individual data protection/privacy interests. Others think it unduly burdens freedom of expression and information retrieval. Many think it depends on the facts.
We all believe that implementation of the ruling should be much more transparent for at least two reasons: (1) the public should be able to find out how digital platforms exercise their tremendous power over readily accessible information; and (2) implementation of the ruling will affect the future of the RTBF in Europe and elsewhere, and will more generally inform global efforts to accommodate privacy rights with other interests in data flows.
Google reports that it has received over 250,000 individual requests concerning one million URLs in the past year. It also reports that it has delisted from name search results just over 40% of the URLs that it has reviewed. In various venues, Google has shared some 40 examples of delisting requests granted and denied (including 22 examples on its website), and it has revealed the top sources of material requested to be delisted (amounting to less than 8% of total candidate URLs). Most of the examples surfaced more than six months ago, with minimal transparency since then. While Google’s decisions will seem reasonable enough to most, in the absence of real information about how representative these are, the arguments about the validity and application of the RTBF are impossible to evaluate with rigour.
Beyond anecdote, we know very little about what kind and quantity of information is being delisted from search results, what sources are being delisted and on what scale, what kinds of requests fail and in what proportion, and what are Google’s guidelines in striking the balance between individual privacy and freedom of expression interests.
The RTBF ruling addresses the delisting of links to personal information that is “inaccurate, inadequate, irrelevant, or excessive for the purposes of data processing,” and which holds no public interest. Both opponents and supporters of the RTBF are concerned about overreach. Because there is no formal involvement of original sources or public representatives in the decision-making process, there can be only incidental challenges to information that is delisted, and few safeguards for the public interest in information access. Data protection authorities seem content to rely on search engines’ application of the ruling’s balancing test, citing low appeal rates as evidence that the balance is being appropriately struck. Of course, this statistic reveals no such thing. So the sides do battle in a data vacuum, with little understanding of the facts — facts that could assist in developing reasonable solutions.
Peter Fleischer, Google Global Privacy Counsel, reportedly told the 5th European Data Protection Days on May 4 that, “Over time, we are building a rich program of jurisprudence on the [RTBF] decision.” (Bhatti, Bloomberg, May 6). It is a jurisprudence built in the dark. For example, Mr. Fleischer is quoted as saying that the RTBF is “about true and legal content online, not defamation.” This is an interpretation of the scope and meaning of the ruling that deserves much greater elaboration, substantiation, and discussion.
We are not the only ones who want more transparency. Google’s own Advisory Council on the RTBF in February 2015 recommended more transparency, as did the Article 29 Working Party in November 2014. Both recommended that data controllers should be as transparent as possible by providing anonymised and aggregated statistics as well as the process and criteria used in delisting decisions. The benefits of such transparency extend to those who request that links be delisted, those who might make such requests, those who produce content that is or might be delisted, and the wider public who might or do access such material. Beyond this, transparency eases the burden on search engines by helping to shape implementation guidelines and revealing aspects of the governing legal framework that require clarification.
Naturally, there is some tension between transparency and the very privacy protection that the RTBF is meant to advance. The revelations that Google has made so far show that there is a way to steer clear of disclosure dangers. Indeed, the aggregate information that we seek threatens privacy far less than the scrubbed anecdotes that Google has already released, or the notifications that it is giving to webmasters registered with Google webmaster tools. The requested data is divorced from individual circumstances and requests. Here is what we think, at a minimum, should be disclosed:
  1. Categories of RTBF requests/requesters that are excluded or presumptively excluded (e.g., alleged defamation, public figures) and how those categories are defined and assessed.
  2. Categories of RTBF requests/requesters that are accepted or presumptively accepted (e.g., health information, address or telephone number, intimate information, information older than a certain time) and how those categories are defined and assessed.
  3. Proportion of requests and successful delistings (in each case by % of requests and URLs) that concern categories including (taken from Google anecdotes): (a) victims of crime or tragedy; (b) health information; (c) address or telephone number; (d) intimate information or photos; (e) people incidentally mentioned in a news story; (f) information about subjects who are minors; (g) accusations for which the claimant was subsequently exonerated, acquitted, or not charged; and (h) political opinions no longer held.
  4. Breakdown of overall requests (by % of requests and URLs, each according to nation of origin) according to the WP29 Guidelines categories. To the extent that Google uses different categories, such as past crimes or sex life, a breakdown by those categories. Where requests fall into multiple categories, that complexity too can be reflected in the data.
  5. Reasons for denial of delisting (by % of requests and URLs, each according to nation of origin). Where a decision rests on multiple grounds, that complexity too can be reflected in the data.
  6. Reasons for grant of delisting (by % of requests and URLs, each according to nation of origin). As above, multi-factored decisions can be reflected in the data.
  7. Categories of public figures denied delisting (e.g., public official, entertainer), including whether a Wikipedia presence is being used as a general proxy for status as a public figure.
  8. Source (e.g., professional media, social media, official public records) of material for delisted URLs by % and nation of origin (with top 5–10 sources of URLs in each category).
  9. Proportion of overall requests and successful delistings (each by % of requests and URLs, and with respect to both, according to nation of origin) concerning information first made available by the requestor (and, if so, (a) whether the information was posted directly by the requestor or by a third party, and (b) whether it is still within the requestor’s control, such as on his/her own Facebook page).
  10. Proportion of requests (by % of requests and URLs) where the information is targeted to the requester’s own geographic location (e.g., a Spanish newspaper reporting on a Spanish person about a Spanish auction).
  11. Proportion of searches for delisted pages that actually involve the requester’s name (perhaps in the form of % of delisted URLs that garnered certain threshold percentages of traffic from name searches).
  12. Proportion of delistings (by % of requests and URLs, each according to nation of origin) for which the original publisher or the relevant data protection authority participated in the decision.
  13. Specification of (a) types of webmasters that are not notified by default (e.g., malicious porn sites); (b) proportion of delistings (by % of requests and URLs) where the webmaster additionally removes information or applies robots.txt at source; and (c) proportion of delistings (by % of requests and URLs) where the webmaster lodges an objection.
As of now, only about 1% of requesters denied delisting are appealing those decisions to national Data Protection Authorities. Webmasters are notified in more than a quarter of delisting cases (Bloomberg, May 6). They can appeal the decision to Google, and there is evidence that Google may revise its decision. In the remainder of cases, the entire process is silent and opaque, with very little public process or understanding of delisting.
The ruling effectively enlisted Google into partnership with European states in striking a balance between individual privacy and public discourse interests. The public deserves to know how the governing jurisprudence is developing. We hope that Google, and all search engines subject to the ruling, will open up."
Full list of signatories, who have the additional honour of riding high in the TechnoLlama approval ratings, available at the Guardian and medium.com.

Monday, May 11, 2015

VE Day, freedom, justice and technology policy


An edited version of the following piece is available at The Conversation.

On the 70th anniversary of VE Day, it became clear that 36.9% of those who turned out, in the UK general election the previous day, voted for the Conservative party. With the turnout being 66.1%, this means that 24.3% of those eligible to vote gave the Tories an unexpected slim majority of seats in Parliament.

What does this mean for technology policy over the next five years, now that the Tories have been joyfully unleashed from their Liberal Democrat coalition partners? Well the big technology issues will continue to be things like communications infrastructure, surveillance, big data, human rights, encryption, security and intellectual property.

Communications infrastructure
Any progressive modern government indulging in rhetoric about being world leaders in technology needs to invest their money where their mouths are. Forget petty little projects like HS2, forget austerity and pour gigantic pots of the green stuff into building broadband infrastructure to every single corner of this little island. I'm not talking about promising households "up to 2Mbps" or even "up to 20Mbps" but a neutral network that delivers speeds upwards of one terabit per second to every household by 2020.

If we harbour serious ambitions to be technology leaders, then give every member of the population access to technology and robust neutral high speed networks. Then get out of the way and watch the human, constructive, social, technological, economic, commercial magic in that technological playpen.

Surveillance
Sadly, much of what passes for technology policy is likely to continue to be driven by the Home Office. Home Secretary Theresa May's obsession with the thoroughly discredited Communications Data Bill aka Snoopers' charter, is a top priority. Though the BBC reports (from 07:50) that the Snoopers charter is being handed to Mrs May's best mate, Michael Gove, in his new Justice Secretary role.  Whichever of them takes it on (and Home Office & Justice ministry insiders still believe it is the Home Office's responsibility) the government mean to reinforce and expand information systems and laws facilitating mass surveillance.

No amount of explanations that mass surveillance is dangerous and doesn't work ever gets through to the largely technologically illiterate MPs in Parliament - proportionately few have any scientific or technology training or background. Government ministers tend to avoid understanding something, when their jobs, and future tilts at the Tory leadership depend, on them not understanding it.

Effective law enforcement and intelligence work is complex, messy and difficult. Unfortunately Home and Justice Secretaries, under the gaze of the 24/7 news media, don't have the luxury of managing complex, messy and difficult. They need clear, simple and immediate actions and apparent solutions. 

Mrs May is concerned to be perceived as tough on crime and terrorism, repeatedly declaring a determination to give the security services everything they need - a phrase which, unlike the approval rating it gets in the UK, sends a shiver of fear down the spines of most continental Europeans, whose countries have experienced histories of totalitarian governance. Law enforcement and security services need to use modern digital technologies intelligently in their work and through targeted data preservation regimes – not the mass surveillance regime they are currently operating – engage in technological surveillance of individuals about whom they have reasonable cause to harbour suspicion. That is not, however, the same as building and enabling the legal operation of an infrastructure of mass surveillance.

The Intelligence and Security Committee of Parliament recently reported that "bulk collection" of communications data is acceptable because most of the data is only ever "seen" by computers. By that measure Mrs May & Mr Gove should install a top of the range CCTV camera in every room in every building in the country. They'll only use the footage if it becomes necessary, you see.

Yet not once, in the 14 years since the dreadful 9/11 attacks, on either side of the Atlantic, have mass surveillance magic terrorist catching machines ever been publicly,* credibly and specifically shown to identify a terrorist suspect pre-emptively.

Big data
Successive UK governments for a long time have not had a great record at managing large tranches of electronic data. On the promise of major medical breakthroughs both partners in the previous coalition government bought into the notion of big data in healthcare. The Tories are keen on privatising the health service. Making medical data available on a wide scale to medical researchers and industry is considered a no brainer, in the newly populated corridors of power.

The now dismantled Nu Labour National Programme for IT in the NHS was arguably the largest ever government IT disaster. But the coalition were and now the new Conservative government are keen to press forward with privacy-destroying management of NHS information systems and the further centralisation of medical records, in the ill-conceived care.data programme.

Government can't lose by claiming they are spending more on the NHS, even if it is on information systems that won't work; and in practice cause untold havoc in the long term. 

There is a serious question about facilitating constructive, enlightening, ethical, socially useful research into healthcare big data whilst protecting medical confidentiality. But it should not be done by letting government circumvent human rights law guarantees to privacy. It needs to be governed by a set of principles such as those set out in the Nuffield Bioethics Council report on the ethical use of data.
  • treat people as individuals worthy of respect and not as industrial raw material - the principle of respect for persons
  • abide by all relevant laws including human rights laws - the principle of respect for established human rights
  • tell people what you are doing with their data and consult them properly and regularly - the principle of participation of those with morally relevant interests
  • account for what you've done with the data and tell people if things go wrong - the principle of accounting for decisions

Human Rights Act
It's a little sad that 70 years on from VE day that 24.3% of eligible voters of the great British public apparently support the abolition of the Human Rights Act. In an information age, information and, by proxy, technology policy becomes human rights policy. The new government want to replace the Human Rights Act with a proper British Bill of Rights for proper, deserving, British people and not be dictated to by those
unelected European judges. The Tories' proposals for the Bill of Rights suggest it would include all of the rights currently protected by the European Convention on Human Rights (and, therefore, the Human Rights Act) but judicial protection would be denied to those rights judged to be “trivial”. Just as a matter of interest, which of -
  • Obligation to respect Human Rights
  • Right to life
  • Prohibition of torture
  • Prohibition of slavery and forced labour
  • Right to liberty and security
  • Right to a fair trial
  • No punishment without law
  • Right to respect for private and family life
  • Freedom of thought, conscience and religion
  • Freedom of expression
  • Freedom of assembly and association
  • Right to an effective remedy
  • Prohibition of discrimination
  • Prohibition of abuse of rights
- would you consider trivial? If you'd like a lesson on what any of these mean in practice may I recommend a couple of minutes at day at the recently launched Rights Info site. Pay particular attention to the human rights myths section - every one of these myths has been trotted out by the mainstream media and governing politicians of varying flavours in the past ten years. Theresa May featured the pet cat story at her Tory conference speech in 2011. Michael Gove has apparently been given charge of the Bill of Rights job.

Plans to ban encryption
Much mockery has been poured by the tech community on David Cameron's plans and the FBI's desire to ban encryption and provide mandatory back doors to government. Basically it won't work as a crime/terrorism prevention measure. And it definitely would not be cost effective. Banning Transport Layer Security (TLS) and its predecessor, Secure Sockets Layer (SSL) would be very costly. It would lead to further mass data silos, useless for preventing nefarious actors engaging in attacks.

Think of it as every householder delivering copies of their door keys to the local police station, just in case the police would like to search any house without breaking the door down. No self-respecting crime syndicate is going to systematically break into multiple houses when they can break into the police station first to get all the keys; or encourage or coerce an insider to provide them with copies of the keys.

The rest
I could go on – there are a host of really important technology related issues to play out in the next 5 years, including: 
  • Web blocking & age verification – think of the children and the poor copyright conglomerates
  • Counter Terrorism & Security Act 2015 "prevent" duties – about to spawn a veritable bureaucratic industry
  • EU data protection measures promised by the end 2015 – that’ll be fun
  • Intellectual property – a complex black hole that determines control of information
- but you, dear reader, have already been inordinately patient. 

So I will close with a question.

70 years on from VE day, given the surveillance society we have built since the turn of the century and intend to expand and reinforce over the next 5 years, can we really be seen to be honouring the sacrifice of those who died to win our freedoms and a lasting peace, founded on justice and good will; or, as John Naughton so eloquently puts it, does our  indolence constitute “a shocking case study of what complacent ignorance can do to a democracy”?

* Updated for clarification to add the word "publicly".