Teresa Scassa - Blog

Displaying items by tag: data analytics

On June 13, 2018 the Supreme Court of Canada handed down a decision that may have implications for how issues of bias in algorithmic decision-making in Canada will be dealt with. Ewert v. Canada is the result of an eighteen-year struggle by Mr. Ewert, a federal inmate and Métis man, to challenge the use of certain actuarial risk-assessment tools to make decisions about his carceral needs and about his risk of recidivism. His concerns, raised in his initial grievance in 2000, have been that these tools were “developed and tested on predominantly non-Indigenous populations and that there was no research confirming that they were valid when applied to Indigenous persons.” (at para 12) After his grievances went nowhere, he eventually sought a declaration in Federal Court that the tests breached his rights to equality and to due process under the Canadian Charter of Rights and Freedoms, and that they were also a breach of the Corrections and Conditional Release Act (CCRA), which requires the Correctional Service of Canada (CSC) to “take all reasonable steps to ensure that any information about an offender that it uses is as accurate, up to data and complete as possible.” (s. 24(1)). Although the Charter arguments were unsuccessful, the majority of the Supreme Court of Canada agreed with the trial judge that CSC had breached its obligations under the CCRA. Two justices in dissent agreed with the Federal Court of Appeal that neither the Charter nor the CCRA had been breached.

Although this is not explicitly a decision about ‘algorithmic decision-making’ as the term is used in the big data and artificial intelligence (AI) contexts, the basic elements are present. An assessment tool developed and tested using a significant volume of data is used to generate predictive data to aid in decision-making in individual cases. The case also highlights a common concern in the algorithmic decision-making context: that either the data used to develop and train the algorithm, or the assumptions coded into the algorithm, create biases that can lead to inaccurate predictions about individuals who fall outside the dominant group that has influenced the data and the assumptions.

As such, my analysis is not about the particular circumstances of Mr. Ewert, nor is it about the impact of the judgement within the correctional system in Canada. Instead, I parse the decision to see what it reveals about how courts might approach issues of bias in algorithmic decision-making, and what impact the decision may have in this emerging context.

1. ‘Information’ and ‘accuracy’

A central feature of the decision of the majority in Ewert is its interpretation of s. 24(1) of the CCRA. To repeat the wording of this section, it provides that “The Service shall take all reasonable steps to ensure that any information about an offender that it uses is as accurate, up to date and complete as possible.” [My emphasis] In order to conclude that this provision was breached, it was necessary for the majority to find that Mr. Ewert’s test results were “information” within the meaning of this section, and that the CCRA had not taken all reasonable steps to ensure its accuracy.

The dissenting justices took the view that when s. 24(1) referred to “information” and to the requirement to ensure its accuracy, the statute included only the kind of personal information collected from inmates, information about the offence committed, and a range of other information specified in s. 23 of the Act. The dissenting justices preferred the view of the CSC that “information” meant ““primary facts” and not “inferences or assessments drawn by the Service”” (at para 107). The majority disagreed. It found that when Parliament intended to refer to specific information in the CCRA it did so. When it used the term “information” in an unqualified way, as it did in s. 24(1), it had a much broader meaning. Thus, according to the majority, “the knowledge the CSC might derive from the impugned tools – for example, that an offender has a personality disorder or that there is a high risk than an offender will violently reoffend – is “information” about that offender” (at para 33). This interpretation of “information” is an important one. According to the majority, profiles and predictions applied to a person are “information” about that individual.

In this case, the Crown had argued that s. 24(1) should not apply to the predictive results of the assessment tools because it imposed an obligation to ensure that “information” is “as accurate” as possible. It argued that the term “accurate” was not appropriate to the predictive data generated by the tools. Rather, the tools “may have “different levels of predictive validity, in the sense that they predict poorly, moderately well or strongly””. (at para 43) The dissenting justices were clearly influenced by this argument, finding that: “a psychological test can be more or less valid or reliable, but it cannot properly be described as being “accurate” or “inaccurate”.” (at para 115) According to the dissent, all that was required was that accurate records of an inmate’s test scores must be maintained – not that the tests themselves must be accurate. The majority disagreed. In its view, the concept of accuracy could be adapted to different types of information. When applied to psychological assessment tools, “the CSC must take steps to ensure that it relies on test scores that predict risks strongly rather than those that do so poorly.” (at para 43)

It is worth noting that the Crown also argued that the assessment tools were important in decision-making because “the information derived from them is objective and thus mitigates against bias in subjective clinical assessments” (at para 41). While the underlying point is that the tools might produce more objective assessments than individual psychologists who might bring their own biases to an assessment process, the use of the term “objective” to describe the output is troubling. If the tools incorporate biases, or are not appropriately sensitive to cultural differences, then the output is ‘objective’ in only a very narrow sense of the word, and the use of the word masks underlying issues of bias. Interestingly, the majority took the view that if the tools are considered useful “because the information derived from them can be scientifically validated. . . this is all the more reason to conclude that s. 24(1) imposes an obligation on the CSC to take reasonable steps to ensure that the information is accurate.” (at para 41)

It should be noted that while this discussion all revolves around the particular wording of the CCRA, Principle 4.6 of Schedule I of the Personal Information Protection and Electronic Documents Act (PIPEDA) contains the obligation that: “Personal information shall be as accurate, complete, and up-to-date as is necessary for the purposes for which it is to be used.” Further, s. 6(2) of the Privacy Act provides that: “A government institution shall take all reasonable steps to ensure that personal information that is used for an administrative purpose by the institution is as accurate, up-to-date and complete as possible.A similar interpretation of “information” and “accuracy” in these statutes could be very helpful in addressing issues of bias in algorithmic decision-making more broadly.

2. Reasonable steps to ensure accuracy

According to the majority, “[t]he question is not whether the CSC relied on inaccurate information, but whether it took all reasonable steps to ensure that it did not.” (at para 47). This distinction is important – it means that Mr. Ewert did not have to show that his actual test scores were inaccurate, something that would be quite burdensome for him to do. According to the majority, “[s]howing that the CSC failed to take all reasonable steps in this respect may, as a practical matter, require showing that there was some reason for the CSC to doubt the accuracy of information in its possession about an offender.” (at para 47, my emphasis) The majority noted that the trial judge had found that “the CSC had long been aware of concerns regarding the possibility of psychological and actuarial tools exhibiting cultural bias.” (at para 49) The concerns had led to research being carried out in other jurisdictions about the validity of the tools when used to assess certain other cultural minority groups. The majority also noted that the CSC had carried out research “into the validity of certain actuarial tools other than the impugned tools when applied to Indigenous offenders” (at para 49) and that this research had led to those tools no longer being used. However, in this case, in spite of concerns, the CSC had taken no steps to assess the validity of the tools, and it continued to apply them to Indigenous offenders. The majority noted that the CCRA, which set out guiding principles in s. 4, specifically required correctional policies and practices to respect cultural, linguistic and other differences and to take into account “the special needs of women, aboriginal peoples, persons requiring mental health care and other groups” (s. 4(g)) The majority found that this principle “represents an acknowledgement of the systemic discrimination faced by Indigenous persons in the Canadian correctional system.” (at para 53) As a result, it found it incumbent on CSC to give “meaningful effect” to this principle “in performing all of its functions”. In particular, the majority found that “this provision requires the CSC to ensure that its practices, however neutral they may appear to be, do not discriminate against Indigenous persons.”(at para 54) The majority observed that although it has been 25 years since this principle was added to the legislation, “there is nothing to suggest that the situation has improved in the realm of corrections” (at para 60). It expressed dismay that “the gap between Indigenous and non-Indigenous offenders has continued to widen on nearly every indicator of correctional performance”. (at para 60) It noted that “Although many factors contributing to the broader issue of Indigenous over-incarceration and alienation from the criminal justice system are beyond the CSC’s control, there are many matters within its control that could mitigate these pressing societal problems. . . Taking reasonable steps to ensure that the CSC uses assessment tools that are free of cultural bias would be one.”(at para 61) [my emphasis]

According to the majority of the Court, therefore, what is required by s. 24(1) of the CCRA is for the CSC to carry out research into whether and to what extent the assessment tools it uses “are subject to cross-cultural variance when applied to Indigenous offenders.” (at para 67) Any further action would depend on the results of the research.

What is interesting here is that the onus is placed on the CSC (influenced by the guiding principles in the CCRA) to take positive steps to verify the validity of the assessment tools on which it relies. The Court does not specify who is meant to carry out the research in question, what standards it should meet, or how extensive it should be. These are important issues. It should be noted that discussions of algorithmic bias often consider solutions involving independent third-party assessment of the algorithms or the data used to develop them.

3. The Charter arguments

Two Charter arguments were raised by counsel for Mr. Ewert. The first was a s. 7 due process argument. Counsel for Mr. Ewert argued that reliance on the assessment tools violated his right to liberty and security of the person in a manner that was not in accordance with the principles of fundamental justice. The tools were argued to fall short of the principles of fundamental justice because of their arbitrariness (lacking any rational connection to the government objective) and overbreadth. The court was unanimous in finding that reliance on the tools was not arbitrary, stating that “The finding that there is uncertainty about the extent to which the tests are accurate when applied to Indigenous offenders is not sufficient to establish that there is no rational connection between reliance on the tests and the relevant government objective.” (at para 73) Without further research, the extent and impact of any cultural bias could not be known.

Mr. Ewert also argued that the results of the use of the tools infringed his right to equality under s. 15 of the Charter. The Court gave little time or attention to this argument, finding that there was not enough evidence to show that the tools had a disproportionate impact on Indigenous inmates when compared to non-Indigenous inmates.

The Charter is part of the Constitution and applies only to government action. There are many instances in which governments may come to rely upon algorithmic decision-making. While concerns might be raised about bias and discriminatory impacts from these processes, this case demonstrates the challenge faced by those who would raise such arguments. The decision in Ewert suggests that in order to establish discrimination, it will be necessary either to demonstrate discriminatory impacts or effects, or to show how the algorithm itself and/or the data used to develop it incorporate biases or discriminatory assumptions. Establishing any of these things will impose a significant evidentiary burden on the party raising the issue of discrimination. Even where the Charter does not apply and individuals must rely upon human rights legislation, establishing discrimination with complex (and likely inaccessible or non-transparent algorithms and data) will be highly burdensome.

Concluding thoughts

This case raises important and interesting issues that are relevant in algorithmic decision-making of all kinds. The result obtained in this case favoured Mr. Ewert, but it should be noted that it took him 18 years to achieve this result, and he required the assistance of a dedicated team of lawyers. There is clearly much work to do to ensure that fairness and transparency in algorithmic decision-making is accessible and realizable.

Mr. Ewert’s success was ultimately based, not upon human rights legislation or the Charter, but upon federal legislation which required the keeping of accurate information. As noted above, PIPEDA and the Privacy Act impose a similar requirement on organizations that collect, use or disclose personal information to ensure the accuracy of that information. Using the interpretive approach of the Supreme Court of Canada in Ewert v. Canada, this statutory language may provide a basis for supporting a broader right to fair and unbiased algorithmic decision-making. Yet, as this case also demonstrates, it may be challenging for those who feel they are adversely impacted to make their case, absent evidence of long-standing and widespread concerns about particular tests in specific contexts.

 

Published in Privacy

In October 2016, the data analytics company Geofeedia made headlines when the California chapter of the American Civil Liberties Union (ACLU) issued the results of a major study which sought to determine the extent to which police services in California were using social media data analytics. These analytics were based upon geo-referenced information posted by ordinary individuals to social media websites such as Twitter and Facebook. Information of this kind is treated as “public” in the United States because it is freely contributed by users to a public forum. Nevertheless, the use of social media data analytics by police raises important civil liberties and privacy questions. In some cases, users may not be aware that their tweets or posts contain additional meta data including geolocation information. In all cases, the power of data analytics permits rapid cross-referencing of data from multiple sources, permitting the construction of profiles that go well beyond the information contributed in single posts.

The extent to which social media data analytics are used by police services is difficult to assess because there is often inadequate transparency both about the actual use of such services and the purposes for which they are used. Through a laborious process of filing freedom of information requests the ACLU sought to find out which police services were contracting for social media data analytics. The results of their study showed widespread use. What they found in the case of Geofeedia went further. Although Geofeedia was not the only data analytics company to mine social media data and to market its services to government authorities, its representatives had engaged in email exchanges with police about their services. In these emails, company employees used two recent sets of protests against police as examples of the usefulness of social media data analytics. These protests were those that followed the death in police custody of Freddie Gray, a young African-American man who had been arrested in Baltimore, and the shooting death by police of Michael Brown, an eighteen-year-old African-American man in Ferguson, Missouri. By explicitly offering services that could be used to monitor those who protested police violence against African Americans, the Geofeedia emails aggravated a climate of mistrust and division, and confirmed a belief held by many that authorities were using surveillance and profiling to target racialized communities.

In a new paper, just published in the online, open-access journal SCRIPTed, I use the story around the discovery of Geofeedia’s activities and the backlash that followed to frame a broader discussion of police use of social media data analytics. Although this paper began as an exploration of the privacy issues raised by the state’s use of social media data analytics, it shifted into a paper about transparency. Clearly, privacy issues – as well as other civil liberties questions – remain of fundamental importance. Yet, the reality is that without adequate transparency there simply is no easy way to determine whether police are relying on social media data analytics, on what scale and for what purposes. This lack of transparency makes it difficult to hold anyone to account. The ACLU’s work to document the problem in California was painstaking and time consuming, as was a similar effort by the Brennan Center for Justice, also discussed in this paper. And, while the Geofeedia case provided an important example of the real problems that underlie such practices, it only came to light because Geofeedia’s employees made certain representations by email instead of in person or over the phone. A company need only direct that email not be used for these kinds of communications for the content of these communications to disappear from public view.

My paper examines the use of social media data analytics by police services, and then considers a range of different transparency issues. I explore some of the challenges to transparency that may flow from the way in which social media data analytics are described or characterized by police services. I then consider transparency from several different perspectives. In the first place I look at transparency in terms of developing explicit policies regarding social media data analytics. These policies are not just for police, but also for social media platforms and the developers that use their data. I then consider transparency as a form of oversight. I look at the ways in which greater transparency can cast light on the activities of the providers and users of social media data and data analytics. Finally, I consider the need for greater transparency around the monitoring of compliance with policies (those governing police or developers) and the enforcement of these policies.

A full text of my paper is available here under a CC Licence.

Published in Privacy

The social sciences research community has been buzzing over the announcement on May 17, 2016 that the Social Sciences Research Network (SSRN) has been acquired by Elsevier Publishing Group.

SSRN is a digital repository that was created in order to enable researchers in the social sciences to share their work in advance of its publication. Prior to the launch of SSRN, long delays between submission and print publication of papers had been a significant problem for researchers – particularly those working in rapidly changing and evolving fields. In addition, it was not always easy to find out who was working in similar areas or to be aware of developing trends in research as a result of the long publication delays. SSRN allows researchers to publish working papers, conference papers, and pre-print versions of accepted papers – as well as (where permitted by journals) published versions of papers. Access to the database is free to anyone with an Internet connection. This too is important for sharing academic research more broadly – many published academic journals sit behind digital paywalls making broader public access impractical or impossible. SSRN has been a game-changer, and it is now widely used by academics around the world as a vehicle for sharing research.

Elsevier is a commercial publisher which has, in the past, focused primarily on the fields of science, technology and health. It publishes over 2000 international journals. In recent years it has developed other information “solutions”. These include not only digital publishing platforms, but also data analytics, as well as tools to enhance and facilitate collaboration among researchers.

The controversy over the acquisition of SSRN lies in the deep distrust many researchers seem to have about the willingness of a commercial publisher known for its top-dollar subscriptions and generally restrictive access policies to maintain a publicly accessible information dissemination service that is free to both academics and the broader public. The founders of SSRN maintain that Elsevier, which also publishes open access journals, understands the need for broad sharing of research and has no intention of placing the site behind a paywall. They argue that SSRN’s acquisition by Elsevier will only enhance the services it can offer to scholars.

Critics of the sale of SSRN to Elsevier raise a number of concerns. One of these is, of course, whether SSRN will genuinely continue to be available as a free resource for sharing research. The reassurances of both Elsevier and SSRN’s founders are firm in this respect. Nevertheless, there are concerns that Elsevier might more strictly police what content is available on SSRN. It is likely the case that some academics post articles to which their publishers hold the copyright on the view that enough time has passed since publication to make free dissemination normatively if not legally acceptable.

The potential that access to some content might be limited is only one of the issues that should be on scholars’ radar – and it is probably not the most important one. By acquiring SSRN, Elsevier will enhance its expanding analytics capability – and data analytics are an important part of its business model. Researchers should consider the nature and extent of these analytics and how they might impact on the publication, dissemination, valuation and support for research in other venues and other contexts. For example, how might granting agencies or governments use proprietary data analytics to make decisions about what research to fund or not fund? Will universities purchase data from Elsevier to use in the evaluation of their researchers for tenure, promotion, or other purposes? Does it serve the academic committee to have so much data – and its analytic potential – in the hands of a single private sector organization? Given that this data might have important ramifications for scholars, and, by extension, for society, are there any governance, accountability or oversight mechanisms that will provide insight into how the data is collected or analyzed?

Essentially, the noble project that was SSRN has evolved into a kind of Facebook for academics. Researchers post their articles and conference papers to share with the broader community – and will continue to do so. While for researchers these works are what define them and are the “value” that they contribute to the site, the real commercial value lies in the data that can be mined from SSRN. Who collaborates with whom? How many times is a paper read or downloaded? Who cites whom, and how often? The commercialization of SSRN should be of concern to academics, but it is data governance and not copyright that should be the focus of attention.

Published in Copyright Law

Canadian Trademark Law

Published in 2015 by Lexis Nexis

Canadian Trademark Law 2d Edition

Buy on LexisNexis

Electronic Commerce and Internet Law in Canada, 2nd Edition

Published in 2012 by CCH Canadian Ltd.

Electronic Commerce and Internet Law in Canada

Buy on CCH Canadian

Intellectual Property for the 21st Century

Intellectual Property Law for the 21st Century:

Interdisciplinary Approaches

Purchase from Irwin Law