Teresa Scassa - Blog

Displaying items by tag: data scraping

 

On December 7, 2021, the privacy commissioners of Quebec, British Columbia and Alberta issued orders against the US-based company Clearview AI, following its refusal to voluntarily comply with the findings in the joint investigation report they issued along with the federal privacy commissioner on February 3, 2021.

Clearview AI gained worldwide attention in early 2020 when a New York Times article revealed that its services had been offered to law enforcement agencies for use in a largely non-transparent manner in many countries around the world. Clearview AI’s technology also has the potential for many different applications including in the private sector. It built its massive database of over 10 billion images by scraping photographs from publicly accessible websites across the Internet, and deriving biometric identifiers from the images. Users of its services upload a photograph of a person. The service then analyzes that image and compares it with the stored biometric identifiers. Where there is a match, the user is provided with all matching images and their metadata, including links to the sources of each image.

Clearview AI has been the target of investigation by data protection authorities around the world. France’s Commission Nationale de l'Informatique et des Libertés has found that Clearview AI breached the General Data Protection Regulation (GDPR). Australia and the UK conducted a joint investigation which similarly found the company to be in violation of their respective data protection laws. The UK commissioner has since issued a provisional view, stating its intent to levy a substantial fine. Legal proceedings are currently underway in Illinois, a state which has adopted biometric privacy legislation. Canada’s joint investigation report issued by the federal, Quebec, B.C. and Alberta commissioners found that Clearview AI had breached the federal Personal Information Protection and Electronic Documents Act, as well as the private sector data protection laws of each of the named provinces.

The Canadian joint investigation set out a series of recommendations for Clearview AI. Specifically, it recommended that Clearview AI cease offering its facial recognition services in Canada, “cease the collection, use and disclosure of images and biometric facial arrays collected from individuals in Canada”, and delete any such data in its possession. Clearview AI responded by saying that it had temporarily ceased providing its services in Canada, and that it was willing to continue to do so for a further 18 months. It also indicated that if it offered services in Canada again, it would require its clients to adopt a policy regarding facial recognition technology, and it would offer an audit trail of searches.

On the second and third recommendations, Clearview AI responded that it was simply not possible to determine which photos in its database were of individuals in Canada. It also reiterated its view that images found on the Internet are publicly available and free for use in this manner. It concluded that it had “already gone beyond its obligations”, and that while it was “willing to make some accommodations and met some of the requests of the Privacy Commissioners, it cannot commit itself to anything that is impossible and or [sic] required by law.” (Letter reproduced at para 3 of Order P21-08).

In this post I consider three main issues that flow from the orders issued by the provincial commissioners. The first relates to the cross-border reach of Canadian law. The second relates to enforcement (or lack thereof) in the Canadian context, particularly as compared with what is available in other jurisdictions such as the UK and the EU. The third issue relates to the interest shown by the commissioners in a compromise volunteered by Clearview AI in the ongoing Illinois litigation – and what this might mean for Canadians’ privacy.

 

1. Jurisdiction

Clearview AI maintains that Canadian laws do not apply to it. It argues that it is a US-based company with no physical presence in Canada. Although it initially provided its services to Canadian law enforcement agencies (see this CBC article for details of the use of Clearview by Toronto Police Services), it had since ceased to do so – thus, it no longer had clients in Canada. It scraped its data from platform companies such as Facebook and Instagram, and while many Canadians have accounts with such companies, Clearview’s scraping activities involved access to data hosted on platforms outside of Canada. It therefore argued that it not only did not operate in Canada, it had no ‘real and substantial’ connection to Canada.

The BC Commissioner did not directly address this issue. In his Order, he finds a hook for jurisdiction by referring to the personal data as having been “collected from individuals in British Columbia without their consent”, although it is clear there is no direct collection. He also notes Clearview’s active contemplation of resuming its services in Canada. Alberta’s Commissioner makes a brief reference to jurisdiction, simply stating that “Provincial privacy legislation applies to any private sector organization that collects, uses and discloses information of individuals within that province” (at para 12). The Quebec Commissioner, by contrast, gives a thorough discussion of the jurisdictional issues. In the first place, she notes that some of the images came from public Quebec sources (e.g., newspaper websites). She also observes that nothing indicates that images scraped from Quebec sources have been removed from the database; they therefore continue to be used and disclosed by the company.

Commissioner Poitras cited the Federal Court decision in Lawson for the principle that PIPEDA could apply to a US-based company that collected personal information from Canadian sources – so long as there is a real and substantial connection to Canada. She found a connection to Quebec in the free accounts offered to, and used by, Quebec law enforcement officials. She noted that the RCMP, which operates in Quebec, had also been a paying client of Clearview’s. When Clearview AI was used by clients in Quebec, those clients uploaded photographs to the service in the search for a match. This also constituted a collection of personal information by Clearview AI in Quebec.

Commissioner Poitras found that the location of Clearview’s business and its servers is not a determinative jurisdictional factor for a company that offers its services online around the world, and that collects personal data from the Internet globally. She found that Clearview AI’s database was at the core of its services, and a part of that database was comprised of data from Quebec and about Quebeckers. Clearview had offered its service in Quebec, and its activities had a real impact on the privacy of Quebeckers. Commissioner Poitras noted that millions of images of Quebeckers were appropriated by Clearview without the consent of the individuals in the images; these images were used to build a global biometric facial recognition database. She found that it was particularly important not to create a situation where individuals are denied recourse under quasi-constitutional laws such as data protection laws. These elements in combination, in her view, would suffice to create a real and substantial connections.

Commissioner Poitras did not accept that Clearview’s suspension of Canadian activities changed the situation. She noted that information that had been collected in Quebec remained in the database, which continued to be used by the company. She stated that a company could not appropriate the personal information of a substantial number of Quebeckers, commercialise this information, and then avoid the application of the law by saying they no longer offered services in Quebec.

The jurisdictional questions are both important and thorny. This case is different from cases such as Lawson and Globe24hrs, where the connections with Canada were more straightforward. In Lawson, there was clear evidence that the company offered its services to clients in Canada. It also directly obtained some of its data about Canadians from Canadian sources. In Globe24hrs, there was likewise evidence that Canadians were being charged by the Romanian company to have their personal data removed from the database. In addition, the data came from Canadian court decisions that were scraped from websites located in Canada. In Clearview AI, while some of the scraped data may have been hosted on servers located in Canada, most were scraped from offshore social media platform servers. If Clearview AI stopped offering its services in Canada and stopped scraping data from servers located in Canada, what recourse would Canadians have? The Quebec Commissioner attempts to address this question, but her reasons are based on factual connections that might not be present in the future, or in cases involving other data-scraping respondents. There needs to be a theory of real and substantial connection that specifically addresses the scraping of data from third-party websites, contrary to those websites’ terms of use, and contrary to the legal expectations of the sites’ users that can anchor the jurisdiction of Canadian law, even when the scraper has no other connection to Canada.

Canada is not alone with these jurisdictional issue – Australia’s orders to Clearview AI are currently under appeal, and the jurisdiction of the Australian Commissioner to make such orders will be one of the issues on appeal. A jurisdictional case – one that is convincing not just to privacy commissioners but to the foreign courts that may have to one day determine whether to enforce Canadian decisions – needs to be made.

 

2. Enforcement

At the time the facts of the Clearview AI investigation arose, all four commissioners had limited enforcement powers. The three provincial commissioners could issue orders requiring an organization to change its practices. The federal commissioner has no order-making powers, but can apply to Federal Court to ask that court to issue orders. The relative impotence of the commissioners is illustrated by Clearview’s hubristic response, cited above, that indicates that it had already “gone beyond its obligations”. Clearly, it considers anything that the commissioners had to say on the matter did not amount to an obligation.

The Canadian situation can be contrasted with that in the EU, where commissioners’ orders requiring organizations to change their non-compliant practices are now reinforced by the power to levy significant administrative monetary penalties (AMPs). The same situation exists in the UK. There, the data commissioner has just issued a preliminary enforcement notice and a proposed fine of £17M against Clearview AI. As noted earlier, the enforcement situation is beginning to change in Canada – Quebec’s newly amended legislation permits the levying of substantial AMPs. When some version of Bill C-11 is reintroduced in Parliament in 2022, it will likely also contain the power to levy AMPs. BC and Alberta may eventually follow suit. When this happens, the challenge will be first, to harmonize enforcement approaches across those jurisdictions; and second, to ensure that these penalties can meaningfully be enforced against offshore companies such as Clearview AI.

On the enforcement issue, it is perhaps also worth noting that the orders issued by the three Commissioners in this case are all slightly different. The Quebec Commissioner orders Clearview AI to cease collecting images of Quebeckers without consent, and to cease using these images to create biometric identifiers. It also orders the destruction, within 90 days of receipt of the order, all of the images collected without the consent of Quebeckers, as well as the destruction of the biometric identifiers. Alberta’s Commissioner orders that Clearview cease offering its services to clients in Alberta, cease the collection and use of images and biometrics collected from individuals in Alberta, and delete the same from its databases. BC’s order prohibits the offering of Clearview AI’s services using data collected from British Columbians without their consent to clients in British Columbia. He also orders that Clearview AI use “best efforts” to cease its collection, use and disclosure of images and biometric identifiers of British Columbians without its consent, as well as to use the same “best efforts” to delete images and biometric identifiers collected without consent.

It is to these “best efforts” that I next turn.

 

3. The Illinois Compromise

All three Commissioners make reference to a compromise offered by Clearview AI in the course of ongoing litigation in Illinois under Illinois’ Biometric Information Privacy Act. By referring to “best efforts” in his Order, the BC Commissioner seems to be suggesting that something along these lines would be an acceptable compromise in his jurisdiction.

In its response to the Canadian commissioners, Clearview AI raised the issue that it cannot easily know which photographs in its database are of residents of particular provinces, particularly since these are scraped from the Internet as a whole – and often from social media platforms hosted outside Canada.

Yet Clearview AI has indicated that it has changed some of its business practices to avoid infringing Illinois law. This includes “cancelling all accounts belonging to any entity based in Illinois” (para 12, BC Order). It also includes blocking from any searches all images in the Clearview database that are geolocated in Illinois. In the future, it also offers to create a “geofence” around Illinois. This means that it “will not collect facial vectors from any scraped images that contain metadata associating them with Illinois” (para 12 BC Order). It will also “not collect facial vectors from images stored on servers that are displaying Illinois IP addresses or websites with URLs containing keywords such as “Chicago” or “Illinois”.” Clearview apparently offers to create an “opt-out” mechanism whereby people can ask to have their photos excluded from the database. Finally, it will require its clients to not upload photos of Illinois residents. If such a photo is uploaded, and it contains Illinois-related metadata, no search will be performed.

The central problem with accepting the ‘Illinois compromise’ is that it allows a service built on illegally scraped data to continue operating with only a reduced privacy impact. Ironically, it also requires individuals who wish to benefit from this compromise, to provide more personal data in their online postings. Many people actually suppress geolocation information from their photographs to protect their privacy. Ironically, the ‘Illinois compromise’ can only exclude photos that contain geolocation data. Even with geolocation turned on, it would not exclude the vacation pics of any BC residents taken outside of BC (for example). Further, limiting scraping of images from Illinois-based sites will not prevent the photos of Illinois-based individuals from being included within the database a) if they are already in there, and b) if the images are posted on social media platforms hosted elsewhere.

Clearview AI is a business built upon data collection practices that are illegal in a large number of countries outside the US. The BC Commissioner is clearly of the opinion that a compromise solution is the best that can be hoped for, and he may be right in the circumstances. Yet it is a bitter pill to think that such flouting of privacy laws will ultimately be rewarded, as Clearview gets to keep and commercialize its facial recognition database. Accepting such a compromise could limit the harms of the improper exploitation of personal data, but it does not stop the exploitation of that data in all circumstances. And even this unhappy compromise may be out of reach for Canadians given the rather toothless nature of our current laws – and the jurisdictional challenges discussed earlier.

If anything, this situation cries out for global and harmonized solutions. Notably it requires the US to do much more to bring its wild-west approach to personal data exploitation in line with the approaches of its allies and trading partners. It also will require better cooperation on enforcement across borders. It may also call for social media giants to take more responsibility when it comes to companies that flout their terms and conditions to scrape their sites for personal data. The Clearview AI situation highlights these issues – as well as the dramatic impacts data misuse may have on privacy as personal data continues to be exploited for use in powerful AI technologies.

Published in Privacy

 

A joint ruling from the federal Privacy Commissioner and his provincial counterparts in Quebec, B.C., and Alberta has found that U.S.-based company Clearview AI breached Canadian data protection laws when it scraped photographs from social media websites to create the database it used to support its facial recognition technology. According to the report, the database contained the biometric data of “a vast number of individuals in Canada, including children.” Investigations of complaints under public sector data protection laws about police use of Clearview AI’s services are still ongoing.

The Commissioners’ findings are unequivocal. The information collected by Clearview AI is sensitive biometric data. Express consent was required for its collection and use, and Clearview AI did not obtain consent. The company’s argument that consent was not required because the information was publicly available was firmly rejected. The Commissioners described Clearview AI’s actions as constituting “the mass identification and surveillance of individuals by a private entity in the course of commercial activity.” (at para 72) In defending itself, Clearview AI put forward arguments that were clearly at odds with Canadian law. They also resisted the jurisdiction of the Canadian Commissioners, notwithstanding the fact that they collected the personal data of Canadians and offered their commercial services to Canadian law enforcement agencies. Clearview AI did not accept the Commissioners’ findings, and “has not committed to following” the recommendations.

At the time of this report, Bill C-11, a bill to reform Canada’s current data protection law, is before Parliament. The goal of this post is to consider what difference Bill C-11 might make to the outcome of complaints like this one should it be passed into law. I consider both the substantive provisions of the bill and its new enforcement regime.

Consent

Like the current Personal Information Protection and Electronic Documents Act (PIPEDA), consent is a core requirement of Bill C-11. To collect, use or disclose personal information, an organization must either obtain valid consent, or its activities must fall into one of the exceptions to consent. In the Clearview AI case, there was no consent, and the disputed PIPEDA exception to the consent requirement was the one for ‘publicly available personal information’. While this exception seems broad on its face, to qualify, the information must fall within the parameters set out in the Regulations Specifying Publicly Available Personal Information. These regulations focus on certain categories of publicly available information – such as registry information (land titles, for example), court registries and decisions, published telephone directory information, and public business information listings. In most cases, the regulations provide that the use of the information must also relate directly to the purposes for which it was made public. The regulations also contain an exception for “personal information that appears in a publication, including a magazine, book or newspaper, in printed or electronic form, that is available to the public, where the individual has provided the information.” The interpretation of this provision was central to Clearview AI’s defense of its practices. It argued that social media postings were “personal information that appears in a publication.” The Commissioners adopted a narrow interpretation consistent with this being an exception in quasi-constitutional legislation. They distinguished between the types of publications mentioned in the exception and uncurated, dynamic social-media sites. The Commissioners noted that unlike newspapers or magazines, individuals retain a degree of control over the content of their social media sites. They also observed that to find that all information on the internet falls within the publicly available information exception “would create an extremely broad exemption that undermines the control users may otherwise maintain over their information at the source.” (at para 65) Finally, the Commissioners observed that the exception applied to information provided by the data subject, but that photographs were scraped by Clearview AI regardless of whether they were posted by the data subject or by someone else.

Would the result be any different under Bill C-11? In section 51, Bill C-11 replicates the “publicly available information exception” for collection, use or disclosure of personal information. Like PIPEDA, it also leaves the definition of this term to regulations. However, Canadians should be aware that there has been considerable pressure to expand the regulations so that personal information shared on social media sites is exempted from the consent requirement. For example, in past hearings into PIPEDA reform, the House of Commons ETHI Committee at one point appeared swayed by industry arguments that PIPEDA should be amended to include websites and social media within this exception. Bill C-11 does not resolve this issue; but if passed, it might well be on the table in the drafting of regulations. If nothing else, the Clearview AI case provides a stark illustration of just how important this issue is to the privacy of Canadians.

However, data scrapers may be able to look elsewhere in Bill C-11 for an exception to consent. Bill C-11 contains new exceptions to consent for “business operations” which I have criticized here. One of these exceptions would almost certainly be relied upon by a company in Clearview AI’s position if the bill were passed. The exceptions allow for the collection and use of personal information without an individual’s knowledge or consent if, among other things, it is for “an activity in the course of which obtaining the individual’s consent would be impracticable because the organization does not have a direct relationship with the individual.” (18(2)(e)). A company that scrapes data from social media sites to create a facial recognition database would find it impracticable to get consent because it has no direct relationship with any of the affected individuals. The exception seems to fit.

That said, s. 18(1) does set some general guardrails. The one that seems relevant in this case is that the exceptions to consent are only available where “a reasonable person would expect such a collection or use for that activity”. Hopefully, collection of images from social media websites to fuel facial recognition technology would not be something that a reasonable person would expect; certainly, the Commissioners would not find it to be so. In addition, section 12 of Bill C-11 requires that information be collected or used “only for purposes that a reasonable person would consider appropriate in the circumstances” (a requirement carried over from PIPEDA, s. 5(3)). In their findings, the Commissioners ruled that the collection and use of images by Clearview AI was for a purpose that a reasonable person would find inappropriate. The same conclusion could be reached under Bill C-11.

There is reason to be cautiously optimistic, then, that Bill C-11 would lead to the same result on a similar set of facts: the conclusion that the wholesale scraping of personal data from social media sites to build a facial recognition database without consent is not permitted. However, the scope of the exception in s. 18(2)(e) is still a matter of concern. The more exceptions that an organization pushing the boundaries feels it can wriggle into, the more likely it will be to engage in a privacy-compromising activities. In addition, there may be a range of different uses for scraped data and “what a reasonable person would expect” is a rather squishy buffer between privacy and wholesale data exploitation.

Enforcement

Bill C-11 is meant to substantially increase enforcement options when it comes to privacy. Strong enforcement is particularly important in cases where organizations are not interested in accepting the guidance of regulators. This is certainly the case with Clearview AI, which expressly rejected the Commissioners’ findings. Would Bill C-11 strengthen the regulator’s hand?

The Report of Findings in this case reflects the growing trend of having the federal and provincial commissioners that oversee private sector data protection laws jointly investigate complaints involving issues that affect individuals across Canada. This cooperation is important as it ensures consistent interpretation of what is meant to be substantially similar legislation across jurisdictions. Nothing in Bill C-11 would prevent the federal Commissioner from continuing to engage in this cross-jurisdictional collaboration – in fact, subsection 116(2) expressly encourages it.

Some will point to the Commissioner’s new order-making powers as another way to strengthen his enforcement hand. The Commissioner can now direct an organization to take measures to comply with the legislation or to cease activities that are in contravention of the legislation (s. 92(2)). This is a good thing. However, these orders are subject to appeal to the new Personal Information Protection and Data Tribunal (the Tribunal). By contrast, orders of the Commissioners of BC and Alberta are final, subject only to judicial review.

In addition, it is not just the orders of the Commissioner that are appealable under C-11, but also his findings. This raises questions about how the new structure under Bill C-11 might affect cooperative inquiries like the one in this case. Conclusions shared with other Commissioners can be appealed by respondents to the Tribunal, which owes no deference to the Commissioner on questions of law. As I and others have already noted, the composition of the Tribunal is somewhat concerning; Bill C-11 would require only a minimum of one member of the tribunal to have expertise in privacy law. While it is true that proceedings before the Federal Court were de novo, and thus the Commissioner was afforded no formal deference in that context either, access to Federal Court was more limited than the wide-open appeals route to the Tribunal. The Bill C-11 structure really seems to shift the authority to interpret and apply the law away from the Commissioner and to the mysterious and not necessarily expert Tribunal.

Bill C-11 also has a much-touted new power to issue substantial fines for breach of the legislation. Interestingly, however, this does not seem to be the kind of case in which a fine would be available. Fines, provided for under s. 93(1) of Bill C-11 are available only with respect to the breach of certain obligations under the statute (these are listed in s. 93(1)). Playing fast and loose with the requirement to obtain consent is not one of them. This is interesting, given the supposedly central place consent plays within the Bill. Further thought might need to be given to the list of ‘fine-able contraventions’.

Overall, then, although C-11 could lead to a very similar result on similar facts, the path to that result may be less certain. It is also not clear that there is anything in the enforcement provisions of the legislation that will add heft to the Commissioner’s findings. In practical terms, the decisions that matter will be those of the Tribunal, and it remains to be seen how well this Tribunal will serve Canadians.

Published in Privacy

An interesting case from Quebec demonstrates the tension between privacy and transparency when it comes to public registers that include personal information. It also raises issues around ownership and control of data, including the measures used to prevent data scraping. The way the litigation was framed means that not all of these questions are answered in the decision, leaving some lingering public policy questions.

Quebec’s Enterprise Registrar oversees a registry, in the form of a database, of all businesses in Quebec, including corporations, sole corporations and partnerships. The Registrar is empowered to do so under the Act respecting the legal publicity of enterprises (ALPE), which also establishes the database. The Registrar is obliged to make this register publicly accessible, including remotely by technological means, and basic use of the database is free of charge.

The applicant in this case is OpenCorporates, a U.K.-based organization dedicated to ensuring total corporate transparency. According to its website, OpenCorporates has created and maintains “the largest open database of companies in the world”. It currently has data on companies located in over 130 jurisdictions. Most of this data is drawn from reliable public registries. In addition to providing a free, searchable public resource, OpenCorporates also sells structured data to financial institutions, government agencies, journalists and other businesses. The money raised from these sales finances its operations.

OpenCorporates gathers its data using a variety of means. In 2012, it began to scrape data from Quebec’s Enterprise Register. Data scraping involves the use of ‘bots’ to visit and automatically harvest data from targeted web pages. It is a common data-harvesting practice, widely used by journalists, civil society actors and researchers, as well as companies large and small. As common as it may be, it is not always welcome, and there has been litigation in Canada and around the world about the legality of data scraping practices, chiefly in contexts where the defendant is attempting to commercialize data scraped from a business rival.

In 2016 the Registrar changed the terms of service for the Enterprise Register. These changes essentially prohibited web scraping activities, as well as the commercialization of data extracted from the site. The new terms also prohibit certain types of information analyses; for example, they bar searches for data according to the name and address of a particular person. All visitors to the site must agree to the Terms of Service. The Registrar also introduced technological measures to make it more difficult for bots to scrape its data.

Opencorporates Ltd. C. Registraire des entreprises du Québec is not a challenge to the Register’s new, restrictive terms and conditions. Instead, because the Registrar also sent OpenCorporates a cease and desist letter demanding that it stop using the data it had collected prior to the change in Terms of Service, OpenCorporates sought a declaration from the Quebec Superior Court that it was entitled to continue to use this earlier data.

The Registrar acknowledged that nothing in the ALPE authorizes it to control uses made of any data obtained from its site. Further, until it posted the new terms and conditions for the site, nothing limited what users could do with the data. The Registrar argued that it had the right to control the pre-2016 data because of the purpose of the Register. It argued that the ALPE established the Register as the sole source of public data on Quebec businesses, and that the database was designed to protect the personal information that it contained (i.e. the names and addresses of directors of corporations). For example, it does not permit extensive searches by name or address. OpenCorporates, by contrast, permits the searching of all of its data, including by name and address.

The court characterized the purpose of the Register as being to protect individuals and corporations that interact with other corporations by assuring them easy access to identity information, including the names of those persons associated with a corporation. An electronic database gives users the ability to make quick searches and from a distance. Quebec’s Act to Establish a Legal Framework for Information Technology provides that where a document contains personal information and is made public for particular purposes, any extensive searches of the document must be limited to those purposes. This law places the onus on the person responsible for providing access to the document to put in place appropriate technological protection measures. Under the ALPE, the Registrar can carry out more comprehensive searches of the database on behalf of users who must make their request to the Registrar. Even then, the ALPE prohibits the Registrar from using the name or address of an individual as a basis for a search. According to the Registrar, a member of the public has right to know, once one they have the name of a company, with whom they are dealing; they do not have the right to determine the number of companies to which a physical person is linked. By contrast, this latter type of search is one that could be carried out using the OpenCorporates database.

The court noted that it was not its role to consider the legality of OpenCorporates’ database, nor to consider the use made by others of that database. It also observed that individuals concerned about potential privacy breaches facilitated by OpenCorporates might have recourse under Quebec privacy law. Justice Rogers’ focus was on the specific question of whether the Registrar could prevent OpenCorporates from using the data it gathered prior to the change of terms of service in 2016. On this point, the judge ruled in favour of OpenCorporates. In her view, OpenCorporates’ gathering of this data was not in breach of any law that the Registrar could rely upon (leaving aside any potential privacy claims by individuals whose data was scraped). Further, she found that nothing in the ALPE gave the Registrar a monopoly on the creation and maintenance of a database of corporate data. She observed that the use made by OpenCorporates of the data was not contrary to the purpose of the ALPE, which was to create greater corporate transparency and to protect those who interacted with corporations. She ruled that nothing in the ALPE obligated the Registrar to eliminate all privacy risks. The names and addresses of those involved with corporations are public information; the goal of the legislation is to facilitate digital access to the data while at the same time placing limits on bulk searches. Nothing in the ALPE prevented another organization from creating its own database of Quebec businesses. Since OpenCorporates did not breach any laws or terms of service in collecting the information between 2012 and 2016, nothing prevented it from continuing to use that information in its own databases. Justice Rogers issued a declaration to the effect that the Registrar was not permitted to prevent OpenCorporates from publishing and distributing the data it collected from the Register prior to 2016.

While this was a victory for OpenCorporates, it did not do much more than ensure its right to continue to use data that will become increasingly dated. There is perhaps some value in the Court’s finding that the existence of a public database does not, on its own, preclude the creation of derivative databases. However, the decision leaves some important questions unanswered. In the first place, it alludes to but offers no opinion on the ability to challenge the inclusion of the data in the OpenCorporates database on privacy grounds. While a breach of privacy argument might be difficult to maintain in the case of public data regarding corporate ownership, it is still unpredictable how it might play out in court. This is far less sensitive data that that involved in the scraping of court decisions litigated before the Federal Court in A.T. v. Globe24hr.com; there is a public interest in making the specific personal information available in the Registry; and the use made by OpenCorporates is far less exploitative than in Globe24hr. Nevertheless, the privacy issues remain a latent difficulty. Overall, the decision tells us little about how to strike an appropriate balance between the values of transparency and privacy. The legislation and the Registrar’s approach are designed to make it difficult to track corporate ownership or involvement across multiple corporations. There is rigorous protection of information with low privacy value and with a strong public dimension; with transparency being weakened as a result. It is worth noting that another lawsuit against the Register may be in the works. It is reported that the CBC is challenging the decision of the Registrar to prohibit searches by names of directors and managers of companies as a breach of the right to freedom of expression.

Because the terms of service were not directly at issue in the case, there is also little to go on with respect to the impact of such terms. To what extent can terms of service limit what can be done with publicly accessible data made available over the Internet? The recent U.S. case of hiQ Labs Inc. v. LinkedIn Corp. raises interesting questions about freedom of expression and the right to harvest publicly accessible data. This and other important issues remain unaddressed in what is ultimately an interesting but unsatisfying court decision.

 

Published in Privacy

Last year I attended a terrific workshop at UBC’s Allard School of Law. The workshop was titled ‘Property in the City’, and panelists presented work on a broad range of issues relating to law in the urban environment. A special issue of the UBC Law Review has just been published featuring some of the output of this workshop. The issue contains my own paper (discussed below and available here) that explores skirmishes over access to and use of Airbnb platform data.

Airbnb is a ‘sharing economy’ platform that facilitates the booking of short-term accommodation. The company is premised on the idea that many urban dwellers have excess space – rooms in homes or apartments – or have space they do not use at certain periods of the year (entire homes or apartments while on vacation, for example) – and that a digital marketplace can maximize efficient use of this space by matching those seeking temporary accommodation with those having excess space. The Airbnb web site claims that it “connects people to unique travel experiences at any price point” and at the same time “is the easiest way for people to monetize their extra space and showcase it to an audience of millions.”

This characterization of Airbnb is open to challenge. Several studies, including ones by the Canadian Centre for Policy Alternatives, the City of Vancouver, and the NY State Attorney General suggest that a significant number of units for rent on Airbnb are offered as part of commercial enterprises. The description also belies Airbnb’s disruptive impact. The re-characterization and commodification of ‘surplus’ private spaces neatly evades the regulatory frameworks designed for the marketing of short-term accommodation and leaves licensed short-term accommodation providers complaining that their highly regulated businesses are being undermined by competition from those not bearing the same regulatory burdens. At the same time, many housing advocates and city officials are concerned about the impact of platforms such as Airbnb on the availability and affordability of long-term housing.

These challenges are made more difficult to address by the fact that the data needed to understand the impact of platform companies, along with data about short-term rentals that would otherwise be captured through regulatory processes, are effectively privatized in the hands of Airbnb. Data deficits of this kind pose a challenge to governments, civil society and researchers..

My paper explores the impact of a company such as Airbnb on cities from the perspective of data. I argue that platform-based, short-term rental activities have a fundamental impact on what data are available to municipal governments who struggle to regulate in the public interest, as well as to civil society groups and researchers that attempt to understand urban housing issues. The impacts of platform companies are therefore not just disruptive of incumbent industries; they disrupt planning and regulatory processes by masking activities and creating data deficits. My paper considers some of the currently available solutions to the data deficits, which range from self-help type recourses such as data scraping to entering into data-sharing agreements with the platform companies. Each of these solutions has its limits and drawbacks. I argue that further action may be required by governments to ensure their data needs are adequately met.

Although this paper focuses on Airbnb, it is worth noting that the data deficits discussed in the paper are merely a part of a larger context in which evolving technologies shift control over some kinds of data from public to private hands. Ensuring the ability of governments and civil society to collect, retain, and share data of a sufficient quality to both enable and to enhance governance, transparency, and accountability should be priorities for municipal governments, and should also be supported by law and policy at provincial and federal levels.

 

 

Skirmishes over right to freely access and use “publicly available” data hosted by internet platform companies have led to an interesting decision from the U.S. District Court from the Northern District of California. The decision is on a motion for an interlocutory injunction, so it does not decide the merits of the competing claims. Nevertheless, it provides insight into a set of issues that are likely only to increase in importance as these rich troves of data are mined by competitors, opportunistic businesses, big data giants, researchers and civil society actors.

The parties in hiQ Labs Inc. v LinkedIn Corp. are companies whose business models are based upon career-related personal information provided by professionals. LinkedIn offers a professional networking platform to over 500 million users, and it is easily the leading company in its space. hiQ, for its part, is a data analytics company with two main products aimed at enterprises. The first is “Keeper”, a product which informs corporations about which of their employees are at greatest risk of being poached by other companies. The second is “Skill Mapper” which provides businesses with summaries of the skills of their employees. For both of its products hiQ relies on data that it scrapes from LinkedIn’s publicly accessible web pages.

Data featured on LinkedIn’s site are provided by users who create accounts and populate their profiles with a broad range of information about their background and skills. LinkedIn members have some control over the extent to which their information will be shared by others. They can choose to limit access to their profile information to only their close contacts or to an expanded list of contacts. Alternatively, they can provide access to all other members of LinkedIn. They also have the option to make their profiles entirely public. These public profiles are searchable by search engines such as Google. It is the data in the fully public profiles that is scraped and used by hiQ.

hiQ is not the only company that scrapes data from LinkedIn as part of an independent business model. In fact, LinkedIn has only recently attempted to take legal action against a large number of users of its data. hiQ was just one of many companies that received a cease and desist letter from LinkedIn. Because being cut off from the LinkedIn data would effectively decimate its business, hiQ responded by seeking a declaration from the California court that its activities were legal. The recent decision from the court is in relation to hiQ’s request for an interlocutory injunction that will allow it to continue to access the LinkedIn data pending resolution of the substantive legal issues raised by both sides.

hiQ argued that in moving against its data scraping activities, LinkedIn engaged in unfair business practices, and violated its free speech rights under the California constitution. LinkedIn, for its part, argued that hiQ’s data scraping activities violated the Computer Fraud and Abuse Act (CFAA), as well as the digital locks provisions Digital Millennium Copyright Act (DMCA) (although these latter claims do not feature in the decision on the interlocutory injunction).

Like other platform companies, access to and use of LinkedIn’s site is governed by website Terms of Service (TOS). These TOS prohibit data scraping. When LinkedIn demanded that hiQ cease scraping data from its site, it also implemented technological protection measures to prevent access by hiQ to its data. LinkedIn’s claims under the CFAA and the DMCA are based largely on the circumvention of these technological barriers by hiQ.

The court ultimately granted the injunction barring LinkedIn from limiting hiQ’s access to its publicly available data pending the resolution of the issues in the case. In doing so, it expressed its doubts that the CFAA applied to hiQ’s activity, noting that if it did, it would “profoundly impact open access to the Internet.” It also found that attempts by LinkedIn to block hiQ’s access might be in breach of state law as anti-competitive behavior. In reaching its decision, the court had some interesting things to say about the importance of access to publicly accessible data, and the privacy rights of those who provided the data. These issues are highlighted in the discussion below.

In deciding whether to grant an interlocutory injunction, a court must assess both the possibility of irreparable harm and the balance of convenience as between the parties. In this case, the court found that denying hiQ access to LinkedIn data would essentially put it out of business – causing it irreparable harm. LinkedIn argued that it was imperative that it be allowed to protect its data because of its users’ privacy interests. While hiQ only scraped data from public profiles, LinkedIn argued that even those users with public profiles had privacy interests. I noted that 50 million of its users with public profiles had selected its “Do Not Broadcast” feature which prevents profile updates from being broadcast to a user’s connections. LinkedIn described this as a privacy feature that would essentially be circumvented by routine data scraping. The court was not convinced. In the first place, it found that there might be many reasons besides privacy concerns that motivated users to choose “do not broadcast”. It gave as an example the concern by users that their connections not be spammed by endless notifications. The Court also noted that LinkedIn had its own service for professional recruiters that kept them apprised of updates even from users who had implemented “Do Not Broadcast”. The court dismissed arguments by LinkedIn that this was different because users had consented to such sharing in their privacy policy. The court stated: “It is unlikely, however, that most users’ actual privacy expectations are shaped by the fine print of a privacy policy buried in the User Agreement that likely few, if any, users have actually read.” [Emphasis in original] This is interesting, because the court discounts the relevance of a privacy policy in informing users’ expectations of privacy. Essentially, the court finds that users who make their profiles public have no real expectation of privacy in the information. LinkedIn could therefore not rely on its users’ privacy interests to justify its actions.

In assessing whether the parties raised serious questions going to the merits of the case, the court considered LinkedIn’s arguments about the CFAA. The CFAA essentially criminalizes intentional access to a computer without authorization, or in a way that exceeds the authorization provided, with the result that information is obtained. The question, therefore, was whether hiQ’s continued access to the LinkedIn site after LinkedIn expressly revoked any permission and tried to bar its access, was a violation of the CFAA. The court dismissed the cases cited by LinkedIn in support of its position, noting that these cases involved unauthorized access to password protected sites as opposed to accessing publicly available information.

The court observed that the CFAA was enacted largely to deal with the problem of computer hacking. It noted that if the application of the law was extended to publicly accessible websites it would greatly expand the scope of the legislation with serious consequences. The court noted that this would mean that “merely viewing a website in contravention of a unilateral directive from a private company would be a crime.” [Emphasis in original] It went on to note that “The potential for such exercise of power over access to publicly viewable information by a private entity weaponized by the potential of criminal sanctions is deeply concerning.” The court placed great emphasis on the importance of an open internet. It noted that “LinkedIn, here, essentially seeks to prohibit hiQ from viewing a sign publicly visible to all”. It clearly preferred an interpretation of the CFAA that would be limited to unauthorized access to a computer system through some form of “authentication gateway”.

The court also found that hiQ raised serious questions that LinkedIn’s behavior might fall afoul of competition laws in California. It noted that LinkedIn is in a dominant position in the field of professional networking, and that it might be leveraging its position to get a “competitively unjustified advantage in a different market.” It also accepted that it was possible that LinkedIn was denying its competitors access to an essential facility that it controls.

The court was not convinced by hiQ’s arguments that the technological barriers erected by LinkedIn violated the free speech guarantees in the California Constitution. Nevertheless, it found that on balance the public interest favoured the granting of the injunction to hiQ pending the outcome of litigation on the merits.

This dispute is extremely interesting and worth following. There are a growing number of platforms that host vast stores of publicly accessible data, and these data are often relied upon by upstart businesses (as well as established big data companies, researchers, and civil society) for a broad range of purposes. The extent to which a platform company can control its publicly accessible data is an important one, and one which, as the California court points out, will have important public policy ramifications. The related privacy issues – where the data is also personal information – are also important and interesting. These latter issues may be treated differently in different jurisdictions depending upon the applicable data protection laws.

Published in Privacy

Canadian Trademark Law

Published in 2015 by Lexis Nexis

Canadian Trademark Law 2d Edition

Buy on LexisNexis

Electronic Commerce and Internet Law in Canada, 2nd Edition

Published in 2012 by CCH Canadian Ltd.

Electronic Commerce and Internet Law in Canada

Buy on CCH Canadian

Intellectual Property for the 21st Century

Intellectual Property Law for the 21st Century:

Interdisciplinary Approaches

Purchase from Irwin Law