The U.S has cleared the way for the use of citizen science by federal government agencies and departments in a new law titled the American Competitiveness and Innovation Act (ACIA) (awaiting presidential signature).

The ACIA as a whole should be of interest to Canadians, as it lays out the principles for how the National Science Foundation (NSF) in the United States should approach its mandate to support scientific research. Earlier bills failed to reach acceptable compromises; some of these would have restricted types of scientific research funded by the NSF to specific sectors. This has echoes of the controversial choices in Canada under the previous government to focus on applied rather than basic scientific research. The American Competitiveness and Innovation Act has moved away from this narrow approach and sets out two main criteria for funding scientific research: intellectual merit and broader public impacts.

The ACIA contains a distinct section titled the Crowdsourcing and Citizen Science Act (CCSA) which paves the way for the use by government agencies and departments of scientific research practices based upon distributed public participation. The CCSA defines citizen science as “a form of open collaboration in which individuals or organizations participate voluntarily in the scientific process in various ways.” (§402(3)(c)(1)) The level of participation can vary, and may include public participation in the development of research questions or in project design, in conducting research, in collecting, analyzing or interpreting data, in developing technologies and applications, in making discoveries and in solving problems. In its preamble, the CCSA acknowledges some of the unique benefits of crowd-sourced research, including cost-effectiveness, providing hands-on learning opportunities, and encouraging greater citizen engagement.

The CCSA specifically empowers the heads of federal science agencies to make use of crowdsourcing and citizen science to conduct research projects that will advance their missions. It enables the use of volunteers in research – something that might otherwise become entangled in red tape. The Act also directs agencies to draft appropriate policies to govern participant consent, and to address “privacy, intellectual property, data ownership, compensation, service, program and other terms of use to the participant in a clear and reasonable manner.” (§402(4))

Significantly, the CCSA also mandates that any data collected through citizen science research enabled under the legislation should be made available to the public as open data in a machine-readable format unless to do so is against the law. It also requires the agency to provide notifications to the public about the expected use of the data, any ownership issues relating to the data, and how the data will be made available to the public. (I note that these issues are addressed in my co-authored guide Managing Intellectual Property Rights in Citizen Science published by the Wilson Center Commons Lab.) The statute also encourages agencies, where possible, to make any technologies, applications or code that are developed as part of the project available to the public. This legislated commitment to open research data and open source technology is an important public policy statement.

One barrier to the use of crowdsourcing and citizen science in the government context is the fear of liability within the risk-averse culture of governments. The CCSA addresses this by proving that participants in citizen science projects enabled under the statute agree to assume all risks of participation, and to waive any claims of liability against the federal government or its agencies.

The CCSA permits federal agencies to partner with community groups, other government agencies, or the private sector for the purposes of carrying out citizen science research. After a two-year grace period, the statute also requires the filing of reports on any citizen science or crowd-sourcing projects carried out under the CCSA, and contains detailed requirements for the content of any such report.

The inclusion in this science and innovation bill of provisions that are specifically designed to facilitate and encourage the use of citizen science by governments is a significant development. It is one that should be of interest to a federal government in Canada that is attempting to carve out space for itself as open, pro-science and keen to engage citizens. Citizen science has significant potential in many fields of scientific research; it also brings with it benefits in terms of education, citizen engagement, and community development.


Monday, 19 December 2016 08:52

Open licensing of real time data

Written by Teresa Scassa

Municipalities are under growing pressure to become “smart”. In other words, they will reap the benefits of sophisticated data analytics carried out on more and better data collected via sensors embedded throughout the urban environment. As municipalities embrace smart cities technology, a growing number of the new sensors will capture data in real time. Municipalities are also increasingly making their data open to developers and civil society alike. If municipal governments decide to make real-time data available as open data, what should an open real-time data license look like? This is a question Alexandra Diebel and I explore in a new paper just published in the Journal of e-Democracy.

Our paper looks at how ten North American public transit authorities (6 in the U.S. and 4 in Canada) currently make real-time GPS public transit data available as open data. We examine the licenses used by these municipalities both for static transit data (timetables, route data) and for real-time GPS data (for example data about where transit vehicles are along their routes in real-time). Our research reveals differences in how these types of data are licensed, even when both types of data are referred to as “open” data.

There is no complete consensus on the essential characteristics of open data. Nevertheless, most definitions require that to be open, data must be: (1) made available in a reusable format; (2) prepared according to certain standards; and (3) available under an open license with minimal restrictions or conditions imposed on reuse. In our paper, we focus on the third element – open licensing. To date, most of what has been written about open licensing in general and the licensing of open data in particular, has focused on the licensing of static data. Static data sets are typically downloaded through an open data portal in a one-time operation (although static data sets may still be periodically updated). By contrast, real-time data must be accessed on an ongoing basis and often at fairly short intervals such as every few seconds.

The need to access data from a host server at frequent intervals places a greater demand on the resources of the data custodian – in this case often cash-strapped municipalities or public agencies. The frequent access required may also present security challenges, as servers may be vulnerable to distributed denial-of-service attacks. In addition, where municipal governments or their agencies have negotiated with private sector companies for the hardware and software to collect and process real-time data, the contracts with those companies may require certain terms and conditions to find their way into open licenses. Each of these factors may have implications for how real-time data is made available as open data. The greater commercial value of real-time data may also motivate some public agencies to alter how they make such data available to the public.

While our paper focuses on real-time GPS public transit data, similar issues will likely arise in a variety of other contexts where ‘open’ real-time data are at issue. We consider how real-time data is licensed, and we identify additional terms and conditions that are imposed on users of ‘open’ real-time data. While some of these terms and conditions might be explained by the particular exigencies of real-time data (such as requirements to register for the API to access the data), others are more difficult to explain. Our paper concludes with some recommendations for the development of a standard for open real-time data licensing.

This paper is part of ongoing research carried out as part of Geothink, a partnership grant project funded by the Social Sciences and Humanities Research Council of Canada.


Note: I was invited by Canada’s Information Commissioner and the Schools of Journalism and Communication, and Public Policy and Administration at Carleton University to participate in a workshop to launch Right to Know Week 2016. This was a full afternoon workshop featuring many interesting speakers and discussions. This blog post is based on my remarks at this event.

For the last 5 years or so, governments at all levels across Canada have been embracing the open government agenda. In doing so, they have expressed, in various ways, new commitments to open data, to the proactive disclosure of government information, and to new forms of citizen engagement. Given that the core goals of the open government movement are to increase government transparency and accountability in the broader public interest, these developments are positive ones.

There is a risk, however, that public commitments to open government have become a bit of a ‘feel good’ thing for governments. After all, what government doesn’t want to publicly commit to being open, transparent and accountable? As a result, it is important to look behind the rhetoric and to examine the nature of the commitments made to open government in Canada and to question how meaningful and enduring they really are.

For the most part, commitments to open government in Canada have been manifested in declarations, policy documents, and directives. These documents express government policy and provide direction to government actors and institutions. Yet they are “soft law” at best. They are not enacted through a process of legislative debate, they are not expressed in laws that would have to be formally repealed or amended in order to be altered, there are no enforcement or compliance mechanisms, and they remain subject to change at the whim of the government in power. Directives and policies, of course, can provide rapid and responsive mechanisms for operationalizing changes in government direction, and so I am not criticizing decisions to set open government in motion through these various means. But I am suggesting that a longer term commitment to open government might require some of these measures to be expressed in and supported by legislation in order to become properly entrenched.

For example, much effort has been invested by the federal government in creating an open licence to facilitate reuse of government data and information. After a slow and sometimes painful process, we now have a pretty good open government licence. It is based on the UK OGL and is very user friendly compared to earlier iterations. It is bilingual and it can be customized to be used by governments at all levels in Canada (for example, a version of this licence was just adopted by city of Ottawa). This reduces the burden on provincial and municipal governments contemplating open government and it creates the potential for greater legal interoperability (when users combine data or information from a number of different governments in Canada).

But let us not forget why we need an open government licence in Canada. An open licence permits the public to make use of works that are protected by copyright without the need to ask permission or pay royalties, and with the fewest restrictions on re-use as possible. Government works in Canada – and this includes court decisions, statutes, Hansard, government reports, studies, to name just a few – are protected by copyright under section 12 of the Copyright Act. One might well ask why, instead of toiling for years to come up with the current open licence, the government has not shown its commitment to openness by abolishing Crown copyright. It’s not as radical as it might sound. In the U.S., s. 105 of the Copyright Act expressly denies protection to works of the U.S. government without any obvious negative consequences. In the U.S., these works are automatically in the public domain. This legislated, hard law solution makes the commitment real and relatively permanent. Yet as things stand in Canada, government works are protected by copyright by default, and governments choose which works to make available under the open licence and which they wish to provide under more onerous licence terms. They can also decide at some point to tear up the open licence and go back to the way things used to be. Crown copyright in its current incarnation sets the default at ‘closed’.

It is true that some aspects of open government are already part of our legislative framework. We have had freedom of information/access to information laws for decades now in Canada, and these laws enshrine the principle of the public’s right to access information in the hands of government. However, the access to information laws that we have are ‘first generation’ when it comes to open government. The federal Act is currently being reviewed by Parliament, and we might see some legislative change, though how much and how significant remains to be seen. As Mary Francoli has pointed out, there wasn’t really a need for further review – the new government had plenty of material on which to take action in proposing amendments to the Act.

The many deficiencies in the Access to Information Act have been well documented. For example, in 2015 the Information Commissioner set out 85 proposed reforms to the statute to modernize and improve it. The June 2016 Report by the Standing Committee on Access to Information, Privacy and Ethics on its Review of the Access to Information Act takes up many of these proposals in its own recommendations for extensive reforms to the Act. We are now awaiting the government’s response to this report. Rather than review the many recommendations already made, I will highlight those that relate to my broader point about enshrining open government principles in legislation

The Access to Information Act as it currently stands is premised on a model of individuals asking for information from government, waiting patiently while government puts together the requested information, and then complaining to the Commissioner when too much information is redacted or withheld. Open government promises both information and data proactively, in reusable formats, and without significant restrictions on reuse. While proactive disclosure of information and open data cannot replace the access to information model (which is, itself, capable of considerable improvement), they will provide quicker, cheaper and more effective access in many areas. Yet the Access to Information Act does not currently contain any statement about proactive disclosure. Proactive disclosure – also referred to as “open by default” is not really “open by default” unless the law says it is. Until then, it is just an aspirational statement and not a legal requirement. We see a proliferation of policies and directives at all levels of government that talk about proactive disclosure, but there are not firm legal commitments to this practice, or to open data. And, although I have been focussing predominantly on the federal regime, these issues are relevant across all levels of government in Canada.

A core principle of open data is that the data sets provided by governments should be made available in open, accessible and reusable formats. Proactive disclosure of information should also be in reusable formats. Access under the conventional regime is also enhanced when the information disclosed is in formats that facilitate analysis and reuse. Yet even under the existing access model, there is no default requirement to provide requested information in open, accessible and reusable formats. It is important to remember that it is not enough just to provide ‘access’ – the nature and quality of the access provided is relevant. The format in which information is provided in a digital age can create a barrier to the processing or analysis of information once accessed.

I would like, also, to venture onto territory that is not addressed in the calls for reform to access to information laws. Another challenge that I see for open data (and open information) in Canada relates to the sources of government data. I am concerned about the lack of controls over the use of taxpayer dollars to create closed data. As we move into the big data era, governments will be increasingly tempted to source their data for decision-making from private sector suppliers rather than to generate it in-house. We are seeing this already; an example is found in recent decisions of some municipal governments to source data about urban cycling patterns from cycling app companies. There will also be instances where governments contract with the private sector to install sensors to collect data, or to process it, and then pay licence fees for access to the resulting proprietary data in the hands of the private sector companies. In these cases, the terms of the license agreements may limit public access to the data or may place significant restrictions on its reuse. This is a big issue. All the talk about open government data will not do much good if the data on which the government relies is not characterized as “government data”. It is important that governments develop transparent policies around contracts for the collection, supply or processing of data that ensure that our rights as members of the public to access and reuse this data – paid for with our tax dollars – are preserved. Even better, it might be worth seeing some principle to this effect enshrined in the law.

Wednesday, 07 September 2016 08:45

New Report on Licensing Digitized Traditional Knowledge

Written by Teresa Scassa

A new report from uOttawa’s Canadian Internet Policy and Public Interest Clinic (CIPPIC) prepared in collaboration with Carleton’s Geomatics and Cartographic Research Centre (GCRC) proposes a strategy for protecting traditional knowledge that is shared in the digital and online context. The report proposes the use of template licences that will allow Indigenous communities to set the parameters for information sharing consistent with cultural norms..

Traditional knowledge – defined by the World Intellectual Property Organization as “the intellectual and intangible cultural heritage, practices and knowledge systems of traditional communities, including indigenous and local communities” – is poorly protected by contemporary intellectual property (IP) regimes. At the root of the failed protection is the reality that Western IP systems were designed according to a particular vision of creativity and innovation rooted in the rise of the industrial revolution. It is a product of a particular social, economic and ideological environment and does not necessarily transplant well to other contexts.

The challenge of protecting indigenous cultural objects, practices and traditional knowledge has received considerable attention – at least on the international stage – as it is a problem that has been exacerbated by globalization. There are countless instances where multinational corporations have used traditional knowledge or cultural heritage to their profit – and without obvious benefit to the source communities. Internationally, the Nagoya Protocol on Access and Benefit Sharing seeks to provide a framework for the appropriate sharing of traditional knowledge regarding plant and genetic resources. Innovative projects such as Mukurtu provide a licensing framework for Indigenous digital cultural heritage. What CIPPIC’s report tackles is a related but distinct issue: how can Indigenous communities share traditional knowledge about themselves or their communities while still maintaining a measure of control that is consistent with their cultural norms regarding that information?

For years now, the GCRC has worked with Indigenous communities in Canada to provide digital infrastructure for cybercartographic atlases that tell stories about those communities and their land. These multimedia atlases offer rich, interactive experiences. For example, the Inuit Siku (Sea Ice) Atlas documents Inuit knowledge of sea ice. The Lake Huron Treaty Atlas is a complex multimedia web of knowledge that is still evolving. These atlases are built upon an open platform developed by the GCRC and that can be adapted by interested communities.

The GCRC sought out the assistance of CIPPIC to explore the possibility of creating a licensing framework that could assist Indigenous communities in setting parameters for the sharing and reuse of their traditional knowledge in these contexts. The idea was to reduce the burden of information management for those sharing information and for those seeking to use it through a series of template licences that can be adapted by communities to suit particular categories of knowledge and contexts of sharing. This is a complex task, and there remains much work to be done, but what CIPPIC proposes offers a glimpse into what might be possible.

A 2016 European Commission report titled Survey report: data management in Citizen Science projects provides interesting insights into how such projects manage the data they collect. Proper management is, of course, essential to ensure that the collected data can be used and reused by project leaders as well as by other downstream users. It is relevant as well to the protection of the privacy of citizen participants. The authors of this report surveyed a large number of citizen science projects. From the 121 responses received they distilled findings that explore the diversity of the citizen science projects, and that reveal a troubling lack of thorough data management practices. A significant shortcoming for many projects was the lack of appropriate data licences to govern reuse of either raw or aggregate data collected.

There has been growing pressure on those carrying out research using public resources to make the fruits of the research – including the research data – publicly available for consultation, verification or reuse. But doing so is not as simple as a binary open/closed choice. There are a number of different questions that researchers must address: Should the raw data be made open or only the aggregate data? Should it be immediately available or available only after an embargo period? Is all data suitable for release or should some be protected for public policy reasons (such as protecting privacy)? And what, if any, terms and conditions should be imposed on reuse?

The authors of the EC report, Sven Schade and Chrysi Tsinaraki, found that overall there was a relatively high level of data sharing from citizen science projects. Significantly, 38% of the respondents to their survey provided access to their raw data; 37% provided access to aggregate data and 30% provided access to both. One interesting observation in this respect was that 68% of those respondents who provided access to their raw data also included within this dataset personal identifiers of citizen contributors to the project. Such data might be advertently collected, as where individuals provide personal information with their data uploads. In some cases, the scope of personal information might be significant. Contributions to a project might include geolocation information and geodemographic details. Schade and Tsinaraki asked respondents about their practices when it came to obtaining informed consent to data collection from project participants; they found that 25% of respondents did not obtain such consent whereas 53% relied upon a generic terms of use document to obtain consent. It was not entirely clear whether the consent being sought related to privacy issues or to obtaining any necessary rights to use or disseminate the data being collected (which might, for example, include copyright protected photographs). In any event, the results of the survey suggest that there is a significant lack of attention to both privacy and IP rights issues in citizen science projects.

On the issue of data licensing, Schade and Tsinaraki found that the conditions imposed on reuse by different projects varied. A majority of those who made data available believed that the data was in the public domain, while others imposed conditions such as non-commercial or share-alike restrictions. When asked which license they used to achieve these goals, 32 out of 56 respondents indicated that they used one of the commonly available template licences such as Creative Commons or Open Data Commons. A surprising number of respondents indicated that no particular licence was used. While data released in this way might be presumed to be “open”, the usefulness of the data might well be hampered by a lack of clarity regarding the scope of permitted reuse.

In addition to providing access to data, the authors of the Report asked whether citizen science researchers allowed open access to research results (presumably in the form of published papers and other output). While the overwhelming majority of projects indicated that they used open access options (ranging from public domain dedication to open access with conditions), Schade and Tsinaraki also found that 14 of the projects they considered used licences with terms that were not consistent with the reuse conditions that the researchers had identified. Clearly there is a need for greater support for projects in developing or choosing appropriate licences.

Although many of the projects indicated that they provided access to their data, the duration of that access was less certain. The authors found that 42% of projects intended to guarantee access to their data only within the lifespan of the project. The authors also found that 40% of projects that provide data access do not provide comprehensive metadata along with the data. This would certainly limit the value of the data for reuse. Both these issues are important in the context of citizen science projects, which are often granted-funded and temporally-limited. The ability to archive and preserve research data and to make it available for meaningful access and reuse should be part of researchers’ data management plans, and is something which should be supported by research institutions and funding agencies.

Overall, the Report provides data that suggests that the burgeoning field of citizen science needs more support when it comes to all aspects of data management. Proper data management practices will help citizen science researchers to meet their own objectives, to share their data effectively and appropriately, and to protect the rights and interests of participants.

Note: In 2015 I drafted a report, with Haewon Chung, for the Wilson Center Commons Lab titled Managing Intellectual Property Rights in Citizen Science. This report addresses many licensing issues related to the collection, sharing and reuse of citizen science data and outputs. It is available under a Creative Commons Licence.


