A Canadian court has just handed down a decision in a case that interweaves interesting issues about copyright in data with issues around how the government can limit the scope of these rights in its view of the public interest. The case is complex – it involves a large number of defendants and is tied to a range of other law suits relating to the regulatory regime for oil and gas exploration in Canada. The complexity of the case is such that I will divide my analysis over two blog posts. This – the first – will address the issues around whether there is copyright in the data submitted to the regulator; the second blog post will deal with the issues relating to the curtailment of the copyright within the context of the regulatory regime.
The plaintiff in this case and in the mass of related litigation is Geophysical Service Inc. (GSI). GSI is a Canadian company that is in the business of carrying out marine seismic surveys and licensing the data that it collected and a compiled as a result of its activities. It claims that its flood of litigation around the copyright and regulatory regime issues resulted from the fact that the government’s approach is driving it out of business. As copyright is often touted as providing incentives to create and innovate, GSI’s precarious status as an innovator in this area sets an interesting context for the issues raised in the litigation.
In a nutshell, GSI – like other companies in this field – had to obtain a licence from the national regulator to conduct its expensive, time and labour intensive work. A condition of the licence was that the data it generated and processed into information products would be submitted to the appropriate regulatory bodies that oversee offshore oil and gas exploration. It is this data and the related information products that GSI claims is protected by copyright law. Under the statutes governing the regulatory process, data submitted to the regulator can be made public after a 5 year period. GSI was in the business of selling its data and information products to companies engaged in oil and gas exploration. GSI argued that the fact that the same data and analysis could be released to the public after 5 years, and was, as a matter of policy released between 5 and 15 years after its submission made its business ultimately unsustainable. They argued, therefore, that they had copyright in the data they collected and in the analytics they carried out on the data. They then argued that the regulator, by releasing this data to the public before the expiration of the copyright term, infringed its copyrights. They also maintained that the other private sector companies which made use of their data obtained from the public sources, violated their copyrights.
The first issue, therefore, was whether the seismic data and related information products produced by GSI amounted to original works that could be protected by copyright law. It is a basic principle of copyright law that there can be no copyright in facts – facts are in the public domain. At the same time, however, it is possible to have copyright in a compilation of facts – so long as that compilation meets the requirements of originality. According to the Supreme Court of Canada in CCH Canadian v. Law Society of Upper Canada, originality requires that a work: a) is not copied; b) reflects an exercise in skill and judgment and 3) can be attributed to a human author. In this case, the defendants argued that the GSI data was ‘copied’ from the environment (i.e. it was factual material not protected by copyright law); that its collection and compilation did not involve sufficient skill and judgment because it was in part automated, and in part collected and compiled according to industry standards; and that the technology-assisted and highly human- and other resource-intensive process involved in its collection and compilation meant that it did not originate from an identifiable human author.
Justice Eidsvik of the Alberta Court of Queen’s Bench found resoundingly for the plaintiffs on the copyright issues. She carefully considered the manner in which the seismic data was both collected and processed. She found that both the raw data and the processed data constituted “works” within the meaning of the Copyright Act. She analogized the raw seismic data to a literary work or a literary compilation. She also found that some of the seismic sections – data represented as squiggly lines – would fall within the definition of an artistic work. Both “works” in this case met the necessary threshold for originality. She noted that the creation and compilation of the seismic data required significant levels of skill, noting that “The data is created, not merely collected, through the intervention of human skill” (at para 79). The collection of this seismic data requires a complex series of choices. She accepted the analogy that it was like taking a photograph. Justice Eidsvik observed:
In this case, the photograph is not just a quick snapshot; rather, it is one that requires careful selection of the location, angle of technological instruments (e.g. the size and depth of the airguns, the length and depth of the streamers, and the number and placement of hydrophones), and finally the filtering and refining of the product. (at para 80)
She also found apt an analogy from one of the expert witnesses between the creation of the data and the conducting of a symphony, where the conductor “ensures that some instruments are played louder, or softer, or faster or slower, to make a beautiful creation. The same types of decisions are made on board the seismic acquisition ship to obtain “beautiful” raw seismic data.” (at para 81)
Having found copyright in the compilation of raw data, it is not surprising that the judge also found copyright in the processed data as well. She found that substantial skill and judgment went into the processing of the data, stating that “The raw data is not simply pumped into a computer and a useful product comes out.” (at para 83) She found that the quality of the processed data is very much dependent upon the participation of a skilled processor, and that different companies would produce different processed data from the raw data depending upon the skill of the processor involved.
Justice Eidsvik also found that the requisite human author was present. In doing so, she addressed the Telstra decision from the Australian High Court which had found no copyright in a telephone director in part because it was created following a largely automated process in which there was relatively little human input. In this case, she found the human input to be a significant factor in determining the quality of the output at both the stage of acquisition of the data and the processing stage. She reviewed the few Canadian cases involving compilations of data, noting that in cases where human input is more significant in terms of the choices made in arranging the facts, the courts accept that the compilation is original.
Justice Eidsvik rejected the argument that it is necessary to identify a specific human author in order to find copyright in a complex factual work. She accepted that a team of “authors” could create a factual compilation. Nevertheless, she was also prepared to identify in this case the head of the seismic crew on the ship as the author of the raw data and the person in charge of the computing as the author of the processed data. She noted as well that in this case the actual owner of the copyright would be the employer of both of these individuals – GSI.
In finding copyright in both the raw and the processed data, Justice Eidsvik was careful to note that she was not deviating from the principle that there could be no copyright in facts or ideas. She found that the “seismic data is an expression of GSI’s views of what the image of the subsurface of the surveyed areas represents.” (at para 97). The raw facts – the features of the subsurface – are there for anyone to see and are in the public domain – but the data collected about those facts is authored. Critical data theorists will recognize in here the seeds of the essential subjectivity of collected data, where choices are made as to how to collect the data, and according to what parameters.
Justice Eidsvik also rejected the idea that the works at issue lacked originality because their collection and compilation were dictated by “practical considerations, utility or externally imposed requirements.” (at para 105) Notwithstanding the presence of industry standards that would influence some of the decision-making involved in the collection and processing of the data, she found that “the original skill and judgment that comes to bear on the final product of the seismic work far outweighs the portion of “hard wired” industry standards in play.” (at para 105)
Based on the facts of this case it is not surprising that Justice Eidsvik would conclude that there was copyright in both the compilation of seismic data and in the processed data. Her extensive review of the process by which the data is first collected and then processed reveals a substantial amount of skill and judgment. In a “datified” society, the decision may give some comfort to those who collect and process all manner of data: their products – whether compilations of raw data or processed data (analytics) – are works that can be protected under copyright law. Such protection will be dependent upon an ability to show that the collection and/or processing involve choices motivated by skill and judgment, rather than mechanical decision-making or compliance with industry norms or standards.
While for GSI it was a victory to have copyright confirmed in its data products, the victory was largely pyrrhic. The second part of the decision – and the part that I will consider in a subsequent blog post – deals with the regulatory regime which the court ultimately finds to have effectively expropriated this copyright interest. Stay tuned!