Text and Data Mining and UK Law

In 2014, the UK government introduced a number of changes to its 1988 Copyright Act . Amongst those changes to the law was one relating to Text and Data Mining (TDM), introducing a new exception to copyright i.e., giving users permission to do things that were previously legally uncertain. The exception (Section 29A of the UK Copyright Act) allows researchers to make copies of any copyright material for the purpose of “computational analysis” (i.e., TDM) if they already have have “lawful access”, for example because the researcher (or their employer) has purchased or subscribed to it. This exception only permits the making of copies for TDM for non-commercial research.

The exception permits any published and unpublished in-copyright works to be copied for the purpose of TDM. This includes sound, film/video, artistic works, journal articles, textual materials, tables and databases, as well as data, It over-rides any contractual term that states you cannot undertake such copying and analysis.

This all sounds great, but there are two important caveats. The first is that the research must be non-commercial. This means commercial organisations can use the exception if the research in question is for non-commercial purposes. A not for profit organisation such as a University cannot take advantage of the exception if the research in question is for a commercial purpose, e.g., with the intention of selling the results of the analysis. However, “non-commercial” is not defined in the law or by case law, e.g., what if a University researcher is doing the research which is part or fully funded by a for profit company, which will have access to the results of the research and may well use those results to launch new commercial products?

The second caveat relates to potential damage to a vendor’s online system, such as digital access to the full text of a range of journals. The exception states: “Publishers and content providers are able to apply reasonable measures to maintain their network security or stability.” Although the exception also states that “these measures should not prevent or unreasonably restrict researcher’s ability to text and data mine”, in practice many publishers have imposed limits on how much can be downloaded. I have yet to see any clear evidence from a publisher showing that TDM activities do slow down their systems, and suspect their rules are designed simply to frustrate researchers. As it would be illegal to try to by-pass any measures imposed by publishers, researchers, and librarians who maintain such subscriptions on behalf of their users, are very reluctant to permit heavy TDM activities.

In my view, researchers and librarians are being unnecessarily nervous of antagonising publishers on this issue, so in practice the new exception has not helped researchers as much as was originally hoped.

Charles Oppenheim

Leave a comment