This is my comment after the news in Nature popped up saying that a big academic publisher allows now limited text mining of its papers. Originally sent to LibLicense list. Reviewed by a lawyer, expert in copyright law.
As a data mining specialist, I’ve followed the different discussions about mining scholarly publications for some time already, and I’ve noticed that there is a big confusion about the legal nature of text mining and the true origin of restrictions related to it. The discussions far too often touch the issue of copyright law, which unnecessarily fudges the problem. Below is my take on this topic.
1) It’s important to observe that current restrictions on text mining are technical, not legal. Publishers impose technical limits on how much content can be downloaded in a given period of time, and if someone downloads too much, the university may get cut off from publisher’s servers. This is regulated legally, of course, but only in the agreement signed between the university and the publisher, not by general law, the least by copyright. What exact terms are signed is a matter of mutual agreement between parties – they can agree on whatever they want – so blaming copyright for limited bandwidth to publisher’s servers is unreasonable.
2) Restrictions are related to subscription contents alone. There are no ways to impose restrictions on Continue reading →