Digitization of protected works: Software for the detection of out of commerce works





Out-of-commerce works, Bibliographic catalogues, Bibliographic comparator, Web scraping, Parser, Out-of-print-works, Computer programmes, Libraries, Book, Digitization.


The digitization of protected works in the context of cultural preservation is becoming more important, according to the latest proposal of the European Commission directive on copyright in the digital single market, pending final approval. This may represent an opportunity for European libraries, which can create digital collections with those works that are manifestly outside commercial channels. This requires a set of computer programs capable of extracting information from catalogs, and provide a first detection of the owners of the works. This research approach the methodology for the development of a software capable of crossing information from library bibliographic catalogs, with commercial catalogs, in order to determine the presence or absence of their books. In its development, the difficulties and solutions used for its construction are explained, derived from the heterogeneity of the consulted catalogs. Finally, it is concluded that the creation of this type of computer applications is feasible and very useful, since an average of more than 90% correct answers can be obtained in the distinction of non-commercial works. However, there are still problems when trying to differentiate the editions or even interpreting the false positives, derived from aspects such as the algorithms of automatic suggestion of works.


Download data is not yet available.


