Which of the metadata with relevance for bibliometrics are the same and which are different when switching from Microsoft Academic Graph to OpenAlex?
DOI:
https://doi.org/10.3145/epi.2023.mar.09Keywords:
Subject classification, Fields of study, Concepts, Bibliographic data, Metadata, Document types, Citation analysis, Bibliometrics, Microsoft Academic Graph, MAG, OpenAlexAbstract
With the announcement of the retirement of Microsoft Academic Graph (MAG), the non-profit organization OurResearch announced that they would provide a similar resource under the name OpenAlex. Thus, we compare the metadata with relevance to bibliometric analyses of the latest MAG snapshot with an early OpenAlex snapshot. Practically all works from MAG were transferred to OpenAlex preserving their bibliographic data publication year, volume, first and last page, DOI as well as the number of references that are important ingredients of citation analysis. More than 90% of the MAG documents have equivalent document types in OpenAlex. Of the remaining ones, especially reclassifications to the OpenAlex document types journal-article and book-chapter seem to be correct and amount to more than 7%, so that the document type specifications have improved significantly from MAG to OpenAlex. As another item of bibliometric relevant metadata, we looked at the paper-based subject classification in MAG and in OpenAlex. We found significantly more documents with a subject classification assignment in OpenAlex than in MAG. On the first and second level, the classification structure is nearly identical. We present data on the subject reclassifications on both levels in tabular and graphical form. The assessment of the consequences of the abundant subject reclassifications on field-normalized bibliometric evaluations is not in the scope of the present paper. Apart from this open question, OpenAlex seems to be overall at least as suited for bibliometric analyses as MAG for publication years before 2021 or maybe even better because of the broader coverage of document type assignments.
Downloads
References
Bojanowski, Michał; Edwards, Robin (2016). Alluvial: R package for creating alluvial diagrams. R package version: 0.1-2. https://github.com/mbojan/alluvial
Crossref (2021). Content type markup guide. https://www.crossref.org/documentation/content-registration/content-type-markup-guide
DOI.org (2019). DOI System and the ISBN System. https://www.doi.org/factsheets/ISBN-A.html
Harzing, Anne-Wil; Alakangas, Satu (2017). "Microsoft Academic: is the phoenix getting wings?". Scientometrics, v. 110, n. 1, pp. 371-383. https://doi.org/10.1007/s11192-016-2185-x
Kramer, Bianca (2022). COKI Open metadata report (Update March 25, 2022). https://github.com/Curtin-Open-Knowledge-Initiative/open-metadata-report
Microsoft Blog (2021). Microsoft Academic. https://www.microsoft.com/en-us/research/project/academic
OpenAlex (2021). Migration guide. https://docs.openalex.org/download-snapshot/mag-format/mag-migration-guide
OurResearch (2021). We´re building a replacement for Microsoft Academic Graph. https://blog.ourresearch.org/were-building-a-replacement-for-microsoft-academic-graph
Priem, Jason; Piwowar, Heather (2022). OpenAlex-concept-tagging. https://github.com/ourresearch/openalex-concept-tagging
Priem, Jason; Piwowar, Heather; Orr, Richard (2022). "OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts". In: 26th International conference on science, technology and innovation indicators (STI 2022), Granada, Spain. https://doi.org/10.5281/zenodo.6936227
R Core Team (2020). R: A language and environment for statistical computing. https://www.R-project.org
Scheidsteger, Thomas; Haunschild, Robin (2022). "Comparison of metadata with relevance for bibliometrics between Microsoft Academic Graph and OpenAlex until 2020". In: 26th International conference on science, technology and innovation indicators (STI 2022), Granada, Spain. https://doi.org/10.5281/zenodo.6975102
Scheidsteger, Thomas; Haunschild, Robin; Hug, Sven E.; Bornmann, Lutz (2018). "The concordance of field-normalized scores based on Web of Science and Microsoft Academic data: A case study in computer sciences". In: 23rd International conference on science, technology and innovation indicators (STI 2018), Leiden, The Netherlands. https://hdl.handle.net/1887/65358
Sinha, Arnab; Shen, Zhihong; Song, Yang; Ma, Hao; Eide, Darrin; Hsu, Bo-June-Paul; Wang, Kuansan (2015). "An overview of Microsoft Academic Service (MAS) and applications". In: 24th International conference on World Wide Web (WWW´15 Companion), Florence, Italy. https://doi.org/10.1145/2740908.2742839
Visser, Martijn; Van-Eck, Nees-Jan; Waltman, Ludo (2021). "Large-scale comparison of bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic". Quantitative science studies, v. 2, n. 1, pp. 20-41. https://www.doi.org/10.1162/qss_a_00112
Wickham, Hadley (2016). ggplot2: Elegant graphics for data analysis: New York: Springer-Verlag. ISBN: 978 3 319 24277 4
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2023 Profesional de la información
This work is licensed under a Creative Commons Attribution 4.0 International License.
Dissemination conditions of the articles once they are published
Authors can freely disseminate their articles on websites, social networks and repositories
However, the following conditions must be respected:
- Only the editorial version should be made public. Please do not publish preprints, postprints or proofs.
- Along with this copy, a specific mention of the publication in which the text has appeared must be included, also adding a clickable link to the URL: http://www.profesionaldelainformacion.com
- Only the final editorial version should be made public. Please do not publish preprints, postprints or proofs.
- Along with that copy, a specific mention of the publication in which the text has appeared must be included, also adding a clickable link to the URL: http://revista.profesionaldelainformacion.com
Profesional de la información journal offers the articles in open access with a Creative Commons BY license.