Which of the metadata with relevance for bibliometrics are the same and which are different when switching from Microsoft Academic Graph to OpenAlex?

Authors

  • Thomas Scheidsteger Max Planck Institute for Solid State Research https://orcid.org/0000-0001-8351-2498
  • Robin Haunschild Max Planck Institute for Solid State Research

DOI:

https://doi.org/10.3145/epi.2023.mar.09

Keywords:

Subject classification, Fields of study, Concepts, Bibliographic data, Metadata, Document types, Citation analysis, Bibliometrics, Microsoft Academic Graph, MAG, OpenAlex

Abstract

With the announcement of the retirement of Microsoft Academic Graph (MAG), the non-profit organization OurResearch announced that they would provide a similar resource under the name OpenAlex. Thus, we compare the metadata with relevance to bibliometric analyses of the latest MAG snapshot with an early OpenAlex snapshot. Practically all works from MAG were transferred to OpenAlex preserving their bibliographic data publication year, volume, first and last page, DOI as well as the number of references that are important ingredients of citation analysis. More than 90% of the MAG documents have equivalent document types in OpenAlex. Of the remaining ones, especially reclassifications to the OpenAlex document types journal-article and book-chapter seem to be correct and amount to more than 7%, so that the document type specifications have improved significantly from MAG to OpenAlex. As another item of bibliometric relevant metadata, we looked at the paper-based subject classification in MAG and in OpenAlex. We found significantly more documents with a subject classification assignment in OpenAlex than in MAG. On the first and second level, the classification structure is nearly identical. We present data on the subject reclassifications on both levels in tabular and graphical form. The assessment of the consequences of the abundant subject reclassifications on field-normalized bibliometric evaluations is not in the scope of the present paper. Apart from this open question, OpenAlex seems to be overall at least as suited for bibliometric analyses as MAG for publication years before 2021 or maybe even better because of the broader coverage of document type assignments.

Downloads

Download data is not yet available.

References

Bojanowski, Michał; Edwards, Robin (2016). Alluvial: R package for creating alluvial diagrams. R package version: 0.1-2. https://github.com/mbojan/alluvial

Crossref (2021). Content type markup guide. https://www.crossref.org/documentation/content-registration/content-type-markup-guide

DOI.org (2019). DOI System and the ISBN System. https://www.doi.org/factsheets/ISBN-A.html

Harzing, Anne-Wil; Alakangas, Satu (2017). "Microsoft Academic: is the phoenix getting wings?". Scientometrics, v. 110, n. 1, pp. 371-383. https://doi.org/10.1007/s11192-016-2185-x

Kramer, Bianca (2022). COKI Open metadata report (Update March 25, 2022). https://github.com/Curtin-Open-Knowledge-Initiative/open-metadata-report

Microsoft Blog (2021). Microsoft Academic. https://www.microsoft.com/en-us/research/project/academic

OpenAlex (2021). Migration guide. https://docs.openalex.org/download-snapshot/mag-format/mag-migration-guide

OurResearch (2021). We´re building a replacement for Microsoft Academic Graph. https://blog.ourresearch.org/were-building-a-replacement-for-microsoft-academic-graph

Priem, Jason; Piwowar, Heather (2022). OpenAlex-concept-tagging. https://github.com/ourresearch/openalex-concept-tagging

Priem, Jason; Piwowar, Heather; Orr, Richard (2022). "OpenAlex: A fully-open index of scholarly works, authors, venues, institutions, and concepts". In: 26th International conference on science, technology and innovation indicators (STI 2022), Granada, Spain. https://doi.org/10.5281/zenodo.6936227

R Core Team (2020). R: A language and environment for statistical computing. https://www.R-project.org

Scheidsteger, Thomas; Haunschild, Robin (2022). "Comparison of metadata with relevance for bibliometrics between Microsoft Academic Graph and OpenAlex until 2020". In: 26th International conference on science, technology and innovation indicators (STI 2022), Granada, Spain. https://doi.org/10.5281/zenodo.6975102

Scheidsteger, Thomas; Haunschild, Robin; Hug, Sven E.; Bornmann, Lutz (2018). "The concordance of field-normalized scores based on Web of Science and Microsoft Academic data: A case study in computer sciences". In: 23rd International conference on science, technology and innovation indicators (STI 2018), Leiden, The Netherlands. https://hdl.handle.net/1887/65358

Sinha, Arnab; Shen, Zhihong; Song, Yang; Ma, Hao; Eide, Darrin; Hsu, Bo-June-Paul; Wang, Kuansan (2015). "An overview of Microsoft Academic Service (MAS) and applications". In: 24th International conference on World Wide Web (WWW´15 Companion), Florence, Italy. https://doi.org/10.1145/2740908.2742839

Visser, Martijn; Van-Eck, Nees-Jan; Waltman, Ludo (2021). "Large-scale comparison of bibliographic data sources: Scopus, Web of Science, Dimensions, Crossref, and Microsoft Academic". Quantitative science studies, v. 2, n. 1, pp. 20-41. https://www.doi.org/10.1162/qss_a_00112

Wickham, Hadley (2016). ggplot2: Elegant graphics for data analysis: New York: Springer-Verlag. ISBN: 978 3 319 24277 4

Published

2023-03-04

How to Cite

Scheidsteger, T., & Haunschild, R. (2023). Which of the metadata with relevance for bibliometrics are the same and which are different when switching from Microsoft Academic Graph to OpenAlex?. Profesional De La información, 32(2). https://doi.org/10.3145/epi.2023.mar.09