In April 2025, we launched the metadata matching project, in order to add missing relationships to the scholarly metadata. We will do this by consolidating all existing and planned matching workflows, which enrich member-deposited metadata in Crossref. This unified service will result in a more complete research nexus. In this blog post, we share our latest milestone: developing and evaluating a strategy for matching funder metadata to Research Organization Registry (ROR) identifiers.
Preserving the integrity of the scholarly record is an important component of the overall endeavour to protect research integrity. Open scholarly infrastructure enables persistent recording of research objects and associated metadata, which provides an evidence trail for these objects for all in the research community. Crossref and DataCite – as providers of essential infrastructure for preservation of the scholarly record – we share our joint expertise in the new guide on “Why metadata matters for research integrity and how to contribute”.
As our global community continues to grow, it is important for us to build and maintain our connections within it. In March this year, we had the opportunity to visit São Paulo for a community event at the Fundação Getúlio Vargas. The content of our presentations is available online. Events such as this provide an opportunity for us to update our members on Crossref fundamentals and developments, and help us better tune in to the varied needs of our communities and learn how we can work together more effectively. This was our third visit to Brazil, with previous events held in Campinas and São Paulo in 2016, and Goiânia and Fortaleza in 2018.
Each organization in the global community of Crossref members (that’s currently over 24k organizations in 166 different countries) plays a key role in building the Research Nexus. Any opportunity we have to meet with our members in person is a highlight and a way for us to learn more from each other. The month of January saw three of us travel to Bangkok to attend the first-ever Charleston Conference organised in Asia and to meet with our growing community in Thailand.
Back in 2014, Geoffrey Bilder blogged about the kick-off of an initiative between Crossref and Wikimedia to better integrate scholarly literature into the world’s largest knowledge space, Wikipedia. Since then, Crossref has been working to coordinate activities with Wikimedia: Joe Wass has worked with them to create a live stream of content being cited in Wikipedia; and we’re including Wikipedia in Event Data, a new service to launch later this year. In that time, we’ve also seen Wikipedia importance grow in terms of the volume of DOI referrals.
How can we keep this momentum going and continue to improve the way we link Wikipedia articles with the formal literature? We invited Alex Stinson, a project manager at The Wikipedia Library (and one of our first guest bloggers) to explain more:
Wikipedia provides the most public gateway to academic and scholarly research. With millions of citations to academic as well as non-academic but reliable sources, like those produced by newspapers, its ecosystem of 5 million English Wikipedia articles and 35 million articles in hundreds of languages provides the first stop for researchers in both scholarly and informal research situations. The practice of “checking Wikipedia” has become ubiquitous in a number of fields; for example, Wikipedia is the most visited source of medical information online, even providing the first stop for many medical students and medical practitioners when looking for medical literature.
The Wikipedia Library program helps Wikipedia’s volunteer editors access and use the best sources in their research and citations. Through partnerships with over fifty leading publishers and aggregators, like JSTOR, Project Muse, Elsevier, Newspapers.com, Highbeam, Oxford University Press and others, we have been able to give over 3000 of our most prolific volunteers access to over 5500 accounts. These are clear, win-win relationships where Wikipedia editors get to use these databases to improve Wikipedia, while in turn linking to authoritative resources and enhancing their discovery.
JSTOR has been working with us since 2012, providing over 500 accounts to our editors. Kristen Garlock at JSTOR writes:
“We’re very happy to collaborate with the Wikipedia Library to provide JSTOR access to Wikipedia editors. Supporting the initiative to increase editor access to scholarly resources and improve the quality of information and sources on Wikipedia has the potential to help all Wikipedia readers. In addition to providing more discoverability for our institutional subscribers, introducing new audiences to the scholarship on JSTOR them discover access opportunities like our Register & Read program.”
There are strong signals that Wikipedia’s role in the citation ecosystem helps ensure the best materials reach the public through its over 400 million monthly readers:
Two of our access partners have found that around half of the referrals arriving from Wikipedia were able to authenticate into their subscription resources, suggesting that a large portion of our readers can take advantage of subscriptions provided by scholarly institutions.
Wikipedia is highly influential in the open access ecosystem as well, with a recent study showing higher citation rates for OA materials than those behind a paywall.
Altmetrics tools (such as Altmetric.com, ImpactStory or Plum Analytics) are recognizing Wikipedia’s importance by including Wikipedia citations in their impact metrics.
Despite these advances, we think this is only the beginning of Wikipedia’s impact on the landscape of scholarly research and discovery. Wikipedia can become a highly integrated research platform within the broader research ecosystem, where the best scholarship is summarized and discoverable-where Wikipedia effectively becomes the front matter to all research.
However, there are some clear barriers to fulfilling this vision. Currently, most citations on Wikipedia are stored in free-text and not readily available in machine-readable formats; our community is working to fix this. Wikipedia also has major systematic gaps in topics where either we lack volunteer interest or Wikipedia reflects larger systemic biases within society or scholarship.We need the help of volunteers, experts, industry partners, and information technologists to grow Wikipedia’s collection of citations, especially around key missing areas, and to transform existing citations into structured formats.
WikiData, Wikipedia’s sister project which crowdsources structured metadata, offers an excellent opportunity for improving the impact of Wikipedia in research. Having Wikipedia citations stored in this structured ecosystem, connecting metadata with semantic meaning, would allow the citations in Wikipedia to become the backbone for discovery tools which emphasize the hand-curated interrelationships between authoritative sources and the knowledge collected by Wikipedia and Wikidata editors.
We need more collaborators to realize the full vision of Wikipedia supporting research in the most effective ways:
We need help from publishers with subscription databases, to help us give our editors access to the databases through The Wikipedia Library’s access partnership program. These high-quality source materials allow our editors to expose that research in a number of languages and for millions of readers.
We need your expertise to build our structured metadata ecosystem, by helping Wikidata map and collect citation data.
We need the larger research community to promote Wikipedia as a scholarly communications tool and make contributing to Wikipedia an important part of the social responsibility of experts. Wider citation of sources in Wikipedia ensures widespread discovery and dissemination of that research.