Blog

2022 public data file of more than 134 million metadata records now available

In 2020 we released our first public data file, something we’ve turned into an annual affair supporting our commitment to the Principles of Open Scholarly Infrastructure (POSI). We’ve just posted the 2022 file, which can now be downloaded via torrent like in years past.

We aim to publish these in the first quarter of each year, though as you may notice, we’re a little behind our intended schedule. The reason for this delay was that we wanted to make critical new metadata fields available, including resource URLs and titles with markup.

Amendments to membership terms to open reference distribution and include UK jurisdiction

Tl;dr

Forthcoming amendments to Crossref’s membership terms will include:

  1. Removal of ‘reference distribution preference’ policy: all references in Crossref will be treated as open metadata from 3rd June 2022.

  2. An addition to sanctions jurisdictions: the United Kingdom will be added to sanctions jurisdictions that Crossref needs to comply with.


Sponsors and members have been emailed today with the 60-day notice needed for changes in terms.

Reference distribution preferences

In 2017, when we consolidated our metadata services under Metadata Plus, we made it possible for members to set a preference for the distribution of references to Open, Limited, or Closed. Prior to the 2017 change, we acted as a broker of 1:1 feeds of parts of metadata for parts of our community - clearly a role that was not scalable.

With a little help from your Crossref friends: Better metadata

We talk so much about more and better metadata that a reasonable question might be: what is Crossref doing to help?

Members and their service partners do the heavy lifting to provide Crossref with metadata and we don’t change what is supplied to us. One reason we don’t is because members can and often do change their records (important note: updated records do not incur fees!). However, we do a fair amount of behind the scenes work to check and report on the metadata as well as to add context and relationships. As a result, some of what you see in the metadata (and some of what you don’t) is facilitated, added or updated by Crossref.

A Registry of Editorial Boards - a new trust signal for scholarly communications?

Background

Perhaps, like us, you’ve noticed that it is not always easy to find information on who is on a journal’s editorial board and, when you do, it is often unclear when it was last updated. The editorial board details might be displayed in multiple places (such as the publisher’s website and the platform where the content is hosted) which may or may not be in sync and retrieving this information for any kind of analysis always requires manually checking and exporting the data from a website (as illustrated by the Open Editors research and its dataset).

A ROR-some update to our API

Earlier this year, Ginny posted an exciting update on Crossref’s progress with adopting ROR, the Research Organization Registry for affiliations, announcing that we’d started the collection of ROR identifiers in our metadata input schema. 🦁

The capacity to accept ROR IDs to help reliably identify institutions is really important but the real value comes from their open availability alongside the other metadata registered with us, such as for publications like journal articles, book chapters, preprints, and for other objects such as grants. So today’s news is that ROR IDs are now connected in Crossref metadata and openly available via our APIs. 🎉

Come and get your grant metadata!

Tl;dr: Metadata for the (currently 26,000) grants that have been registered by our funder members is now available via the REST API. This is quite a milestone in our program to include funding in Crossref infrastructure and a step forward in our mission to connect all.the.things. This post gives you all the queries you might need to satisfy your curiosity and start to see what’s possible with deeper analysis. So have the look and see what useful things you can discover.

Lesson learned, the hard way: Let’s not do that again!

TL;DR

We missed an error that led to resource resolution URLs of some 500,000+ records to be incorrectly updated. We have reverted the incorrect resolution URLs affected by this problem. And, we’re putting in place checks and changes in our processes to ensure this does not happen again.

How we got here

Our technical support team was contacted in late June by Wiley about updating resolution URLs for their content. It’s a common request of our technical support team, one meant to make the URL update process more efficient, but this was a particularly large request. Shortly thereafter, we were provided with nearly 1,200 separate files by Atypon on behalf of Wiley in order to update the resolution URLs of ~9 million records. We manually spot checked over 50 of these files, because, prior to this issue, our technical support team did not have a mechanism to automatically check for errors. That labor intensive review did not turn up any problems. That is, those 50 samples had no errors with the headers, like were found later.

Some rip-RORing news for affiliation metadata

We’ve just added to our input schema the ability to include affiliation information using ROR identifiers. Members who register content using XML can now include ROR IDs, and we’ll add the capability to our manual content registration form, participation reports, and metadata retrieval APIs in the near future. And we are inviting members to a Crossref/ROR webinar on 29th September at 3pm UTC.

The background

We’ve been working on the Research Organization Registry (ROR) as a community initiative for the last few years. Along with the California Digital Library and DataCite, our staff has been involved in setting the strategy, planning governance and sustainability, developing technical infrastructure, hiring/loaning staff, and engaging with people in person and online. In our view, it’s the best current model of a collaborative initiative between like-minded open scholarly infrastructure (OSI) organizations.

RFP: Help evaluate the reach and effects of metadata

Jennifer Kemp

Jennifer Kemp – 2021 July 21

In MetadataCommunity

UPDATE, 14 October 2021:

We received several excellent proposals in response to this RFP and we’d like to thank everyone involved for their time and enthusiasm.

We are excited to announce the two projects that have been selected, to run through early 2023. Stay tuned!

With or Without: Measuring Impacts of Books Metadata
This project will test the premise that academic books metadata improves discoverability and usage by assessing the impact of book chapter records with DOIs (unique from metadata associated with the entire book) with associated chapter and book attributes. The study aims to prove or disprove its hypothesis and rank metadata attributes by their association with successful content discovery and access. The findings will be considered alongside similar metadata research in order to develop a metadata efficacy framework, which can be used to determine the return on metadata investments by publishers and service providers.

An Advisory Group for Preprints

We are delighted to announce the formation of a new Advisory Group to support us in improving preprint metadata. Preprints have grown in popularity over the last few years, with increasing focus brought by the need to rapidly disseminate knowledge in the midst of a global pandemic. We have supported metadata deposits for preprints under the record type ‘posted content’ since 2016, and members currently register a total of around 17,000 new preprints metadata records each month.