Blog

Crossref to Auto-Update ORCID Records

In the next few weeks, authors with an ORCID iD will be able to have Crossref automatically push information about their published work to their ORCID record. It’s something that ORCID users have been asking for and we’re pleased to be the first to develop the integration. 230 publishers already include ORCID iDs in their metadata deposits with us, and currently there are 248,000 DOIs that include ORCID iDs.

Best Practices for Depositing Funding Data

Crossref’s funding data initiative (FundRef) encourages publishers to deposit information about the funding sources of authors’ research as acknowledged in their papers. The funding data comprises funder name and identifier, and grant number or numbers. Funding data can be deposited on its own or with the rest of the metadata for an item of content.

Introducing the Crossref Labs DOI Chronograph

tl;dr http://chronograph.labs.crossref.org

At Crossref we mint DOIs for publications and send them out into the world, but we like to hear how they’re getting on out there. Obviously, DOIs are used heavily within the formal scholarly literature and for citations, but they’re increasingly being used outside of formal publications in places we didn’t expect. With our DOI Event Tracking / ALM pilot project we’re collecting information about how DOIs are mentioned on the open web to try and build a picture about new methods of citation.

♫ Researchers just wanna have funds ♫

Cindy Lauper

photo credit

Summary

You can use a new Crossref API to query all sorts of interesting things about who funded the research behind the content Crossref members publish.

Background

Back in May 2013 we launched Crossref’s FundRef service. It can be summarized like this:

  • Crossref keeps and manages a canonical list of Funder Names (ephemeral) and associated identifiers (persistent).
  • We encourage our members (or anybody, really- the list is available under A CC-Zero license waiver) to use this list for collecting information on who funded the research behind the content that our members publish.
  • We then ask that our members deposit this data in their normal Crossref metadata deposits.

And that was cool.

PDF-Extract

PDF-EXTRACT

Crossref Labs is happy to announce the first public release of “pdf-extract” an open source set of tools and libraries for extracting citation references (and, eventually, other semantic metadata) from PDFs. We first demonstrated this tool to Crossref members at our annual meeting last year. See the pdf-extract labs page for a detailed introduction to this new set of tools.

If you are unable to download and install the tool, you can play with a experimental web interface called “Extracto.” Be warned, Extracto is running on very feeble server using an erratic and slow internet connection. The only guarantee that we can make about using it is that it will repeatedly fall over and annoy you. The weasel has spoken.

Turning DOIs into formatted citations

Today two new record types were added to dx.doi.org resolution for Crossref DOIs. These allow anyone to retrieve DOI bibliographic metadata as formatted bibliographic entries. To perform the formatting we’re using the citation style language processor, citeproc-js which supports a shed load of citation styles and locales.

In fact, all the styles and locales found in the CSL repositories, including many common styles such as bibtex, apa, ieee, harvard, vancouver and chicago are supported. First off, if you’d like to try citation formatting without using content negotiation, there’s a simple web UI that allows input of a DOI, style and locale selection. If you’re more into accessing the web via your favorite programming language, have a look at these content negotiation curl examples. To make a request for the new “text/bibliography” record type: $ curl -LH “Accept: text/bibliography; style=bibtex” http://dx.doi.org/10.1038/nrd842 @article{Atkins_Gershell_2002, title={From the analyst’s couch: Selective anticancer drugs}, volume={1}, DOI={10.1038/nrd842}, number={7}, journal={Nature Reviews Drug Discovery}, author={Atkins, Joshua H. and Gershell, Leland J.}, year={2002}, month={Jul}, pages={491-492}} A locale can be specified with the “locale” record type parameter, like this: $ curl -LH “Accept: text/bibliography; style=mla; locale=fr-FR” http://dx.doi.org/10.1038/nrd842 Atkins, Joshua H., et Leland J. Gershell. « From the analyst’s couch: Selective anticancer drugs ». Nature Reviews Drug Discovery 1.7 (2002): 491-492. You may want to process metadata through CSL yourself. For this use case, there’s another new record type, “application/citeproc+json” that returns metadata in a citeproc-friendly JSON form: $ curl -LH “Accept: application/citeproc+json” http://dx.doi.org/10.1038/nrd842 {“volume”:“1”,“issue”:“7”,“DOI”:“10.1038/nrd842”,“title”:“From the analyst’s couch: Selective anticancer drugs”,“container-title”:“Nature Reviews Drug Discovery”,“issued”:{“date-parts”:[[2002,7]]},“author”:[{“family”:“Atkins”,“given”:“Joshua H.”},{“family”:“Gershell”,“given”:“Leland J.”}],“page”:“491-492”,“type”:“article-journal”} Finally, to retrieve lists of supported styles and locales, see:

Content Negotiation for Crossref DOIs

So does anybody remember the posting DOIs and Linked Data: Some Concrete Proposals?

Well, we went with option “D.”

From now on, DOIs, expressed as HTTP URIs, can be used with content-negotiation.

Let’s get straight to the point. If you have curl installed, you can start playing with content-negotiation and Crossref DOIs right away:

curl -D - -L -H   “Accept: application/rdf+xml” “http://dx.doi.org/10.1126/science.1157784” 

curl -D - -L -H   “Accept: text/turtle” “http://dx.doi.org/10.1126/science.1157784”

Add Crossref metadata to PDFs using XMP

Geoffrey Bilder

Geoffrey Bilder – 2009 December 09

In MetadataPDFXMP

In order to encourage publishers and other content producers to embed metadata into their PDFs, we have released an experimental tool called “pdfmark”, This open source tool allows you to add XMP metadata to a PDF. What’s really cool, is that if you give the tool a Crossref DOI, it will lookup the metadata in Crossref and then apply said metadata to the PDF. More detail can be found on the pdfmark page on the Crossref Labs site. The usual weasels words and excuses about “experiments” apply.

Recommendations on RSS Feeds for Scholarly Publishers

We’re pleased to announce that a Crossref working group has released a set of best practice recommendations for scholarly publishers producing RSS feeds.

Variations in practice amongst publisher feeds can be irritating for end-users, but they can be insurmountable for automated processes. RSS feeds are increasingly being consumed by knowledge discovery and data mining services. In these cases, variations in date formats, the practice of lumping all authors together in one <dc:creator> element, or generating invalid XML can render the RSS feed useless to the service accessing it.

Citation Typing Ontology

I was happy to read David Shotton’s recent Learned Publishing article, Semantic Publishing: The Coming Revolution in scientific journal publishing, and see that he and his team have drafted a Citation Typing Ontology.*

Anybody who has seen me speak at conferences knows that I often like to proselytize about the concept of the “typed link”, a notion that hypertext pioneer, Randy Trigg, discussed extensively in his 1983 Ph.D. thesis.. Basically, Trigg points out something that should be fairly obvious- a citation (i.e. “a link”) is not always a “vote” in favor of the thing being cited.
In fact, there are all sorts of reasons that an author might want to cite something. They might be elaborating on the item cited, they might be critiquing the item cited, they might even be trying to refute the item cited (For an exhaustive and entertaining survey of the use and abuse of citations in the humanities, Anthony Grafton‘s, The Footnote: A Curious History, is a rich source of examples)
Unfortunately, the naive assumption that a citation is tantamount to a vote of confidence has become inshrined in everything from the way in which we measure scholarly reputation, to the way in which we fund universities and the way in which search engines rank their results. The distorting affect of this assumption is profound. If nothing else, it leads to a perverse situation in which people will often discuss books, articles, and blog postings that they disagree with without actually citing the relevant content, just so that they can avoid inadvertently conferring “wuffie” on the item being discussed. This can’t be right.
Having said that, there has been a half-hearted attempt to introduce a gross level of link typology with the introduction of the “nofollow” link attribute- an initiative started by Google in order to try to address the increasing problem of “Spamdexing”. But this is a pretty ham-fisted form of link typing- particularly in the way it is implemented by the Wikipedia where Crossref DOI links to formally published scholarly literature have a “nofollow” attribute attached to them but, inexplicably, items with a PMID are not so hobbled (view the HTML source of this page, for example). Essentially, this means that, the Wikipedia is a black-hole of reputation. That is, it absorbs reputation (through links too the Wikipedia), but it doesn’t let reputation back out again. Hell, I feel dirty for even linking to it here ;-).
Anyway, scholarly publishers should certainly read Shotton’s article because it is full of good, and practical ideas about what can can be done with today’s technology in order to help us move beyond the “digital incunabula” that the industry is currently churning out. The sample semantic article that Shotton’s team created is inspirational and I particularly encourage people to look at the source file for the ontology-enhanced bibliography which reveals just how much more useful metadata can be associated with the humble citation.
And now I wonder whether CiteULike, Connotea, 2Collab or Zotero will consider adding support for the CItation Typing Ontology into their respective services?
* Disclosure:
a) I am on the editorial board of Learned Publishing
b) Crossref has consulted with David Shotton on the subject of semantically enhancing journal articles