| CARVIEW |
Navigation Menu
-
Notifications
You must be signed in to change notification settings - Fork 39
Description
One of the greatest gotchas of the DOI protocol is its case insensitivity as explained in the handbook:
DOI names are case insensitive, using ASCII case folding for comparison of text. (Case insensitivity for DOI names applies only to ASCII characters. DOI names which differ in the case of non-ASCII Unicode characters may be different identifiers.) 10.123/ABC is identical to 10.123/AbC. All DOI names are converted to upper case upon registration, which is a common practice for making any kind of service case insensitive. The same is true with resolution. If a DOI name were registered as 10.123/ABC, then 10.123/abc will resolve it and an attempt to register 10.123/AbC would be rejected with the error message that this DOI name already existed.
The above doc makes ALL CAPS seem like the canonical form, but in fact I don't think there is a standard for reproducibly casing DOIs. In other words, is Crossref using the same casing conventions as LibGen? Since I don't think there is a canonical case, when doing any analysis that requires unique identification, you need to convert all DOIs to either lower or uppercase.
This issue is a reminder to go through our resources and consistently case the DOIs and deal with any duplicate DOI issues that arise.