Issue/Id-space names or identifiers
We are looking at minting URIs possessing fields for id-space designation and record-within-id-space designation. This note considers the question of what the id-space designators should look like.
The tradition with DBXREFs is to identify id-spaces (databanks, databases, repositories, or particular partitions within same) with short alphanumeric names. For example, the Uniprot DBXREF databank table has
AC : DB-0001 Abbrev: AGD Name : Ashbya genome database Ref : BMC Genomics 8:9-9(2007); PubMed=17212814; DOI=10.1186/1471-2164-8-9; LinkTp: Explicit Server: http://agd.vital-it.ch/ Db_URL: agd.vital-it.ch/Ashbya_gossypii/geneview?gene=%s
We could follow this convention, establishing our own registry of short id-space names such as 'agb', and minting URIs that look like http://sharedname.org/agd/12345... We could initialize this registry from one or more DBXREF tables such as the ones from Uniprot, NCBI, or OBO.
Users like alphabetic id-space designators because they're easy to read and to remember, and databank owners like them because they give exposure to their brand name.
The problem with alphanumeric id-space names is that id-spaces frequently change their names and/or branding. For example, around 2003 'Locuslink' (often abbreviated LL) changed its name to 'Entrez Gene'. (Well, it did get reengineered, but the id-space remained almost completely compatible across the switch.) Many sources still refer to Entrez Gene using the abbreviation 'LL', causing some awkwardness and confusion.
This same problem comes up in the publishing industry, where publishers are subject not only to name changes but to mergers and acquisitions. Recognizing this reality, the handle system (of which the DOI system is a part) decided to designate publishers with numbers, not alphanumeric strings, as a way to deflect rebranding pressures. Rebranding doesn't affect the DOI since the old brand name doesn't occur in the DOI - only its numerical representative, which no one cares much about.
Using numerical designators would protect the Shared Names system against possible threats related to ownership of a brand. The ownership could be of either the legal kind of the "moral" kind (in the sense of general recognition that a name or acronym "belongs" to some recognized entity). The failure scenario is that the shared names user community starts using a URI, say http://sharedname.org/acme/123. Years later, the owner of the Acme name overhauls its namespace; record 123 is now called record 112233. Shared Names responds by setting up a server that translates the old record designators to the new ones, and forwarding to the new Acme web site. Acme then claims that Shared Names has no right to refer to record 112233 under the old (stable) name acme/123 since it ought to be Acme that decides what its own record numbers mean. The user community is now in a pickle.
As in the DOI case, use of numeric id-space designators protects against this kind of trouble.
Regardless of which system we choose (branded / alphanumeric vs. neutral / numeric) we should make the choice consciously and provide a rationale.
Trademark
For information on fair use of trademarks please see
